Skip to content

Commit

Permalink
Merge pull request #1286 from cal-itp/starter_kit
Browse files Browse the repository at this point in the history
Starter_Kit
  • Loading branch information
amandaha8 authored Nov 8, 2024
2 parents d8f8043 + 6fd6e87 commit 7b579b0
Show file tree
Hide file tree
Showing 6 changed files with 91 additions and 64 deletions.
2 changes: 0 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -75,14 +75,12 @@ build_starterkit_ha:
pip install -r portfolio/requirements.txt
make build_portfolio_site
make git_check_no_sections
make production_portfolio

build_starterkit_LASTNAME:
$(eval export site = YOUR_SITE_NAME)
pip install -r portfolio/requirements.txt
make build_portfolio_site
make git_check_no_sections
make production_portfolio

build_fund_split:
$(eval export site = sb125_fund_split_analysis)
Expand Down
39 changes: 0 additions & 39 deletions portfolio/sites/gtfs_digest_testing.yml

This file was deleted.

44 changes: 38 additions & 6 deletions starter_kit/2024_basics_01.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"id": "247e773f-0e29-4ed6-ab4d-5856325611b4",
"metadata": {},
"source": [
"# Exercise 1: `pandas`,`python`, `f-strings`, Importing and Exporting data.\n",
"# Exercise 1: `Git`, `pandas`,`python`, `f-strings`, Importing and Exporting data.\n",
"If you are new to Python, there are many resources to help you! Below is just a small sample of what is available.\n",
"* There are introductory Python courses available through [Caltrans's LinkedIn Learning Library](https://www.linkedin.com/learning/search?keywords=python&u=36029164).\n",
"* [Practical Python for Data Science](https://www.practicalpythonfordatascience.com/00_python_crash_course) is an incredibly helpful resource. Material from it is linked throughout.\n",
Expand All @@ -23,6 +23,28 @@
"* While the values in we are working with today are all <i>fake</i>, the exercise is based on the actual data and work we've done. "
]
},
{
"cell_type": "markdown",
"id": "41f1a8ae-23c0-42ba-8b50-5a1014d75fd0",
"metadata": {},
"source": [
"## GitHub - Making a Branch\n",
"* You are probably on the `main` branch of our `data-analyses` repo. \n",
"* The `main` branch is [here](https://github.com/cal-itp/data-analyses).\n",
"* We never work on the `main` branch. \n",
"* You can think of the `main` branch as an area that contains our work only when it's at a good stopping point.\n",
"* We typically save (or the proper term is `commit`) our work to the `main` branch at the end of the work week.\n",
"* The rest of the time, we work on our own branches. \n",
"* Let's make (or rather `check out`) our own branch.\n",
"\n",
"**Steps**\n",
"1. Go to the terminal.\n",
"2. Paste `git pull origin main` which pulls down the main branch with the latest work. \n",
"3. Paste `git switch -c your_branch` in the terminal. Swap out `your_branch` with something else.\n",
" * We typically name branches with all lowercase. Instead of `Amanda_Branch`, write `amanda_branch`.\n",
"4. Your terminal should now show `jovyan@jupyter-your_name ~/data-analyses (your_branch) $ ` which means you successfully made your new branch!"
]
},
{
"cell_type": "markdown",
"id": "4dd32eed-55a4-4fd1-874b-02f9b4bd94a7",
Expand Down Expand Up @@ -650,13 +672,23 @@
"id": "69d211b4-89f0-4b2c-9093-1118114ba649",
"metadata": {},
"source": [
"## You're almost done!\n",
"* Name this notebook `YOURNAME_exercise1.ipynb`\n",
"## Git - `Committing` Code\n",
"* In the terminal, paste `git mv 2024_basics_01.ipynb your_new_notebook.ipynb`. \n",
" * This renames your notebook.\n",
" * You can't right click and rename the file, since this notebook is tracked with Git. \n",
" * Rename it using `git mv OLDNAME.ipynb NEWNAME.ipynb`. \n",
" * The `mv` stands for move, and renaming a file is basically \"moving\" its path. \n",
" * Doing it this way retains the git history associated with the notebook. If you rename directly with right click, rename, you destroy the git history.\n",
"* Use a descriptive commit message (ex: adding chart, etc). GitHub already tracks who makes the commit, the date, the timestamp of it, the files being affected, so your commit message should be more descriptive than the metadata already stored."
" * If you rename directly with right click, rename, you destroy the git history.\n",
" * Doing it this way retains the git history associated with the notebook.\n",
"* In the terminal, paste `your_new_notebook.ipynb`. \n",
" * This adds your new notebook.\n",
" * To add all files with a certain extension, write `git add *ipynb`.\n",
"* Continuing in the terminal, paste `git commit -m 'write a message here'`\n",
" * This details the work you did this particular coding session. \n",
" * A typical message would be: `git commit -m 'added charts'` or `git commit -m 'worked on exercise 1'`\n",
" * GitHub already tracks the change's date and timestamp, the files being affected, who made the change, and more so you don't need to include details like these details.\n",
"* Finally, in the terminal, paste `git push origin your_branch`.\n",
" * This pushes up your change to the remote `data-analyses` repo onto your own branch.\n",
" * Now, all your work is safely stored on and recorded by GitHub."
]
}
],
Expand Down
48 changes: 35 additions & 13 deletions starter_kit/2024_basics_03.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"id": "3f74a524-f90a-4ad5-8d98-368afc398b46",
"metadata": {},
"source": [
"# Exercise 3: Strings, Functions, If Else, For Loops"
"# Exercise 3: Strings, Functions, If Else, For Loops, Git"
]
},
{
Expand Down Expand Up @@ -631,6 +631,7 @@
"source": [
"## For Loops \n",
"* For Loops are one of the greatest gifts of Python. \n",
"* It runs code from the beginning to the end of a list. \n",
"* Below is a simple for loop that prints out all the numbers in range of 10.\n"
]
},
Expand Down Expand Up @@ -667,21 +668,13 @@
" display(df[column].describe())"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51cd25c3-2234-4d89-a715-4e5365d7c99a",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "ded54884-4bad-46ae-a82f-2a67936c57dd",
"metadata": {},
"source": [
"### Practice using a for loop\n",
"* Below, I have already aggregated the dataframe for you."
"* I have aggregated the dataframe for you."
]
},
{
Expand Down Expand Up @@ -758,7 +751,7 @@
"id": "a47dc93c-ab8b-4be7-a90d-3ca941e94050",
"metadata": {},
"source": [
"* Use the function to create a chart out of the aggregated dataset."
"* Use the function above to create a chart out of the aggregated dataset."
]
},
{
Expand All @@ -774,10 +767,12 @@
"id": "eff3b0be-7091-4995-b2b8-63d62bf9b6c4",
"metadata": {},
"source": [
"\n",
"* We have a couple of other columns left that still need to be visualized. \n",
"* This is the perfect case for using a for loop, since all we want to do is replace the column above with the two remainig columns. \n",
"* This is the perfect case for using a for loop, since all we want to do is replace the column above with the two remaining columns. \n",
"* Try this below! \n",
" * Hint: you'll have to wrap the function with `display()` to get your results."
" * You'll have to create a `list` that contains the rest of the columns.\n",
" * You'll have to wrap the function with `display()` to get your results."
]
},
{
Expand All @@ -787,6 +782,33 @@
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "35d78005-8c55-4fd9-90cc-2af30ed3fd6b",
"metadata": {},
"source": [
"## GitHub - Pull Requests\n",
"* In Exercise 1, you created a new branch that you are working on now.\n",
"* Now that you are done with Exercise 3, you are at a nice stopping point to commit your work to our `main` branch.\n",
"\n",
"**Steps**\n",
"1. Do the normal workflow of `committing` your work. \n",
"2. Navigate to the our `data-analyses` [repo over here](https://github.com/cal-itp/data-analyses).\n",
"3. Follow the steps detailed in [this video](https://youtu.be/nCKdihvneS0?si=nPlBOAMcgO1nv3v1&t=95). \n",
"4. Once you're done writing, scroll down the bottom and click `merge pull request` \n",
"<img src= \"./starter_kit_img.png\">\n",
"5. Your work is now merged into the `main` branch of our `data-analyses` repo. \n",
"6. To check, navigate to the our [repo](https://github.com/cal-itp/data-analyses) and to this `starter_kit` folder to make sure your notebooks are on the `main` branch.\n",
"7. Delete the branch `your_branch`. \n",
" * It's considered outdated now because your changes are on the `main branch`. In the terminal, paste `git branch -d your_branch`. \n",
" * If that doesn't work, paste `git branch -D your_branch`.\n",
"8. Continuing in the terminal, paste `git switch main`. \n",
"9. Paste `git pull origin main`. \n",
" * This pulls down the work you just uploaded, along with the other work your coworkers have committed onto the main branch. \n",
"9. Create a new branch `git switch -c your_branch` to continue working on exercises 4 and 5.\n",
" * Your new branch can have the same name as the branch you just merged in."
]
}
],
"metadata": {
Expand Down
22 changes: 18 additions & 4 deletions starter_kit/2024_basics_05.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,10 @@
" * This process of looping over variables to generate new notebooks is called parameterizing a notebook.\n",
" \n",
"**Resources**\n",
" * [Preparing notebooks for the portfolio](https://docs.calitp.org/data-infra/publishing/sections/4_notebooks_styling.html)\n",
" * [Publishing to the portfolio](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html)\n",
"* [Preparing notebooks for the portfolio](https://docs.calitp.org/data-infra/publishing/sections/4_notebooks_styling.html)\n",
"* [Publishing to the portfolio](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html)\n",
"\n",
"**Let's make a portfolio**\n",
"* Feel free to delete all the instruction markdown cells (including this one) off once you're done. \n",
"* Spoiler alert! Your end result will look something like [this](https://ha-starterkit-district--cal-itp-data-analyses.netlify.app/readme)."
]
},
Expand Down Expand Up @@ -269,7 +268,7 @@
"**Step 12: Something not right?**\n",
"* What if something is a little off? After updating your code, rerun this line of code to redo your portfolio. \n",
" * You must always `clean` your portfolio before regenerating new notebooks. \n",
"` python portfolio/portfolio.py clean REPLACE_YML_NAME && python portfolio/portfolio.py build REPLACE_YML_NAME --deploy`\n",
"`python portfolio/portfolio.py clean REPLACE_YML_NAME && python portfolio/portfolio.py build REPLACE_YML_NAME --deploy`\n",
"* There are many other specifications you can add to `python portfolio/portfolio.py build`, detailed on [DDS Other Specifications](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html#other-specifications). "
]
},
Expand Down Expand Up @@ -297,6 +296,21 @@
"* Make sure you retain all the `\t` spaces! \n",
"* At the root of the repo run `Make build_starterkit_LASTNAME`.\n"
]
},
{
"cell_type": "markdown",
"id": "847603b7-da59-4757-8302-08f1485dd2cd",
"metadata": {},
"source": [
"**Step 14: Remove your Portfolio**\n",
"* Let's remove your portfolio from analysis.calitp.org.\n",
"* Make sure you're at the root of our repo.\n",
"* Run `pip install -r portfolio/requirements.txt`. \n",
"* Run `python portfolio/portfolio.py clean REPLACE_YML_NAME` to delete all the new notebooks.\n",
"* Run `cd portfolio/sites` and `git rm REPLACE_YML_NAME.yml` to remove the yml file.\n",
"* `cd ../..` back to the root of our repo.\n",
"* Run `python portfolio/portfolio.py index --deploy --prod` which will refresh analysis.calitp.org."
]
}
],
"metadata": {
Expand Down
Binary file added starter_kit/starter_kit_img.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7b579b0

Please sign in to comment.