diff --git a/Makefile b/Makefile
index d65fded4f..ea54dbb8b 100644
--- a/Makefile
+++ b/Makefile
@@ -75,14 +75,12 @@ build_starterkit_ha:
pip install -r portfolio/requirements.txt
make build_portfolio_site
make git_check_no_sections
- make production_portfolio
build_starterkit_LASTNAME:
$(eval export site = YOUR_SITE_NAME)
pip install -r portfolio/requirements.txt
make build_portfolio_site
make git_check_no_sections
- make production_portfolio
build_fund_split:
$(eval export site = sb125_fund_split_analysis)
diff --git a/portfolio/sites/gtfs_digest_testing.yml b/portfolio/sites/gtfs_digest_testing.yml
deleted file mode 100644
index 518a3a732..000000000
--- a/portfolio/sites/gtfs_digest_testing.yml
+++ /dev/null
@@ -1,39 +0,0 @@
-directory: ./gtfs_digest/
-notebook: ./gtfs_digest/03_report.ipynb
-parts:
-- chapters:
- - caption: District 04 - Oakland
- params:
- district: 04 - Oakland
- sections:
- - organization_name: Alameda-Contra Costa Transit District
- - organization_name: Capitol Corridor Joint Powers Authority
- - organization_name: Central Contra Costa Transit Authority
- - organization_name: City and County of San Francisco
- - organization_name: City of Fairfield
- - organization_name: City of Menlo Park
- - organization_name: City of Petaluma
- - organization_name: City of Union City
- - organization_name: City of Vacaville
- - organization_name: Eastern Contra Costa Transit Authority
- - organization_name: Emeryville Transportation Management Agency
- - organization_name: Livermore-Amador Valley Transit Authority
- - organization_name: Marin County Transit District
- - organization_name: Mission Bay Transportation Management Agency
- - organization_name: Mountain View Transportation Management Association
- - organization_name: Napa Valley Transportation Authority
- - organization_name: Peninsula Corridor Joint Powers Board
- - organization_name: Presidio Trust
- - organization_name: San Francisco Bay Area Rapid Transit District
- - organization_name: San Francisco International Airport
- - organization_name: San Mateo County Transit District
- - organization_name: Santa Clara Valley Transportation Authority
- - organization_name: Solano County Transit
- - organization_name: Sonoma County
- - organization_name: Sonoma-Marin Area Rail Transit District
- - organization_name: Stanford University
- - organization_name: University of California, Berkeley
- - organization_name: Western Contra Costa Transit Authority
-
-readme: ./gtfs_digest/README.md
-title: GTFS Digest
diff --git a/starter_kit/2024_basics_01.ipynb b/starter_kit/2024_basics_01.ipynb
index 9a838649e..3d7020cba 100644
--- a/starter_kit/2024_basics_01.ipynb
+++ b/starter_kit/2024_basics_01.ipynb
@@ -5,7 +5,7 @@
"id": "247e773f-0e29-4ed6-ab4d-5856325611b4",
"metadata": {},
"source": [
- "# Exercise 1: `pandas`,`python`, `f-strings`, Importing and Exporting data.\n",
+ "# Exercise 1: `Git`, `pandas`,`python`, `f-strings`, Importing and Exporting data.\n",
"If you are new to Python, there are many resources to help you! Below is just a small sample of what is available.\n",
"* There are introductory Python courses available through [Caltrans's LinkedIn Learning Library](https://www.linkedin.com/learning/search?keywords=python&u=36029164).\n",
"* [Practical Python for Data Science](https://www.practicalpythonfordatascience.com/00_python_crash_course) is an incredibly helpful resource. Material from it is linked throughout.\n",
@@ -23,6 +23,28 @@
"* While the values in we are working with today are all fake, the exercise is based on the actual data and work we've done. "
]
},
+ {
+ "cell_type": "markdown",
+ "id": "41f1a8ae-23c0-42ba-8b50-5a1014d75fd0",
+ "metadata": {},
+ "source": [
+ "## GitHub - Making a Branch\n",
+ "* You are probably on the `main` branch of our `data-analyses` repo. \n",
+ "* The `main` branch is [here](https://github.com/cal-itp/data-analyses).\n",
+ "* We never work on the `main` branch. \n",
+ "* You can think of the `main` branch as an area that contains our work only when it's at a good stopping point.\n",
+ "* We typically save (or the proper term is `commit`) our work to the `main` branch at the end of the work week.\n",
+ "* The rest of the time, we work on our own branches. \n",
+ "* Let's make (or rather `check out`) our own branch.\n",
+ "\n",
+ "**Steps**\n",
+ "1. Go to the terminal.\n",
+ "2. Paste `git pull origin main` which pulls down the main branch with the latest work. \n",
+ "3. Paste `git switch -c your_branch` in the terminal. Swap out `your_branch` with something else.\n",
+ " * We typically name branches with all lowercase. Instead of `Amanda_Branch`, write `amanda_branch`.\n",
+ "4. Your terminal should now show `jovyan@jupyter-your_name ~/data-analyses (your_branch) $ ` which means you successfully made your new branch!"
+ ]
+ },
{
"cell_type": "markdown",
"id": "4dd32eed-55a4-4fd1-874b-02f9b4bd94a7",
@@ -650,13 +672,23 @@
"id": "69d211b4-89f0-4b2c-9093-1118114ba649",
"metadata": {},
"source": [
- "## You're almost done!\n",
- "* Name this notebook `YOURNAME_exercise1.ipynb`\n",
+ "## Git - `Committing` Code\n",
+ "* In the terminal, paste `git mv 2024_basics_01.ipynb your_new_notebook.ipynb`. \n",
+ " * This renames your notebook.\n",
" * You can't right click and rename the file, since this notebook is tracked with Git. \n",
- " * Rename it using `git mv OLDNAME.ipynb NEWNAME.ipynb`. \n",
" * The `mv` stands for move, and renaming a file is basically \"moving\" its path. \n",
- " * Doing it this way retains the git history associated with the notebook. If you rename directly with right click, rename, you destroy the git history.\n",
- "* Use a descriptive commit message (ex: adding chart, etc). GitHub already tracks who makes the commit, the date, the timestamp of it, the files being affected, so your commit message should be more descriptive than the metadata already stored."
+ " * If you rename directly with right click, rename, you destroy the git history.\n",
+ " * Doing it this way retains the git history associated with the notebook.\n",
+ "* In the terminal, paste `your_new_notebook.ipynb`. \n",
+ " * This adds your new notebook.\n",
+ " * To add all files with a certain extension, write `git add *ipynb`.\n",
+ "* Continuing in the terminal, paste `git commit -m 'write a message here'`\n",
+ " * This details the work you did this particular coding session. \n",
+ " * A typical message would be: `git commit -m 'added charts'` or `git commit -m 'worked on exercise 1'`\n",
+ " * GitHub already tracks the change's date and timestamp, the files being affected, who made the change, and more so you don't need to include details like these details.\n",
+ "* Finally, in the terminal, paste `git push origin your_branch`.\n",
+ " * This pushes up your change to the remote `data-analyses` repo onto your own branch.\n",
+ " * Now, all your work is safely stored on and recorded by GitHub."
]
}
],
diff --git a/starter_kit/2024_basics_03.ipynb b/starter_kit/2024_basics_03.ipynb
index 58be931be..0a96096ce 100644
--- a/starter_kit/2024_basics_03.ipynb
+++ b/starter_kit/2024_basics_03.ipynb
@@ -5,7 +5,7 @@
"id": "3f74a524-f90a-4ad5-8d98-368afc398b46",
"metadata": {},
"source": [
- "# Exercise 3: Strings, Functions, If Else, For Loops"
+ "# Exercise 3: Strings, Functions, If Else, For Loops, Git"
]
},
{
@@ -631,6 +631,7 @@
"source": [
"## For Loops \n",
"* For Loops are one of the greatest gifts of Python. \n",
+ "* It runs code from the beginning to the end of a list. \n",
"* Below is a simple for loop that prints out all the numbers in range of 10.\n"
]
},
@@ -667,21 +668,13 @@
" display(df[column].describe())"
]
},
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "51cd25c3-2234-4d89-a715-4e5365d7c99a",
- "metadata": {},
- "outputs": [],
- "source": []
- },
{
"cell_type": "markdown",
"id": "ded54884-4bad-46ae-a82f-2a67936c57dd",
"metadata": {},
"source": [
"### Practice using a for loop\n",
- "* Below, I have already aggregated the dataframe for you."
+ "* I have aggregated the dataframe for you."
]
},
{
@@ -758,7 +751,7 @@
"id": "a47dc93c-ab8b-4be7-a90d-3ca941e94050",
"metadata": {},
"source": [
- "* Use the function to create a chart out of the aggregated dataset."
+ "* Use the function above to create a chart out of the aggregated dataset."
]
},
{
@@ -774,10 +767,12 @@
"id": "eff3b0be-7091-4995-b2b8-63d62bf9b6c4",
"metadata": {},
"source": [
+ "\n",
"* We have a couple of other columns left that still need to be visualized. \n",
- "* This is the perfect case for using a for loop, since all we want to do is replace the column above with the two remainig columns. \n",
+ "* This is the perfect case for using a for loop, since all we want to do is replace the column above with the two remaining columns. \n",
"* Try this below! \n",
- " * Hint: you'll have to wrap the function with `display()` to get your results."
+ " * You'll have to create a `list` that contains the rest of the columns.\n",
+ " * You'll have to wrap the function with `display()` to get your results."
]
},
{
@@ -787,6 +782,33 @@
"metadata": {},
"outputs": [],
"source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "35d78005-8c55-4fd9-90cc-2af30ed3fd6b",
+ "metadata": {},
+ "source": [
+ "## GitHub - Pull Requests\n",
+ "* In Exercise 1, you created a new branch that you are working on now.\n",
+ "* Now that you are done with Exercise 3, you are at a nice stopping point to commit your work to our `main` branch.\n",
+ "\n",
+ "**Steps**\n",
+ "1. Do the normal workflow of `committing` your work. \n",
+ "2. Navigate to the our `data-analyses` [repo over here](https://github.com/cal-itp/data-analyses).\n",
+ "3. Follow the steps detailed in [this video](https://youtu.be/nCKdihvneS0?si=nPlBOAMcgO1nv3v1&t=95). \n",
+ "4. Once you're done writing, scroll down the bottom and click `merge pull request` \n",
+ "\n",
+ "5. Your work is now merged into the `main` branch of our `data-analyses` repo. \n",
+ "6. To check, navigate to the our [repo](https://github.com/cal-itp/data-analyses) and to this `starter_kit` folder to make sure your notebooks are on the `main` branch.\n",
+ "7. Delete the branch `your_branch`. \n",
+ " * It's considered outdated now because your changes are on the `main branch`. In the terminal, paste `git branch -d your_branch`. \n",
+ " * If that doesn't work, paste `git branch -D your_branch`.\n",
+ "8. Continuing in the terminal, paste `git switch main`. \n",
+ "9. Paste `git pull origin main`. \n",
+ " * This pulls down the work you just uploaded, along with the other work your coworkers have committed onto the main branch. \n",
+ "9. Create a new branch `git switch -c your_branch` to continue working on exercises 4 and 5.\n",
+ " * Your new branch can have the same name as the branch you just merged in."
+ ]
}
],
"metadata": {
diff --git a/starter_kit/2024_basics_05.ipynb b/starter_kit/2024_basics_05.ipynb
index f53ee454f..a13e6c179 100644
--- a/starter_kit/2024_basics_05.ipynb
+++ b/starter_kit/2024_basics_05.ipynb
@@ -20,11 +20,10 @@
" * This process of looping over variables to generate new notebooks is called parameterizing a notebook.\n",
" \n",
"**Resources**\n",
- " * [Preparing notebooks for the portfolio](https://docs.calitp.org/data-infra/publishing/sections/4_notebooks_styling.html)\n",
- " * [Publishing to the portfolio](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html)\n",
+ "* [Preparing notebooks for the portfolio](https://docs.calitp.org/data-infra/publishing/sections/4_notebooks_styling.html)\n",
+ "* [Publishing to the portfolio](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html)\n",
"\n",
"**Let's make a portfolio**\n",
- "* Feel free to delete all the instruction markdown cells (including this one) off once you're done. \n",
"* Spoiler alert! Your end result will look something like [this](https://ha-starterkit-district--cal-itp-data-analyses.netlify.app/readme)."
]
},
@@ -269,7 +268,7 @@
"**Step 12: Something not right?**\n",
"* What if something is a little off? After updating your code, rerun this line of code to redo your portfolio. \n",
" * You must always `clean` your portfolio before regenerating new notebooks. \n",
- "` python portfolio/portfolio.py clean REPLACE_YML_NAME && python portfolio/portfolio.py build REPLACE_YML_NAME --deploy`\n",
+ "`python portfolio/portfolio.py clean REPLACE_YML_NAME && python portfolio/portfolio.py build REPLACE_YML_NAME --deploy`\n",
"* There are many other specifications you can add to `python portfolio/portfolio.py build`, detailed on [DDS Other Specifications](https://docs.calitp.org/data-infra/publishing/sections/5_analytics_portfolio_site.html#other-specifications). "
]
},
@@ -297,6 +296,21 @@
"* Make sure you retain all the `\t` spaces! \n",
"* At the root of the repo run `Make build_starterkit_LASTNAME`.\n"
]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "847603b7-da59-4757-8302-08f1485dd2cd",
+ "metadata": {},
+ "source": [
+ "**Step 14: Remove your Portfolio**\n",
+ "* Let's remove your portfolio from analysis.calitp.org.\n",
+ "* Make sure you're at the root of our repo.\n",
+ "* Run `pip install -r portfolio/requirements.txt`. \n",
+ "* Run `python portfolio/portfolio.py clean REPLACE_YML_NAME` to delete all the new notebooks.\n",
+ "* Run `cd portfolio/sites` and `git rm REPLACE_YML_NAME.yml` to remove the yml file.\n",
+ "* `cd ../..` back to the root of our repo.\n",
+ "* Run `python portfolio/portfolio.py index --deploy --prod` which will refresh analysis.calitp.org."
+ ]
}
],
"metadata": {
diff --git a/starter_kit/starter_kit_img.png b/starter_kit/starter_kit_img.png
new file mode 100644
index 000000000..7a8f4f8b9
Binary files /dev/null and b/starter_kit/starter_kit_img.png differ