Skip to content

Commit

Permalink
Updating week 5 after lecture 5A.
Browse files Browse the repository at this point in the history
  • Loading branch information
ian-mitchell committed Feb 5, 2025
1 parent 330c495 commit 4722847
Show file tree
Hide file tree
Showing 4 changed files with 89 additions and 6 deletions.
Binary file added files/Lec09_WebAsData.pdf
Binary file not shown.
62 changes: 62 additions & 0 deletions notes/week05/billboard_demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# This is a copy of the solution for A_demo.py from the Class 5A workspace.

import pandas as pd
import billboard

# We will write some code to get the billboard 100 for last week using the billboard.py package

chart = billboard.ChartData('hot-100','2024-02-06')
# print(chart) # messy but complete!

song = chart[0]
print(song.title)

# didn't work! Why?
# Hint: look at what song is/was
for i in range(len(chart)):
print(song.artist)

# Print out only the chart artists!
for song in chart:
print(song.artist)

# Task: Create a list of dictionaries
# with each dictionary being one song
# The attributes we want are:
# title, artist, peakPos, lastPost, weeks, rank, isNew

chart_songs = []

for song in chart:
testdict = {}

testdict['title'] = song.title
testdict['artist'] = song.artist
testdict['peakPos'] = song.peakPos
testdict['lastPos'] = song.lastPos
testdict['weeks'] = song.weeks
testdict['rank'] = song.rank
testdict['isNew'] = song.isNew

chart_songs.append(testdict)

print(type(chart_songs),len(chart_songs)) # it works!

# Task: Convert list of dictionaries to a dataframe

df = pd.DataFrame(chart_songs)

print(type(df),len(df)) # Good! It's a dataframe!

print(df)

# filtering examples
# print only rows where weeks < 10

print(df[df['weeks']<10])

# Task: Export to CSV
df.to_csv('top100.csv')

# Remove the index column
df.to_csv('top100.csv', index=None)
30 changes: 25 additions & 5 deletions notes/week05/class5A.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,40 @@
# Class Meeting 5A

Due to a weather disruption, today's class was presented remotely via zoom. [Link to zoom recording](https://ubc.zoom.us/rec/share/xfO_r8AAmjtVhAdAuCTZJrJwpozieJMpxeFOeiKZ2XTqPAkWQexp3qypifZOxtYc.w0hkvsDLdlRNdkW7) and you will need passcode `n4vvf$WB` to access it. *Note:* There is no Panopto recording for today's lecture.


Below are the slides from today's class embedded.
Feel free to download them to keep them locally, or leave them archived here and just bookmark them.
We will leave the website open even after the course is over for a reasonable number of years.

<div>
<iframe src="../../Lec09_WebAsData.pdf" width="100%" height="600px" frameBorder="0"> </iframe>
</div>

[Download the Slides from today](https://github.com/ubc-cs/cpsc203/raw/main/files/Lec09_WebAsData.pdf)

During the lecture we played around with the `ChartData` and `ChartEntry` classes from the `billboard` module. I wrote and deleted a lot of code as we explored the VSCode environment and these classes, but here is the final version of the code. It should print one line for each entry in the Hot-100 list showing the current position and the position for the previous week.

```python
import pandas as pd
import billboard

# We will write some code to get the billboard 100 for last week using the billboard.py package
chart = billboard.ChartData('hot-100','2025-01-04')

for i, s in enumerate(chart.entries):
print("position " + str(i) + ", last week: "+ str(s.lastPos))
```

Your task to complete *before* Thursday's lecture is:
* Start with the data in `chart.entries: List[ChartEntry]`, which has one `ChartEntry` object for each song in the Hot-100 list.
* Create a variable `chart_songs: List[Dict]` with one dictionary for each song in the Hot-100 list.
* Each dictionary should have one key for each attribute of the corresponding `ChartEntry` object: `title`, `artist`, `peakPos`, `lastPos`, `weeks`, `rank`, `isNew`. The value for each key should be the value for the corresponding attribute for that song.

In other words, you are coverting from `ChartEntry` objects to dictionary objects. On Thursday we will load those dictionaries into a Pandas dataframe and then play around with the data.

## Links for today

- [beautifulsoup library](https://pypi.org/project/beautifulsoup4/) and [Beautiful Soup documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/).
- [billboard library](https://github.com/guoguo12/billboard-charts).
- [LOTS More on Pandas](https://firas.moosvi.com/courses/data301/2022_WT2/notes/week05/Class5A/Class5A.html) (courtesy of Dr. Moosvi's former DATA 301 course at UBC-O).
- [beautifulsoup library](https://pypi.org/project/beautifulsoup4/) (and [Beautiful Soup documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)).
- [billboard library](https://github.com/guoguo12/billboard-charts) (and [The Python Package Index version](https://pypi.org/project/billboard.py/))

<!--
## Important links for today:
Expand Down
3 changes: 2 additions & 1 deletion notes/week05/class5B.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ We will leave the website open even after the course is over for a reasonable nu

[Download the Slides from today](https://github.com/ubc-cs/cpsc203/raw/main/files/Lec10_Plotting_DataFrames.pdf)

## Optional links for today
## Links for today
- [LOTS More on Pandas](https://firas.moosvi.com/courses/data301/2022_WT2/notes/week05/Class5A/Class5A.html) (courtesy of Dr. Moosvi's former DATA 301 course at UBC-O).

<!--
## Important links for today:
Expand Down

0 comments on commit 4722847

Please sign in to comment.