Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

County vs. state information for USA: total or remainder? #79

Open
emmanuelle opened this issue Mar 23, 2020 · 1 comment
Open

County vs. state information for USA: total or remainder? #79

emmanuelle opened this issue Mar 23, 2020 · 1 comment
Labels
bug Something isn't working

Comments

@emmanuelle
Copy link
Contributor

The Johns Hopkins dataset has information at state or county level in the State / Province column. For some states (eg California) there is information both for the state and for some counties of this state (altough most of the time the numbers are 0 at county level). It is not clear how the group by should be performed in this case.

@emmanuelle emmanuelle added the bug Something isn't working label Mar 23, 2020
@emmanuelle
Copy link
Contributor Author

For example

In [50]: df[df['Province/State'].str[-2:] == 'NY']                                                                                  
Out[50]: 
             Province/State Country/Region      Lat     Long  1/22/20  ...  3/18/20  3/19/20  3/20/20  3/21/20  3/22/20
274      Suffolk County, NY             US  40.9849 -72.6151        0  ...        0        0        0        0        0
275       Ulster County, NY             US  41.8586 -74.3118        0  ...        0        0        0        0        0
285     Rockland County, NY             US  41.1489 -73.9830        0  ...        0        0        0        0        0
286     Saratoga County, NY             US  43.0324 -73.9360        0  ...        0        0        0        0        0
308       Nassau County, NY             US  40.6546 -73.5594        0  ...        0        0        0        0        0
319     New York County, NY             US  40.7128 -74.0060        0  ...        0        0        0        0        0
332  Westchester County, NY             US  41.1220 -73.7949        0  ...        0        0        0        0        0

[7 rows x 65 columns]

In [51]: df[df['Province/State'] == 'New York']                                                                                     
Out[51]: 
   Province/State Country/Region      Lat     Long  1/22/20  1/23/20  ...  3/17/20  3/18/20  3/19/20  3/20/20  3/21/20  3/22/20
99       New York             US  42.1657 -74.9481        0        0  ...     1706     2495     5365     8310    11710    15793

[1 rows x 65 columns]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant