Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report: EIA API for demand data is timing out #322

Open
1 task done
victoriahunt opened this issue Nov 4, 2022 · 4 comments
Open
1 task done

Bug report: EIA API for demand data is timing out #322

victoriahunt opened this issue Nov 4, 2022 · 4 comments
Assignees
Labels
bug Something isn't working hifld Related to ingestion of the HIFLD data

Comments

@victoriahunt
Copy link

🪲

  • I have checked that this issue has not already been reported.

Bug summary

The EIA API is proposed to be used for demand data for the HIFLD Project as described in Issue #293
I am trying to retrieve data for the list of BAs created in issue #241
During the week of Oct 31 - Nov 4, I have been attempting to use the EIA API, with varying degrees of success due to it timing out at unpredictable intervals, while trying to retrieve data for that BA list. When it times out, it throws an error. It is also slow, taking >10 minutes per BA in some cases.

Code for reproduction

import getpass
import pandas as pd
import from prereise.gather.demanddata.eia.get_eia_data import get_ba_demand

start = pd.to_datetime('2016-01-01 00:00:00')
end = pd.to_datetime('2016-12-31 23:00:00')

key = getpass.getpass(prompt='api_key=')
 #BA shp list is list of BAs from 
get_ba_demand(ba_shplist, start, end, key)`

Actual outcome

Screen Shot 2022-11-04 at 3 31 55 PM
Other possible error I've observed: IncompleteRead: IncompleteRead(303104 bytes read)

Expected outcome

What it looks like downloading data from the API:
Screen Shot 2022-11-04 at 4 00 03 PM

Additional context

There are multiple warnings that may be contributing to these issues that appear on the API site as of Nov 4, 2022.
There is a 'scheduled maintenance' on Nov 4, but the issue happened other days this week also.
There is a likely more important warning "Notice: EIA will discontinue support for its legacy API (APIv1) in November, 2022. Excel add-in v1 sheets will continue to function as they are. Please refer to our documentation for the APIv2 interaction methods and our APIv2 query browser to view the data."
This second warning may be contributing to the issues with the API and likely requires a long term fix.
Screen Shot 2022-11-04 at 4 04 17 PM

@victoriahunt victoriahunt added bug Something isn't working hifld Related to ingestion of the HIFLD data labels Nov 4, 2022
@BainanXia
Copy link
Collaborator

BainanXia commented Nov 4, 2022

Based on the discussion with @victoriahunt , the bug is induced by unknown issue of the current EPA API which make the download process unstable. According to the official notice, EPA APIv2 has been released and the support of the current API will be discontinued after Nov. 2022. Hence, we will need to update the download function in the code base to reflect the change. According to the documentation of APIv2, it seems all we need to do is update the URL from "http://api.eia.gov/series/?api_key=" to "http://api.eia.gov/v2/series/?api_key=". However, it hasn't been tested yet.

@victoriahunt
Copy link
Author

@BainanXia Unfortunately I have some new evidence that simply updating URL doesn't work. I tried updating the API to the new url and I got the following error, running the eastern_demand_v5_demo notebook, otherwise unchanged on develop branch:
Screen Shot 2022-11-08 at 10 23 54 AM

@rouille
Copy link
Collaborator

rouille commented Nov 8, 2022

The documentation for the new API can be found here. The route has changed. It seems that demand data can be found through: https://api.eia.gov/v2/electricity/rto/region-sub-ba-data/?api_key=

It looks like only data from 06/15/2018 and later are available
Screen Shot 2022-11-08 at 1 12 41 PM

@victoriahunt
Copy link
Author

victoriahunt commented Nov 10, 2022

Looping through the BAs one at a time works with a short 'sleep' pause as follows (using the old API URL).


for i in range(0,len(ba_shplist)):
    ba = ba_shplist[i]
    ba_shp_list = [ba]
    temp = get_ba_demand(ba_shp_list, start, end, key)
    listname.append(temp)
    time.sleep(20)

Note that it takes between one and two minutes on average per BA using this code on my device, so running through 120+ BAs takes more than 2 hours -- but it doesn't time out. Of course this may not work if/once the API URL support is dropped by EIA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hifld Related to ingestion of the HIFLD data
Projects
None yet
Development

No branches or pull requests

6 participants