Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are the Related bills getting removed and getting added back very frequently #147

Open
prajwal19988 opened this issue Mar 21, 2024 · 5 comments
Assignees
Labels
Investigating question Further information is requested

Comments

@prajwal19988
Copy link

Hi Admin,
I hope you are doing well. I am from Quorum.
We use the API to work on the bills and the related data and this API has been an excellent and extensive resource.
However, lately,
In the xlsx data dump we are observing that the related bills section is being updated very frequently .
Many New bills are getting added and many other previous entries are getting removed .

Can you please let us know why we are noticing such behaviour and if there is a way forward for it ...
Thank you !

@prajwal19988 prajwal19988 changed the title Related bills keep getting removed and getting added back frequently Why are the Related bills getting removed and getting added back very frequently Mar 21, 2024
@jonquandt jonquandt self-assigned this Mar 21, 2024
@jonquandt jonquandt added the question Further information is requested label Mar 21, 2024
@jonquandt
Copy link
Member

Can you clarify what services and resources you are querying? Examples of requests will help us investigate. Thanks.

@prajwal19988
Copy link
Author

Definitely. If We download the bills' zip and check for individual xml files : https://www.govinfo.gov/bulkdata/BILLSTATUS/118/hr/BILLSTATUS-118-hr.zip
We are finding the difference too often..on a daily basis...even though the bill action is quite old.
example : BILLSUM-118hr2670.xml

@jonquandt
Copy link
Member

jonquandt commented Mar 22, 2024

Thanks for this.
Are you using the API to pull the BILLSTATUS files - e.g. calling https://api.govinfo.gov/collections/2024-03-21T00:00:00Z?offsetMark=*&pageSize=100&api_key=DEMO_KEY

or are you using the xml or json endpoints for bulkdata

https://www.govinfo.gov/bulkdata/xml
https://www.govinfo.gov/bulkdata/json

The link you provided suggests the latter or at least that you are pulling the ZIP files for Congress/bill types on a recurring basis from the bulkdata repository.

The example xml file you referenced is a BILLSUM file, not a BILLSTATUS xml file. I want to better understand the specific issue you are seeing so we can troubleshoot.

In BILLSTATUS-118hr2670.xml, there is a relatedBills tag that lists a large number of relatedBills items. Are you saying that you are seeing items be removed from this list?

My initial guess is that there are changes on the upstream congress.gov API (the source for our BILLSTATUS and BILLSUM xml) that may be causing changes in the resulting xml.

@prajwal19988
Copy link
Author

prajwal19988 commented Mar 22, 2024

yes, we are finding large number of related bills related changes in the billsum file...lot of items in related bills are being removed and few others being added on a daily basis...

@prajwal19988
Copy link
Author

prajwal19988 commented Mar 29, 2024

Hi @jonquandt , the crux of the issue is that on Jan 25, for Bill HR 2670 was added with an entry in related bills : H.R. 6056 . The peculiarity is that both these bills did not have any recent updates. It was introduced in October of last year-2023.

for example -
HR-3746 : The newer update does not have related bills of older version.

Can you please take a look and let us know if this is the expected behaviour, if not can this be rectified in the future ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Investigating question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants