Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase updating frequency #2

Open
martinszy opened this issue Jun 16, 2014 · 1 comment
Open

Increase updating frequency #2

martinszy opened this issue Jun 16, 2014 · 1 comment

Comments

@martinszy
Copy link

Right now, ogov-importer runs in the whole site to know what is changed. This could be reduced to only checking projects introduced in the current calendar year or the past calendar year, since older projects lose "parliamentary status" after one year and would (theoretically) never again be updated.
If this is implemented, perhaps we can increase the updating frequency of the data, since each run would not be so intensive.

@seykron
Copy link
Owner

seykron commented Jun 16, 2014

Maybe we need a smarter heuristic to deal with progressive imports. As we import full pages, we don't know which bills are included in a specific range and the Congress search engine does not ensure any specific order to display results (this is a known-issue actually, because the importer assumes that a query always retrieves the same set of bills, which could be false, the only way to force a fresh import is cleaning up the cache).

Anyway, if you want to import only a year you can change the query at BillImporter.js. Just put whatever you want in the "fecha_inicio" parameter specified in DATA_SOURCE constant.

Regarding the import process, last performance tunning reduced the time in 80%. If you already have bills in the cache, it takes ~10 minutes to process 100K bills with a Intel CoreDuo. And as the memory leak was fixed, it requires a constant average of ~150MB of memory and a constant system load of ~2.0 for the whole process.

I will think in a strategy to perform progressive updates... let me know if you have some idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants