Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster access to the list of dataflows #228

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

viguice
Copy link
Contributor

@viguice viguice commented Jul 7, 2023

My first test was showing that the app was a bit too slow to my taste for loading all dataflows from a provider.
To improve the performance, I have

  • Added allstubs option in detail resulting to smaller file when downloading all dataflows. The drawback is the DSD identifiers are not loaded at this stage. So, I have removed them from the list and the DSD information is loaded only when a particular dataflow is selected.
  • Force the compression of the response. I noticed that Eurostat can compress the response but it was not working currently in the app. Now, it is fixed and the response is very fast. In addition, I think that custom configuration for Eurostat might be removed as it seems the compression is enough and I am not sure that asynchronous message is still used (or mandatory).

The next steps would be:

  • To remove children as references from createStructureQuery (full) and to add descendants instead to createDataflowQuery (full). I would allow to get the DSD and attached codelists in one call but it requires to adapt the DataflowParser.
  • To add referencepartial as detail when retrieving the dataflow with descendants. It would allow to retrieve only the relevant codes for this dataflow and not all the codes. It is working fine for Eurostat but not for all providers I am afraid. The alternative solution when this solution is not working to filter the codes returned the schema endpoint when is available (possible for ECB where referencepartial is not allowed). On the last ressort, it would be to filter on the client side using serieskeyonly results (with the possibility to mimic availability endpoint when it is not available).

amattioc
amattioc previously approved these changes Sep 6, 2023
@amattioc
Copy link
Owner

Hi @viguice

thanks a lot for the time you spent on this. The compression is now being managed in the provider properties, using the file that has been introduced recently. Every provider will need to show if it supports compression.

Regarding the allstubs option, it has to be managed in a more granular way: in the helper teh dsds are needed because you could want to check what flows are related to the same structure. In the getflows calls outside the helper it can be useful to improve response time. I'll try to add this behaviour in next releases.

I also already introduced the many small code fixes that you had put in the code, thanks for them as well!

Attilio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants