We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I was reading the code save_stories_from_feed in tasks.py and it looks to be making one database call per feed entry to check for duplicates.
normalized_url_exists could be replaced by a single call to the database to check all feed entries at once.
There could a function call getValidFeedEntries that would apply the logic existing in save_stories_from_feed that skips invalid entries.
Then a single database call to identify what is duplicate and then bulk insert and commit.
If it sounds reasonable I can give it a try. This looks to be the eventual bottleneck of this implementation?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I was reading the code save_stories_from_feed in tasks.py and it looks to be making one database call per feed entry to check for duplicates.
normalized_url_exists could be replaced by a single call to the database to check all feed entries at once.
There could a function call getValidFeedEntries that would apply the logic existing in save_stories_from_feed that skips invalid entries.
Then a single database call to identify what is duplicate and then bulk insert and commit.
If it sounds reasonable I can give it a try. This looks to be the eventual bottleneck of this implementation?
The text was updated successfully, but these errors were encountered: