-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mongo can handle larger databases now #174
base: master
Are you sure you want to change the base?
Conversation
Made changes to mongodb backend logic so that it no longer creates multiple queries to backend. Now mongodb backend constructs one filter from the request and queries data from mongodb just once. Previously multiple queries were made and filtering was done based on document IDs returned from those queries, this was bad practice since mongodb would have to go trough all documents multiple times filtering them using a huge list of IDs returned from previous queries. Using just one filter allows mongodb to handle the single query internally without having to match document IDs to a sometimes very large array of IDs.
This pull request doesn't contain a test, since triggering the bug mentioned in Issue #173 |
Here is a sample of an alternative approach I came up with which uses windowing (this is for finding max versions): [
{
"$setWindowFields": {
"partitionBy": "$id",
"output": {
"_max_version": {
"$max": "$_manifest.version"
}
}
}
},
{
"$match": {
"$expr": {
"$eq": [
"$_manifest.version",
"$_max_version"
]
}
}
},
{
"$unset": [
"_max_version"
]
}
] I don't know how it would compare in terms of efficiency. Perhaps it is more efficient since it doesn't require sorting all of the documents first? |
I've been facing this exact same problem recently as I've had to account for a worst case configuration in Sentinel of having a connector set to pull all indicators once a minute, I've tested your change and the difference is like night and day. I'm able to pull ~11k indicators in 1.3 seconds now. Many thanks for this @Halk0 |
Changes to matching versions based on aproach @chisholm had taken. Sorting documents while filtering versions was removed, this should have a improving effect on performance. I also removed unnecessary loop that I added, since it does the same thing as the line below it, only difference being it was more complicated. This also should have a performance increasing effect
@chisholm thats nice, I made some changes based your approach! |
Made changes to mongodb backend logic so that it no longer creates multiple queries to backend. Now mongodb backend constructs one filter from the request and queries data from mongodb just once. Previously multiple queries were made and filtering was done based on document IDs returned from those queries, this was bad practice since mongodb would have to go trough all documents multiple times filtering them using a huge list of IDs returned from previous queries. Using just one filter allows mongodb to handle the single query internally without having to match document IDs to a sometimes very large array of IDs.
Closes #173