-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QueuedTracking setup on multiple servers (new guide) #134
Comments
Hello |
+1 |
Forgot to mention we are using matomo 3.14.1 |
Hi @okossuth @danielsss |
Just a suggestion, if we used lpop or better blpop then that would eliminate potential race conditions, allow use of only one shared queue, and unlimited workers processing the same queue with no need for complicated locking. It would also scale to any level. We had our workers stop for a day now we have a 60GB queue size that we are trying to catch up with, but it's taking forever as only one worker can process each queue. The main downside being that if the processing of the popped data fails then there are no retries. However I don't think that's a big deal, and even if it is can work around that by adding the data back to the beginning of the list, or into a failed queue. |
Thanks @uglyrobot the problem is less around redis but more about Matomo and how it tracks data etc. There's a related issue in core eg matomo-org/matomo#6415 basically if two workers were to work on the same a queue and one worker was processing the second tracking request of a visit slightly faster than another worker does the first tracking request Matomo could store wrong data in its database and sometimes even additionally create multiple visits. |
I seem to be having issue with the following command which is stopping me from executing this correctly; ./console queuedtracking:process --queue-id=X When activating ./console queuedtracking:process --queue-id=0 specifically for queue-id=0, it doesn’t work, i get this error ERROR [2020-07-06 09:10:58] 4700 Uncaught exception: C:\inetpub\wwwroot\vendor\symfony\console\Symfony\Component\Console\Input\ArgvInput.php(242): The “–queue-id” option requires a value. It works for fine for “./console queuedtracking:process --queue-id=1” is this a known issue or am i doing something incorrectly? |
@StevieKay90 could you send us the output of your system check see https://matomo.org/faq/troubleshooting/how-do-i-find-and-copy-the-system-check-in-matomo-on-premise/ ? The output should be anonymised automatically. |
Thanks for quick response! its here; |
Hi Thomas, i've just found out that if you set ./console queuedtracking:process --queue-id=00 it works, good help from the community! One thing which is vexing me though is why queue=0 seems to the most full, its not evenly distributing the load. The other queues are just a handful of requests in but queue 0 has over 200 |
Thanks for this. I still can't reproduce it just yet. @sgiehl any chance you have a windows running with Matomo and can try to reproduce this? I'm wondering if it's maybe windows related. |
@tsteur don't have a matomo running directly on windows. But I could check if my Windows VM where I had set this up once is still running. But I guess it's already outdated and I would need to set it up again. Let me know if it's important enough to spend time on it. |
@StevieKay90 could you remove the |
@tsteur I have done, i'm not using command line at all now i'm using the "Process during tracking request option" |
@StevieKay90 it will likely catch up and process these requests. If otherwise overall it always pushes more requests into the first queue that might be if a lot of the requests are coming from the same IP address or a lot of them use the same visitorId or userId (if userId feature is used). It's possible that simply the visits in the queue |
btw you could maybe also try |
@tsteur the command --queue-id=00 seems to work on windows to process queue 0. However this problem i'm now suffering from is way deeper (i thought this was the issue like you but now i don't think it is). Previously, not stating an ID did actually process queue 0- its just that |
@tsteur Ok, i've done some research and have some very interesting findings! Forcing queue ID: 0 : This worker finished queue processing with 3.2req/s (150 requests in 46.91 seconds) So its not that the there is more requests being routed to Queue ID 0 - its just the computing time of this specific queue is incredibly slow in comparison to the others! UPDATE I now opted for 16 workers as i figured that the relative speed of the other 15 would counter balance that of the slow moving queue 0. However - Now queue 0 is performing a lot better (figuratively speaking at about 12-20req/s) but queue number 6 is now the naughty boy! There was nothing especially wrong in the verbose process output when i processed this queue manually, only the fact that it was slow and i could read most of the lines as they went by when normally its just a black and white fuzzy blur. |
@StevieKay90 any chance our using our log analytics for example to track / import data? This would explain that more requests go into the first queue and that it's slower since every request might consist of multiple tracking requests. Or in case you do custom tracking with bulk tracking requests that would explain it too. That another queue might have now more entries be likely expected if you're not using the regular JS tracker. Be great to know how you track the data @StevieKay90 |
Thanks for the response thomas. All data is from the regular JS tracker.
It looks like I’m going to have to return to matomo 3 to check if it was
the upgrade which changed the queued tracker process.
Currently with QT s weir Ged on I eventually get a pool of data in a
queue which can’t be cleared fast enough and without QT I get a lot of
strain on the db server
…On Thu, 11 Mar 2021 at 20:08, Thomas Steur ***@***.***> wrote:
@StevieKay90 <https://github.com/StevieKay90> any chance our using our
log analytics for example to track / import data? This would explain that
more requests go into the first queue and that it's slower since every
request might consist of multiple tracking requests. Or in case you do
custom tracking with bulk tracking requests that would explain it too.
That another queue might have now more entries be likely expected if
you're not using the regular JS tracker. Be great to know how you track the
data @StevieKay90 <https://github.com/StevieKay90>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#134 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ATFEZHWPNXTY2R7PGTA5HKDTDEPKRANCNFSM4NBIIKTQ>
.
|
Let us know how you go with the downgrade to Matomo 3. Generally, there wasn't really any change though in queued tracking so I don't think it would make a difference. Be interesting to see though. |
@tsteur is queued tracking compatible with php 8? out of interest? |
AFAIK it should be @StevieKay90 |
Hi, we are using QueuedTracking on 3 frontend servers, each with 24core and backend DB+redis with 128core+1TB RAM. Having 16 queues, 10 requests per batch, processing 6 queues on 1st frontent and 5 queues on second and third frontend, each queue processor is hitting ~80% cpu, but frontend servers still have spare CPU power. Is it possible to increase number of queues beyond 16 to get even more performance? Do you have any other advices to increase QueuedTracking capacity here? |
Hi @bitactive. I'm sorry you're experiencing issues. Sadly, 16 is currently the maximum number of queues supported. You could try adjusting the number of requests processed in each batch. I believe that the default is 25. Any other recommendations @AltamashShaikh ? |
@bitactive We would recommend to increase the no of requests here |
@snake14 @AltamashShaikh Increased no of requests from 10 per batch to 25 per batch. Now each of 16 workers have ~80% CPU and increased total throughput (processed requests per second) by ~15%. Still not able to process queue in realtime during high hours with 16 workers, each at 80% CPU on 3.8GHz cores. What are further possible steps to increase efficiency, e.g. by an additional 100%? We do track one big website and have nearly unlimited resources for this (machines / CPU cores / memory). |
@bitactive What if you change the no pf requests to 50 ? |
@snake14 @AltamashShaikh Changing requests per batch to 50 gives another 10-15% throughput increase. Will try 100 soon as traffic increase. Meantime i have another question for this configuration. If i would like to add second big project to this Matomo instance, is it possible to configure it so for example matomo project As far as i know different matomo projects can be processed independently so it should be possible to direct requests from one project to one redis queue and from second project to another redis queue and then process them independently by another 16 workers? |
Hi @bitactive . I'm glad that helped. As far as I can tell, each Matomo instance would need a separate Redis database. Can you confirm @AltamashShaikh ? |
@bitactive You can specify the database if you want to use the same Redis for 2 instances |
Here are some notes I wrote earlier and thought it would be useful to put in the FAQ maybe?
How do I setup QueuedTracking on multiple tracking servers?
Say you have
then on each of your 4 frontend servers, you need to:
./console queuedtracking:process --queue-id=X
Where X is the queue ID. Each server handles 2 queues. So the 4 servers handle the 8 queues.
Queue ID starts at 0.
Notes:
./console queuedtracking:monitor
to track the state of the queueThe text was updated successfully, but these errors were encountered: