-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependent archives don't get invalidated when invalidating reports making them show outdated data #18772
Comments
@justinvelluppillai that shouldn't have anything to do with #18790 |
thanks @sgiehl . Will leave this open for now then. |
@tsteur @sgiehl The ticket is pretty hard to read so apologies if I missed something. I understand that we're archiving a number of reports for future data, hence we're wasting CPU on reports that just show zeroes. Am I correct in understanding we would save up a lot of CPU with this, or would this hardly make a difference? Let me know if I'm missing any other value that would be added by working on this ticket. I'm just wondering if we need to do this in 5.1.0 or whether it should simply be pushed much further, let's say 5.5.0 or such. |
I would need to think that through and check the code to be able to say may. But it might possibly be already fixed with #18773. So maybe it would for now be enough to plan in some time to check if that issue is still reproducible or if it is already fixed. |
The bug would need to be confirmed. If confirmed, the problem is that we're at times showing 0 data when there actually is data. And at times we may be showing outdated/inaccurate data. That applies to reports that rely on dependent data like "Goals new visitors / returning visitors data" and some media analytics metrics (eg it might fix PG-599). |
We've closed the internal ticket relating to this and have a follow up ticket to review the PR and adjust the tests. |
refs DEV-2418 for more details.
refs #18773 which might fix this issue as well depending how it is fixed. If #18773 fixes it by fixing the invalidation, then this is also fixed.
I noticed when archiving a week, then in https://github.com/matomo-org/matomo/blob/4.7.1/core/Archive.php#L629-L643 we archive up to 2 days into the future. That means if today is
2022-02-09
, then when archiving this week we also archive2022-02-10
and2022-02-11
. I noticed this in the archive tables:In above screenshots we can see that the archive was created eg on 2nd Feb for a date in the future.
That means we create archives for in the future and of course these archives will have 0 data.
I assume that these archives in the future are also created for regular archives, not only dependent archives. However, I believe it works for these regular archives because they are invalidated again regularly and therefore the data is re-archived one or two days later. The logic for invalidating archives currently doesn't know which dependent archives exist with what segments. Meaning I'm assuming these get never invalidated.
I'm meaning calls like https://github.com/matomo-org/matomo/blob/4.7.1/plugins/Goals/Archiver.php#L124-L127
$this->getProcessor()->processDependentArchive('Goals', VisitFrequencyAPI::NEW_VISITOR_SEGMENT);
. These calls also happen in non-core plugins specifically MediaAnalytics for example.The goals of this issue is when invalidating any period (including any older period), then the dependent archives are also invalidated. Meaning if invalidating eg the "All Visits" segment, then it should also invalidate archives from dependent archives where it uses segments like
media_spent_time>0
and therefore archives for the flag ofdonec276dccc46ea23c1bffdd046127f41d6.%
needs to be invalidated too.Maybe we need to have some event in something like https://github.com/matomo-org/matomo/blob/4.7.1/core/DataAccess/Model.php#L119 to also trigger archives for segments that are used by
processDependentArchive
.The text was updated successfully, but these errors were encountered: