Proposal: option to provide a static url->title mapping for the "Page titles" report #234

mackuba · 2018-11-28T14:22:32Z

I want to use Matomo to track visits on my site, and I decided to use only log analytics. I'm aware of the limitations, that some things will be missing compared to JS tracking - like resolution info, outgoing links, plugin support etc., and I'm fine with that.

However, I've realized that on a relatively rarely changing site like mine - a blog with a few articles posted per year at most - I can solve one of these things, namely the "Page titles" which are normally read in JS from the <title> tag, by providing a static list of url -> title mappings in a file passed in a parameter to the import script.

The file looks like this (the first = is treated as a separator and both sides are trimmed from whitespace, but of course we can change the format to e.g. CSV or something else):

https://mackuba.eu/ = mackuba.eu
https://mackuba.eu/2018/09/07/new-stuff-from-wwdc-2018/ = New stuff from WWDC 2018 &ndash; mackuba.eu
https://mackuba.eu/2018/07/10/dark-side-mac-2/ = Dark Side of the Mac: Updating Your App &ndash; mackuba.eu
https://mackuba.eu/2018/06/11/notifications-in-ios-12/ = What's new in notifications in iOS 12 &ndash; mackuba.eu
...

Now, when I call import_logs.py with --page-titles-from=page_titles.txt, whenever the parser sees a hit with URL e.g. https://mackuba.eu/2018/07/10/dark-side-mac-2/, it will set the action_name to "Dark Side of the Mac: Updating Your App – mackuba.eu", and so on. My "Page titles" report looks just like with the JS tracker version, and I only need to remember to update the file whenever I post a new article (or better, automate it).

I believe quite a lot of sites using log analytics could use something like this. Depending on the size and type of the site the file can be maintained manually, or built from a database of articles/pages using a script or an action on the server. In my case, I wrote a small script that loads my sitemap.xml file and then goes through all link items listed there, fetches each HTML and extracts the <title> from it.

mackuba added 2 commits January 26, 2019 15:22

option to read a list of page titles from a file

8ba3e87

look up the url without query part in page titles list

8c55c0d

mackuba force-pushed the page_titles branch from e7085a3 to 8c55c0d Compare January 26, 2019 13:22

tsteur changed the base branch from master to 3.x-dev January 13, 2020 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: option to provide a static url->title mapping for the "Page titles" report #234

Proposal: option to provide a static url->title mapping for the "Page titles" report #234

mackuba commented Nov 28, 2018

Proposal: option to provide a static url->title mapping for the "Page titles" report #234

Are you sure you want to change the base?

Proposal: option to provide a static url->title mapping for the "Page titles" report #234

Conversation

mackuba commented Nov 28, 2018