albumsQuery: return flag indicating contiguous or noncontiguous groups in the album tracks #1272

darrell-k · 2024-12-31T16:19:35Z

See #1269 and particularly #1269 (comment) for the design rationale.

michaelherger · 2025-01-01T07:40:15Z

Ok, I understand this solves part of the other (referenced) issue in a simpler way? I'm a bit concerned about the performance due to the additional queries. Would this be called on individual albums, or when fetching the album list?

michaelherger · 2025-01-01T07:54:19Z

Slim/Control/Queries.pm

 		my $col = '(SELECT COUNT(1) FROM (SELECT 1 FROM tracks WHERE tracks.album=albums.id GROUP BY work,grouping,performance))';
 		$c->{$col} = 1;
 		$as->{$col} = 'group_count';
+		$col = "(SELECT GROUP_CONCAT(SUBSTR('00000'||tracknum,-5) || '->' || COALESCE(work,'') || '##' || COALESCE(performance,'') || '##' || COALESCE(grouping,''),',,') FROM tracks WHERE tracks.album = albums.id)";
+		$c->{$col} = 1;
+		$as->{$col} = 'group_structure';


Is this really necessary? Two additional requests to the tracks table, one of which would potentially process all tracks if looping through all albums? Why not just have a relation to the tracks table in the main request and process the raw data in code? Then at least you'd process all tracks (worst case) only once, and not more than once with this construct?

And would it be possible to store the results in a JSON object rather than a hard to parse (and even harder to read!) code/field? See https://www.sqlite.org/json1.html (I plan to officially drop MySQL support anyway...).

JSON option? Replace the SELECT with the following:

SELECT json_group_object(tracks.tracknum, json_array(tracks.work, tracks.performance, tracks.grouping) )

Which would return something like:

{ "1": [ 12, null, null ], "2": [ 12, null, null ], "3": [ 12, null, null ], "4": [ 13, null, null ], "5": [ 13, null, null ], "6": [ 13, null, null ] }

instead of:

00001->12####,,00002->12####,,00003->12####,,00004->13####,,00005->13####,,00006->13####

I have no clue whether there's any performance pro/con, but it would certainly simplify the remainder of the code and improve readability.

Is this really necessary? Two additional requests to the tracks table, one of which would potentially process all tracks if looping through all albums? Why not just have a relation to the tracks table in the main request and process the raw data in code? Then at least you'd process all tracks (worst case) only once, and not more than once with this construct?

And would it be possible to store the results in a JSON object rather than a hard to parse (and even harder to read!) code/field? See https://www.sqlite.org/json1.html (I plan to officially drop MySQL support anyway...).

I don't think it matters much from a performance point of view, the database engine will be optimising, I think proved by my performance tests mentioned in #1269 . But the JSON idea is a good one. Presumably would cause a hard failure (no albums list!) for MySQL users. Did you announce 9.0 no longer supports MySQL yet?

I'll keep asking questions... isn't the overall creation of that string overly complicated? Is the zero padding needed? The arrow as a separator, then double commas... Couldn't you do with one single separator for them all?

And wouldn't it be good enough to run that query only if group_count was larger than zero down around line 840? I bet the vast majority of albums wouldn't need it anyway. For all those albums this really is just unnecessary overhead.

I'll keep asking questions... isn't the overall creation of that string overly complicated? Is the zero padding needed? The arrow as a separator, then double commas... Couldn't you do with one single separator for them all?

We need to use two separate separators, so the first split (on ,, results in for example this array:

"00001->####", "00002->####", "00003->55992####", "00004->####", "00005->55992####", "00006->####", "00007->55992####", "00008->####", "00009->####", "00010->55992####",

The zero-padded track numbers are required to ensure that the sorted array is in track number sequence: as per the SQLite documentation, the sequence of the data returned from GROUP_CONCAT is undefined.

Once we've got an array sorted on track number, we have no further use for the track number, so we extract the grouping string with a split on ->.

We then get rid of ^####$ so that we can easily ignore the "null group" - we don't care if tracks in the null group are non-contiguous, though we do care if the tracks of a "real" group are separated by "null group" tracks (as in the example above), so we must have the null group tracks in the array.

It's an ugly GROUP_CONCAT (made uglier by me trying to use separators which wouldn't be found in PERFORMANCE or GROUPING tags, which are text strings) but it results in a string which can be parsed fairly easily when processing in the results loop. (If only the json functions were available to us!)

And wouldn't it be good enough to run that query only if group_count was larger than zero down around line 840? I bet the vast majority of albums wouldn't need it anyway. For all those albums this really is just unnecessary overhead.

The issue here is that doing it that way would mean running a separate query for the albums so identified within the results loop, so more database accesses, even if it was for only a few albums.

As we found recently with the performance testing of the group_count PR, it was much more efficient in the main loop, and as I mentioned above, the performance with this PR is not materially worse, probably because the database engine is clever enough to optimise the two subselects on the same table/key into a single join under the covers.

SQLite is too smart to make us think harder 😉.

One more thought: SQLite allows you to create custom functions. The FTS plugin makes heavy use of that:

https://github.com/LMS-Community/slimserver/blob/public/9.1/Slim/Plugin/FullTextSearch/Plugin.pm#L657-L670

We could use that to implement basic JSON support... I'll give that a try!

I did consider using a join to tracks instead of the two subselects (it's already joined sometimes, depending on the input criteria) but that might mess things up in hard to understand ways when we're grouping by performance/work/album rather than just album (ie when displaying album-works).

Also I did consider custom functions like in FTS (in this case I would have just had it return the contiguous flag directly), but that would end up running separate queries against the database, with performance implications, I think. Good idea in general, though.

Are you set up to run more tests? I've created a PR against yours, using a custom function. I'd really be interested if that had a performance impact.

https://github.com/darrell-k/slimserver/pull/15/files

IMHO using a standard serialization tool would be preferable over some custom string parsing. Because even if your separators were unlikely to be found in a potential string, they could. Using JSON would take care of that.

Just commented on your commit, the logic's not there yet, but it's a good idea (and not hitting the database again which was my concern). Once I've sorted out the logic, I'll run the performance test again.

darrell-k · 2025-01-01T11:46:55Z

@michaelherger , looks like the SQLite json functions are not available to us:

The JSON functions and operators are built into SQLite by default, as of SQLite version 3.38.0 (2022-02-22). They can be omitted by adding the -DSQLITE_OMIT_JSON compile-time option. Prior to version 3.38.0, the JSON functions were an extension that would only be included in builds if the -DSQLITE_ENABLE_JSON1 compile-time option was included. In other words, the JSON functions went from being opt-in with SQLite version 3.37.2 and earlier to opt-out with SQLite version 3.38.0 and later.

(LMS on Debian Stable is running SQLite v3.22)

michaelherger · 2025-01-01T11:55:15Z

Argh... I got used to using them for the stats endpoint and similar. Didn't think twice about the availability. Thanks for checking! And sorry for sending you down the wrong trail...

darrell-k · 2025-01-01T12:00:09Z

Useful to know about even if we can't use (yet).

darrell-k · 2025-01-02T09:40:27Z

@michaelherger do you still want changes to this?

michaelherger · 2025-01-02T10:53:17Z

@michaelherger do you still want changes to this?

Please see my various questions in https://github.com/LMS-Community/slimserver/pull/1272/files#r1900383838

…instead of hand made serialization I'd prefer to let an existing serialization tool do the job, rather than using a custom string concatenation which might still be subject to parsing failure.

michaelherger

Let's get that tested. Thanks!

darrell-k added 5 commits December 27, 2024 10:24

noncongiguous groups handling

dd93087

indentation

9e37569

Allow non-contiguous null groups

c4cc3af

albumsQuery to return contiguous_groups flag

3e56822

tweak return value

b457a4a

darrell-k mentioned this pull request Dec 31, 2024

Check whether tracks are contiguous with respect to work/grouping/performance #1269

Open

michaelherger reviewed Jan 1, 2025

View reviewed changes

use 'seen' hash + ideas from mherger

f3e4df0

michaelherger approved these changes Jan 3, 2025

View reviewed changes

michaelherger merged commit f35d39a into LMS-Community:public/9.0 Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

albumsQuery: return flag indicating contiguous or noncontiguous groups in the album tracks #1272

albumsQuery: return flag indicating contiguous or noncontiguous groups in the album tracks #1272

darrell-k commented Dec 31, 2024

michaelherger commented Jan 1, 2025 •

edited

Loading

michaelherger Jan 1, 2025 •

edited

Loading

michaelherger Jan 1, 2025

michaelherger Jan 1, 2025

darrell-k Jan 1, 2025

michaelherger Jan 1, 2025

darrell-k Jan 2, 2025

michaelherger Jan 2, 2025 •

edited

Loading

darrell-k Jan 2, 2025

michaelherger Jan 2, 2025

darrell-k Jan 2, 2025

darrell-k commented Jan 1, 2025

michaelherger commented Jan 1, 2025

darrell-k commented Jan 1, 2025

darrell-k commented Jan 2, 2025

michaelherger commented Jan 2, 2025

michaelherger left a comment

albumsQuery: return flag indicating contiguous or noncontiguous groups in the album tracks #1272

albumsQuery: return flag indicating contiguous or noncontiguous groups in the album tracks #1272

Conversation

darrell-k commented Dec 31, 2024

michaelherger commented Jan 1, 2025 • edited Loading

michaelherger Jan 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelherger Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darrell-k commented Jan 1, 2025

michaelherger commented Jan 1, 2025

darrell-k commented Jan 1, 2025

darrell-k commented Jan 2, 2025

michaelherger commented Jan 2, 2025

michaelherger left a comment

Choose a reason for hiding this comment

michaelherger commented Jan 1, 2025 •

edited

Loading

michaelherger Jan 1, 2025 •

edited

Loading

michaelherger Jan 2, 2025 •

edited

Loading