Implement site search with Pagefind #7485
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I stumbled across #4271 while looking through old issues and thought it looked interesting, so I decided to take a shot at it.
Rather than a ready-to-merge implementation, this PR is more a proof-of-concept, hence I've opened it as a draft PR. It proves the possibility and outlines the general process, but leaves a fair bit to be desired in terms of implementation. It also begs the question of whether TWiR wants a search function at all since it's been going fine without one for 2.5 years now, but I leave that question to the appropriate authorities.
I tested the updated build process in this PR on macOS and Windows through WSL. I tested the website on Chromium on a MacBook Pro, on Firefox on a Windows PC, on Firefox on Android, and on Safari on a simulated iPhone on a MacBook. In my experience, building the website took an additional 3-5 seconds with the indexing step to enable search functionality, and I didn't notice any additional or unusual slowness when loading the website.
If you've got any questions, comments, concerns, etc., don't hesitate to let me know.
Problem
The TWiR site previously had search functionality provided by Pelican's search plugin, which under the hood is powered by Stork. However, due to an issue causing website loading issues on iOS/iPad devices, the search functionality got temporarily removed, and hasn't been revisited since as far as I can tell.
In diving into this issue, I began by reenabling the old search functionality and trying to build the site, but after spending an afternoon playing whack-a-mole with numerous errors, I still couldn't get it to build. I didn't find that too surprising, however, as Pelican's search plugin was last updated 2 years ago, and Stork officially ceased development 3 years ago. At this point I was left to find an alternative, of which the developer of Stork helpfully provided several possible options. After looking over the recommendations, I landed on Pagefind as the best-looking alternative from my perspective.
Solution
Pagefind is a static site search binary that, like Stork, is written in Rust (yay!). Because static sites, by definition, don't have a server to handle search queries live, Pagefind works by "indexing" the site at build time and bundling that output into the static site, which a little JavaScript magic on the client's side turns into a functional search bar. Admittedly, it's pretty nifty.
Pagefind conveniently has a Python wrapper package on PyPi, which makes installation a breeze. However, Pagefind is a little too new for the
python:3.8.16image that the build container is derived from, so I had to bump it up to thepython:3.9.25image. I know that can theoretically cause breaking compatibility issues, though I didn't encounter any in testing, even when bumping it as high aspython:3.11.14.The Pagefind settings out of the box are pretty reasonable, but adding a few tags to the site's HTML templates gives it a big boost, such as always sorting results by the most recent issue date.
Cons
With a significant feature implementation like this, there will always be some negative consequences, no matter how much I enjoy saying "zero-cost abstraction" into the mirror. The biggest potential problem is likely the image bump from
python:3.8.16-slimtopython:3.9.25-slimdue to the possibility of compatibility issues, though as I said I didn't encounter any.Additionally, the indexing process does add time to every build of the website, around 3-5 seconds in my experience. Some kind of
--dont-indexflag could probably be added to the justfile to suppress the indexing step during development if it's a major issue, though it never bothered me.Probably the biggest current issue is simply the design. I used the prebuilt search bar bundled with Pagefind and inserted it where the old search bar was before removal. The bar's horizontal spacing is a little awkward, and the search result text doesn't seem to change color in dark mode, making it nigh unreadable in that situation. Furthermore, it might make more sense to put the search function on its own page and link to that page in the header. All worthwhile questions, but I decided that trying to address them before starting a discussion was putting the cart before the horse, so I left the implementation as-is.