⚡ Bolt: Optimize generate-latest-post.js performance with two-pass algorithm#85
⚡ Bolt: Optimize generate-latest-post.js performance with two-pass algorithm#85NickJLange wants to merge 1 commit intomainfrom
generate-latest-post.js performance with two-pass algorithm#85Conversation
…read algorithm This commit reduces the build time for Docusaurus environments with large chronologically ordered blog posts directories. Previously, the script executed `fs.readFileSync` and `gray-matter` parsing on *every* file newer than the currently tracked maximum while iterating through the directories. The optimized algorithm splits the operation: 1. Pass 1: O(N) array iteration over filenames mapping them against a date regex (O(1) Memory, O(1) I/O reads). 2. Pass 2: A single `fs.readFileSync` and `matter(content)` operation for the exact most recent file identified in pass 1. Includes associated documentation updates for `.jules/bolt.md`. Co-authored-by: NickJLange <1529105+NickJLange@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
⚡ What:
Re-wrote
scripts/generate-latest-post.jsto implement a two-pass algorithm.First pass: Identifies the latest blog post by evaluating just the filenames and paths without opening any files.
Second pass: Performs a single I/O read
fs.readFileSyncand markdown parser operationmatter()strictly on the identified target post.🎯 Why:
The original script performed synchronously sequential I/O and frontmatter parsing on every single post file that happened to be more recent than the prior maximum in an iteration. This created a massive performance overhead in directories with numerous ordered items, resulting in O(N) file system calls and block delays during
bun run build.📊 Impact:
Reduces disk read operations from O(N) dependent on sorting order to exactly O(1) file read/parse during the extraction of the latest post. This drastically speeds up both
startandbuildcommands for large or heavily-populated blog repos.🔬 Measurement:
Run
bun run scripts/generate-latest-post.jslocally on a machine with a heavily populated directory (or artificially populate a mock directory) and measure execution time or I/O system calls usingstraceor standard profiling. Alternatively, runbun run buildand observe the "Build preparation complete." delay before Docusaurus begins standard site aggregation.PR created automatically by Jules for task 377610626140408433 started by @NickJLange
Summary by cubic
Speed up latest-post generation by switching
scripts/generate-latest-post.jsto a two-pass algorithm that reads and parses only one file. This cuts disk reads from O(N) to O(1) and reduces build time for large blogs.gray-matter..jules/bolt.mddocumenting the two-pass approach.Written for commit 8d1f4a1. Summary will update on new commits.