F/schema optimize #143

fnlearner · 2025-10-21T08:25:32Z

add test case and optimizing the restore method
the assistant is here

shared demos：
https://gemini.google.com/share/a906a1cb5532
https://gemini.google.com/share/339a5681f618
https://gemini.google.com/share/b0600d9292ff

gemini-code-assist · 2025-10-21T08:26:36Z

Summary of Changes

Hello @fnlearner, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the schema optimization process by introducing a new restore utility function designed to decompress data efficiently. It includes robust testing, such as a synthetic benchmark to ensure the restored output matches the original schema, and specific tests to handle empty elements correctly. Furthermore, it updates the project's documentation to reflect the new compression strategy and integrates the optimized restore method into the main example.

Highlights

Restore Method Optimization: The core restore utility function has been optimized to efficiently decode compressed JSON structures back into their full schema representation, including handling abbreviated keys and numeric type mappings.
Comprehensive Test Cases: New test files (restore.benchmark.spec.ts and restore.empty-filter.spec.ts) have been introduced to validate the restore function's correctness, including a synthetic benchmark and specific tests for pruning empty elements.
Schema Compression Documentation: The prompt.md file has been updated with detailed instructions and mapping tables for abbreviated JSON keys and numeric type values, outlining the new schema compression strategy.
Example Usage Update: The example/main.tsx file now utilizes the optimized restore method with a new compressed JSON example (test_json7.json), demonstrating its integration.
Dependency Addition: The ajv library has been added as a development dependency to package.json, likely for schema validation in the new benchmark tests.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant optimization by defining a compressed JSON schema and implementing a restore function to decompress it, which will help reduce payload size. The addition of both a benchmark test and specific filter tests is excellent, providing strong confidence in the new functionality. My review focuses on the implementation of the restore utility. I've identified a type definition mismatch where the SectionSpec type is missing a title property that the restore function correctly adds. I've also noted a few opportunities for refactoring within the restore utility to reduce code duplication and improve maintainability. Overall, this is a great enhancement to the project.

gemini-code-assist · 2025-10-21T08:28:25Z

example/utils.ts

@@ -0,0 +1,615 @@
+// -------------------- 解码映射常量 --------------------
+import { NarrativeTextSpec, SectionSpec } from '../src/schema/structure';


The restore function correctly adds a title property to section objects, which is consistent with the data in origin.json and xiaomi_origin.json. However, the SectionSpec type imported from ../src/schema/structure.ts does not include this property, leading to a type inconsistency. To ensure type safety and reflect the actual data structure, the SectionSpec and StandardSectionSpec type definitions should be updated to include the optional title property.

For example, in src/schema/structure.ts:

export type StandardSectionSpec = { title?: ParagraphSpec; // Or a more specific type for the title paragraphs?: ParagraphSpec[]; };

gemini-code-assist · 2025-10-21T08:28:25Z

example/utils.ts

+    // origin 数字强制转换
+    if (originalKey === 'origin' && typeof restoredValue === 'string' && /^\d+(\.\d+)?$/.test(restoredValue)) {
+      restoredValue = Number(restoredValue);
+    }


The logic to convert the origin property from a string to a number is useful. However, this same logic is duplicated in the restorePhrasesArray function (line 344). To avoid code duplication and ensure consistent behavior, this transformation could be centralized. A good approach would be to move this conversion logic into the decodeValue function, so it's automatically applied whenever a key named origin is decoded.

gemini-code-assist · 2025-10-21T08:28:25Z

example/utils.ts

+function restoreBulletItem(raw: unknown, path: string, strict: boolean): BulletItemSpec | null {
+  if (!raw || typeof raw !== 'object') return null;
+  // 压缩形式: { p:{ dt:1, i:[ {...phrase...} ] } } 或 { dt:1,i:[...] }
+  if ('p' in raw) {
+    const phrasesContainer = raw.p;
+    if (
+      phrasesContainer &&
+      typeof phrasesContainer === 'object' &&
+      'dt' in phrasesContainer &&
+      'i' in phrasesContainer
+    ) {
+      const items = Array.isArray((phrasesContainer as Record<string, unknown>).i)
+        ? ((phrasesContainer as Record<string, unknown>).i as unknown[])
+        : [];
+      const phrases = restorePhrasesArray(
+        items,
+        String(decodeValue('type', (phrasesContainer as Record<string, unknown>).dt) || 'text'),
+        { parentKey: 'p', path, strict },
+      );
+      const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases };
+      // 处理子级 bullets: 压缩键 bs -> subBullet paragraph (t:11) 或直接 dt:32,i:[...] 列表
+      if ('bs' in raw && raw.bs && typeof raw.bs === 'object') {
+        const sub = raw.bs;
+        // 支持两种： { dt:32,i:[{p:{...}},...] , io:true } 或 { b:{ dt:32,i:[...] } }
+        if ('dt' in sub && 'i' in sub) {
+          const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : [];
+          const subBullets = subItems
+            .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
+            .filter(Boolean);
+          if (subBullets.length)
+            bulletItem.subBullet = {
+              type: ParagraphType.BULLETS,
+              isOrder: !!(sub as Record<string, unknown>).io,
+              bullets: subBullets,
+            };
+        } else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) {
+          const bInner = sub.b as Record<string, unknown>;
+          const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : [];
+          const subBullets = subItems
+            .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
+            .filter(Boolean);
+          if (subBullets.length)
+            bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets };
+        }
+      }
+      return bulletItem;
+    }
+  }
+  if ('dt' in raw && 'i' in raw) {
+    const items = Array.isArray((raw as Record<string, unknown>).i)
+      ? ((raw as Record<string, unknown>).i as unknown[])
+      : [];
+    const phrases = restorePhrasesArray(
+      items,
+      String(decodeValue('type', (raw as Record<string, unknown>).dt) || 'text'),
+      { parentKey: 'p', path, strict },
+    );
+    const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases };
+    if ('bs' in raw && raw.bs && typeof raw.bs === 'object') {
+      const sub = raw.bs;
+      if ('dt' in sub && 'i' in sub) {
+        const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : [];
+        const subBullets = subItems
+          .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
+          .filter(Boolean);
+        if (subBullets.length)
+          bulletItem.subBullet = {
+            type: ParagraphType.BULLETS,
+            isOrder: !!(sub as Record<string, unknown>).io,
+            bullets: subBullets,
+          };
+      } else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) {
+        const bInner = sub.b as Record<string, unknown>;
+        const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : [];
+        const subBullets = subItems
+          .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
+          .filter(Boolean);
+        if (subBullets.length)
+          bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets };
+      }
+    }
+    return bulletItem;
+  }
+  return null;
+}


This function contains duplicated logic for handling sub-bullets (the bs property). The block of code that processes bs is repeated within both the if ('p' in raw) and if ('dt' in raw && 'i' in raw) branches. To improve maintainability and reduce redundancy, this logic could be extracted into a separate helper function. Alternatively, the function could be restructured to first determine the phrases and then handle the optional subBullet property once at the end, regardless of how the phrases were parsed.

gemini-code-assist · 2025-10-21T08:28:25Z

__tests__/restore/benchmark.spec.ts

+// Load schema
+const schemaPath = path.resolve(__dirname, '../schema.json');
+const schema = JSON.parse(readFileSync(schemaPath, 'utf-8'));
+const ajv = new Ajv({ allErrors: true, strict: false });


Initializing Ajv with strict: false can be useful for benchmarks to allow for a wider range of generated test data, but it may also mask subtle schema violations, such as unexpected properties in the restored object. For more rigorous validation, it's generally recommended to use strict: true. If strict: false is intentionally used here, consider adding a comment to explain why it's necessary (e.g., if the schema itself is not fully compliant with strict mode).

gemini-code-assist bot reviewed Oct 21, 2025

View reviewed changes

fnlearner added 5 commits October 21, 2025 21:25

feat: 🎸 add schema optimizatino

c5be13a

feat: 🎸 add omit key

aa48b86

fix: 🐛 context simulation

2a9a203

feat: 🎸 add test case and optimizing the retore method

35dcc70

test: 💍 add compression test case

6db615e

fnlearner force-pushed the f/schema_optimize branch from 82eca91 to 6db615e Compare October 21, 2025 13:26

fnlearner added 5 commits October 21, 2025 21:42

test: 💍 update test case

df1377e

fix: 🐛 fix no-compressive pa && nest bullets

e5b9148

chore: 🤖 update test path

08bbb9c

test: 💍 add regression testing

83a48d7

test: 💍 extract common method

f4e1483

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

F/schema optimize #143

F/schema optimize #143

Uh oh!

fnlearner commented Oct 21, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 21, 2025

Uh oh!

gemini-code-assist bot Oct 21, 2025

Uh oh!

gemini-code-assist bot Oct 21, 2025

Uh oh!

gemini-code-assist bot Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -0,0 +1,615 @@
		// -------------------- 解码映射常量 --------------------
		import { NarrativeTextSpec, SectionSpec } from '../src/schema/structure';

F/schema optimize #143

Are you sure you want to change the base?

F/schema optimize #143

Uh oh!

Conversation

fnlearner commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fnlearner commented Oct 21, 2025 •

edited

Loading