Skip to content

Conversation

@fnlearner
Copy link
Contributor

@fnlearner fnlearner commented Oct 21, 2025

@gemini-code-assist
Copy link

Summary of Changes

Hello @fnlearner, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the schema optimization process by introducing a new restore utility function designed to decompress data efficiently. It includes robust testing, such as a synthetic benchmark to ensure the restored output matches the original schema, and specific tests to handle empty elements correctly. Furthermore, it updates the project's documentation to reflect the new compression strategy and integrates the optimized restore method into the main example.

Highlights

  • Restore Method Optimization: The core restore utility function has been optimized to efficiently decode compressed JSON structures back into their full schema representation, including handling abbreviated keys and numeric type mappings.
  • Comprehensive Test Cases: New test files (restore.benchmark.spec.ts and restore.empty-filter.spec.ts) have been introduced to validate the restore function's correctness, including a synthetic benchmark and specific tests for pruning empty elements.
  • Schema Compression Documentation: The prompt.md file has been updated with detailed instructions and mapping tables for abbreviated JSON keys and numeric type values, outlining the new schema compression strategy.
  • Example Usage Update: The example/main.tsx file now utilizes the optimized restore method with a new compressed JSON example (test_json7.json), demonstrating its integration.
  • Dependency Addition: The ajv library has been added as a development dependency to package.json, likely for schema validation in the new benchmark tests.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant optimization by defining a compressed JSON schema and implementing a restore function to decompress it, which will help reduce payload size. The addition of both a benchmark test and specific filter tests is excellent, providing strong confidence in the new functionality. My review focuses on the implementation of the restore utility. I've identified a type definition mismatch where the SectionSpec type is missing a title property that the restore function correctly adds. I've also noted a few opportunities for refactoring within the restore utility to reduce code duplication and improve maintainability. Overall, this is a great enhancement to the project.

@@ -0,0 +1,615 @@
// -------------------- 解码映射常量 --------------------
import { NarrativeTextSpec, SectionSpec } from '../src/schema/structure';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The restore function correctly adds a title property to section objects, which is consistent with the data in origin.json and xiaomi_origin.json. However, the SectionSpec type imported from ../src/schema/structure.ts does not include this property, leading to a type inconsistency. To ensure type safety and reflect the actual data structure, the SectionSpec and StandardSectionSpec type definitions should be updated to include the optional title property.

For example, in src/schema/structure.ts:

export type StandardSectionSpec = {
  title?: ParagraphSpec; // Or a more specific type for the title
  paragraphs?: ParagraphSpec[];
};

Comment on lines +293 to +296
// origin 数字强制转换
if (originalKey === 'origin' && typeof restoredValue === 'string' && /^\d+(\.\d+)?$/.test(restoredValue)) {
restoredValue = Number(restoredValue);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to convert the origin property from a string to a number is useful. However, this same logic is duplicated in the restorePhrasesArray function (line 344). To avoid code duplication and ensure consistent behavior, this transformation could be centralized. A good approach would be to move this conversion logic into the decodeValue function, so it's automatically applied whenever a key named origin is decoded.

Comment on lines +384 to +468
function restoreBulletItem(raw: unknown, path: string, strict: boolean): BulletItemSpec | null {
if (!raw || typeof raw !== 'object') return null;
// 压缩形式: { p:{ dt:1, i:[ {...phrase...} ] } } 或 { dt:1,i:[...] }
if ('p' in raw) {
const phrasesContainer = raw.p;
if (
phrasesContainer &&
typeof phrasesContainer === 'object' &&
'dt' in phrasesContainer &&
'i' in phrasesContainer
) {
const items = Array.isArray((phrasesContainer as Record<string, unknown>).i)
? ((phrasesContainer as Record<string, unknown>).i as unknown[])
: [];
const phrases = restorePhrasesArray(
items,
String(decodeValue('type', (phrasesContainer as Record<string, unknown>).dt) || 'text'),
{ parentKey: 'p', path, strict },
);
const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases };
// 处理子级 bullets: 压缩键 bs -> subBullet paragraph (t:11) 或直接 dt:32,i:[...] 列表
if ('bs' in raw && raw.bs && typeof raw.bs === 'object') {
const sub = raw.bs;
// 支持两种: { dt:32,i:[{p:{...}},...] , io:true } 或 { b:{ dt:32,i:[...] } }
if ('dt' in sub && 'i' in sub) {
const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : [];
const subBullets = subItems
.map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
.filter(Boolean);
if (subBullets.length)
bulletItem.subBullet = {
type: ParagraphType.BULLETS,
isOrder: !!(sub as Record<string, unknown>).io,
bullets: subBullets,
};
} else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) {
const bInner = sub.b as Record<string, unknown>;
const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : [];
const subBullets = subItems
.map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
.filter(Boolean);
if (subBullets.length)
bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets };
}
}
return bulletItem;
}
}
if ('dt' in raw && 'i' in raw) {
const items = Array.isArray((raw as Record<string, unknown>).i)
? ((raw as Record<string, unknown>).i as unknown[])
: [];
const phrases = restorePhrasesArray(
items,
String(decodeValue('type', (raw as Record<string, unknown>).dt) || 'text'),
{ parentKey: 'p', path, strict },
);
const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases };
if ('bs' in raw && raw.bs && typeof raw.bs === 'object') {
const sub = raw.bs;
if ('dt' in sub && 'i' in sub) {
const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : [];
const subBullets = subItems
.map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
.filter(Boolean);
if (subBullets.length)
bulletItem.subBullet = {
type: ParagraphType.BULLETS,
isOrder: !!(sub as Record<string, unknown>).io,
bullets: subBullets,
};
} else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) {
const bInner = sub.b as Record<string, unknown>;
const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : [];
const subBullets = subItems
.map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict))
.filter(Boolean);
if (subBullets.length)
bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets };
}
}
return bulletItem;
}
return null;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function contains duplicated logic for handling sub-bullets (the bs property). The block of code that processes bs is repeated within both the if ('p' in raw) and if ('dt' in raw && 'i' in raw) branches. To improve maintainability and reduce redundancy, this logic could be extracted into a separate helper function. Alternatively, the function could be restructured to first determine the phrases and then handle the optional subBullet property once at the end, regardless of how the phrases were parsed.

// Load schema
const schemaPath = path.resolve(__dirname, '../schema.json');
const schema = JSON.parse(readFileSync(schemaPath, 'utf-8'));
const ajv = new Ajv({ allErrors: true, strict: false });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Initializing Ajv with strict: false can be useful for benchmarks to allow for a wider range of generated test data, but it may also mask subtle schema violations, such as unexpected properties in the restored object. For more rigorous validation, it's generally recommended to use strict: true. If strict: false is intentionally used here, consider adding a comment to explain why it's necessary (e.g., if the schema itself is not fully compliant with strict mode).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant