-
Notifications
You must be signed in to change notification settings - Fork 6
F/schema optimize #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
F/schema optimize #143
Conversation
Summary of ChangesHello @fnlearner, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the schema optimization process by introducing a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant optimization by defining a compressed JSON schema and implementing a restore function to decompress it, which will help reduce payload size. The addition of both a benchmark test and specific filter tests is excellent, providing strong confidence in the new functionality. My review focuses on the implementation of the restore utility. I've identified a type definition mismatch where the SectionSpec type is missing a title property that the restore function correctly adds. I've also noted a few opportunities for refactoring within the restore utility to reduce code duplication and improve maintainability. Overall, this is a great enhancement to the project.
| @@ -0,0 +1,615 @@ | |||
| // -------------------- 解码映射常量 -------------------- | |||
| import { NarrativeTextSpec, SectionSpec } from '../src/schema/structure'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The restore function correctly adds a title property to section objects, which is consistent with the data in origin.json and xiaomi_origin.json. However, the SectionSpec type imported from ../src/schema/structure.ts does not include this property, leading to a type inconsistency. To ensure type safety and reflect the actual data structure, the SectionSpec and StandardSectionSpec type definitions should be updated to include the optional title property.
For example, in src/schema/structure.ts:
export type StandardSectionSpec = {
title?: ParagraphSpec; // Or a more specific type for the title
paragraphs?: ParagraphSpec[];
};| // origin 数字强制转换 | ||
| if (originalKey === 'origin' && typeof restoredValue === 'string' && /^\d+(\.\d+)?$/.test(restoredValue)) { | ||
| restoredValue = Number(restoredValue); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to convert the origin property from a string to a number is useful. However, this same logic is duplicated in the restorePhrasesArray function (line 344). To avoid code duplication and ensure consistent behavior, this transformation could be centralized. A good approach would be to move this conversion logic into the decodeValue function, so it's automatically applied whenever a key named origin is decoded.
| function restoreBulletItem(raw: unknown, path: string, strict: boolean): BulletItemSpec | null { | ||
| if (!raw || typeof raw !== 'object') return null; | ||
| // 压缩形式: { p:{ dt:1, i:[ {...phrase...} ] } } 或 { dt:1,i:[...] } | ||
| if ('p' in raw) { | ||
| const phrasesContainer = raw.p; | ||
| if ( | ||
| phrasesContainer && | ||
| typeof phrasesContainer === 'object' && | ||
| 'dt' in phrasesContainer && | ||
| 'i' in phrasesContainer | ||
| ) { | ||
| const items = Array.isArray((phrasesContainer as Record<string, unknown>).i) | ||
| ? ((phrasesContainer as Record<string, unknown>).i as unknown[]) | ||
| : []; | ||
| const phrases = restorePhrasesArray( | ||
| items, | ||
| String(decodeValue('type', (phrasesContainer as Record<string, unknown>).dt) || 'text'), | ||
| { parentKey: 'p', path, strict }, | ||
| ); | ||
| const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases }; | ||
| // 处理子级 bullets: 压缩键 bs -> subBullet paragraph (t:11) 或直接 dt:32,i:[...] 列表 | ||
| if ('bs' in raw && raw.bs && typeof raw.bs === 'object') { | ||
| const sub = raw.bs; | ||
| // 支持两种: { dt:32,i:[{p:{...}},...] , io:true } 或 { b:{ dt:32,i:[...] } } | ||
| if ('dt' in sub && 'i' in sub) { | ||
| const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : []; | ||
| const subBullets = subItems | ||
| .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict)) | ||
| .filter(Boolean); | ||
| if (subBullets.length) | ||
| bulletItem.subBullet = { | ||
| type: ParagraphType.BULLETS, | ||
| isOrder: !!(sub as Record<string, unknown>).io, | ||
| bullets: subBullets, | ||
| }; | ||
| } else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) { | ||
| const bInner = sub.b as Record<string, unknown>; | ||
| const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : []; | ||
| const subBullets = subItems | ||
| .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict)) | ||
| .filter(Boolean); | ||
| if (subBullets.length) | ||
| bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets }; | ||
| } | ||
| } | ||
| return bulletItem; | ||
| } | ||
| } | ||
| if ('dt' in raw && 'i' in raw) { | ||
| const items = Array.isArray((raw as Record<string, unknown>).i) | ||
| ? ((raw as Record<string, unknown>).i as unknown[]) | ||
| : []; | ||
| const phrases = restorePhrasesArray( | ||
| items, | ||
| String(decodeValue('type', (raw as Record<string, unknown>).dt) || 'text'), | ||
| { parentKey: 'p', path, strict }, | ||
| ); | ||
| const bulletItem: BulletItemSpec = { type: 'bullet-item', phrases }; | ||
| if ('bs' in raw && raw.bs && typeof raw.bs === 'object') { | ||
| const sub = raw.bs; | ||
| if ('dt' in sub && 'i' in sub) { | ||
| const subItems = Array.isArray(sub.i) ? (sub.i as unknown[]) : []; | ||
| const subBullets = subItems | ||
| .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict)) | ||
| .filter(Boolean); | ||
| if (subBullets.length) | ||
| bulletItem.subBullet = { | ||
| type: ParagraphType.BULLETS, | ||
| isOrder: !!(sub as Record<string, unknown>).io, | ||
| bullets: subBullets, | ||
| }; | ||
| } else if ('b' in sub && sub.b && typeof sub.b === 'object' && 'dt' in sub.b && 'i' in sub.b) { | ||
| const bInner = sub.b as Record<string, unknown>; | ||
| const subItems = Array.isArray(bInner.i) ? (bInner.i as unknown[]) : []; | ||
| const subBullets = subItems | ||
| .map((bi: unknown, biIdx: number) => restoreBulletItem(bi, `${path}.subBullet.bullets[${biIdx}]`, strict)) | ||
| .filter(Boolean); | ||
| if (subBullets.length) | ||
| bulletItem.subBullet = { type: ParagraphType.BULLETS, isOrder: !!bInner.io, bullets: subBullets }; | ||
| } | ||
| } | ||
| return bulletItem; | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function contains duplicated logic for handling sub-bullets (the bs property). The block of code that processes bs is repeated within both the if ('p' in raw) and if ('dt' in raw && 'i' in raw) branches. To improve maintainability and reduce redundancy, this logic could be extracted into a separate helper function. Alternatively, the function could be restructured to first determine the phrases and then handle the optional subBullet property once at the end, regardless of how the phrases were parsed.
__tests__/restore/benchmark.spec.ts
Outdated
| // Load schema | ||
| const schemaPath = path.resolve(__dirname, '../schema.json'); | ||
| const schema = JSON.parse(readFileSync(schemaPath, 'utf-8')); | ||
| const ajv = new Ajv({ allErrors: true, strict: false }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing Ajv with strict: false can be useful for benchmarks to allow for a wider range of generated test data, but it may also mask subtle schema violations, such as unexpected properties in the restored object. For more rigorous validation, it's generally recommended to use strict: true. If strict: false is intentionally used here, consider adding a comment to explain why it's necessary (e.g., if the schema itself is not fully compliant with strict mode).
82eca91 to
6db615e
Compare
add test case and optimizing the restore method
the assistant is here
shared demos:
https://gemini.google.com/share/a906a1cb5532
https://gemini.google.com/share/339a5681f618
https://gemini.google.com/share/b0600d9292ff