Skip to content

[codex] Prevent Hancom from rejecting HWPX roundtrips#40

Merged
airmang merged 1 commit into
mainfrom
codex/hwpx-hancom-compat-main
May 4, 2026
Merged

[codex] Prevent Hancom from rejecting HWPX roundtrips#40
airmang merged 1 commit into
mainfrom
codex/hwpx-hancom-compat-main

Conversation

@airmang
Copy link
Copy Markdown
Owner

@airmang airmang commented May 4, 2026

Summary

  • Preserve Hancom-compatible root namespace declarations and standalone="yes" on section/header HWPML parts during document serialization.
  • Preserve original ZIP entry order and per-entry metadata when saving an existing HWPX package.
  • Make package_validator fail on the exact regression that produced Hancom “damaged/tampered” behavior, and normalize section/header roots when packing directories back to HWPX.

Root cause

The previous save path produced XML that generic parsers accepted, but Hancom Office can reject: section/header roots lost broad HWPML namespace declarations, the XML declaration could omit standalone="yes", and ZIP entries were rewritten without preserving original archive ordering/metadata.

Validation

  • python -m pytest tests/test_gap_closure_tools.py tests/test_opc_package.py -q — 26 passed
  • python -m pytest -q — 256 passed, 2 skipped
  • pyright — 0 errors
  • Real HWPX add-paragraph roundtrip against the provided form: validate_package passed, section/header roots retained standalone="yes" and no required namespace declarations were missing

Notes

Manual opening in the Hancom Office GUI was not performed from this environment.

Keep section/header XML roots and archive metadata aligned with Hancom-authored packages so simple read-modify-save operations do not produce files that look damaged or tampered with.

Constraint: Hancom Office is stricter than generic XML parsers about HWPML root declarations, standalone XML declarations, and OPC ZIP entry metadata.
Rejected: Relying on XML well-formedness alone | it allowed files that validate in Python but can be rejected by Hancom.
Confidence: high
Scope-risk: moderate
Directive: Preserve Hancom-compatible HWPML root metadata when adding new serializers or pack/unpack paths.
Tested: python -m pytest tests/test_gap_closure_tools.py tests/test_opc_package.py -q; python -m pytest -q; pyright; real HWPX add_paragraph roundtrip validator/root namespace audit
Not-tested: Manual opening in Hancom Office GUI
@airmang airmang marked this pull request as ready for review May 4, 2026 15:35
@airmang airmang merged commit dbc5ea6 into main May 4, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant