Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop if XSD is recursive #9

Open
svanteschubert opened this issue Mar 27, 2023 · 3 comments
Open

Loop if XSD is recursive #9

svanteschubert opened this issue Mar 27, 2023 · 3 comments

Comments

@svanteschubert
Copy link

The European e-Invoice standard uses OASIS UBL XML and UN/CEFACT CII XML, the latter is being used form its second 2016 release (D16B), you may find the XSD here: https://unece.org/DAM/cefact/xml_schemas/D16B_SCRDM__Subset__CII.zip

There is something nasty in, an element that can contain itself:

<xsd:complexType name="**GroupedWorkItemType**">
	<xsd:sequence>
		<xsd:element name="ID" type="udt:IDType"/>
		<xsd:element name="PrimaryClassificationCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="AlternativeClassificationCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="TypeCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="Comment" type="udt:TextType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="TotalQuantity" type="udt:QuantityType" minOccurs="0"/>
		<xsd:element name="Index" type="udt:TextType" minOccurs="0"/>
		<xsd:element name="RequestedActionCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="PriceListItemID" type="udt:IDType" minOccurs="0"/>
		<xsd:element name="ContractualLanguageCode" type="udt:CodeType" minOccurs="0"/>
		<xsd:element name="TotalCalculatedPrice" type="ram:CalculatedPriceType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ItemGroupedWorkItem" type="ram:**GroupedWorkItemType**" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ItemBasicWorkItem" type="ram:BasicWorkItemType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ChangedRecordedStatus" type="ram:RecordedStatusType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ActualWorkItemComplexDescription" type="ram:WorkItemComplexDescriptionType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ReferencedSpecifiedBinaryFile" type="ram:SpecifiedBinaryFileType" minOccurs="0" maxOccurs="unbounded"/>
	</xsd:sequence>
</xsd:complexType>

The trick to solve it was that the constructor is only called, when the element does not already exist.
For this reason I added a Map <String, XsdElement> at the document root

public Map<String, XsdElement> allElements = new HashMap<>();

and exchanged the two constructor with two statical factory methods - a final tweak was to move the recursive init() out of the constructor (otherwise the case above will still loop) :-)

    public static XsdElement newXsdElement(XSElementDeclaration element, XsdDocument parent) {
        String ns = element.getNamespace();
        String name = element.getName();
        if(ns != null && !ns.isEmpty()){
            name = "{" + ns + "}" + name;
        }else{
            name = name;
        }
        if(parent.allElements.containsKey(name)){
            return parent.allElements.get(name);
        }else{
            XsdElement xsdElement = new XsdElement(element, parent);
            parent.allElements.put(name, xsdElement);
            xsdElement.init();
            return xsdElement;
        }
    }

    public static XsdElement newXsdElement(XSParticle elementDefinition, XsdElement parent) {
        XSElementDeclaration element = (XSElementDecl) elementDefinition.getTerm();
        String ns = element.getNamespace();
        String name = element.getName();
        if(ns != null && !ns.isEmpty()){
            name = "{" + ns + "}" + name;
        }else{
            name = name;
        }
        if(parent.document.allElements.containsKey(name)){
            return parent.document.allElements.get(name);
        }else{
            XsdElement xsdElement = new XsdElement(elementDefinition, parent);
            parent.document.allElements.put(name, xsdElement);
            xsdElement.init();
            return xsdElement;
        }
    }

Now the compare method is still looping - likely for similar reason - and
I will check tomorrow. I plan to provide a patch if you like to (just answer) otherwise I might save the time..

@yoep
Copy link
Owner

yoep commented Mar 27, 2023

Hi

I've you want/could create a PR for it, it would be much appreciated.
Otherwise, I'll see if I can find the time to patch it later this week.

@svanteschubert
Copy link
Author

Hi yoep,

I did so now: #10
It was a a pleasure to work on this and I have learned a lot about XSD (reading the spec) and on the Xerces API!
Never heard of lombak before, very useful! Learned a lot from you, thank you!

PS: Please be gentle, when I might renamed a few things. For instance, using now oldNode and newNode (dropping the sometimes used prefix "original" in favor of "old"). In the end these are just names and perhaps personal taste, you could start changing things back, but perhaps have a quick chat ahead to understand the intentions ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants