Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple XML Schemas with different namespaces in the same WSDL #5

Open
holtkamp opened this issue Feb 15, 2017 · 15 comments
Open

Multiple XML Schemas with different namespaces in the same WSDL #5

holtkamp opened this issue Feb 15, 2017 · 15 comments
Assignees
Labels

Comments

@holtkamp
Copy link

holtkamp commented Feb 15, 2017

While experimenting with this library against https://cal-test.adyen.com/cal/services/Account/v1?wsdl I noticed errors like:

php vendor/bin/wsdl2php convert config.yml v1.wsdl
[GoetasWebservices\XML\XSDReader\Exception\TypeException]                                   
  Can't find type named {http://common.services.adyen.com}#Address, at line 26 in /Users/user/workspace/payment-api-adyen/v1.wsdl         

This particular element resides in a different namespace than the target namespace of the first <xsd:schema/> tag.

As we can see in the WSDL the <wsdl:types/> tags contains multiple <xsd:schema/> tags.

screen shot 2017-02-15 at 23 10 53

I wonder, is this currently supported?

@goetas
Copy link
Member

goetas commented Feb 15, 2017

this should work... i remember that i have already used it to generate some stuff for adyen integration...

should investigate...

@holtkamp
Copy link
Author

holtkamp commented Feb 15, 2017

Wow, that is a quick answer, I was still investigating myself, it should work indeed looking at this:

https://github.com/goetas-webservices/wsdl-reader/blob/7ca312187f620c667f6199ad73ec4ad0c7c4217d/src/DefinitionsReader.php#L155-L164

Will fiddle around a bit more and get back here.

@holtkamp
Copy link
Author

holtkamp commented Feb 15, 2017

Is it possible that the DefinitionsReader chokes on:

<xsd:element minOccurs="0" name="address" nillable="true" type="ns1:Address" />

because when trying to process that xsd:element using SchemaReader::schemaNode(), the type ns1:Address has not been defined yet since the schema it belongs to has not been processed yet?

@holtkamp
Copy link
Author

holtkamp commented Feb 15, 2017

Reduced it to this problematic WSDL:

<?xml version="1.0" encoding="UTF-8"?>
<wsdl:definitions xmlns:ns1="http://common.services.adyen.com" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://account.marketplace.services.adyen.com">
  <wsdl:types>
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://account.marketplace.services.adyen.com">
      <xsd:import namespace="http://common.services.adyen.com" />
      <xsd:complexType name="AccountHolderDetails">
        <xsd:sequence>
          <xsd:element minOccurs="0" name="address" nillable="true" type="ns1:Address" />
        </xsd:sequence>
      </xsd:complexType>
    </xsd:schema>
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://common.services.adyen.com">
      <xsd:complexType name="Address">
        <xsd:sequence>
          <xsd:element minOccurs="0" name="city" nillable="true" type="xsd:string" />
        </xsd:sequence>
      </xsd:complexType>
    </xsd:schema>
  </wsdl:types>
</wsdl:definitions>

I got the idea that the import of the namespace in the second schema is going wrong. When the first schema is being processed, it is stored in the loadedFiles array as /Users/user/workspace/payment-api-adyen/v1.wsdl#1 but can not be loaded like that anymore when importing it later on in SchemaReader::loadImport() since that one considers only /Users/user/workspace/payment-api-adyen/v1.wsdl (note the missing hash tag).

Storing it internally (DefinitionsReader::loadTypes()) using the target namespace instead of a key might be an idea?:

$childNode->ownerDocument->documentURI . '#'  . $childNode->getAttribute('targetNamespace');

Note that this does not resolve the problem that the type of an XSD element should be defined BEFORE it is used...

@goetas
Copy link
Member

goetas commented Feb 16, 2017 via email

@goetas
Copy link
Member

goetas commented Feb 16, 2017

I had a look at the problem... and is not a easy one :(

The schema is trying to solve a type that does not exist yet ( will be loaded in the next iteration). I was partially aware of the problem and there is some code and strategy to solve it... but definitively the current code will not work and to make it work will require more that a simple bugfix :(

All the callback and call_user_func were trying to "solve" the problem, but they hardly will work... the solution should be a multi step parsing.

  1. set the schemas to load (most of the times only one, but in this case two or more)
  2. load all the not loaded schema
  3. resolve types

@goetas goetas added the bug label Feb 16, 2017
@goetas goetas self-assigned this Feb 16, 2017
@holtkamp
Copy link
Author

Thanks for diving in to this in this short term! That "multi step parsing" seems a biggie indeed...

But please note that for the example snippet (where the types are defined before they are being referred to), this 'big' issue does not apply => in that particular case the already processed schema can not be 'applied' / read properly:

  • the code looks for a schema with key Users/user/workspace/payment-api-adyen/v1.wsdl
  • the code should look for a schema with key for Users/user/workspace/payment-api-adyen/v1.wsdl#1 (an index is appended to the documentURI), after the root schema, it is the second schema, so index = 1

I think we can conclude that there are two aspects:

  • impossible to 'find' an element of an already loaded schema, since referencing goes wrong => relatively small issue
  • impossible to refer to types that are not loaded yet => big issue, major effort. Probably prioritize the parsing steps? Definitions first, then references...

Disclaimer: I did the puzzling late at night, so my issue description might be not have been so well-structured 😨

@goetas
Copy link
Member

goetas commented Feb 16, 2017

Thanks for taking time for elaborating my analysis.

impossible to 'find' an element of an already loaded schema, since referencing goes wrong => relatively small issue

Types are loaded by namespace and name pair, not by path. The path is used just in error messages and to avoid re-loading the same file twice (this increases performance). so instead of documentUri and index, we can use any random string

impossible to refer to types that are not loaded yet => big issue, major effort. Probably prioritize the parsing steps? Definitions first, then references...

Exactly!

Disclaimer: I did the puzzling late at night, so my issue description might be not have been so well-structured

I did I in early morning 😄

@holtkamp
Copy link
Author

holtkamp commented Feb 16, 2017

Types are loaded by namespace and name pair, not by path. The path is used just in error messages and to avoid re-loading the same file twice (this increases performance).

so true, the loading of the types: yes. But the schemas that are processed, are stored in an array (as a local cache?) '$loadedFiles' with a certain key. https://github.com/goetas-webservices/wsdl-reader/blob/7ca312187f620c667f6199ad73ec4ad0c7c4217d/src/DefinitionsReader.php#L160 you can see that this key is appended with an numeric index ($k), so 0 for the first processed schema, then 1, 2, 3, etc.... (this is where it goes wrong!)

Later on, this index makes it 'impossible' to lookup the schema again for an <xsd:import/> (see the second schema in my example, it imports the previously defined schema) in the $loadedFiles array here:
https://github.com/goetas-webservices/xsd-reader/blob/0dec546c3d0b430fea142ec7eed3c9b17b09f8f9/src/SchemaReader.php#L654

So I suggested to:

  • instead of appending an index number
  • append the target namespace
DefinitionsReader::loadTypes(){
   ...
   $key = $childNode->ownerDocument->documentURI . "#" .$childNode->getAttribute('targetNamespace');
   $schema = $this->reader->readNode($childNode,  $key);
   $definitions->getSchema()->addSchema($schema);
}
SchemaReader::loadImport(){
   ...
   elseif (isset($this->loadedFiles[$file.'#' . $node->getAttribute("namespace")] )) {
      $schema->addSchema($this->loadedFiles[$file.'#' . $node->getAttribute("namespace")] );
   ...
   }
}

Only: I am not sure of that loading mechanism with the namespace appended always works... 😄

@goetas
Copy link
Member

goetas commented Feb 16, 2017

the approach is nice, but unfortunately the import is parsed in the first loop, while the "http://common.services.adyen.com" schema will be parsed in the second loop. But this never happens since the exception is throw in the first iteration of the loop:

DefinitionsReader::loadTypes(){
   ...
   $key = $childNode->ownerDocument->documentURI . "#" .$childNode->getAttribute('targetNamespace');
   $schema = $this->reader->readNode($childNode,  $key);
   $definitions->getSchema()->addSchema($schema);
}

@holtkamp
Copy link
Author

doh! Sorry.

I found out I changed my testing WSDL snippet: first define the types, THEN refer to them:

<?xml version="1.0" encoding="UTF-8"?>
<wsdl:definitions xmlns:ns1="http://common.services.adyen.com" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://account.marketplace.services.adyen.com">
  <wsdl:types>
   <!-- first define the types! -->
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://common.services.adyen.com">
      <xsd:complexType name="Address">
        <xsd:sequence>
          <xsd:element minOccurs="0" name="city" nillable="true" type="xsd:string" />
        </xsd:sequence>
      </xsd:complexType>
    </xsd:schema>

    <!-- now we can reference them -->
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://account.marketplace.services.adyen.com">
      <xsd:import namespace="http://common.services.adyen.com" />
      <xsd:complexType name="AccountHolderDetails">
        <xsd:sequence>
          <xsd:element minOccurs="0" name="address" nillable="true" type="ns1:Address" />
        </xsd:sequence>
      </xsd:complexType>
    </xsd:schema>
  </wsdl:types>
</wsdl:definitions>

That works with the suggested fix: no more exception is thrown and PHP classes + YAML metadata files are generated properly. Indeed this is not a fix for the second, bigger problem where types are used before they are defined...

@goetas
Copy link
Member

goetas commented Feb 16, 2017

Yes, I agree. Ff you can change the order or schemas in your wsdl, your approach will work! I will be happy to merge a PR fixing it.

Regarding the second bigger problem, we can put it in backlog 😄

@holtkamp
Copy link
Author

holtkamp commented Feb 17, 2017

you can change the order or schemas in your wsdl

Yeah, tried that, problem is that there is no possible sequence that does not trigger the error. I will isolate and prepare the XML Schema's using some best practices to avoid this problem.

I will be happy to merge a PR fixing it.

Ok, will do try to!

Regarding the second bigger problem, we can put it in backlog

I was thinking of a viable approach, what about: instead of directly invoking all callbacks:

  • gather all callbacks (only traverse the import/include statements to acquire them all, reach all nodes in the tree)
  • sort them in such a way that either:
    • type definitions are processed first
    • the element definitions are processed last

Did you got inspired by other libraries for parsing the XML Schema's? Or... homebrew 😄

@goetas
Copy link
Member

goetas commented Feb 17, 2017

Regarding the approach you suggested, more or less that was the way I had in mind too.

Where i got inspired? This set of XML-based projects started more than 10 years ago, and I was also not so experienced... so is just a lot of hours spent in coding, coding and...... coding.

The first version was still depending on the dom, even after the parsing... was a really bad idea 😄
and this is the last available state. currently it has already some drawbacks, as: there is not difference between sequence and choice or there is no way to express two elements with same name but different types.

After so much time, my conclusion is that XSD was a great idea, but over complicated by trying to support each single micro case that was possible to express with XML. Such complexity was not trivial at all to implement in parsers and other tools... especially 10 years ago. With soap, the over complication got even bigger.

Now JSON folks are re-implementing for JSON everything was already built for XML because XML become so complicated that takes years to write a parser... or to build something on top of it.
JSON folks decided to drop a lot of "cool" stuff that XML was offering (even if maybe nobody really needed). Sometimes I see also some "over complications" for JSON too... but definitively less that XSD.

Homebrew? Never read a single line of that project 😄

@holtkamp
Copy link
Author

homebrew

haha, I did not mean the project, but "did you all make it yourself?". Apparently so! I recognize the constant change of ideas and strategies over the year! Guess this "we might need multi step parsing" is another one of those bumps in the road ahead 😄

I guess your contribution / work on JMS Serializer also brought new insight on how to design a parser... Thanks anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants