Skip to content

Commit

Permalink
Fix Resolve Schema Location for xsi:SchemaLocation in Config files
Browse files Browse the repository at this point in the history
- currently, due to a bug in Xerces xsi:schemaLocation URIs are absolutized, which causes resolveCommon to not set the resolvedUri to the path it resolves from namespace, since the absolute URI doesn't match the end of the resolvedURI. This fix converts systemId to a URI and uses its path when comparing with resolvedURI, therefore resulting in a successful resolution.
- add comment explaining systemIdPath check change
- add schema location to config to prove it doesn't cause failure

DAFFODIL-2339
  • Loading branch information
olabusayoT committed Oct 29, 2024
1 parent 36f2d87 commit e5851a7
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 22 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -216,12 +216,41 @@ class DFDLCatalogResolver private ()
// because the nsURI will resolve to the including schema file.
// This will cause the including schema to be repeatedly parsed resulting in a stack overflow.

lazy val systemIdUri = if (systemId != null) {
new URI(systemId)
} else {
null
}

/**
* Xerces has a bug where it absolutizes systemId i.e the user supplies
* {{{
* <xs:schema...
* ... xsi:schemaLocation="urn:some:namespace /some/path.xsd"
* }}}
* Xerces takes that schemaLocation URI and absolutizes it to {{{ file:/some/path.xsd }}}
* and passes that to our resolveEntity and in turn resolveCommon, which while it's able
* to find the namespace, fails to set the resolvedUri since the file:/some/path.xsd will
* never match anything resolved from our catalog since that'd return something like
* {{{ file:/some/absolute/path/to/some/path.xsd }}}
*
* This is a workaround to that bug where we convert systemId to a URI and check if the
* path (from URI.getPath) matches the end of resolvedUri. Note: This can ignore absolute
* URIs passed in for schemaLocation, but those are edge cases where the user expects
* the namespace to match a different file (i.e what they provide in the schemalocation)
* than what we find in the catalog.
*/
lazy val systemIdPath = if (systemIdUri != null && systemIdUri.getScheme == "file") {
systemIdUri.getPath
} else {
systemId
}
val resolvedId = {
if (resolvedSystem != null && resolvedSystem != resolvedUri) {
resolvedSystem
} else if (
resolvedUri != null && ((systemId == null) || (systemId != null && resolvedUri.endsWith(
systemId
systemIdPath
)))
) {
resolvedUri
Expand Down Expand Up @@ -698,7 +727,8 @@ class DaffodilXMLLoader(val errorHandler: org.xml.sax.ErrorHandler)

// We must use XMLReader setProperty() function to set the entity resolver--calling
// setEntityResolver with the Xerces XML reader causes validation to fail for some
// reason. We call the right function below, but unfortunately, scala-xml calls
// reason (we get a "cvc-elt.1.a: Cannot find the declaration of element 'schema'" error).
// We call the right function below, but unfortunately, scala-xml calls
// setEntityResolver in loadDocument(), which cannot be disabled and scala-xml does not
// want to change. To avoid this, we wrap the Xerces XMLReader in an XMLFilterImpl and
// override setEntityResolver to a no-op. However, XMLFilterImpl parse() calls
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,10 @@
See the License for the specific language governing permissions and
limitations under the License.
-->

<!--
Note: Bug DAFFODIL-2339
We'd like to have these schemaLocation attributes on daf:dfdlConfig, but this breaks tests because
it tries to load the schema from the schemaLocation, and can't resolve org/apache/daffodil/xsd/dafext.xsd.
Simple things like adding an sbt dependency from daffodil-cli back to daffodil-lib, whether always or "it->test"
dependent, don't fix this.
The CLI is using DaffodilXMLLoader to load this config file, so the resolver should be doing the right thing by
finding this dafext.xsd on the class path inside of daffodil-lib's jar.
The failure seems to happen earlier. A SAX fatal error is invoked when looking up dfdl:anyOther a symbol
in the dfdl schema annotation schemas, from the XMLSchema_for_DFDL.
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext org/apache/daffodil/xsd/dafext.xsd"
-->
<daf:dfdlConfig xmlns:daf="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext">
<daf:dfdlConfig xmlns:daf="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext /org/apache/daffodil/xsd/dafext.xsd"
>
<daf:externalVariableBindings xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:ex="http://example.com">
<daf:bind name="ex:var1">-9</daf:bind>
Expand Down

0 comments on commit e5851a7

Please sign in to comment.