Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump org.xmlresolver:xmlresolver from 5.2.3 to 6.0.9 #5419

Closed

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Aug 20, 2024

Bumps org.xmlresolver:xmlresolver from 5.2.3 to 6.0.9.

Release notes

Sourced from org.xmlresolver:xmlresolver's releases.

6.0.9

No release notes provided.

6.0.8

Restores the NamespaceResolver API. This API, like the other shims for 5.x backwards compatibility, is deprecated, but it does make version 6.0.8 a better drop-in replacement for version 5.x.

6.0.7

Removes two external dependencies. This makes the 6.x resolver release a drop-in replacement for the 5.x release in most cases.

6.0.6

This is a maintenance release of the 6.0.x resolver, which is still in "beta".

The following changes are included:

  • Improvements to support for UNC paths

    Documents and catalogs that use Windows UNC paths are better supported if the FIX_WINDOWS_SYSTEM_IDENTIFIERS feature is enabled.

  • Support for Java 21

    The build still produces class (and jar) files that will work with any version of Java back to Java 8. However, the build itself now uses Java 21. This closes issue #173.

  • Refactor the SAX entity resolvers

    The SAX EntityResolver and EntityResolver2 interfaces are now implemented on separate objects. This closes issue #183.

  • Fix namespace-based lookup in the DOM

The LSResourceAdapter API used when resolving documents for the DOM did not correctly handle XML Schema validation. It was failing to lookup based on the namespace. This closes issue #180.

  • Improved support for concurrency

    The catalog lookup code was not sufficiently careful about multi-threaded access. My thanks to JFK-DXML for the patch. This closes issue #182.

  • Reworked the FIX_WINDOWS_SYSTEM_IDENTIFIERS feature

    This feature now applies irrespective of platform. This will enable Windows documents and catalogs to work correctly even on non-Windows systems. This closes issue #184.

A number of smaller issues (testing and build system issues and issues not expected to be user visible) have also been corrected. See the commit log for more details.

6.0.4

No release notes provided.

6.0.3

This release introduces (deprecated) APIs that implement some of the most common features from the V5.x APIs. This should allow version 6.x to substitute for version 5.x until dependent libraries can be updated.

It also fixes a bug where classpath: URIs were not being masked.

6.0.2

No release notes provided.

... (truncated)

Commits

Dependabot compatibility score

You can trigger a rebase of this PR by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Note
Automatic rebases have been disabled on this pull request as it has been open for over 30 days.

Bumps [org.xmlresolver:xmlresolver](https://github.com/xmlresolver/xmlresolver) from 5.2.3 to 6.0.9.
- [Release notes](https://github.com/xmlresolver/xmlresolver/releases)
- [Commits](xmlresolver/xmlresolver@5.2.3...6.0.9)

---
updated-dependencies:
- dependency-name: org.xmlresolver:xmlresolver
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot requested a review from a team as a code owner August 20, 2024 03:43
@dependabot dependabot bot added dependencies Pull requests that update a dependency file java Pull requests that update Java code labels Aug 20, 2024
@dizzzz
Copy link
Member

dizzzz commented Aug 23, 2024

still not a drop-in replacement @ndw

@ndw
Copy link

ndw commented Aug 23, 2024

There's a lot of output there. Any chance this can be narrowed down a little bit? I see what appear to be a bunch of attempts to pull in xml-resolver version 1.2. I wonder if there might be classpath ordering issues?

A possibly relevant failing test appears to be org.exist.xquery.functions.validate.JaxpDtdCatalogTest.dtd_anyURI_catalog_valid. What does it do and how does it fail?

@dizzzz
Copy link
Member

dizzzz commented Aug 23, 2024

@ndw yes sorry I have been isolating the situation already. I will provide the relevant details

@dizzzz
Copy link
Member

dizzzz commented Aug 24, 2024

Interestingly, version 6.0.9 did actually fix an issue where the xmlresolver did throw an NPE in "ResourceAccess.java", line 42 and/or 82 ; these test now pass, that is good :-)

image

In the current issues, the resolver can't find the a DTD, which is stored in the database:

    <message level="Warning" line="25" column="4">schema_reference.4: 
Failed to read schema document 'xmldb:/db/parse/instance/valid-dtd.xml', because 
1) could not find the document; 
2) the document could not be read; 
3) the root element of the document is not &lt;xsd:schema&gt;.</message>

I got this visible by adding a system.println to https://github.com/dizzzz/exist/blob/4cd81b6d44218879e27f549e51b54a2bb4c9cca0/exist-core/src/test/java/org/exist/xquery/functions/validate/JaxpDtdCatalogTest.java#L163

The interesting thing here -for now- is that '/db/parse/instance/valid-dtd.xml' is identified as a schema document whilst it is a XML document that should be validated with a dtd...

Data is found on https://github.com/eXist-db/exist/blob/b1fc56ef202d186d1ad15040c22a6b5c84b7e9a8/exist-samples/src/main/resources/org/exist/samples/validation/parse

queries:

validation:jaxp-report( xs:anyURI('/db/parse/instance/valid-dtd.xml'), 
false(),
doc('/db/parse/catalog.xml') )
validation:jaxp-report( xs:anyURI('/db/parse/instance/valid-dtd.xml'), 
false(),
xs:anyURI('/db/parse/catalog.xml') )

The data can be easily uploaded using the java client

`bin/client.sh -l' (embedded mode)

image

query can be run via the 'binoculars'

image

catalog in /db/parse/...

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <uri name="MyNameSpace" uri="schemas/MyNameSpace.xsd" />
    <uri name="AnotherNamespace" uri="schemas/AnotherNamespace.xsd" />
    
    <public publicId="MNS" uri="dtds/MyNameSpace.dtd"/>
</catalog>

valid-dtd.xml

<!DOCTYPE A PUBLIC "MNS" "MyNameSpace.dtd">
<A>
    <b>b0</b>
    <c>0</c>
    <Z>
        <X>50</X>
        <Y>2006-05-04T18:13:51.0Z</Y>
    </Z>
</A>

@ndw
Copy link

ndw commented Aug 24, 2024

Can you make sure you've turned off ResolverFeature.ALWAYS_RESOLVE? Unless you've integrated support for the xmldb: URI scheme deeply into the JVM in some magic way, the resolver is going to fail if it tries to read that resource.

That feature is enabled by default because otherwise HTTP redirects cause the parser to choke. ☹️

@dizzzz
Copy link
Member

dizzzz commented Aug 24, 2024

We did integrate resolving that url with a custom url resolver indeed. That works for 15+ years or so..

@dizzzz
Copy link
Member

dizzzz commented Aug 24, 2024

What if... we could inject a custom protocol resolver into the xmlresolver ?

@ndw
Copy link

ndw commented Aug 25, 2024

If ALWAYS_RESOLVE is true, which it is by default, then it will always try to resolve the resource and if it tries to resolve an xmldb: URI, it's going to fail. First question is, if you turn that feature off, does it work?

I'm not opposed to supporting custom URI resolvers. How does that work? I quick web search didn't turn up any obvious answers.

@dizzzz
Copy link
Member

dizzzz commented Aug 25, 2024

Some explanation to be found here, our implementation here. I will check if I can intercept here , but for XML/xsds the mechanism/resolving seems to work.....

I will try ResolverFeature.ALWAYS_RESOLVE ...

@dizzzz
Copy link
Member

dizzzz commented Aug 25, 2024

The tests indeed pass if I set System.setProperty("xml.catalog.alwaysResolve", "false"); ..

@dizzzz
Copy link
Member

dizzzz commented Aug 25, 2024

The other issues are now visible (again):

[ERROR] Tests run: 3, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 1.572 s <<< FAILURE! -- in org.exist.validation.CollectionConfigurationValidationModeTest
[ERROR] org.exist.validation.CollectionConfigurationValidationModeTest.insertModeAuto -- Time elapsed: 0.378 s <<< FAILURE!
java.lang.AssertionError: URI must not be null in getResource
	at org.junit.Assert.fail(Assert.java:89)
	at org.exist.validation.CollectionConfigurationValidationModeTest.insertModeAuto(CollectionConfigurationValidationModeTest.java:175)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)

[ERROR] org.exist.validation.CollectionConfigurationValidationModeTest.insertModeTrue -- Time elapsed: 0.111 s <<< FAILURE!
java.lang.AssertionError: URI must not be null in getResource
	at org.junit.Assert.fail(Assert.java:89)
	at org.exist.validation.CollectionConfigurationValidationModeTest.insertModeTrue(CollectionConfigurationValidationModeTest.java:131)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)

Code test location here.

This boils down to the throw new NullPointerException("URI must not be null in getResource"); in ResourceAccess.java which I saw in earlier versions of 6.x as well. 5.x does not have issues (class is new?).

@ndw
Copy link

ndw commented Sep 10, 2024

What we need to figure out here is how we arrive at getResource() with a null URI. That shouldn't happen.

I cloned exist and am trying to get it to build so I can run that test under the debugger. It is not going smoothly.

@adamretter
Copy link
Contributor

I cloned exist and am trying to get it to build

@ndw I usually run:

mvn -V -T2C clean package -DskipTests -Ddependency-check.skip=true -Dappbundler.skip=true -Ddocker=false -P \!mac-dmg-on-mac,\!codesign-mac-app,\!codesign-mac-dmg,\!mac-dmg-on-unix,\!installer,\!concurrency-stress-tests,\!micro-benchmarks,skip-build-dist-archives

@ndw
Copy link

ndw commented Sep 10, 2024

Thanks, Adam. That seemed to configure things so that I could build it. (I'm not sure why it builds given that I can't see where org.exist.xquery.parser.XQueryLexer is and there's an import for it, but nevermind...)

FWIW, insertModeTrue() passes for me. So there's more to the problem.

If you can reproduce the problem, but I can't, then I guess a stack trace would be the next most useful thing.

@adamretter
Copy link
Contributor

adamretter commented Sep 10, 2024

I can't see where org.exist.xquery.parser.XQueryLexer

@ndw This is Java code that is generated from Antlr v2 code during the build. The Antlr v2 source code is in exist-core/src/main/antlr/org/exist/xquery/parser/

If you can reproduce the problem, but I can't, then I guess a stack trace would be the next most useful thing.

I think @dizzzz was looking into this... but I think he is on holiday for a couple of weeks right now.

@ndw
Copy link

ndw commented Sep 10, 2024

(re: XQueryLexer, that makes sense. I guess my IntelliJ project didn't figure that out. But it builds so I don't think it matters.)

Okay. Holidays are a good thing. But I don't think I can do much more here until I can reproduce the problem or there's some way to get more detail. It's tempting to just return null instead of throwing the exception, but it would be nice to understand how we got here. AFAICT, the only way to get this result is to construct a request that has neither a resolved URI nor a request URI, then attempt to get it. That shouldn't happen.

@ndw
Copy link

ndw commented Sep 11, 2024

Rookie error. I was running it on the wrong branch,.

I can now reproduce the problem, but only if the ALWAYS_RESOLVE feature is set to true.

And now that I look back at the comments, I'm confused. First @dizzzz says that the tests pass if that feature is set to false. Then in the next comment, he says other issues are visible again and points to this test.

Because you have custom URI schemes in the mix, I don't think the XML Resolver will succeed unless you set the ALWAYS_RESOLVE feature to false. If it works when that feature is false, I think you're good to go. If it fails when that feature is set to true, then set it to false :-)

If you need it to be true for some reason, or if you can't control how it's set, then we'll need to work out some API where you can tell the resolver how to find resources with your custom URI scheme.

@dizzzz
Copy link
Member

dizzzz commented Sep 13, 2024

Thank you Norm, for looking into it so far. Unfortunately I don't have access to a laptop for the coming 2 weeks, so I can't experiment myself.

That said...

I have never been a fan of the Custom URL Resolver (which I introduced a long long time ago). In the past it gave issues for exist running in tomcat, as tomcat did already the same thing, and it is only possible to register one, once, to a JVM.

We need to have this URL resolver as there is no other way to have 3rd party libs (Saxon, Xalan, XProc, XMLresolver,... ) working with in eXist stored resources.

Actually (a long time ago) I already have been thinking of extending the original resolver with a pluggable URL resolver, to solve things without registering this custom resolver.

I would be in favor of having such an API as you suggested as it makes the resolver more versatile, e.g. it would be possible to introduce an extension for commons-vfs ...

https://commons.apache.org/proper/commons-vfs/filesystems.html

The xmldb:exist:/// one (maybe to be simplified as exist:// - another option, but with impact) would be then just another one....

@ndw
Copy link

ndw commented Sep 13, 2024

I think there are two separate questions here. First, under what circumstances are you seeing the bug reported above about a null URI in getResource(). I can't reproduce that with ALWAYS_RESOLVE set to false and I can't tell from the comments if you can or not.

Second, a mechanism for registering custom URI resolvers might be possible. I'll give that a think.

(Enjoy your vacation! Talk to you in a couple of weeks!)

@ndw
Copy link

ndw commented Sep 13, 2024

Custom scheme resolvers turned out to be not too hard, xmlresolver/xmlresolver#210 (comment)

Copy link
Contributor Author

dependabot bot commented on behalf of github Sep 23, 2024

Superseded by #5448.

@dependabot dependabot bot closed this Sep 23, 2024
@dependabot dependabot bot deleted the dependabot/maven/org.xmlresolver-xmlresolver-6.0.9 branch September 23, 2024 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file java Pull requests that update Java code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants