Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve alternative index handling. #124

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Carreau
Copy link
Contributor

@Carreau Carreau commented Jul 21, 2024

This improve the handling of indexes and error messages.

Before this PR all errors reaching an index would be ValueError, thus not finding a wheel was the same as the index page having a non-handled content type, or even failing to parse the html because of a typo.

This made it hard to debug, or even realize that an index url was wrong, or that the pages served by the index were incorrect.

I thus introduce 3 New Errors that do not inherit from ValueError:

  • IndexMetadataFetchError
  • UnsupportedParserContentTypeError
  • WheelNotFoundError

The first two of which trigger real error, as they likely suggest a problem with the index or the user config, and should not be ignored.

In addition the verbose= option was unconditionally setting the log level. I think this is improper as if the logging level is set somewhere else (to DEBUG, or something else), then it is overwritten.

  • This then adds the option to pass verbose=None, (and make it the default), in which case it will not change the default.

python -m http.server will serve ContentType with ; charset=utf-8, which is not recognized.

  • This now handle the case where the index ContentType header contains parameters for example ; charset=utf-8, by discarding everything after the semicolon ;; this is not proper parsing, but should be sufficient here.

  • I found that more debug logging would be useful and added a number of debug logs calls

With this I appear to get proper error message and debug statements when trying to work on the Cors proxy.

Should close #123, and #121, and help with pyodide/pyodide#4898

This improve the handling of indexes and error messages.

Before this PR all errors reaching an index would be ValueError, thus
not finding a wheel was the same as the index page having a non-handled
content type, or even failing to parse the html because of a typo.

This made it hard to debug, or even realize that an index url was wrong,
or that the pages served by the index were incorrect.

I thus introduce 3 New Errors that do not inherit from `ValueError`:
 - IndexMetadataFetchError
 - UnsupportedParserContentTypeError
 - WheelNotFoundError

The first two of which trigger real error, as they likely suggest a
problem with the index or the user config, and should not be ignored.

In addition the `verbose=` option was unconditionally setting the log
level. I think this is improper as if the logging level is set somewhere
else (to DEBUG, or something else), then it is overwritten.

- This then adds the option to pass `verbose=None`, (and make it the
  default), in which case it will not change the default.

python -m http.server will serve ContentType with `; charset=utf-8`,
which is not recognized.

- This now handle the case where the index ContentType header
  contains parameters for example `; charset=utf-8`, by discarding
  everything after the semicolon `;`; this is not proper parsing, but
  should be sufficient here.

 - I found that more debug logging would be useful and added a number
   of debug logs calls

With this I appear to get proper error message and debug statements
when trying to work on the Cors proxy.

Should close pyodide#123, and pyodide#121, and help with pyodide/pyodide#4898
@Carreau Carreau marked this pull request as draft July 21, 2024 12:27
@Carreau
Copy link
Contributor Author

Carreau commented Jul 21, 2024

There are still a few case that needs handling, in particular when fetching for a given wheel and the page does not exists, is it a problem with the index, or does the package really not exists.

I think I might have to tweak the behavior a bit more.

@Carreau
Copy link
Contributor Author

Carreau commented Jul 21, 2024

Ok, one of the issues is that in particular PYPI does not set CORS on 404, therefore we get an OS error when trying to fetch a non-existing page.

I'm wondering if this is a bug upstream

@Carreau Carreau marked this pull request as ready for review July 22, 2024 09:42
@hoodmane
Copy link
Member

Thanks @Carreau.

PYPI does not set CORS on 404

This does seem to be a warehouse bug.

@Carreau
Copy link
Contributor Author

Carreau commented Jul 23, 2024

Thanks @Carreau.

PYPI does not set CORS on 404

This does seem to be a warehouse bug.

Yep and it's known pypi/warehouse#14229

Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good to me overall. Could you update the changelog and fix the failing tests? I think you need to update the Error type here

micropip/package_index.py Outdated Show resolved Hide resolved
@Carreau
Copy link
Contributor Author

Carreau commented Jul 23, 2024

Yeah, I'm looking at understanding the failing tests.

I'd like to figure out how to both get proper error messages if the index is invalid, and just keep going on 404, but because of PyPI not setting CORS on 404 I don't know if it's currently possible.

So I'm wondering if i I should bake a an boolean for indexes no_cors_on_404 or similar. Otherwise we'll get back to original behavior of swallowing almost every errors (including parse error). I'm not even sure the case where an index return a real 404 with proper cors is properly handled, and does not raise a parse error as query_package only error handing is OSError and does not check response status code, or does not use raises_for_status.

(I also have limited time, so might not touch this that much by next week).

@Carreau Carreau marked this pull request as draft July 23, 2024 13:02
@hoodmane
Copy link
Member

We should look into fixing the warehouse bug. I tried a little bit but I didn't get the warehouse test suite running.

@ryanking13
Copy link
Member

Yep, +1 for fixing the warehouse bug. Let me see if I can understand and modify the warehouse codebase.

@Carreau
Copy link
Contributor Author

Carreau commented Jul 29, 2024

Also pyodide/pyodide#4974 will be relevant for later refactor.

@Carreau
Copy link
Contributor Author

Carreau commented Jul 29, 2024

crosslinking pypi/warehouse#16339 as well.

@Carreau
Copy link
Contributor Author

Carreau commented Jul 30, 2024

So should I assume that we'll always get micropip on the latest pyodide, or can we have conditional that depends on versions ?

@ryanking13
Copy link
Member

If there are some changes in Pyodide and if you want to use those features in the micropip, you are welcome to do so.

We try to keep the compatibility of micropip and Pyodide versions if possible, but it is not mandatory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verbose setting overwrite custom logger level.
3 participants