Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baseurl does not properly handle http GET query parameter #467

Open
SeanDougherty opened this issue Feb 29, 2024 · 5 comments
Open

baseurl does not properly handle http GET query parameter #467

SeanDougherty opened this issue Feb 29, 2024 · 5 comments

Comments

@SeanDougherty
Copy link

SeanDougherty commented Feb 29, 2024

Describe the bug

When utilizing a baseurl that is hosted with access control measures in place, tdnf has difficulty attempting to properly append paths to the baseurl.

E.g.,
A package repository is hosted in a cloud service and access to anything within the cloud storage is handled through passing of http query parameters
(For example, in https://github.com/vmware/tdnf/issues/new?assignees=&labels=bug&projects=&template=bug-report.yml url, the substring ?assignees=&labels=bug&projects=&template=bug-report.yml is the query parameter)
In the cloud storage situation, the unauthenticated baseurl may look like: foo.cloudservice.storage.com/reporoot and the authenticated baseurl will look like: foo.cloudservice.storage.com/reporoot?<authenticationtoken>.

This authentication token must be appended to every HTTP get request when interacting with the cloud storage service.

In TDNF, there is no handling of this query substring, so the final path will get appended after this query parameter.

foo.cloudservice.storage.com/reporoot?<authenticationtoken>/repodata/repomd.xml
instead of
foo.cloudservice.storage.com/reporoot/repodata/repomd.xml?<authenticationtoken>

This prevents the use of tdnf with secured package repositories.

When a baseurl contains an http query parameter, tdnf fails to retrieve the repomd.xml and will cease to function, causing a non-zero exit code.

Reproduction steps

  1. Create a private container in Azure Storage (Please reach out if I can help with this)
  2. Create a .repo file that points to the authenticated baseurl of the storage account
  3. Attempt to call tdnf makecache
    ...

Expected behavior

tdnf makecache, and all other tdnf commands should behave the same as with an unauthenticated baseurl

Additional context

No response

@oliverkurth
Copy link
Contributor

If I understand correctly, you have this configured in your *.repo file:

baseurl=https://foo.cloudservice.storage.com/reporoot?<authenticationtoken>

And tdnf just mindlessly appends /repodata/repomd.xml to it, yielding an invalid URL, correct?

@SeanDougherty
Copy link
Author

This is correct! Thank you for the fast response 👍

@oliverkurth
Copy link
Contributor

Thinking about this, I get more questions:

Let's say we have successfully fetched the metadata file repomd.xml. It will refer to additional metadata files which we also need to fetch. Do we also need to append the query params to those URLs, or will they already be baked in? And then again, we will get the locations of the packages, for which we need to construct the URLs. Do we again need to append the query params?

I am wondering how dnf handles this, but I was not able to find out.

We do have username/password authentication, see https://github.com/vmware/tdnf/wiki/Repository-Configuration#password . Does your server support that?

@SeanDougherty
Copy link
Author

SeanDougherty commented Mar 4, 2024

Thinking about this, I get more questions:

Let's say we have successfully fetched the metadata file repomd.xml. It will refer to additional metadata files which we also need to fetch. Do we also need to append the query params to those URLs, or will they already be baked in? And then again, we will get the locations of the packages, for which we need to construct the URLs. Do we again need to append the query params?

I am wondering how dnf handles this, but I was not able to find out.

We do have username/password authentication, see https://github.com/vmware/tdnf/wiki/Repository-Configuration#password . Does your server support that?


Re: Initial question

Your hunch is correct, the query params would need to be used in every request to the storage endpoint (package location fetching, package fetching, etc...)

I am not familiar with how the password field behaves in tdnf. Will this accomplish the requested behavior?

For added context, the Azure Storage service does not readily use Username/Password (unless controlled by Azure CLI or powershell [Link]). Instead, a construct known as "sas tokens" are used (these query params that I've alluded to). This [Link] can provide more information on sas tokens. Additionally, I'm happy to work with you 1:1 to generate a private storage account, protected by sas tokens, that we can use to test functionality with TDNF.


Re: dnf

I do know that dnf handles this scenario. Exactly how, I am unaware... However, the size of the dnf package makes it an unattractive option, and tdnf much more interesting.


Additional thought

A colleague of mine has a working patch to the tdnf package that has been shown to solve this issue. I can work with him tomorrow afternoon and get a PR sent out for your review, if you are open to solutions.

With your permission, I am much more interested in upstreaming this fix, instead of applying the patch to just my own tdnf package.

Thanks
Sean

@oliverkurth
Copy link
Contributor

I think I understand how it works. I am not enthusiastic about how it works - HTTP has headers which could be used, and there are standardized authentication mechanisms. I don't know why Microsoft has to use query parameters.

I am very short on bandwidth right now, and this is a feature for which I am not aware of any other use cases, so I don't know when and if I am able to work on this. But if you or your colleague have a patch and it's of good quality I would accept that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants