Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PublicKey id length #24

Open
oed opened this issue Nov 18, 2020 · 14 comments
Open

PublicKey id length #24

oed opened this issue Nov 18, 2020 · 14 comments

Comments

@oed
Copy link

oed commented Nov 18, 2020

Currently the id, specifically what comes after #, is the entire encoded public key. Is this really needed? Can they be made shorter by just using the first few characters of the pubkey?

@OR13
Copy link
Contributor

OR13 commented Sep 23, 2021

this is how the method works currently... it has advantages, IMO its not worth changing.

@oed
Copy link
Author

oed commented Sep 23, 2021

What's the advantages of using the full key?

@OR13
Copy link
Contributor

OR13 commented Sep 28, 2021

@oed makes string comparisons for keys possible.

I could live with making the id the JWK thumbprint... but I expect folks would object to that...

@oed
Copy link
Author

oed commented Sep 28, 2021

makes string comparisons for keys possible.

Can you elaborate on what this means? Comparing would still work if it's only the first 15 chars of the key?

@OR13
Copy link
Contributor

OR13 commented Sep 28, 2021

@oed https://datatracker.ietf.org/doc/html/rfc7638

TLDR ... not safe to just compare the first few character of a string in order to implement a binary quality check.

@oed
Copy link
Author

oed commented Sep 28, 2021

@OR13 should be safe if you use DID + id?

@OR13
Copy link
Contributor

OR13 commented Sep 28, 2021

@oed not if you use relative ref ids...

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/suites/jws-2020/v1"
  ],
  "id": "did:key:z6Mkff5wEYRgDuuPzg4u5FYhvCzGHQDGGCHL8C9YNaEs2nwF",
  "verificationMethod": [
    {
      "id": "#z6Mk...",
      "type": "JsonWebKey2020",
      "controller": "did:key:z6Mkff5wEYRgDuuPzg4u5FYhvCzGHQDGGCHL8C9YNaEs2nwF",
      "publicKeyJwk": {
        "kty": "OKP",
        "crv": "Ed25519",
        "x": "EeM5xZGniNFEdHXYX6YhzDY9MEaVhdna_9c-HFaSqcY"
      }
    },

@OR13
Copy link
Contributor

OR13 commented Sep 28, 2021

FWIW I feel the same way as you do about this... But i think these are the only options:

1. use the full multiform for the fragment value 
`z6Mkff5wEYRgDuuPzg4u5FYhvCzGHQDGGCHL8C9YNaEs2nwF` (base58 encoded)

2. use a hash of the key for the fragment value 
`NzbLsXh8uDCcd-6MNwXF4W_7noWXFZAfHkxZsRGC9Xs` (base64url encoded)

3. last few characters of did  value for the fragment 
(`#s2nwF`)

4. use integer index for fragment 
(`#key-0`)

@oed
Copy link
Author

oed commented Sep 28, 2021

not if you use relative ref ids...

If you see a relative ref, wouldn't it make sense to concat it with the DID?

I would modify 3. above to use the last n chars (as to not include multicodec byte).

I would be fine with using integer index (4) since this DID is not mutable.

@oed
Copy link
Author

oed commented Mar 4, 2022

@OR13 how would you feel about a PR that introduces #0 and #1 etc for ids?

The old ids could be kept for backward compatibility.
I'm mainly worried about protected JWS headers becoming super long with the current ids, e.g.:

eyJhbGciOiJFZERTQSIsImNhcCI6ImlwZnM6Ly9iYWZ5cmVpY3JqZnF4eGNoZGFweGFkMmo2N3RlM2x0bGdjdW8zbnl1d2F4NmoyaXp4bWJpZmZxYXdkaSIsImtpZCI6ImRpZDprZXk6ejZNa3ZRaXpMeXprcVRSeVhCN2JmNjdQN3ZWQVl1NjFVektOWkRueUs2SGFGNWp1I3o2TWt2UWl6THl6a3FUUnlYQjdiZjY3UDd2VkFZdTYxVXpLTlpEbnlLNkhhRjVqdSJ9

@OR13
Copy link
Contributor

OR13 commented Mar 4, 2022

@oed

Thanks for the header, you are proposing:

{
  "alg": "EdDSA",
  "cap": "ipfs://bafyreicrjfqxxchdapxad2j67te3ltlgcuo3nyuwax6j2izxmbiffqawdi",
  "kid": "did:key:z6MkvQizLyzkqTRyXB7bf67P7vVAYu61UzKNZDnyK6HaF5ju#z6MkvQizLyzkqTRyXB7bf67P7vVAYu61UzKNZDnyK6HaF5ju"
}

be changed to:

{
  "alg": "EdDSA",
  "cap": "ipfs://bafyreicrjfqxxchdapxad2j67te3ltlgcuo3nyuwax6j2izxmbiffqawdi",
  "kid": "did:key:z6MkvQizLyzkqTRyXB7bf67P7vVAYu61UzKNZDnyK6HaF5ju#0"
}

I am open to such a PR.

It would be a major breaking change, and we would have to make a bunch of updates, but we are willing to do that to make the identifiers that end up in credentials shorter.

However, we are not the only implementers of did:key.

It's one of the most popular, if not the most popular and interoperable did method.

So breaking changes may take a long time to negotiate.

cc @msporny @kimdhamilton @mavarley @mprorock ... please add others to help with the discussion.

@oed
Copy link
Author

oed commented Mar 4, 2022

Note that it would be possible to keep the old id around in generated DID documents.
Perhaps this could be added as a Backwards compatibility note?

@msporny
Copy link
Contributor

msporny commented Mar 4, 2022

It would be a major breaking change, and we would have to make a bunch of updates, but we are willing to do that to make the identifiers that end up in credentials shorter.

Digital Bazaar would be a -1 on this change for the following reasons:

  • It tries to optimize byte usage at the wrong layer. JWS/JWT is a text-based format with terrible compression characteristics (133% bloat due to use of base64 which is known to compress terribly).
  • It deviates from other multikey expression formats that need to do key rotation and guarantee that key identifiers don't collide during rotations. Using # 0 and # 1 is a pattern that should be discouraged... we should be using key fingerprints that are designed to be cryptographically collision-free.
  • It establishes a pattern that might be copied to other DID Methods where you cannot safely do distributed key generation w/o looking at the history of a DID Document. That is, you now need to ask the question: "Was # 1 ever used before, guess I'll have to get the DID Document and apply all of history to make sure I don't accidentally re-use a key ID that someone has marked as "expired" or "compromised" in their internal systems/caches."
  • We have been able to compress did:keys to ~36 bytes when sent over a binary transport (such as when using CBOR-LD), so it's doable if you compress at the right layer.
  • It perpetuates the same broken layering that went into JWTs -- rather than creating a "human-readable layer" that maps to a "binary-format layer" that is capable of being properly compressed (like CBOR), JWTs tried to name properties using three character values, e.g., "iss", "kid", "typ" (to save space!)... and then made the mistake of double-base-encoding values (which leads to compression bloat), further perpetuating the inability to properly compose primitives. That ended up pushing the "we're concerned about how much space we're wasting" up to the higher level application space, which then results in requests like this issue. The problem here is that JWT's leaky abstractions are now affecting DID Methods -- which is a clear signal that JWT got the layering wrong.

Just signalling very early that we will be opposed to this change for architectural layering and security design reasons.

@baha-ai
Copy link

baha-ai commented Mar 4, 2022

I agree with @msporny

Adding to his argument:

Sending this (from @OR13's examples above) base64 encoded string:
ewogICJhbGciOiAiRWREU0EiLAogICJjYXAiOiAiaXBmczovL2JhZnlyZWljcmpmcXh4Y2hkYXB4YWQyajY3dGUzbHRs \ Z2N1bzNueXV3YXg2ajJpenhtYmlmZnFhd2RpIiwKICAia2lkIjogImRpZDprZXk6ejZNa3ZRaXpMeXprcVRSeVhCN2JmN \ jdQN3ZWQVl1NjFVektOWkRueUs2SGFGNWp1I3o2TWt2UWl6THl6a3FUUnlYQjdiZjY3UDd2VkFZdTYxVXpLTlpEbnlL \
NkhhRjVqdSIKfQ==
compared to the suggested encoding (with #0 suffix in the kid):
ewogICJhbGciOiAiRWREU0EiLAogICJjYXAiOiAiaXBmczovL2JhZnlyZWljcmpmcXh4Y2hkYXB4YWQyajY3dGUzbHR \ sZ2N1bzNueXV3YXg2ajJpenhtYmlmZnFhd2RpIiwKICAia2lkIjogImRpZDprZXk6ejZNa3ZRaXpMeXprcVRSeVhCN2J \ mNjdQN3ZWQVl1NjFVektOWkRueUs2SGFGNWp1IzAiCn0=

saves you about 64 bytes. We're not saving much here. Using CBOR is probably a better option for saving bytes than JSON based messages.

Also a didKey is a special kind of DID document without a service bloc. Its kid field is a representation of the public key, not an index of the key in the document as it has no VMs. Using the index #0 would imply having VMs in the DID doc, but it's not the case for DIDkeys. On the other hand using the full key multibase encoding as a suffix of the kid indicates that this is a didKey document. Finally this DIDkey has a full kid value and cannot be referenced by a relative id (ie #0 is meaningless in a didkey while #z6MkvQizLyzkqTRyXB7bf67P7vVAYu61UzKNZDnyK6HaF5ju represents the public key).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants