Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTLS: Add server side stateless and CID QoL API #8224

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

julek-wolfssl
Copy link
Member

  • wolfDTLS_accept_stateless - statelessly listen for incoming connections
  • wolfSSL_inject - insert data into WOLFSSL object
  • wolfSSL_SSL(Enable|Disable)Read - enable/disable reading from IO
  • wolfSSL_get_wfd - get the write side file descriptor
  • wolfSSL_dtls_set_pending_peer - set the pending peer that will be upgraded to regular peer when we successfully de-protect a DTLS record
  • wolfSSL_dtls_get0_peer - zero copy access to the peer address
  • wolfSSL_is_stateful - boolean to check if we have entered stateful processing
  • wolfSSL_dtls_cid_get0_rx - zero copy access to the rx cid
  • wolfSSL_dtls_cid_get0_tx - zero copy access to the tx cid
  • wolfSSL_dtls_cid_parse - extract cid from a datagram/message

Copy link
Contributor

@rizlik rizlik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a big step in the right direction but there are things unclear. These comments are considering wolfSSL/wolfssl-examples#472 as the user of the changes in this PR.

  1. what's the purpose of splitting the fd in write and read?

  2. I don't get why we should use the pendingPeer and the example doesn't use it. It looks complicate and there is code that get hit after we copy the pendingPeer to the current Peer that it doesn't look doing nothing other than flipping the processPendingRecord bit.

  3. I don't see much gain in using wolfSSL_accept_stateless.
    AFAIU it just boils down in a) disabling read so that it can return before reading from the socket and b) to return after a good CH. a) avoids dropping other peer messages that can be received in the EmbedRecvFrom. But, as shown in the example, the server has already to read the messages itself outside of the wolfSSL to do proper demux. I wonder if then this is superfluous.

I would suggest of using a custom internal field like the pendingPeer, but called, as you suggested, currentRecordPeer.
Until the ssl is stateless, this is set on recvfrom, but no filtering happens bc of this.
It's used as the destination of the HRR/HVR and to compute the cookie.
It's not used anywhere else and it's not automatically copied over the other peer.
This way an ssl object can be used to reply to the CHs w/o cookies.
If we got a CH w/ valid cookie we can then leverage our existing goodch callback to set the right peer for sending and continue the connection.
If the application wants to demux, it needs to manage the reading from the socket itself anyway.

Just to recap, my suggestion are:

  • In EmbedRecvFrom to set a "peer to send the HRR/HRV" to if we are in a stateless
  • Application needs to use ChGoodCB to set the proper peer (is it already like this?)
  • Make clear that application needs to manage the read itself if it wants to demux over a single socket

The new APIs (inject, parse_cid) allows the application to manage the connection with a good tradeoff of complexity and control imo.

doc/dox_comments/header_files/ssl.h Outdated Show resolved Hide resolved
doc/dox_comments/header_files/ssl.h Outdated Show resolved Hide resolved
src/dtls.c Outdated
Comment on lines 1421 to 1437
if (msgSz < DTLS_RECORD_HEADER_SZ + cidSz)
return;
/* content type(1) + version(2) + epoch(2) + sequence(6) */
*cid = msg + ENUM_LEN + VERSION_SZ + OPAQUE16_LEN + OPAQUE16_LEN +
OPAQUE32_LEN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the calculation of lengths are different between the check and the parsing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are checking DTLS_RECORD_HEADER_SZ + cidSz but then we are coping from offset ENUM_LEN, VERSION_SZ, LEN, LEN and OPAQUE32_LEN

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CID field is not the last field which is why the checks are different. DTLS_RECORD_HEADER_SZ is the sum of the remaining static fields.

     struct {
         ContentType outer_type = tls12_cid;
         ProtocolVersion version;
         uint16 epoch;
         uint48 sequence_number;
         opaque cid[cid_length];               // New field
         uint16 length;
         opaque enc_content[DTLSCiphertext.length];
     } DTLSCiphertext;

int result = DtlsMsgPoolSend(ssl, 0);
int result;
#ifdef WOLFSSL_DTLS13
if (IsAtLeastTLSv1_3(ssl->version))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to the feature or is a general fix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General improvement. Adding 1.3 support to wolfSSL_dtls_retransmit.

src/ssl.c Outdated

if (ssl->buffers.dtlsCtx.peer.sa != NULL &&
ssl->buffers.dtlsCtx.peer.sz == peerSz &&
XMEMCMP(ssl->buffers.dtlsCtx.peer.sa, peer, peerSz) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct way to compare sockaddr structure, see sockAddrEqual

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using sockAddrEqual.

if (ssl->buffers.inputBuffer.length - *inOutIdx <
(word32)cidSz + LENGTH_SZ)
return LENGTH_ERROR;
if (cidSz > DTLS_CID_MAX_SIZE ||
wolfSSL_dtls_cid_get_rx(ssl, cid, cidSz) != WOLFSSL_SUCCESS)
if (cidSz != DtlsGetCidRxSize(ssl) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this check redundant? we set cidSz to DtlsGetCidRxSize() before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Removing.


WOLFSSL_ENTER("wolfSSL_read_internal");

if (ssl == NULL || data == NULL || sz <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we allow sz == 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how a 0-length inject would be valid.

src/dtls.c Outdated
int wolfSSL_dtls_cid_max_size(void)
{
return DTLS_CID_MAX_SIZE;
}

void wolfSSL_dtls_cid_parse(const unsigned char* msg, unsigned int msgSz,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return WOLFSSL_SUCCESS or WOLFSSL_FAILURE.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cid is already the return. I don't see the point of complicating with an extra return. If anything, I can have the function return the pointer instead of outputting it to a parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I do think checking for the return is more uniform than checking if *cid is NULL. Please add this to the documentation of the API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I'll change it to return the CID instead of setting the parameter.

\sa wolfSSL_dtls_set_peer
\sa wolfSSL_dtls
*/
int wolfSSL_dtls_get0_peer(WOLFSSL* ssl, const void** peer, unsigned int* peerSz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't expose the internal cid.
Prefer to add an API to compare to the ssl cid object if afraid of performance loss of copying the cid.
(like wolfSSL_dtls_cid_cmp(WOLFSSL*ssl,byte *cid, int cidSz))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the peer address. Did you mean wolfSSL_dtls_cid_get0_rx? I think its better to allow the user to compare the CID however they want (using many memcmp or a bloom filter for example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it exposes our internal implementation, if we haven't any request of doing that I'll avoid it. but I'm ok if you think it's important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only exposes the peer directly. The parameter is marked const so it signals to the user that this should not be modified.

@julek-wolfssl
Copy link
Member Author

what's the purpose of splitting the fd in write and read?

It was always split. I just added a getter for that field.

I don't get why we should use the pendingPeer and the example doesn't use it. It looks complicate and there is code that get hit after we copy the pendingPeer to the current Peer that it doesn't look doing nothing other than flipping the processPendingRecord bit.
[...]
It's used as the destination of the HRR/HVR and to compute the cookie.

The pending peer is used in the example when CID's are used. The bit flipping is just to make sure that the pending peer is only valid for the next parsed record. We already have a good mechanism for cookie calculation, there is no need to change it. The pendingPeer is used in conjunction with CID's to validate the incoming record before updating the peer address. An off-path attacker sending records with the CID set can change the address if we don't first attempt to de-protect the record. Attempting to de-protect an off-path attacker's packets should fail and we will reject the pendingPeer internally without any extra help from the application needed.

I don't see much gain in using wolfSSL_accept_stateless.

You're the one who suggested adding it :D. I think it makes sense to have it to abstract away from the user the complexity of the stateless handling.

But, as shown in the example, the server has already to read the messages itself outside of the wolfSSL to do proper demux. I wonder if then this is superfluous.

But we also support a server object that reads from the network directly. We have to support that use case as well. This API helps a lot with that case to make sure no records are lost.

To respond to the suggestions:

In EmbedRecvFrom to set a "peer to send the HRR/HRV" to if we are in a stateless

Adding.

Application needs to use ChGoodCB to set the proper peer (is it already like this?)

It does not need to do that. It calls wolfSSL_dtls_set_peer for peers during the connection process. wolfSSL_dtls_set_pending_peer is used during the connection with CID's.

Make clear that application needs to manage the read itself if it wants to demux over a single socket

I think that is clear from the example. Where do you think we need more documentation?

src/internal.c Outdated
Comment on lines 21554 to 21569
if (ssl->options.dtls) {
if (!ssl->buffers.dtlsCtx.processingPendingRecord)
ssl->buffers.dtlsCtx.processingPendingRecord = 1;
else {
ssl->buffers.dtlsCtx.processingPendingRecord = 0;
if (ssl->buffers.dtlsCtx.pendingPeer.sa != NULL) {
/* Clear the pending peer */
XFREE(ssl->buffers.dtlsCtx.pendingPeer.sa, ssl->heap,
DYNAMIC_TYPE_SOCKADDR);
ssl->buffers.dtlsCtx.pendingPeer.sa = NULL;
ssl->buffers.dtlsCtx.pendingPeer.sz = 0;
ssl->buffers.dtlsCtx.pendingPeer.bufSz = 0;
}
}
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and the other block to update the peer feel misplaced here. One reason is that it will run for every packet, even the ones where we don't have set any pending peer. Another reason is that is adding more noise in the already quite convoluted processReply.

But I can't find a good solution because I don't see a good point where we can clear the pending peer on error, as we have many exit points in this function.

The only idea I've is to modify wolfssl_inject so that it also invokes wolfSSL_negotiate if necessary and wolfSSL_read. If there are application-data we can return them from wolfssl_inject.

The basic idea is that we include what the app is doing inside dispatchExistingConnection inside the wolfSSl_inject API.

If this makes sens to you we can include the the peer in the api itself. It's true what we are doing two things in a single API, but it's also true that the data that we want to inject usually have a source, so the two information are related.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you reject the inject modification, can you move this and the other block in separate functions and adds a longer comments on why we are doing that?

Copy link
Member Author

@julek-wolfssl julek-wolfssl Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run for every packet

Only bit flipping will run for every pass.

But I can't find a good solution because I don't see a good point where we can clear the pending peer on error, as we have many exit points in this function.

That is precisely the reason for this. If we could have a single cleanup goto it would make it easier but this function is complicated enough.

I don't want to add anything else to the inject API. I think it should do only what the user of an inject API would expect. I'll add more documentation and refactor it to make it cleaner.

src/dtls.c Outdated
Comment on lines 1421 to 1437
if (msgSz < DTLS_RECORD_HEADER_SZ + cidSz)
return;
/* content type(1) + version(2) + epoch(2) + sequence(6) */
*cid = msg + ENUM_LEN + VERSION_SZ + OPAQUE16_LEN + OPAQUE16_LEN +
OPAQUE32_LEN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are checking DTLS_RECORD_HEADER_SZ + cidSz but then we are coping from offset ENUM_LEN, VERSION_SZ, LEN, LEN and OPAQUE32_LEN

@julek-wolfssl julek-wolfssl force-pushed the dtls-server-demux branch 5 times, most recently from 6a1039b to 5977bdb Compare December 12, 2024 18:36
@julek-wolfssl julek-wolfssl requested a review from rizlik December 12, 2024 18:45
Copy link
Contributor

@rizlik rizlik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better, thanks!

src/internal.c Show resolved Hide resolved
src/internal.c Show resolved Hide resolved
- wolfDTLS_accept_stateless - statelessly listen for incoming connections
- wolfSSL_inject - insert data into WOLFSSL object
- wolfSSL_SSL(Enable|Disable)Read - enable/disable reading from IO
- wolfSSL_get_wfd - get the write side file descriptor
- wolfSSL_dtls_set_pending_peer - set the pending peer that will be upgraded to regular peer when we successfully de-protect a DTLS record
- wolfSSL_dtls_get0_peer - zero copy access to the peer address
- wolfSSL_is_stateful - boolean to check if we have entered stateful processing
- wolfSSL_dtls_cid_get0_rx - zero copy access to the rx cid
- wolfSSL_dtls_cid_get0_tx - zero copy access to the tx cid
- wolfSSL_dtls_cid_parse - extract cid from a datagram/message
Interrupted connection should return control to the user since they may want to handle the signal that caused the interrupt. Otherwise, we might never give back control to the user (the timeout would error out but that causes a big delay).

socat.yml: in test 475, the test would send a SIGTERM after 3 seconds. We would continue to ignore this signal and continue to call `recvfrom`. Instead we should error out and give control back to the user.
@julek-wolfssl
Copy link
Member Author

Retest this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants