Skip to content

Conversation

@mathiaspeters
Copy link
Contributor

@mathiaspeters mathiaspeters commented Oct 14, 2025

Problem

Our trust-dns fork was much behind upstream

Solution

Update the fork from v0.24.0 to v0.25.2 and bump the reference in libtelio

The branch in trust-dns has 7 commits that can also be reviewed

☑️ Definition of Done checklist

  • Commit history is clean (requirements)
  • README.md is updated
  • Functionality is covered by unit or integration tests

tomaszklak
tomaszklak previously approved these changes Oct 17, 2025
Copy link
Contributor

@tomaszklak tomaszklak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 247 to 284
.map_err(|err|
match err.kind() {
ResolveErrorKind::Proto(proto) => {
match proto.kind() {
ProtoErrorKind::NoRecordsFound {
query: _,
soa: _,
ns: _,
negative_ttl: _,
response_code,
trusted: _,
authorities:_,
} => {
if *response_code == ResponseCode::NoError {
telio_log_debug!("Got an error response with NoError code for {name}, this should not happen so converting to ServFail");
// Failed query with no error - convert that to a real error,
// otherwise the LookupError::from will panic in debug builds.
// If we use a number from the 'private use' range:
// https://datatracker.ietf.org/doc/html/rfc2929#section-2.3 like
// LookupError::from(ResponseCode::Unknown(3841))
// the trust-dns will end up looping until the requests with some other error
// is returned. This will make the original dns request (eg. by nslookup or dig or some app)
// never complete. To avoid that, lets return ServFail which will produce
// an empty respones.
LookupError::from(ResponseCode::ServFail)
} else {
LookupError::from(*response_code)
}
}
// NOTE: this is probably incorrect, and should be at least 24, most likely
// in range 3841-4095 ('private use' range), instead of '0'.
_ => LookupError::from(ResponseCode::Unknown(0)),
}
}
}
// NOTE: this is probably incorrect, and should be at least 24, most likely
// in range 3841-4095 ('private use' range), instead of '0'.
_ => LookupError::from(ResponseCode::Unknown(0)),
})
// NOTE: this is probably incorrect, and should be at least 24, most likely
// in range 3841-4095 ('private use' range), instead of '0'.
_ => LookupError::from(ResponseCode::Unknown(0)),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have LookupControlFlow maybe we do not need this block with rather weird, and likely incorrect under some edge-cases, conversions and we can simply pass the result without map_err()'ing it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it and I think this will be staying, but I tried to clean it up at least a little bit

Basically, this is here to give us greater control over when we get an empty result and when the DNS request properly fails

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a bit more? As this, looking at first glance, does not look correct error mapping 🤔


// NOTE: this is probably incorrect, and should be at least 24, most likely
// in range 3841-4095 ('private use' range), instead of '0'.
LookupError::from(ResponseCode::Unknown(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are still converting any error other then err.kind() == ResolveErrorKind::Proto(...) to LookupError::ResponseCode which IMHO is incorrect.

IMHO we should convert to LookupError::ProtoError()`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I would agree with you, but this has a specific use case: https://github.com/NordSecurity/trust-dns/blob/release/0.25.2/crates/server/src/authority/catalog.rs#L837

The way this is handled in our fork of hickory is that we want to have certain errors actually result in an error and others resulting in an empty response. I believe this was initially set to Unknown so that we can have more control over when that happens, and this just copies that. We can change this to be one of the variants of ProtoError, but then I think it's more likely that the special handling of errors in our hickory fork might behave differently

Copy link
Contributor

@Jauler Jauler Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 🤔 🤔

In the link you provided, in one of the cases - we are converting IO errors (hence DNS request was probably not even sent, or response not even received into empty response? As telio-dns is kind of transparent proxy - it means we should not respond as well 🤔. And definitely should not convert the response into empty, but successful response.

Additionally if we have unknown code (the other case) responded from upstream server - we should not convert it to successful response as well.

name = "hickory-resolver"
version = "0.24.0"
source = "git+https://github.com/NordSecurity/trust-dns.git?tag=v3.0.1#88204024fd3ecdadccf4574eb7976167e7c52001"
version = "0.25.2"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly reminder to close a dependabot issue after merging this PR since there's one for hickory-proto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants