Refactor: `codec-sv2` crate #1129

rrybarczyk · 2024-08-21T19:49:34Z

Background

This task is an outcome of the protocols Rust docs issues tracked in #845.

While documenting protocols::v2::codec-sv2 in #1012, areas of potential code debt were identified. This issue servers as a place to list out these items to then be addressed in an organized manner. The initial Rust documentation effort was an immediate action, while a refactoring (which implies breaking API changes) is not so urgent priority, so for now we should leave this in the backlog for an appropriate moment in the future.

Note: This issue will replace the #873 discussion.

Identified Potential Code Debt

Error Variant Naming

Suggestions for codec_sv2::error::Error and ::CError refactor:

Instead of importing at the top of the file:
- Use noise_sv2 crate directly in the Error variants like AeadError(noise_sv2::AeadError) and NoiseSv2Error(noise_sv2::Error).
- Use framing_sv2 crate directly in the Error variant like FramingSv2Error(framing_sv2::Error).
There are two definitions of the framing_sv2::Error variant: FramingError and FramingSv2Error, remove one of these variant and handle any downstream effects.
Rename (applies to both the codec_sv2::Error and codec_sv2::CError enums):
AeadError -> Either keep as is or do:NoiseSv2Aead or NoiseAead
BinarySv2Error -> BinarySv2or Binary
FramingSv2Error/FramingError -> FramingSv2 or Framing
NoiseSv2Error -> Noise
Alphabetize the error variants, the match arms of the implementations for the Error enums, and the order of the impl From<external-error> for Error

Uniform Spelling

We are using both HandShake and Handshake. Everything should be Handshake with a lowercase "s" (this is how it is defined in The Noise Protocol Framework) . This impacts:
- codec_sv2:::
  - lib::State::HandShake -> lib::State::Handshake
  - error::Error::NotInHandShake -> error::Error::NotInHandshake
  - error::CError::NotInHandShake -> error::CError::NotInHandshake
- framing_sv2::framing2:: (there may be more in framing_sv2, but these are what I have noticed so far):
  - HandShakeFrame -> HandshakeFrame
  - EitherFrame::HandShake -> EitherFrame::Handshake
We have decoder::WithNoise::decode_noise_frame and then encoder::NoiseEncoder::encode. Would it make sense to make these methods uniform like decoder::WithNoise::decode and encoder::NoiseEncoder::encode?
We have decoder::WithNoise and decoder::WithoutNoise. Then we have encoder::NoiseEncoder and encoder::Encoder. Should we make it decoder::WithNoise, decoder::WithoutNoise, encoder::WithNoise, and encoder::WithoutNoise? Or decoder::NoiseDecoder, decoder::Decoder, encoder::NoiseEncoder and encoder::Encoder?

Imports

Small thing, but in codec_sv2::lib, I am not a big fan of importing framing_sv2::framing2::handshake_message_to_frame as h2f. I think we should import framing_sv2::framing2 then use framing2::handshake_message_to_frame directly. The upside is it is clearer to the reader, the downside is it is more verbose.
use core::cmp is imported in codec_sv2::error with the #[cfg(test)] declarator, but when running cargo test I get the following warning saying it is not used. Should it be removed?

warning: unused import: `core::cmp`
 --> v2/codec-sv2/src/error.rs:8:5
  |
8 | use core::cmp;
  |     ^^^^^^^^^

In codec_sv2::decoder, some imports can be combined into one line. For example, use binary_sv2::{GetSize, Serialize}.

Use of `Result` types

The following methods use core::result::Result but should use codec_sv2::Result (unless there is a reason I am not seeing that requires using core::result::Result):

State::step_0
State::step_1
State::step_2

Uniform Struct Fields

decoder::WithNoise and encoder::WithoutNoise have some shared fields, these fields should appear first and be listed in the same order in each struct.

Misc

Should hard coded default buffer sizes be defined and constants?
error module imports use core::cmp under the #[cfg(test)] but it is unused.
In the decoder.rs module the types StandardEitherFrame and StandardSv2Frame are defined, however these frames can represent an encoded OR decoded Sv2 frame. I think we should move them out of the decoder module into a more appropriate place.
In the encoder.rs module, we have type Slice = buffer_sv2::Slice; with the #[cfg(feature = "with_buffer_pool")] decorator. Why do we need this type if it is simply assigned to buffer_sv2::Slice? Could we just directly use buffer_sv2::Slice?
In encoder.rs, we have the type Slice = Vec<u8> commented out. Can this be completely removed?
new methods should always come first (exception is if default is present, which it is not for this crate) in an impl. See impl<T: Serialize + GetSize> Encoder<T> in encoder.

The text was updated successfully, but these errors were encountered:

plebhash · 2024-09-16T14:43:29Z

we should seriously consider merging codec_sv2 + framing_sv2 crates

#873 (comment)

perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

Fi3 · 2024-09-16T14:47:04Z

we should seriously consider merging codec_sv2 + framing_sv2 crates

#873 (comment)

perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

rrybarczyk · 2024-09-16T15:39:31Z

we should seriously consider merging codec_sv2 + framing_sv2 crates
#873 (comment)
perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates.

Unless there is a reason to not combine these two crates, we should plan on this refactor. @Fi3, is there any reason not to combine that crates that you know of?

Fi3 · 2024-09-16T15:44:53Z

we should seriously consider merging codec_sv2 + framing_sv2 crates
#873 (comment)
perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates.

Unless there is a reason to not combine these two crates, we should plan on this refactor. @Fi3, is there any reason not to combine that crates that you know of?

I just don't see an advantage that justify the work of putting them together. What is this advantage? Why we should spend time doing that?

rrybarczyk · 2024-09-16T15:46:32Z

we should seriously consider merging codec_sv2 + framing_sv2 crates
#873 (comment)
perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates.
Unless there is a reason to not combine these two crates, we should plan on this refactor. @Fi3, is there any reason not to combine that crates that you know of?

I just don't see an advantage that justify the work of putting them together. What is this advantage? Why we should spend time doing that?

Now is the time where we are really scrutinizing these crates to solidify them moving forward. Can you help me understand by giving me an example of when one of these crates would be used without the other?

plebhash · 2024-09-16T15:49:35Z

We need to make sure the low-level APIs are lean and easy to use. Right now, they are extremely convoluted. There's no way to understand how to use one single crate without looking for references in multiple other crates.

Maybe this is a blindspot for you @Fi3 , because the APIs are fresh in your head and you can navigate them easily (since you wrote most of them).

But for us, even while just documenting these crates, it has been extremely challenging to understand how everything is meant to be used together.

Our mission here is to make sure the codebase can be easily used by the entire ecosystem, not just one single pool.

Fi3 · 2024-09-16T15:58:12Z

IMHO is lot easier to make a crate that export everything and is well documented rather then refactor all the crates

Fi3 · 2024-09-16T16:00:10Z

https://github.com/demand-open-source/share-accounting-ext/blob/master/Cargo.toml#L15C22-L15C23

here for example I need framing but not codec

Fi3 · 2024-09-16T16:03:32Z

also low level crates are not supposed to be used but we should use an higher level library designed to be easy to use ecc ecc. The low level crates should be used only in special occasion where you really need them, for example for ffi

rrybarczyk · 2024-09-16T16:49:53Z

Ok, I will take a look at that extension and dig deeper into the code. Thank you.

Fi3 · 2024-09-16T16:51:42Z

ext is just an example main point is

IMHO is lot easier to make a crate that export everything and is well documented rather then refactor all the crates

jbesraa · 2024-09-17T08:48:47Z

I dont have a good input about merging/not merging. I will only share that as a new dev in the team, whenever I saw a crate under protocols folder, I immediately thought that there is a protocol specification for the crate, living in sv2-specs. Now I know this is not the case, but I do think it is miss-leading.

Fi3 · 2024-09-17T08:55:09Z

having several low level crates have the main advantage of keeping the lib as small as possible and easier to test and to review. framing-sv2 is very small and I agree that could be merged, I would put it in binary-sv2 not in codec. I also think that is more than fine like that, and that we shouldn't waste our time and energy in merging it.

plebhash · 2024-09-17T12:18:44Z

I can definitely entertain @Fi3's argument about effort. Whenever we start planning the scope how we will mitigate technical debt, I fully agree that it is very important for us to evaluate the trade-offs of all proposed changes.

And for sure, there will absolutely be cases where the benefits will not justify the efforts.

I'm not saying this will be the case here. I don't have a clear perspective around that yet, especially since we still have not fully studied the entire scope of protocols.

To be honest, this discussion is still somewhat premature (which is my fault, since I brought up this topic). After we cover 100% protocols crates, we will be in a much better position to have these debates.

What I said about potentially merging framing_sv2 + codec_sv2 was just a random thought that occurred to me while I reviewed #1040.

It's actually quite interesting that @Fi3 said that it would probably be better to merge framing_sv2 into binary_sv2 instead of codec_sv2. I will tag @Shourya742 here so he is aware, but with a careful warning that his mindset while reading this should be to simply to entertain this possibility while he documents and reviews binary_sv2/no-serde.

For now, at least in my mind, all options are on the table. The main point is that it is very important that we do our best to learn protocols crates and really become experts at them, just like @Fi3 is. And maybe at the end of that journey, we could even come to the conclusion that everything is fine the way it is.

Or maybe not. Time will tell. 🧘

plebhash · 2024-09-17T20:47:49Z

on the top issue description @rrybarczyk suggested:

Uniform Spelling

...

We have decoder::WithNoise and decoder::WithoutNoise. Then we have encoder::NoiseEncoder and encoder::Encoder. Should we make it decoder::WithNoise, decoder::WithoutNoise, encoder::WithNoise, and encoder::WithoutNoise? Or decoder::NoiseDecoder, decoder::Decoder, encoder::NoiseEncoder and encoder::Encoder?

I would suggest the following naming:

encoder.rs:

#[cfg(feature = "noise_sv2")]
pub type Encoder<T> = EncryptedEncoder<T>;

#[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear
pub type Encoder<T> = PlainEncoder<T>;

pub struct EncryptedEncoder<T> {
    noise_buffer: Buffer,
    sv2_buffer: Buffer,
    frame: PhantomData<T>,
}

pub struct PlainEncoder<T> {
    buffer: Vec<u8>,
    frame: PhantomData<T>,
}

decoder.rs:

#[cfg(feature = "noise_sv2")]
pub type Decoder<B, T> = EncryptedDecoder<B, T>;

#[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear
pub type Decoder<B, T> = PlainDecoder<B, T>;

pub struct EncryptedDecoder<B, T> {
    frame: PhantomData<T>,
    missing_noise_b: usize,
    noise_buffer: B,
    sv2_buffer: B,
}

struct PlainDecoder<B, T> {
    frame: PhantomData<T>,
    missing_b: usize,
    buffer: B,
}

It's not necessarily the final code organization, just some pseudo-rust to convey the idea around the keywords:

Encrypted vs Plain (which depends on noise_sv2 feature flag)
Encoder vs Decoder

rrybarczyk added refactor Implies refactoring code protocols Lowest level protocol logic codec-sv2 labels Aug 21, 2024

rrybarczyk self-assigned this Aug 21, 2024

rrybarczyk mentioned this issue Aug 21, 2024

Rust Docs + refactor: protocols crates #845

Open

67 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: `codec-sv2` crate #1129

Refactor: `codec-sv2` crate #1129

rrybarczyk commented Aug 21, 2024 •

edited

Loading

plebhash commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

plebhash commented Sep 16, 2024 •

edited

Loading

Fi3 commented Sep 16, 2024

Fi3 commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

Fi3 commented Sep 16, 2024

jbesraa commented Sep 17, 2024

Fi3 commented Sep 17, 2024

plebhash commented Sep 17, 2024

plebhash commented Sep 17, 2024 •

edited

Loading

Uniform Spelling

Refactor: codec-sv2 crate #1129

Refactor: codec-sv2 crate #1129

Comments

rrybarczyk commented Aug 21, 2024 • edited Loading

Background

Identified Potential Code Debt

Error Variant Naming

Uniform Spelling

Imports

Use of Result types

Uniform Struct Fields

Misc

plebhash commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

plebhash commented Sep 16, 2024 • edited Loading

Fi3 commented Sep 16, 2024

Fi3 commented Sep 16, 2024

Fi3 commented Sep 16, 2024

rrybarczyk commented Sep 16, 2024

Fi3 commented Sep 16, 2024

jbesraa commented Sep 17, 2024

Fi3 commented Sep 17, 2024

plebhash commented Sep 17, 2024

plebhash commented Sep 17, 2024 • edited Loading

Uniform Spelling

Refactor: `codec-sv2` crate #1129

Refactor: `codec-sv2` crate #1129

rrybarczyk commented Aug 21, 2024 •

edited

Loading

Use of `Result` types

plebhash commented Sep 16, 2024 •

edited

Loading

plebhash commented Sep 17, 2024 •

edited

Loading