Skip to content

Conversation

@to11mtm
Copy link
Member

@to11mtm to11mtm commented Jan 30, 2025

Changes

Adds public ReadOnlySequence<byte> ToReadOnlySequence() to ByteString.

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Latest dev Benchmarks

N/A

This PR's Benchmarks

N/A

Notes:

ReadOnlySequence is a bit of a tradeoff; On one hand, it is a larger struct (two objects and two ints) and does some fun typechecks against said objects which makes it not as fast as ReadOnlySpan.

On the other hand, for non-compacted ByteString, rather than requiring a full copy+alloc of new array, it's essentially a linked-list of the segments (each segment being a long of the total length of previous segments, a Memory<T> and the next ReadOnlySequenceSegment).

(For already 'compacted' ByteStrings, no allocation is required since ReadOnlySequence has logic to to avoid allocation for a single ReadOnlyMemory being passed in.)

The other advantage however, is that ReadOnlySequence is *notaref struct`.

This makes it easier to do certain user code (i.e. not having to worry about ref struct rules as much, e.x. including in a class being passed as a message.

As far as ecosystem support, I know MessagePack-CSharp supports ReadOnlySequence<byte> for input, System.Text.Json should eventually include it based on dotnet/runtime#95187 / dotnet/runtime#67337, and Protobuf-Net appears to have support for it as well. Probably others too but those are the first that came to mind for serialization.

Comment on lines 50 to 55
if (bArr.Length == 2)
{
item = bArr[1];
last = new ByteStringReadOnlySequenceSegment(item, bs.Count-item.Count);
first.Next = last;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this might be premature optimization....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, please remove

Comment on lines 58 to 68
var prior = first;
for (int i = 1; i < bArr.Length; i++)
{
item = bArr[i];
var curr = new ByteStringReadOnlySequenceSegment(item, prior.RunningIndex+prior.Memory.Length);
prior.Next = curr;
prior = curr;
}
last = prior;
}
return new ReadOnlySequence<byte>(first,0,last,last.Memory.Length);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if I like this less or more than just 'going backwards on the array' and just building the Segments from last to first.

Happy for feedback (I suppose main thing being the pattern here vs setting next explicitly in ctor could impact JIT optimizations... but probably overthinking things a bit.)

[DebuggerDisplay("(Count = {_count}, Buffers = {_buffers})")]
public sealed class ByteString : IEquatable<ByteString>, IEnumerable<byte>
{
public class ByteStringReadOnlySequenceSegment : ReadOnlySequenceSegment<byte>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to put in class (or maybe even move to partial/etc?)

It's a nested class for the sake of being able to directly access _buffers in the create. OTOH if we 'flipped' the construction (see other comment) we wouldn't need to nest it and could just build in .ToReadOnlySequence().

... Either way this should be sealed though

@to11mtm to11mtm marked this pull request as ready for review February 1, 2025 18:25
@to11mtm
Copy link
Member Author

to11mtm commented Feb 1, 2025

Needs API review

@to11mtm
Copy link
Member Author

to11mtm commented Feb 1, 2025

All observed failures in tests appear to be races since they wouldn't be using these new APIs.

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are probably going to restructure ByteString to implement System.Memory directly, so anything that's more complicated than

public ReadOnlySequence<byte> ToReadOnlySequence()
        {
            if (_count == 0)
            {
                return ReadOnlySequence<byte>.Empty;
            }
            else if (_buffers.Length == 1)
            {
                //Happy path, we can just pass ArraySegment here and avoid alloc.
                return new ReadOnlySequence<byte>(_buffers[0]);
            }
            else
            {
                return ByteStringReadOnlySequenceSegment.CreateSequence(this);
            }
        }

needs to be deleted.

@to11mtm
Copy link
Member Author

to11mtm commented Feb 5, 2025

We are probably going to restructure ByteString to implement System.Memory directly, so anything that's more complicated than

needs to be deleted.

Did my best to simplify the creation path; If there's specific things to handle please LMK.

@Aaronontheweb
Copy link
Member

Thanks @to11mtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants