Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite ChunkedMemoryStream #2828

Open
wants to merge 10 commits into
base: release/3.1.x
Choose a base branch
from
Open

Conversation

JimBobSquarePants
Copy link
Member

@JimBobSquarePants JimBobSquarePants commented Oct 22, 2024

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

Fixes #2806

ChunkedMemoryStream contained multiple writing bugs and was too costly to fix/maintain relative to performance benefits so I'm just ditching it.

  • Complete rewrite of ChunkedMemoryStream to simplify it and fix numerous bugs.
  • Ensure ImageEncoder uses the chunked stream when encoding non-seekable streams.

@antonfirsov
Copy link
Member

antonfirsov commented Oct 22, 2024

relative to performance benefits

IMO the memory benefits were significant. With the current decoder & encoder design, non seekable streams are always fully buffered into memory. A switch to MemoryStream will reintroduce significant GC allocations for large inputs all around the library, this will be a noticable regression for users who deal with http or other kinds of network streams.

Recommendations:

  • In case there is no bug in stream writing, do not remove ChunkedMemoryStream for decoders, since the largest streams are typically the input ones.
  • Instead of using MemoryStream for encoders, implement our own non-chunked MemoryStream that still uses MemoryAllocator. Although buffers over 4MB won't be pooled, in typical scenarios encoded files are smaller.

@JimBobSquarePants JimBobSquarePants changed the title Remove ChunkedMemoryStream Rewrite ChunkedMemoryStream Oct 23, 2024
@JimBobSquarePants
Copy link
Member Author

relative to performance benefits

IMO the memory benefits were significant. With the current decoder & encoder design, non seekable streams are always fully buffered into memory. A switch to MemoryStream will reintroduce significant GC allocations for large inputs all around the library, this will be a noticable regression for users who deal with http or other kinds of network streams.

Recommendations:

  • In case there is no bug in stream writing, do not remove ChunkedMemoryStream for decoders, since the largest streams are typically the input ones.
  • Instead of using MemoryStream for encoders, implement our own non-chunked MemoryStream that still uses MemoryAllocator. Although buffers over 4MB won't be pooled, in typical scenarios encoded files are smaller.

Thanks for the review @antonfirsov I've instead chosen to completely rewrite the ChunkedMemoryStream to simplify the implementation. It's much easier to maintain now!

public override void Flush()
{
_ = this.Read(this.singleByteBuffer, 0, 1);
return MemoryMarshal.GetReference<byte>(this.singleByteBuffer);
Copy link

@mgravell mgravell Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the byte[] path goes via AsSpan(), why not use the span approach directly?

Span<byte> buffer = stackalloc byte[1];
return Read(buffer) == 1 ? buffer[0] : -1

ideally with SkipLocalsInit enabled

alt that elides a range check:

byte b = 0;
return Read(MemoryMarshal.CreateSpan(ref b, 1)) == 1 ? b : -1;

(you can't do this for the async path, though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... I'd completely forgotten about that method!

Rather than decorating a public mthod with [SkipLocalsInit] i've opted for the following:

/// <inheritdoc/>
public override int ReadByte()
{
    Unsafe.SkipInit(out byte b);
    return this.Read(MemoryMarshal.CreateSpan(ref b, 1)) == 1 ? b : -1;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better!

int offset = 0;
int count = buffer.Length;
while (count > 0)
while (bytesToRead != 0 && this.currentChunk != this.memoryChunkBuffer.Length)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about the underlying implementation here; if this is doing additional downstream reads, you might prefer to exit after the first read (you're only required to read "some" data - you don't need to fill the supplied data, just return at least 1 byte or 0 for EOF); if all the data is already loaded, I wonder whether your memoryChunkBuffer is duplicating the innards of ReadOnlySequence<byte> - that already has all the Slice, CopyTo etc you might want; just a suggestion, though (I can help you grok ReadOnlySequence<T> if you're not already familiar)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looking at MemoryChunkBuffer there is definitely some overlap. Ideally, I should be tracking the buffer and chunk indexes internally within that class.

However, this type needs to be expandable on-demand which AFAIK is not possible with ReadonlySequence<T>.

Copy link

@mgravell mgravell Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying buffer-chain is as mutable as you want it to be (it is your chain, ultimately); if you want to resize it, that is usually as simple as simply creating a new ROS (which is a lightweight struct just tracking the start and end), specifying the new bounds. The chain bits aren't trivial, but not too complex. I guess if what you already have works well, it might be overkill to touch it, though.

}

return chunkBuffer.GetSpan()[this.readOffset++];
MemoryMarshal.Write(this.singleByteBuffer, ref value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto stackalloc; or possibly even the more exotic:

byte b = 0;
var span = MemoryMarshal.CreateSpan(ref b, 1);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// <inheritdoc/>
public override void WriteByte(byte value)
    => this.Write(MemoryMarshal.CreateSpan(ref value, 1));

/// </summary>
/// <returns>The <see cref="T:byte[]"/>.</returns>
/// <returns>A new <see cref="T:byte[]"/>.</returns>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is very inefficient; I would suggest trying to deprecate this kind of API - especially if we can use ROS, for example;

[Obsolete("prefer " + nameof(AsReadOnlySequence)] public byte[] ToArray() => AsReadOnlySequence().ToArray();
public ReadOnlySequence<byte> AsReadOnlySequence() => /* magic happens */

Copy link
Member Author

@JimBobSquarePants JimBobSquarePants Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we need this to when reading XMP data for the V3 build however I wish to rewrite the XMPProfile type for V4 to avoid passing arrays around. Once this is merged to the V3 branch I'll upstream and make additional changes.

@antonfirsov
Copy link
Member

@JimBobSquarePants I believe I will have some time to also review this in the weekend.

@@ -12,44 +14,23 @@ namespace SixLabors.ImageSharp.IO;
/// Chunks are allocated by the <see cref="MemoryAllocator"/> assigned via the constructor
/// and is designed to take advantage of buffer pooling when available.
/// </summary>
internal sealed class ChunkedMemoryStream : Stream
public class ChunkedMemoryStream : Stream

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider reverting this back to being sealed. I.e. is this class designed to be sub-classed? If not then sealing can allow the JITter to make certain optimisations around the method calling of the virtual/override methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to be internal sealed actually! I forgot to change it back after rewriting (public makes the IDE tell me to add method docs).

private readonly int allocatorCapacity;

// Has the stream been disposed.
private long length;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sidenote: IMO it's fairly conventional in C# land to prefix private field names with an underscore, to allow easy distinction from local variables, and to avoid excessive use of this.. I assume this is your personal preference, but thought I'd mention it as IMO it is somewhat non-idiomatic C#.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I'd rather stick to using the language as designed rather than using conventions carried over from C.

The this keyword provides important context IMO and encouraged consistancy throughout a codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colgreen the project's preferred coding style is based on Framework design guidelines, and on StyleCop recommendations. The guidelines explicitly prohibit prefixing variables.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidelines state Internal and private fields are not covered by guidelines, (I believe those guidelines are primarily related to public API surface, rather than private/internal naming). The underscore prefix for private fields is very common in my experience, e.g. it's used widely in Microsoft .NET repos.

However, this topic is probably not relevant in the context of this PR :)

// Has the stream been disposed.
private long length;
private long position;
private int currentChunk;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names could be considered a little misleading/confusing.

E.g. currentChuck is an index into memoryChunkBuffer, so I think maybe call it currentChunkIdx.

Whereas currentChunkIdx is an offset/index /within/ the current chunk, so maybe call it intraChunkByteIdx, chunkByteIdx, or checkByteOffset? etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. good point. I've opted for bufferIndex and chunkIndex as an improvement.

chunk.Dispose();
chunk = chunk.Next;
this.Dispose(true);
GC.SuppressFinalize(this);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not necessary to call GC.SuppressFinalize(this) in a sealed class with no finalizer. This would be to cover sub-types that have a finalizer (in scenarios where there is no finalizer defined directly on the type).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force of habit. Well spotted!

Copy link
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Previous reviewers had good points.
  • Added some nitpicking.
  • Improving funcitonal coverage would be valuable.

Looks good otherwise.

return i < 16 ? b128K * (1 << (int)((uint)i / 4)) : b4M;
}

private void Dispose(bool disposing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given https://github.com/SixLabors/ImageSharp/pull/2828/files#r1823357071, the Dispose(bool disposing) is not even needed.

Same for MemoryChunk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an override of the base Stream method though. I can't implement Dispose directly. All other implementations have been simplified now though.

Comment on lines 216 to 221
if (remaining > count)
{
remaining = count;
}

Span<byte> chunkBuffer = this.writeChunk.Buffer.GetSpan();
int chunkSize = this.writeChunk.Length;
int count = buffer.Length;
int offset = 0;
while (count > 0)
int bytesToWrite = (int)remaining;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remaining is not being used after this line.

-        if (remaining > count)
-        {
-            remaining = count;
-        }

-        int bytesToWrite = (int)remaining;
+        int bytesToWrite = count;

Same for Read.

Comment on lines 446 to 450
public IEnumerator<MemoryChunk> GetEnumerator()
=> ((IEnumerable<MemoryChunk>)this.memoryChunks).GetEnumerator();

IEnumerator IEnumerable.GetEnumerator()
=> ((IEnumerable)this.memoryChunks).GetEnumerator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any code enumerating this with foreach, so IEnumerable implementation can be deleted.

private readonly int allocatorCapacity;

// Has the stream been disposed.
private long length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colgreen the project's preferred coding style is based on Framework design guidelines, and on StyleCop recommendations. The guidelines explicitly prohibit prefixing variables.

@@ -30,7 +30,7 @@ public class ChunkedMemoryStreamTests
[Fact]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to extend these tests to stress the corner(?) cases which were buggy in the previous implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Tests have been massivly exapanded. We now test reading larger buffers and test encoding to webp for all test images.

Comment on lines 107 to 108
[InlineData(DefaultSmallChunkSize * 16)]
public void MemoryStream_ReadByteBufferSpanTest(int length)
Copy link
Member

@antonfirsov antonfirsov Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also include lengths over DefaultSmallChunkSize * 16 and make buffer.Length a parameter. Would test cases, when buffer.Length > DefaultSmallChunkSize.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length is already determined by the parameter but I've expanded to double the previous maximum length

@@ -167,18 +167,20 @@ public void MemoryStream_WriteToTests()
[Fact]
public void MemoryStream_WriteToSpanTests()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write tests could stress more cases with different sizes. See my comment on the read tests.

int offset = 0;
int count = buffer.Length;
while (count > 0)
while (bytesToRead != 0 && this.currentChunk != this.memoryChunkBuffer.Length)
Copy link
Member

@antonfirsov antonfirsov Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though current code looks good and Slice would throw for a negative number, I find this safer to maintain. Same for Write.

Suggested change
while (bytesToRead != 0 && this.currentChunk != this.memoryChunkBuffer.Length)
while (bytesToRead > 0 && this.currentChunk != this.memoryChunkBuffer.Length)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants