Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NIOFilesystem #2615

Merged
merged 17 commits into from
Jan 16, 2024
Merged

Add NIOFilesystem #2615

merged 17 commits into from
Jan 16, 2024

Conversation

glbrntt
Copy link
Contributor

@glbrntt glbrntt commented Dec 21, 2023

Motivation:

I/O is typically blocking, but options for interacting with the
filesystem using async APIs are limited. This change adds a new module
to NIO, NIOFileSystem. At the moment this is exposed as the
_NIOFileSystem product while users get a chance to try it out and
provide feedback.

During this time the API of NIOFileSystem is subject to change.
Breaking changes will come in minor NIO releases and additive changes
may come in patch releases of NIO.

Modifications:

  • Add NIOFileSystem, NIOFileSystemFoundationCompat,
    NIOFileSystemTests, NIOFileSystemIntegrationTests, and
    NIOFileSystemFoundationCompatTests.
  • Update docs
  • Update CNIOLinux and CNIODarwin shims

Result:

File system can be accessed via async APIs

Motivation:

I/O is typically blocking, but options for interacting with the
filesystem using `async` APIs are limited. This change adds a new module
to NIO, `NIOFileSystem`. At the moment this is exposed as the
`_NIOFileSystem` product while users get a chance to try it out and
provide feedback.

During this time the API of `NIOFileSystem` is subject to change.
Breaking changes will come in _minor_ NIO releases and additive changes
may come in _patch_ releases of NIO.

Modifications:

- Add `NIOFileSystem`, `NIOFileSystemFoundationCompat`,
  `NIOFileSystemTests`, `NIOFileSystemIntegrationTests`, and
  `NIOFileSystemFoundationCompatTests`.
- Update docs
- Update CNIOLinux and CNIODarwin shims

Result:

File system can be accessed via `async` APIs
@glbrntt
Copy link
Contributor Author

glbrntt commented Dec 21, 2023

This change is split into two commits:

Copy link
Contributor

@adam-fowler adam-fowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an initial look this is really cool. So far I've used this to read, write files and read chunks and write buffered files. I've included a few comments below.

I did have one crash. deinit was run on a lock that still had a count. I cannot be sure if it was my code or the file system code though. It seemed to be related to a task using a readChunks AsyncSequence being cancelled. I cannot replicate it though

range: FileChunks.ChunkRange,
lowWatermark: Int,
highWatermark: Int
) -> BufferedStream<ByteBuffer> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no public method for setting the low and high watermarks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's intentional, I don't think they are something users should need to worry about. I suspect we'll need to tweak these to find an optimal value for most users though.

/// A global shared instance of ``FileSystem``.
///
/// See also ``FileSystem/shared``.
public static var shared: FileSystem {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having FileSystem.shared and FileSystemProtocol.shared generates confusing errors when combined with a method that includes a closure eg

await withThrowingTaskGroup(of: Void.self) { group in
    group.addTask {
        try await FileSystem.shared.withFileHandle(forWritingAt: FilePath(filename)) { fileHandle in
            try await fileHandle.write(contentsOf: buffer.readableBytesView, toAbsoluteOffset: 0)
        }
    }
}

Produces the error ambiguous use of shared
This hides the actual error which is write is returning an Int64 when the withFileHandle closure should return a Void.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the compiler isn't able to see the @discardableResult in some contexts and drop the result in favour of Void.

Given:

@discardableResult
func discardableInt() -> Int { 42 }

func withResultOfClosure<R>(_ fn: () throws -> R) rethrows -> R {
    try fn()
}

func withVoidResult(_ fn: () throws -> Void) rethrows {
    try fn()
}

This warns:

func fn() {
    withResultOfClosure { // WARN: Result of call to 'withResultOfClosure' is unused
        discardableInt()
    }
}

Adding an explicit return suppresses the warning:

func fn() {
    return withResultOfClosure {
        discardableInt()
    }

    // Also suppresses the warning:
    //
    // withResultOfClosure {
    //     discardableInt()
    // } as Void
}

Wrapping the closure in a closure which must return Void (more or less what you've run into) emits an error:

func fn() {
    withVoidResult { // ERROR: Type of expression is ambiguous without a type annotation
        withResultOfClosure {
            discardableInt()
        }
    }
}

Adding the explicit return 'fixes' it:

func fn() {
    withVoidResult {
        return withResultOfClosure {
            discardableInt()
        }
    }
}

So I agree this isn't a great situation but I think this is on the compiler to improve. Having the shared instance is the right thing to do as is returning the number of bytes written as a discardable result. I'm open to workarounds if you have suggestions though!

fileSystem: .shared
)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have a ByteBuffer.write(to: FilePath) helper function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. There are extensions on Sequence<UInt8> and by extension ByteBufferView but I think it's worth adding shims on ByteBuffer which call these.

@hassila
Copy link
Contributor

hassila commented Dec 22, 2023

Very cool. Haven't had time to dig in looking at the implementation yet, but just wanted to check if the design has a memory ownership model suitable for (and is considering...) true async backends (ie. Io_uring) - would make sense to use that if available (even if not part of the initial implementation, it would be nice to be able to add it in later without too much trouble).

@adam-fowler
Copy link
Contributor

One thing I've noticed is GH actions running on a macOS 13 runner fail to compile because they can't find the FTS symbols

Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! I think this PR is already using swift-format right? Should we setup format checking for just the new modules as a way to get us started with adopting swift-format and avoiding format divergence in those new modules?

Sources/NIOFileSystem/BufferedReader.swift Outdated Show resolved Hide resolved
Sources/NIOFileSystem/ByteCount.swift Outdated Show resolved Hide resolved
Sources/NIOFileSystem/Docs.docc/NIOFileSystem.md Outdated Show resolved Hide resolved
/// more information about the error by calling ``FileSystemError/detailedDescription()`` which
/// returns a structured multi-line string containing information about the error.
@available(macOS 10.15, iOS 13.0, watchOS 6.0, tvOS 13.0, *)
public final class FileSystem: Sendable, FileSystemProtocol {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually need to be a class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glbrntt you never replied here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I missed this one. Just changed it.

@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 8, 2024

From an initial look this is really cool. So far I've used this to read, write files and read chunks and write buffered files. I've included a few comments below.

Thanks for the feedback!

I did have one crash. deinit was run on a lock that still had a count. I cannot be sure if it was my code or the file system code though. It seemed to be related to a task using a readChunks AsyncSequence being cancelled. I cannot replicate it though

Interesting, could you share what the calling code looking like?

@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 8, 2024

One thing I've noticed is GH actions running on a macOS 13 runner fail to compile because they can't find the FTS symbols

Thanks for highlighting this, will investigate.

@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 8, 2024

Great stuff! I think this PR is already using swift-format right? Should we setup format checking for just the new modules as a way to get us started with adopting swift-format and avoiding format divergence in those new modules?

It was, yes. We can but if we do that it should be in a separate PR.

@adam-fowler
Copy link
Contributor

I did have one crash. deinit was run on a lock that still had a count. I cannot be sure if it was my code or the file system code though. It seemed to be related to a task using a readChunks AsyncSequence being cancelled. I cannot replicate it though

Interesting, could you share what the calling code looking like?

Err that was last year 😬 . It's in one of the Soto multipart S3 upload/download functions. They are pretty complex. I just had a quick look through the code and I found an unstructured Task in one of the download functions (That section was written in the swift 5.5 anything goes days) it isn't exactly in the file saving part but means the code is probably untrustworthy. I'll rewrite it, I was meaning to anyway, and see if I can replicate after that.

@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 8, 2024

Very cool. Haven't had time to dig in looking at the implementation yet, but just wanted to check if the design has a memory ownership model suitable for (and is considering...) true async backends (ie. Io_uring) - would make sense to use that if available (even if not part of the initial implementation, it would be nice to be able to add it in later without too much trouble).

There is API space for different backend implementations but I don't have enough experience with io_uring to know if or how well it'd fit. It'd be great to have support but it isn't a priority for us and is unlikely to be one any time soon.

@hassila
Copy link
Contributor

hassila commented Jan 8, 2024

Very cool. Haven't had time to dig in looking at the implementation yet, but just wanted to check if the design has a memory ownership model suitable for (and is considering...) true async backends (ie. Io_uring) - would make sense to use that if available (even if not part of the initial implementation, it would be nice to be able to add it in later without too much trouble).

There is API space for different backend implementations but I don't have enough experience with io_uring to know if or how well it'd fit. It'd be great to have support but it isn't a priority for us and is unlikely to be one any time soon.

It is a true async syscall interface with basically two ring buffers on in each direction, we can pray that we will get something similar on Darwin in the future too as it's really a great step forward on many fronts - we are working on getting backend support for io_uring for the networking part in #2357 and there were some discussions in #1890 and #1805 with regard to memory ownership in that context.

The major issue in general is typically memory ownership of buffers and the assumption that syscalls are synchronous in nature, rather than allowing for a completion event to finish the syscall - designing api surfaces so that it's possible to write a backend which is async in nature would be nice future proofing - but now, I still haven't had time to go through the API in detail, just wanted to share the links above as they might be relevant.

@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 15, 2024

One thing I've noticed is GH actions running on a macOS 13 runner fail to compile because they can't find the FTS symbols

@adam-fowler FYI this should be fixed in 52a0091

@glbrntt glbrntt requested a review from FranzBusch January 15, 2024 11:06
@adam-fowler
Copy link
Contributor

@adam-fowler FYI this should be fixed in 52a0091

I reran the failing job and it worked 👍

Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@glbrntt glbrntt enabled auto-merge (squash) January 16, 2024 11:20
@glbrntt
Copy link
Contributor Author

glbrntt commented Jan 16, 2024

Credit also goes to @stefanadranca and @gjcairo who contributed to this work.

@glbrntt glbrntt added the 🆕 semver/minor Adds new public API. label Jan 16, 2024
//===----------------------------------------------------------------------===//

/// Represents the number of bytes.
public struct ByteCount: Hashable, Sendable {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a nice type to have outside of NIOFileSystem as well, could we move this?

@glbrntt glbrntt merged commit f80c69f into apple:main Jan 16, 2024
8 of 10 checks passed
@glbrntt glbrntt deleted the filesystem branch January 16, 2024 12:55
glbrntt added a commit to glbrntt/swift-nio that referenced this pull request Jan 16, 2024
Motivation:

In apple#2615 some of the syscall tests use a mocking infrastrucutre which is
compiled out in release builds. The tests weren't correctly guarded for
this so fail to build in release mode.

Modifications:

- Add missing guards

Result:

Tests compile in release mode
glbrntt added a commit to glbrntt/swift-nio that referenced this pull request Jan 16, 2024
Motivation:

In apple#2615 some of the syscall tests use a mocking infrastrucutre which is
compiled out in release builds. The tests weren't correctly guarded for
this so fail to build in release mode.

Modifications:

- Add missing guards
- Add missing availability on tests

Result:

Tests compile in release mode
glbrntt added a commit that referenced this pull request Jan 16, 2024
Motivation:

In #2615 some of the syscall tests use a mocking infrastrucutre which is
compiled out in release builds. The tests weren't correctly guarded for
this so fail to build in release mode.

Modifications:

- Add missing guards
- Add missing availability on tests

Result:

Tests compile in release mode
}

int CNIOLinux_renameat2(int oldfd, const char* old, int newfd, const char* newName, unsigned int flags) {
return renameat2(oldfd, old, newfd, newName, flags);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broke my Android CI today because Android didn't add this to its Bionic libc till API 30 a couple years ago, whereas I build against the older API 24.

Since this approach is still experimental, can we just disable rename() alone on Android or something like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also used during file creation so we can't just disable it on Android. I think we can work around this though as this also isn't available on Darwin so I think we can just do what we're doing on Darwin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2627

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🆕 semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants