Skip to content

Conversation

@rarensu
Copy link
Contributor

@rarensu rarensu commented Jun 24, 2025

Guys! It's finally happening. I have completed the long-anticipated major refactor to eliminate the unwanted reply callback paradigm inherited from libfuse.

Purpose: a goal of this pull request is to avoid functional changes and preserve existing features. I just wanted the things that worked before to continue working. In theory, code written against the new fuser API should compile to a nearly-identical binary. This is about ergonomics, readability, maintainability, and style. I documented everything I could. I even used cargo clippy. The main benefit of this work is that it paves the way for future developments such as async/await.

Excerpt from CHANGELOG:

  • Major API Refactor: The Filesystem trait methods have been refactored to return Result instead of using Reply objects for callbacks.
    • All Filesystem trait methods that previously accepted a reply: ReplyT parameter now return a Result<T, Errno>, where T is a struct containing the success data for that operation.
    • The Request object passed to Filesystem methods has been replaced with RequestMeta, a smaller struct containing only the request's metadata (uid, gid, pid, unique id). The full request parsing is now handled internally.
    • Additional public structs are introduced to simplify handling of request data, response data, and errors.
    • This change unifies the implementation of Filesystem methods and brings them more in line with Idiomatic Rust.
    • Examples and tests have been updated to match this new API. A few unrelated bugs in examples and tests have been fixed.
    • Idiomatic implementation of Passthrough and Notify are TODO items.

Status: on my machine:

  • All features compile.
  • All tests pass.
  • All the examples function correctly, to the best of my knowledge.

Next:

  1. I need help to test on more kinds of devices and with more complex applications. I don't have a macos device and I don't understand how to make Ioctl do anything.
  2. We need to discuss what to do about Passthrough and Notify. It wasn't obvious to me how to minimally change them to where they just work without the callback. I do have opinions about how to bring them into the new paradigm, but it would involve functional changes, which I believed was out of the scope of the PR.

@rarensu
Copy link
Contributor Author

rarensu commented Jun 24, 2025

Some related issues I would like to mention.
#164 -- the original issue that motivated this PR.
#274 -- this PR could address this issue, but doesn't yet.
#292 -- this PR could address this issue, but doesn't yet.
#296 -- this PR would close this issue.
#297 -- this PR would re-open this issue.
#298 -- this PR would partially address this issue.
#320 -- an issue related to the next steps after this PR.
#335 -- this PR would close this issue.

@allisonkarlitskaya
Copy link
Contributor

Regarding fd passthrough: I guess I'd suggest moving the backing ID registration stuff to either the request object or as a method on the trait itself (although I have a hard time imagining how that might look in terms of implementation)... and then the return value of the open function would become an enum, I guess?

for what it's worth, though, I think I disagree with this PR in general. Forcing the user to return a Vec from the read function? That seems like a pointless copy. What about flags? What about when some new feature gets added and you can't simply add a new API but now you need to add a new field to a type? What about alternate execution models where you can punt the reply object to a thread for background processing?

I really like the ergonomics improvement about returning Result and handling Errno via Err() return. That's a definite improvement. I also like the safety of forcing a reply value to be generated (although see the above comment about the thread thing).

So ya: overall I feel like this PR isn't an improvement...

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

Regarding fd passthrough: I guess I'd suggest moving the backing ID registration stuff to either the request object or as a method on the trait itself (although I have a hard time imagining how that might look in terms of implementation)... and then the return value of the open function would become an enum, I guess?

An open function that returns an enum makes a lot of sense to me. If the file can be opened, by one method or another, then the open function can return that.

How important is it that the application gets to handle the backing id? In your example, the application handles the backing ids in a custom structure. That feels very opposite to the use case I would have guessed for this feature. When I am building a filesystem, if I want to passthrough file handles, then I assume I also want my application to be minimal. I just want to point the fuser crate in the general direction of a file, fire and forget. If the fuser crate were responsible for storing the backing id, then it would greatly simplify the api. This feels more correct for the use case I imagine.

So I think an open function might look like:

fn open(self, ... ) -> Result<OpenEnum, Errno>{
    match flip_a_coin? { 
        heads => {
            fh: u64 = ... ;
            Ok(OpenEnum::OwnedHandle( fh, ... ))
        },
        tails => {
            name: OsString = ... ;
            Ok(OpenEnum::Passthrough( name, ... ))
        },
        _ => Err(Errno::coin_on_edge)
    }
}

In that case, I would move the backing map to the session object. The release block in requests.rs would switch between putting file handles into the backing map or passing on the file handle on to the application as appropriate.

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

for what it's worth, though, I think I disagree with this PR in general. Forcing the user to return a Vec from the read function? That seems like a pointless copy.

It does seem like a pointless copy. However, if written correctly (some extra steps), then it's actually a no-cost move. I acknowledge that the added steps required to avoid a pointless copy are a disadvantage.

There are disadvantages to references, too. If the trait method returns a reference, then the application developer needs to know how to set an explicit lifetime for the data, or wrap it in some kind a shared reference-counting object. Both are extremely unfriendly to the async/await paradigm that I am planning to support.

Sometimes, the copy is inevitable. For example, in the asynchronous case, or in the case where the application lacks complete control over the source data (like, when reading from an underlying filesystem), then the data needs to be copied into a buffer; otherwise someone could be writing over the data at the source while you're waiting for the kernel to come and read it. The advantage of the immutable reference is only really relevant for the exact case of a synchronous filesystem that keeps its data in memory. For everyone else, copying into an owned vector is not a big difference.

For these reasons, I felt that the benefits of the owned vector outweighed the benefits of the reference. If I am proven wrong, then I will change it back to reference.

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

What about flags? What about when some new feature gets added and you can't simply add a new API but now you need to add a new field to a type?

Sorry, I don't understand what you're asking me. The current API already struggles to adopt new features. Adding a field to a struct is usually less invasive than adding an whole new argument to a function. I thought this would be an improvement.

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

What about alternate execution models where you can punt the reply object to a thread for background processing?

My understanding of fuse file system applications is it they generally want concurrency more than they want multiprocessing. The way you usually achieve concurrency with multi-threading is by occupying multiple CPU cores, and that's not very efficient for a lightweight application that spends most of its time waiting for file IO. More or less, the whole point of async: it's the easiest way for tasks to be concurrent on a single CPU. In the interest of performance, it's probably more important to support async than it is to support threading. Additionally, multithreading and async can coexist; for example, we can support multi-threading at the session level, while doing async at the request level. That's what the async runtime (such as tokio) is for; it negotiates between tasks and threads. In that scenario, the dispatch loop, the filesystem, and the reply are all async tasks that are assigned to threads in a pool. So the answer to your question is yes, we will be doing what you described, and it does not require that the reply be a callback.

@allisonkarlitskaya
Copy link
Contributor

allisonkarlitskaya commented Jun 25, 2025

It does seem like a pointless copy. However, if written correctly (some extra steps), then it's actually a no-cost move. I acknowledge that the added steps required to avoid a pointless copy are a disadvantage.

You mention your system in-memory filesystem, which I think is a good place to start: then you're forced to copy each time (and then yet another copy at the syscall boundary). Being able to pass a &[u8] is a much nicer API, but of course that doesn't work with return values because of the scope thing... although maybe you could figure out a way, to be honest. The methods on the filesystem impl are &mut and if you returned data scoped to the same lifetime as self, it would basically be a way to say "this is good until you call the next method". But then you force the value to continue to live, when maybe it really was a Vec originally and you wanted it to be freed (so that you don't have to cache it somewhere). So, maybe have it both ways with Cow? That would cover a good number of cases, but not all: maybe we want to run some specific code when freeing (like munmap). You can't return an opaque impl via a trait method unfortunately... You could make the filesystem impl generic over the data type it returns from its read function but if you go that way you'd probably also want to make it generic over a half dozen other things. This would get messy fast.

In short: it's a real benefit that the filesystem implementation gets to run again after the reply is sent in order to do cleanup tasks (even if they're implicit via Drop trait impls). The only other way is the copy...

So the answer to your question is yes, we will be doing what you described, and it does not require that the reply be a callback.

Unless you make all of the trait methods async, I don't understand how. And note also: async methods on traits is realllly tricky.

Note also this existing documentation:

It's very clearly intended behaviour that you run the session and move/Send the reply objects to other threads. I'd be very surprised if nobody was doing this.

Sorry, I don't understand what you're asking me. The current API already struggles to adopt new features. Adding a field to a struct is usually less invasive than adding an whole new argument to a function. I thought this would be an improvement.

So in my experiencing adding the passthrough-fd thing, for example: I was able to do that without breaking anything by simply adding some new methods on the reply object. If this were a structure and I added a field to it then you're breaking API: all existing users would need to make sure that that (newly) missing field value now got filled in. If you're looking at enums, of course, then you could make an argument that it's safe to add new members if you're using #[non_exhaustive]...

Random other thing I forgot to mention last time: I also noticed that your implementation of directory reading using a vector doesn't give the "stop adding items" signal that the existing API features. That's important information that's missing.

I think you could maybe rescue this with async, enum, some clever use of things like Cow, etc., but it would be an awful lot of work, and it might prove to be impossible in the end. There are parts here that are helpful incremental improvements, though, like being able to return an errno...

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

Random other thing I forgot to mention last time: I also noticed that your implementation of directory reading using a vector doesn't give the "stop adding items" signal that the existing API features. That's important information that's missing.

The feature exists; I just moved it to the fuser internal side. If the vector is too big, fuser won't send all of it. Instead, I pass the max_bytes as a number to the filesystem. The filesystem developer can optionally choose to be responsible for respecting max bytes. A helper function might be appropriate?

If you have a specific reason why you think the filesystem developer should be required to use the fuser implementation of this logic, then I'm ready to listen. We could wrap the vanilla vector in a struct that comes with the helper functions you like.

@rarensu
Copy link
Contributor Author

rarensu commented Jun 25, 2025

Unless you make all of the trait methods async, I don't understand how. And note also: async methods on traits is realllly tricky.

Yes, all of the trait methods will be async. Yes, it will be tricky. That is beyond the scope of this PR.

Note also this existing documentation:

It's very clearly intended behaviour that you run the session and move/Send the reply objects to other threads. I'd be very surprised if nobody was doing this.

I don't agree with your assessment that it's very clearly intended. I think people are doing it because threads are what is currently available.

@cberner
Copy link
Owner

cberner commented Jun 30, 2025

Thanks for working on this! I haven't had time to review the whole PR, but my high level thoughts are:

  1. I like that this makes the API more ergonomic by enforcing return values and avoiding callbacks
  2. It would be helpful to me to see what the final async API is going to look like. I haven't thought it through, but do you know if it's possible to say add an AsyncFilesystem trait that has the async API we want, and then an adapter implementation that takes that AsyncFilesystem trait and translates the calls to the current legacy trait? If that's possible, my preferred approach would be: a) add that async trait and merge it to master, b) release it and let people try it out, c) once it's in good shape remove the current Filesystem trait and switch everything over.
  3. In its current state, I think this PR has some blockers. The ownership issues that @allisonkarlitskaya & I commented on are one, and then the removal of multi-threading is another. I think if we can go down the path of a separate AsyncFilesystem trait that would side step both of those issues
  4. The current PR is very large, let's find a way to split the functionality up into smaller pieces, so that I can review them. Unfortunately, I have less time to work on fuser than I used to 😓

Updated poll method to remove one more callback.
@rarensu
Copy link
Contributor Author

rarensu commented Jun 30, 2025

  1. In its current state, I think this PR has some blockers. The ownership issues that @allisonkarlitskaya & I commented on are one, ...

Sorry, I don't see where you @cberner commented on an ownership issue. Can you be more specific?

  1. It would be helpful to me to see what the final async API is going to look like. I haven't thought it through, but do you know if it's possible to say add an AsyncFilesystem trait that has the async API we want, and then an adapter implementation that takes that AsyncFilesystem trait and translates the calls to the current legacy trait?

I am planning to implement an async filesystem trait probably towards the end of July. My current plan is to support both sync and async variant at the crate API level forever, while also unifying them at the crate internal level so there is no duplicate code. Using current technology, this can be done using a clever combination of modules and macros. Eventually, Rust will adopt a "generic over async-ness" function property and should will take over the job of code unification. If I have planned correctly, migrating from macros to the generic async property will be straightforward and put fuser way ahead of the pack.

... If that's possible, my preferred approach would be: a) add that async trait and merge it to master, b) release it and let people try it out, c) once it's in good shape remove the current Filesystem trait and switch everything over.

Would you considered adding a develop branch or a feature branch to fuser? You can merge from my fork to that branch for testing and sharing, and then just delete the branch if it doesn't work out.

Comment on lines -434 to +445
name: &OsStr,
name: OsString,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes from &OsStr to OsString necessary? I think we should structure the API to minimize copying data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to understand where the copy operation you are worried about takes place. When the data comes in from the kernel, it's not OsStr, it's u8. So, we must construct an OsStr at least once. Given that we have to construct it, constructing OsStr and constructing OsString are pretty much the same thing. The difference only becomes relevant after the data is passed to the filesystem implementation. If the session object needs to both pass the OsStr to the filesystem and hold a copy for later, then it matters, because borrow is cheaper than clone. But why does the session need to keep a copy of the OsStr after passing the data to the filesystem? If it doesn't need to, I would argue that moving the data with ownership is more ergonomic than moving it without ownership. I will admit, that this is a weak argument. I am not aware of anything technical that prevents us from favoring a borrowed OsStr in the trait signature. For example, both OsStr and OsString are both trait send and trait sync.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment on one of the copies

Comment on lines -574 to +590
data: &[u8],
data: Vec<u8>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about ownership. I think we should leave this as a borrow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the OsStr comment. This seems to be a matter of preference. I must ponder.

lock_owner: Option<u64>,
reply: ReplyData,
) {
) -> Result<Vec<u8>, Errno> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use an output parameter here, like out: &mut [u8] otherwise implementations are required to allocate the return Vec<u8> buffer which will be more expensive. I rely on avoiding a Vec allocate in fleetfs, see here which calls this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the example. I see from your example how a callback is compatible with borrowed data. It is a neat trick that I admit does make it easier to avoid an unnecessary copy.

I still don't see how to do that trick without the callback; maybe it's theoretically possible by adding lifetime parameters? That would not feel very ergonomic to me.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too familiar with the ergonomics of async APIs, but ya it looks like Tokio does it with a lifetime: https://docs.rs/tokio/latest/tokio/io/trait.AsyncReadExt.html#method.read_exact

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not yet understanding the consequences of using a pre-allocated data out buffer. Does that imply that the entire buffer needs to be allocated in advance, every time, whether or not there will actually be any data?

@cberner
Copy link
Owner

cberner commented Jun 30, 2025

Sorry, I don't see where you @cberner commented on an ownership issue. Can you be more specific?

Oops, I forgot to hit submit on my review. I submitted it now

Would you considered adding a develop branch or a feature branch to fuser? You can merge from my fork to that branch for testing and sharing, and then just delete the branch if it doesn't work out.

I don't want to add a branch in this repo, but people should be able to use your repo & branch for testing I think. Is there any testing that requires the branch to be in my repo?

Comment on lines -450 to +656
x.name().as_ref(),
x.name().into(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rarensu here is where one of the copies happens that I am referring to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am beginning to understand. Path was constructed by borrowing from the underlying bytes, so there was no copy. Converting further to OsStr is zero copy, and converting even further to str could even be zero copy. On the other hand, OsString is owned. There must be a copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I refactored the lower level request handling so that raw data is parsed into owned structs from the get-go? The upside of this is that the data stays owned at all times, so there is no need to copy. it's just move-move-move. The potential downside is that the data is no longer accessible to the session loop after it gets passed to the filesystem. But, is that an actual downside? Is there any reason that the session loop might need the raw request data after the filesystem takes it?

@rarensu
Copy link
Contributor Author

rarensu commented Jun 30, 2025

@cberner @allisonkarlitskaya thank you for your feedback about my PR. I have been contemplating the Vec issue. I would like to propose a solution that is near-optimal for a wide variety of use cases, at the cost of being a bit fancier.

pub enum DataU8<'a> {
    /// A borrowed slice from an existing memory location.
    Borrowed(&'a [u8]),
    /// An owned, sized data structure. 
    Owned(Box<[u8]>),
} 

fn Foo() -> Result<DataU8<'a>, Errno> {
    if a {
      let cached_data: 'static [u8] = something;
      // a true zero copy (the lifetime is your problem now)
      Ok(DataU8::Borrowed(&cached_data)
    }
    if b {
      let allocated_data: Vec<u8> = something;
      // a low-cost move (the bytes are not copied, just the pointer).
      Ok(DataU8::Owned(allocated_data.into_boxed_slice())
    }
}

@allisonkarlitskaya
Copy link
Contributor

@cberner @allisonkarlitskaya thank you for your feedback about my PR. I have been contemplating the Vec issue. I would like to propose a solution that is near-optimal for a wide variety of use cases, at the cost of being a bit fancier.

take a look at std::borrow::Cow

@rarensu
Copy link
Contributor Author

rarensu commented Jul 1, 2025

take a look at std::borrow::Cow

Ok, I see it, and now I am embarrassed to have proposed a redundant implementation.

I need to think harder.

@rarensu
Copy link
Contributor Author

rarensu commented Jul 21, 2025

This PR is obsolete. See PR #347

@rarensu rarensu closed this Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants