How to create file system snapshots with fs_snapshot_create?

The online documentation for fs_snapshot_create, which is on a website which apparently I'm not allowed to link to on this forum, mentions that some entitlement is necessary, but doesn't specify which one. Searching online I found someone mentioning com.apple.developer.vfs.snapshot, but when adding this to my entitlement file and building my Xcode project, I get the error

Provisioning profile "Mac Team Provisioning Profile: com.example.myApp" doesn't include the com.apple.developer.vfs.snapshot entitlement.

Searching some more online, I found someone mentioning that one has to request this entitlement from DTS. Is this true? I couldn't find any official documentation.

I actually want to make a snapshot of a user-selected directory so that my app can sync it to another volume while avoiding that the user makes changes during the sync process that would make the copy inconsistent. Would fs_snapshot_create be faster than traversing the chosen directory and creating clones of each nested file with filecopy and the flag COPYFILE_CLONE? Although I have the impression that only fs_snapshot_create could make a truly consistent snapshot.

Answered by DTS Engineer in 841909022

I’m gonna let Kevin answer your main question

I can never resist a good file system question.

Searching some more online, I found someone mentioning that one has to request this entitlement from DTS. Is this true?

Yes, that is correct. However, the criteria for granting this entitlement are very narrow, so it is only granted to "backup" applications. I'm not sure that this would qualify:

I actually want to make a snapshot of a user-selected directory so that my app can sync it to another volume while avoiding that the user makes changes during the sync process that would make the copy inconsistent.

Having said that, I'm also not sure you need volume snapshotting for that. Breaking down your options:

Would fs_snapshot_create be faster than traversing the chosen directory and creating clones of each nested file with filecopy and the flag COPYFILE_CLONE?

So, there are three options here, loosely ordered by performance:

  1. Create a snapshot with fs_snapshot_create.

  2. Clone the entire directory using clonefile() (see "man clonefile")

  3. Iteratively clone the files and recreate the directory (which is what copyfile would do).

Moving into the comparison side of things:

Although I have the impression that only fs_snapshot_create could make a truly consistent snapshot.

Semantically, #1 & #2 above are identical. Both of them create an atomic clone of a given hierarchy and, in fact, you can basically think of "fs_snapshot_create" as a special case of directory cloning that just happens to target the entire volume. You're correct that #3 does not create a TRULY consistent "snapshot", however, how important that actually is depends a lot on the overall context.

Comparing #2 & #3, the issue here is a tradeoff between performance and safety. I actually did a detailed pair of forum posts on that here, but the summary is that:

  • Directory cloning is significantly faster (~10x).

  • Iterative cloning is still quite fast in absolute terms.

  • The danger with directory cloning is that you can (potentially) block access to large parts of the file system, causing major performance problems, potentially even panic'ing the kernel.

That last point is the BIG issue here. If you're going to use directory cloning, then you need to either:

  • Already "klnow" that the source hierarchy is of "reasonable" size. Note that "reasonable" here isn't necessarily "small". Results will vary widely depending on the specifics of the underlying hardware but, for reference, it takes ~0.6s for clonefile() to clone a directory of ~10,000 files on my macBook Pro.

  • Ensure/"know" that the source directory won't have system wide consequence and/or disrupt the user.

...and, ideally, both. The big point here is that you probably shouldn't just let the user select and arbitrary source and then clone whatever directory they point at. However, the performance benefit is large enough that it's worth considering when the circumstances are appropriate.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

I’m gonna let Kevin answer your main question, but I want to address this:

which is on a website which apparently I'm not allowed to link to on this forum

Yeah you are, you just have to do it in the clear. See tip 14 in the increasingly badly named Quinn’s Top Ten DevForums Tips.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks. Here is the link to the documentation: https://manp.gs/mac/2/fs_snapshot_create

(The validation message I get when trying to use a Markdown link is quite misleading: "This domain is not a permitted domain on this forums.")

I’m gonna let Kevin answer your main question

I can never resist a good file system question.

Searching some more online, I found someone mentioning that one has to request this entitlement from DTS. Is this true?

Yes, that is correct. However, the criteria for granting this entitlement are very narrow, so it is only granted to "backup" applications. I'm not sure that this would qualify:

I actually want to make a snapshot of a user-selected directory so that my app can sync it to another volume while avoiding that the user makes changes during the sync process that would make the copy inconsistent.

Having said that, I'm also not sure you need volume snapshotting for that. Breaking down your options:

Would fs_snapshot_create be faster than traversing the chosen directory and creating clones of each nested file with filecopy and the flag COPYFILE_CLONE?

So, there are three options here, loosely ordered by performance:

  1. Create a snapshot with fs_snapshot_create.

  2. Clone the entire directory using clonefile() (see "man clonefile")

  3. Iteratively clone the files and recreate the directory (which is what copyfile would do).

Moving into the comparison side of things:

Although I have the impression that only fs_snapshot_create could make a truly consistent snapshot.

Semantically, #1 & #2 above are identical. Both of them create an atomic clone of a given hierarchy and, in fact, you can basically think of "fs_snapshot_create" as a special case of directory cloning that just happens to target the entire volume. You're correct that #3 does not create a TRULY consistent "snapshot", however, how important that actually is depends a lot on the overall context.

Comparing #2 & #3, the issue here is a tradeoff between performance and safety. I actually did a detailed pair of forum posts on that here, but the summary is that:

  • Directory cloning is significantly faster (~10x).

  • Iterative cloning is still quite fast in absolute terms.

  • The danger with directory cloning is that you can (potentially) block access to large parts of the file system, causing major performance problems, potentially even panic'ing the kernel.

That last point is the BIG issue here. If you're going to use directory cloning, then you need to either:

  • Already "klnow" that the source hierarchy is of "reasonable" size. Note that "reasonable" here isn't necessarily "small". Results will vary widely depending on the specifics of the underlying hardware but, for reference, it takes ~0.6s for clonefile() to clone a directory of ~10,000 files on my macBook Pro.

  • Ensure/"know" that the source directory won't have system wide consequence and/or disrupt the user.

...and, ideally, both. The big point here is that you probably shouldn't just let the user select and arbitrary source and then clone whatever directory they point at. However, the performance benefit is large enough that it's worth considering when the circumstances are appropriate.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks for your detailed explanations. My app allows to sync directory pairs and can be considered a backup app, since one of the two directories can be the backup (on an external volume). Wouldn't this qualify it for that entitlement?

The reason why I'm considering cloning the source directory is that I would like to add an option to calculate the checksum for each source file so that the app can check at certain points in time if the backup files are corrupt. This would be problematic if between copying the source file to the backup volume and calculating the checksum of the source file, the source file is changed (either by the user, or by other apps, such as Photos, which regularly updates its database). The app would make a bad impression if the copied files would immediately be flagged as corrupt because the checksum was calculated on a different version of the source file.

From what I understand I should avoid cloning an arbitrary, user-selected directory, and the only efficient solution I see is to create a source volume snapshot. The other, much less efficient solution that comes to my mind would be to check that the checksum of the backup file corresponds to the checksum of the source file calculated before the copy operation, and if it doesn't, copy the file again until it does.

Accepted Answer

Thanks for your detailed explanations. My app allows to sync directory pairs and can be considered a backup app, since one of the two directories can be the backup (on an external volume). Wouldn't this qualify it for that entitlement?

I can't say for certain, as the final determination is up to the engineering team. The entitlement has generally only been granted for "whole volume" backups, but it's possible the engineering team would agree with you. You're welcome to submit a request, I just can't guarantee that your request will be approved.

Note that if you submit a request, please include as much detail is possible about your product, including your marketing page, etc. If your product is still in development, "tmutil" can be used to create/manage/interact with snapshots well enough to prototype and experiment, even if you product isn't ready to be released.

From what I understand I should avoid cloning an arbitrary, user-selected directory, and the only efficient solution I see is to create a source volume snapshot.

Yes and no. What I'm actually saying is that:

  1. Directory cloning has significant "logic" benefits (since it captures the hierarchy at a single moment in time).

  2. Directory cloning can have ENORMOUS performance benefits.

  3. It can be dangerous if used blindly/carelessly.

Jumping back to my previous message, I want to highlight this point:

Already "know" that the source hierarchy is of "reasonable" size.

Something like a "data sync" app is EXACTLY the kind of app the ends up "knowing" this kind of information. That's because of things like:

  • You're working with the same directory over and over again and often end up saving your own "copy" of that directory state for your own purposes.

  • You scan the entire hierarchy looking for changed and new files.

In addition here, there are more options here than "clone the entire source directory" and "clone individual files". You also have the option of a "mixed" solution where you clone specific sub-directories (for the performance benefit) and then clone/create the individual objects that couldn't be duplicated through directory cloning.

I'll also say that once external copying is involved, the performance of file cloning can be somewhat... misleading. The time required to clone an individual is effectively "noise" relative to the cost of copying the actual file data. The way I would actually do something like this is the following:

  1. On one thread, start cloning the original source hierarchy to your private source location. This could be done with copyfile(), but there might also be some benefit to writing your own "engine".

  2. As objects "finish" in #1, they're "fed" into the copy engine, which copies data from the private source hierarchy to the final hierarchy.

Properly implemented, I think the performance of this approach is basically identical to directly copying from the original source, even if you're cloning individual files. On my own device, the "file clone" rate of copyfile works out to ~1500+ files/second. That means that a delay as short as 0.1s puts engine #1 at LEAST 100+ files "ahead" of #2. Now, the actual rate will vary depending on the source device, however, the performance will effect your copy rate MORE than it effects the clone rate. No matter what the situation is, #1 is going to finish LONG before #2 is anywhere close to "done".

The other, much less efficient solution that comes to my mind would be to check that the checksum of the backup file corresponds to the checksum of the source file calculated before the copy operation, and if it doesn't, copy the file again until it does.

No. The alternative here is:

  1. Clone the file
  2. Copy the clone.
  3. Compare the checksum of the backup against the checksum of the clone.

Keep in mind that this is basically the way I'd recommend almost any app work with almost ANY file. File clones are all but "free" in file system terms, so why NOT work with your own "private" copy instead of dealing with the live file system issues.

Note that you could even collect/save these over time in the same way you could with snapshots. SO, for example, you could have directory like this on the source volume:

BackupSets/
	Backup 6_1_25/
	Backup 6_2_25/
	Backup 6_3_25/
	Backup 6_4_25/

...where you save the clone files you previously backed up.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

I actually thought of that third possibility while I started reading your post. It's so obvious that I couldn't possibly see it before. I'm now thinking that cloning a file before copying it to the target volume and then calculate the checksum on the clone is the best solution when the source volume is the root file system, because then I can create the clone in the user's temporary directory. In theory it would be sufficient if the source volume simply supports file cloning, but it feels a little ugly creating a temporary clone on a source volume that doesn't have a temporary system directory.

Hi,

In theory it would be sufficient if the source volume simply supports file cloning, but it feels a little ugly creating a temporary clone on a source volume that doesn't have a temporary system directory.

The name doesn't exactly roll of the tongue, but this issue is what "url(for:in:appropriateFor:create:)" is actually "for". The "appropriateFor" argument lets you specify the volume you need to directory to be on, and the system will then return you a directory on that volume. One of it's major roles is providing temporary directories on "arbitrary" volumes, which is what that's specifically mentioned at the end of Discussion section.

Also, as a purely personal comment:

creating a temporary clone on a source volume that doesn't have a temporary system directory.

Developer's tend to think of volumes as either:

  1. The boot system volume.

  2. An external/removable/etc. volume which isn't really "part" of the users system.

On a purely personal level, that's not necessarily true. I've long made a habit of keeping my home directory on a secondary volume of the same physical device, primarily because it simplifies data migration between system versions because destroying a system volume is a lot less scary when your kids photos aren't on that volume. The point here is that changing your behavior on the assumption that non system volumes are truly "different" isn't necessarily the right choice.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks. Sure, I just meant that it seems a little "unclean" to have a temporary file on a volume that is not controlled by the OS and could potentially be lying there forever if the volume is disconnected in the middle of the operation.

I'm going to be using URLResourceKey.volumeSupportsFileCloningKey to determine if cloning is available and use clonefile in that case, and otherwise use filecopy to copy the file to the system temporary directory and then copy it from there to the actual destination volume. The latter may be a problem if it's a big file for which the system volume has not enough free space, but in that case the user should just disable checksum calculation in my app.

Thanks. Sure, I just meant that it seems a little "unclean" to have a temporary file on a volume that is not controlled by the OS and could potentially be lying there forever if the volume is disconnected in the middle of the operation.

There's a forum post I did here about the choice between using the directory returned by url(for:in:appropriateFor:create:) vs creating your own user-visible directory, but the bottom line is that there isn't really any single, "right" answer to this question. You're going to leave data behind if the operation is interrupted, and you have to decide whether it's better to:

  1. Make things "seamless" by minimizing user involvement, but risk "orphaning" data in ways that aren't necessarily visible to the user.

  2. Complicate the experience by making what you're doing visible to the user.

The right choice here depends entirely on the details of the product you’re building.

I'm going to be using URLResourceKey.volumeSupportsFileCloningKey to determine if cloning is available and use clonefile in that case, and otherwise use filecopy to copy the file to the system temporary directory and then copy it from there to the actual destination volume.

I'm not sure how specifically you thought this through, but I'd be careful about edge cases and crossing volumes. You don't want to end up unnecessarily copying to the system volume when you could have cloned by staying "inside" the correct file system.

Also, if network file systems are an important use case, then that's something that's worth looking at as its own case. Our smb client does not declare support for file cloning, but it will in fact clone files if you call the right (obscure) API as long as you’re copying within the remote hierarchy and the underlying file system supports cloning. Even if you’re forced to copy, the Carbon API will give you a much faster/more "atomic" copy than copyfile (or any other copy API).

Also, on the smb side, I don't think there's currently any way to preserve file clones across the smb copy, even when both sides support cloning. So copying across smb can increase real storage usage compared to the same copy directly between APFS volumes.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

You don't want to end up unnecessarily copying to the system volume when you could have cloned by staying "inside" the correct file system.

I don't understand. Doesn't URLResourceKey.volumeSupportsFileCloningKey allow me to detect if cloning is supported?

on the smb side, I don't think there's currently any way to preserve file clones across the smb copy, even when both sides support cloning

I thought you were saying that with the deprecated Carbon API one can clone files, but then I don't understand why file clones are not preserved. Do you mean when copying a folder that contains file clones, those files are copied and not cloned?

I don't understand. Doesn't URLResourceKey.volumeSupportsFileCloningKey allow me to detect if cloning is supported?

It does, and I think you'll find that it returns "false" when called on any SMB volume.

I thought you were saying that with the deprecated Carbon API one can clone files, but then I don't understand why file clones are not preserved. Do you mean when copying a folder that contains file clones, those files are copied and not cloned?

First of all, please try this with Finder paying attention to what you're doing and how it affects storage usage on the server. I've explained what's going on below, but it will make a lot more sense when you do the testing yourself.

So, there are two different cases here:

  1. Copy from the local device to the remote server.

  2. Copying contents within a volume "on" the remote server.

For case #1, the Carbon API and our other copy APIs have the same behavior and similar performance (if anything, Carbon will be slightly slower). As part of that, cloned files will be "lost" because, as I noted above, smb says it doesn't support file cloning.

HOWEVER, the behavior you'll see in #2 will be WILDLY different, with the Carbon API being ENORMOUSLY faster than any other copy API. What's going on here is that most of our APIs work by reading the data from the server, then writing that data BACK to the server, just like a "normal" file copy would.

However, that is NOT what the Finder or the Carbon API are doing. What they're doing is iterating through the hierarchy and sending a command to the SMB server saying "copy this to here", which the SMB server then executes locally. On an APFS volume that means the server clones the files, while on other volume formats the copy itself occurs as a server-side operation.

Jumping back to here:

I don't understand. Doesn't URLResourceKey.volumeSupportsFileCloningKey allow me to detect if cloning is supported?

What "volumeSupportsFileCloningKey" actually means is "does the clonefile() syscall do something?“ In our current smb implementation, the answer to that is "no". However, the Carbon API happens to "bypass" that limitation because it ends up calling through to a different syscall, and THAT syscall ends up (eventually) causing the server to call "clonefile".

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

I see, thanks. I'd prefer to stay away from deprecated APIs, but if a user complains about the speed when copying between SMB volumes, I'll come back to this topic. Perhaps the only remaining question would be: how would I find out if the Carbon API / the SMB volume effectively supports cloning, or is it not possible?

Perhaps the only remaining question would be: how would I find out if the Carbon API / the SMB volume effectively supports cloning, or is it not possible?

That's a tricky one, as you have very little ability to interrogate the target system and can't really determine what's on the other "side". If "VOL_CAP_INT_COPYFILE"* is "false" then you know that it won't "work" but that doesn't cover cases like the remote volume being an HFS+ volume (not APFS).

*As backstory, yes, there are in fact two totally different functions in the system which were both called "copyfile()". One is the public API, the other is a private syscall hook to the "VNOP_COPYFILE" VFS hook. Note that afp (Apple File Sharing Protocol, NOT "APFS-> Apple File System") and smb are currently the ONLY file systems that implement VNOP_COPYFILE. The history here is that a VERY long time ago VNOP_COPYFILE was originally added to support AFP (which is where the "do the copy on the server" idea came from) and SMB picked it up "later" to provide performance parity as we transitioned away from AFP. In parallel with that, we implemented public API for copying files... which we called "copyfile()".

In terms of deciding when to use the carbon API, honestly, I think this is something you'd need to test and experiment with yourself to figure out the right "balance" of factors. In particular:

  1. Carbon will be FAR faster when copying inside an APFS volume, due to cloning.

  2. I think carbon will be significantly faster when copying inside an HFS+ volume, since it's avoids the network I/O path.

  3. I'm not sure how it will behave when copying across volumes/shares. It's possible you'll still see a performance boost, but it's also possible the volume are being treated as "unrelated" and all the data is being shunted over the network.

  4. I'm not sure how it's "worst case" (meaning, all the data is being transferred over the network) performance actually compares to FileManager/copyfile(). I believe Carbon is "slower", but the difference is much smaller than the difference of the "best case" in #1/#2.

I think the key issue here is really about exactly what you're product is doing and how it's being used. In your case, where you're specifically talking about doing things like large scale copying and/or duplicating files for later comparison, the performance benefit possible with 1 & 2 is SO big that it's probably worth looking at. For more "general" applications, I probably wouldn't bother.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

How to create file system snapshots with fs_snapshot_create?
 
 
Q