[MlMt] High Sierra, APFS, Time Machine, and MailMate...

Steven M. Bellovin smb at cs.columbia.edu
Tue Dec 12 20:10:10 EST 2017

On 12 Dec 2017, at 17:48, Bill Cole wrote:

> On 12 Dec 2017, at 1:10 (-0500), Steven M. Bellovin wrote:
>> On 11 Dec 2017, at 23:26, Bill Cole wrote:
>>> On 10 Dec 2017, at 21:14 (-0500), Steven M. Bellovin wrote:
>>>> My suspicion is that the problem has to do with very large 
>>>> directories on APFS file systems
>>> This would be shocking. One of the rationales for APFS existing is 
>>> that the HFS foundation was played out for dealing with large 
>>> directories efficiently. I haven't looked into the details (life is 
>>> short...) but if APFS is *worse* than HFS{+,X} with large 
>>> directories then Apple is in a worse state than I had thought...
>> Yah. I have no other explanation, though. To give a current example, 
>> on a machine -- an old one, to be sure -- a Time Machine backup 
>> started almost 10 hours ago. It's dumped 77.5 MB -- out of a total of 
>> 152.7 MB -- in that time, and it's been at about 77 MB for the last 
>> ~7-8 hours. At some point, though, it will pass the expensive point 
>> and run at a reasonable rate. This dump is to a directly connected 
>> USB 2.0 drive. And the CPU is about 96% idle, according to 'top'.
>> Btw: by "big", I mean that I have one mailbox with 114K messages; the 
>> directory itself is 3.6 MB. No other mailbox is more than half that 
>> size, though I have four that are over 1 MB.
> Oh my.

Yah. I knew some were large, but I didn't think *that* large. Worse yet, 
one of the top few is my inbox, which I haven't been cleaning out of 
late. I've been following the MailMate mantra: just create smart 
> Since the backup disk can't be APFS (Time Machine relies on 
> hard-linked directories, which APFS won't do) you're still dealing 
> with that huge directory in HFS+ on the write side. If that directory 
> has changes it is going to be spectacularly slow for TM to do 114k 
> file hard links and copy a handful of changed files into a new 
> directory.

Right, which explains older slowness, but not the sudden problem.
> Also, USB 2.0 historically has been cripplingly slow on MacOS X, at 
> least through El Capitan. I haven't tested it on my Sierra machine and 
> don't have anything on High Sierra yet, so I can't say whether that 
> might be a part of the problem.

My USB 2.0-only machine is, as of about 1:30 AM today, officially 
retired from hot spare status; I just got a new laptop and have moved my 
previous one to hot spare status. But the problem was on the 3.0 
machine, too.
> 3 suggestions, in order of least to greatest effect on your specific 
> issue (although the first 2 are good general TM housekeeping):
> 1. Use the 'tmutil' tool to thin your old backups more aggressively 
> than TM does. (See the man page for details) This reduces the 
> complexity of the filesystem btrees, making it easier for TM to do its 
> work and also frees space so that you can avoid TM's arbitrary 
> deletion of the oldest backups when the disk fills.

Possibly, though I doubt it will help with this issue. Time Machine did 
a massive delete on one of my disks (and then a massive new backup...), 
but it was just as slow afterwards.
> 2. Rebuild the filesystem btree structures on the Time Machine disk. 
> This can be done with fsck_hfs using the "-Race" option or with 
> Alsoft's DiskWarrior software. This will tidy up the mess that TM 
> creates by building a full image of the source disk with mostly hard 
> links every hour and then thinning them out over time, usually 
> resulting in suboptimal structures. Note that either fsck_hfs or 
> DiskWarrior may take an hour or more to rebuild the btrees but it will 
> more than pay for itself in faster backups and especially in speeding 
> up the filesystem verification TM does occasionally, which is also the 
> source of the dreaded "you need to create a new backup" alert.

Ah, I didn't know about that one. I'll certainly try it.
> 3. Split that huge mailbox up into smaller slices that mostly never 
> change, so that TM never has to do the appalling task of making the 
> umpteenth hard link of each one of 114k files into a new huge 
> directory. HFS+ starts to get noticeably slow with around 1k files in 
> a directory. I try to keep my archives split into subfolders with 
> nothing more than 2k messages because it feels like the speed 
> degradation is worse than O(n) and is painful by 2k.
That's my current plan (though not to that small), but since it's just 
about intersession (I'm a professor) I have enough time to play and try 
to understand more of what's happening.

         --Steve Bellovin, https://www.cs.columbia.edu/~smb

More information about the mailmate mailing list