[MlMt] High Sierra, APFS, Time Machine, and MailMate...

Bill Cole mmlist-20120120 at billmail.scconsult.com
Tue Dec 12 17:48:59 EST 2017


On 12 Dec 2017, at 1:10 (-0500), Steven M. Bellovin wrote:

> On 11 Dec 2017, at 23:26, Bill Cole wrote:
>
>> On 10 Dec 2017, at 21:14 (-0500), Steven M. Bellovin wrote:
>>
>>> My suspicion is that the problem has to do with very large 
>>> directories on APFS file systems
>>
>> This would be shocking. One of the rationales for APFS existing is 
>> that the HFS foundation was played out for dealing with large 
>> directories efficiently. I haven't looked into the details (life is 
>> short...) but if APFS is *worse* than HFS{+,X} with large directories 
>> then Apple is in a worse state than I had thought...
>
> Yah. I have no other explanation, though. To give a current example, 
> on a machine -- an old one, to be sure -- a Time Machine backup 
> started almost 10 hours ago. It's dumped 77.5 MB -- out of a total of 
> 152.7 MB -- in that time, and it's been at about 77 MB for the last 
> ~7-8 hours. At some point, though, it will pass the expensive point 
> and run at a reasonable rate. This dump is to a directly connected USB 
> 2.0 drive. And the CPU is about 96% idle, according to 'top'.
>
> Btw: by "big", I mean that I have one mailbox with 114K messages; the 
> directory itself is 3.6 MB. No other mailbox is more than half that 
> size, though I have four that are over 1 MB.

Oh my.

Since the backup disk can't be APFS (Time Machine relies on hard-linked 
directories, which APFS won't do) you're still dealing with that huge 
directory in HFS+ on the write side. If that directory has changes it is 
going to be spectacularly slow for TM to do 114k file hard links and 
copy a handful of changed files into a new directory.

Also, USB 2.0 historically has been cripplingly slow on MacOS X, at 
least through El Capitan. I haven't tested it on my Sierra machine and 
don't have anything on High Sierra yet, so I can't say whether that 
might be a part of the problem.

3 suggestions, in order of least to greatest effect on your specific 
issue (although the first 2 are good general TM housekeeping):

1. Use the 'tmutil' tool to thin your old backups more aggressively than 
TM does. (See the man page for details) This reduces the complexity of 
the filesystem btrees, making it easier for TM to do its work and also 
frees space so that you can avoid TM's arbitrary deletion of the oldest 
backups when the disk fills.

2. Rebuild the filesystem btree structures on the Time Machine disk. 
This can be done with fsck_hfs using the "-Race" option or with Alsoft's 
DiskWarrior software. This will tidy up the mess that TM creates by 
building a full image of the source disk with mostly hard links every 
hour and then thinning them out over time, usually resulting in 
suboptimal structures. Note that either fsck_hfs or DiskWarrior may take 
an hour or more to rebuild the btrees but it will more than pay for 
itself in faster backups and especially in speeding up the filesystem 
verification TM does occasionally, which is also the source of the 
dreaded "you need to create a new backup" alert.

3. Split that huge mailbox up into smaller slices that mostly never 
change, so that TM never has to do the appalling task of making the 
umpteenth hard link of each one of 114k files into a new huge directory. 
HFS+ starts to get noticeably slow with around 1k files in a directory. 
I try to keep my archives split into subfolders with nothing more than 
2k messages because it feels like the speed degradation is worse than 
O(n) and is painful by 2k.



-- 
Bill Cole
bill at scconsult.com or billcole at apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


More information about the mailmate mailing list