[MlMt] High Sierra, APFS, Time Machine, and MailMate...
Steven M. Bellovin
smb at cs.columbia.edu
Tue Dec 12 20:10:10 EST 2017
On 12 Dec 2017, at 17:48, Bill Cole wrote:
> On 12 Dec 2017, at 1:10 (-0500), Steven M. Bellovin wrote:
>
>> On 11 Dec 2017, at 23:26, Bill Cole wrote:
>>
>>> On 10 Dec 2017, at 21:14 (-0500), Steven M. Bellovin wrote:
>>>
>>>> My suspicion is that the problem has to do with very large
>>>> directories on APFS file systems
>>>
>>> This would be shocking. One of the rationales for APFS existing is
>>> that the HFS foundation was played out for dealing with large
>>> directories efficiently. I haven't looked into the details (life is
>>> short...) but if APFS is *worse* than HFS{+,X} with large
>>> directories then Apple is in a worse state than I had thought...
>>
>> Yah. I have no other explanation, though. To give a current example,
>> on a machine -- an old one, to be sure -- a Time Machine backup
>> started almost 10 hours ago. It's dumped 77.5 MB -- out of a total of
>> 152.7 MB -- in that time, and it's been at about 77 MB for the last
>> ~7-8 hours. At some point, though, it will pass the expensive point
>> and run at a reasonable rate. This dump is to a directly connected
>> USB 2.0 drive. And the CPU is about 96% idle, according to 'top'.
>>
>> Btw: by "big", I mean that I have one mailbox with 114K messages; the
>> directory itself is 3.6 MB. No other mailbox is more than half that
>> size, though I have four that are over 1 MB.
>
> Oh my.
Yah. I knew some were large, but I didn't think *that* large. Worse yet,
one of the top few is my inbox, which I haven't been cleaning out of
late. I've been following the MailMate mantra: just create smart
folders...
>
> Since the backup disk can't be APFS (Time Machine relies on
> hard-linked directories, which APFS won't do) you're still dealing
> with that huge directory in HFS+ on the write side. If that directory
> has changes it is going to be spectacularly slow for TM to do 114k
> file hard links and copy a handful of changed files into a new
> directory.
Right, which explains older slowness, but not the sudden problem.
>
> Also, USB 2.0 historically has been cripplingly slow on MacOS X, at
> least through El Capitan. I haven't tested it on my Sierra machine and
> don't have anything on High Sierra yet, so I can't say whether that
> might be a part of the problem.
My USB 2.0-only machine is, as of about 1:30 AM today, officially
retired from hot spare status; I just got a new laptop and have moved my
previous one to hot spare status. But the problem was on the 3.0
machine, too.
>
> 3 suggestions, in order of least to greatest effect on your specific
> issue (although the first 2 are good general TM housekeeping):
>
> 1. Use the 'tmutil' tool to thin your old backups more aggressively
> than TM does. (See the man page for details) This reduces the
> complexity of the filesystem btrees, making it easier for TM to do its
> work and also frees space so that you can avoid TM's arbitrary
> deletion of the oldest backups when the disk fills.
Possibly, though I doubt it will help with this issue. Time Machine did
a massive delete on one of my disks (and then a massive new backup...),
but it was just as slow afterwards.
>
> 2. Rebuild the filesystem btree structures on the Time Machine disk.
> This can be done with fsck_hfs using the "-Race" option or with
> Alsoft's DiskWarrior software. This will tidy up the mess that TM
> creates by building a full image of the source disk with mostly hard
> links every hour and then thinning them out over time, usually
> resulting in suboptimal structures. Note that either fsck_hfs or
> DiskWarrior may take an hour or more to rebuild the btrees but it will
> more than pay for itself in faster backups and especially in speeding
> up the filesystem verification TM does occasionally, which is also the
> source of the dreaded "you need to create a new backup" alert.
Ah, I didn't know about that one. I'll certainly try it.
>
> 3. Split that huge mailbox up into smaller slices that mostly never
> change, so that TM never has to do the appalling task of making the
> umpteenth hard link of each one of 114k files into a new huge
> directory. HFS+ starts to get noticeably slow with around 1k files in
> a directory. I try to keep my archives split into subfolders with
> nothing more than 2k messages because it feels like the speed
> degradation is worse than O(n) and is painful by 2k.
>
That's my current plan (though not to that small), but since it's just
about intersession (I'm a professor) I have enough time to play and try
to understand more of what's happening.
--Steve Bellovin, https://www.cs.columbia.edu/~smb
More information about the mailmate
mailing list