[MlMt] MailMate so slow it locks up

Bill Cole mmlist-20120120 at billmail.scconsult.com
Fri Jun 17 19:51:16 EDT 2016


On 16 Jun 2016, at 6:56, Alex Bligh wrote:

> Benny,
>
> Thanks for your reply.
>
>> On 15 Jun 2016, at 22:36, Benny Kjær Nielsen 
>> <mailinglist at freron.com> wrote:
>>
>> On 15 Jun 2016, at 17:10, Alex Bligh wrote:
>>
>> I'm afraid I cannot provide you with a solution, but at least I can 
>> provide a few comments/explanations. It may not be clear below, but 
>> in general I'm always embarrassed when performance clearly could be 
>> better.
>>
>>> ... Thunderbird accesses both accounts directly in the normal way.
>>
>> But maybe not in a fully offline mode?
>
> Thunderbird and Mail.app access the smaller account in the normal way, 
> i.e. all folders are synced and work online and offline.

That's actually optional in TBird. Each account has its own sync 
settings and they are quite flexible. A side effect of this is that 
TBird search is incomplete until the first sync is complete and that can 
be never if you don't have an account set to download everything.

> On the larger account Thunderbird is only subscribed to some of the 
> mailboxes (in this respect just like Mail.app) and works online and 
> offline. Mail.app can't cope with that which is why I use the 
> filtering app I wrote.
>
>> MailMate is a so-called offline IMAP email client and there are no 
>> options to change that. This means that MailMate fetches and stores a 
>> local copy of every email in the account(s).
>
> Like Mail.app and like, (I believe) Thunderbird in respect of 
> subscribed mailboxes. My main client is Mail.app, and I'd be happy if 
> MailMate worked just like Mail.app in respect of subscribed mailboxes 
> and ignored everything else.
>
>> In addition to that, every header of every email is indexed.
>
> I believe the same is true in Mail.app and Thunderbird;

Unsure about current TBird but that has not been the case in the past (I 
bailed on it right around the time versioning jumped from 3.x to double 
digits, breaking every plugin). When last I looked, it indexed a defined 
subset of headers.

Mail.app doesn't index all headers, unless it is doing so without 
exposing that indexing as search functionality.

Benny is modestly understating the MM index. In addition to indexing ALL 
headers, it has a couple dozen synthetic pseudo-header metadata fields 
created for each message in the index and can decompose sub-fields of 
many headers with structured formats. There are 7134

> Mail.app (at least) also indexes the bodies.

As does MailMate. It maintains distinct indices for quoted and unquoted 
text, each with a case-squashed index (I assume: the files are tagged 
'lc' and are slightly smaller than the untagged files) which is how it 
does body searches blindingly fast. It is also why MM's indexing 
database is pretty consistently 1/3 as large as the store of messages.

>> Although optimizations mean that MailMate can handle fairly large 
>> email accounts (some users have more than a million),
>
> Elsewhere on a list populated by Mac users with large mail spools, 
> there seems to be a binary split between people who've got Mailmate to 
> work with an enormous number of messages, and those who haven't. This 
> split is not based on spool size. Most of them have (hence I got a 
> recommendation to use it). It may be that I'm doing something wrong.
>
>> MailMate was not really designed to handle “huge” accounts. There 
>> is likely more that I could do to handle large accounts more 
>> efficiently, but few of them are likely to be quick fixes. In short, 
>> MailMate is extremely flexible, but some parts do not scale well.
>
> I think one thing that is problematic and is possibly easy to fix is 
> simply that when downloading it does not 'yield' to the UI often 
> enough. I would care less about the high CPU if the GUI would redraw. 
> Oh, and also if I could see what it was actually doing and how far it 
> was through. I tried the activity window but it appears to show IMAP 
> instructions without showing which mailboxes it is syncing.

Well, it does, but you have to find the 'SELECT' command at the start of 
the sync procedure for a mailbox.

I'm mostly with you on the idea that the UI should work while doing 
massive tasks like initial sync and re-indexing of the local message 
store, but that's a 2-edged sword because of how MM is designed to be 
used: primarily with synthetic mailboxes, both cross-account merged 
Inbox etc. and "Smart Mailboxes" that are live views using the index DB. 
This creates a deep problem in the structure of MM because if one thread 
is roaring through a flood of changes to the indices because it is 
retrieving multiple streams of messages for the first time, the UI 
updates would end up being just as fast OR decoupled from (and so 
routinely out of sync with) the index. Benny's choice to just beachball 
in some circumstances seems like a reasonable surrender to this 
intrinsic problem being an intrinsic problem: routine synch happens in 
the background, large piles of changes stall it. Mail and TBird  use the 
approach of almost entirely decoupling UI updates from index updates 
because they index less and don't use their indices to fill in the UI. 
Only Benny can say if there's some easy way to loosen up the UI 
dependence on the index DB or make updates lazier, but I'd guess that 
since sometimes the main viewer window appears mid-rebuild, it may be 
something he can tune to get better responsiveness at the cost of 
perfect data coherence.

>>> On installing MailMate, I set up the smaller account first (300,000 
>>> messages). It correctly imported the settings, but after that there 
>>> was a world of pain. It sync'd all the mail, but only by running 2 
>>> CPU cores at 100% continuously for over 24 hours. Neither Mail.app 
>>> nor Thunderbird do this; they import in the background, and (for 
>>> that mail account) take only a few hours. Saying that, after 24 
>>> hours (perhaps a bit more), it had pulled all the mail in and 
>>> appeared to work.
>>
>> That's more time than I would have expected if you are on a fast 
>> connection and it isn't Gmail (Google throttles traffic), but since 
>> it did finish then some kind of looping behavior is probably 
>> unlikely.
>
> It can pull about 50-70Mb/s from the mail server.

Is that through a single IMAP session retrieving messages or just 
pumping a flood of bits? IMAP servers use all sorts of different storage 
formats and I don't recall the details on Cyrus, but it is certain that 
you cannot pull 10,000 10KB messages in a tree of 100 mailboxes down 
over one IMAP connection as fast as you can transfer a single 100MB file 
via HTTP or FTP or SCP.

My 1st guess would be that you only have one IMAP session open and could 
benefit from opening more concurrent IMAP sessions to the server, but 
that you cannot for some reason. By default, MM will use up to 3 
concurrent connections per account, and you can increase that for a 
specific source account by manually adding a "maxNumberOfConnections" 
integer key to the account's dictionary in ~/Library/Application 
Support/MailMate/Sources.plist. MailMate will silently reduce it's 
operational limit when it gets an indication from the server that it has 
hit a session limit (Benny magic... there's no consistency in how 
servers handle session limits.) If you've got multiple MUAs online with 
the same account simultaneously, it's not hard to hit a server-side 
limit, which (when they exist) tend to be ~10 but may be as high as 30. 
The fact that you're only eating 2 cores during an initial sync and not 
3 or 4 may indicate that Cyrus is telling MM to get lost on the 2nd and 
3rd connection attempts and MM is making do with one. On the other hand, 
I could be wrong: Benny may have MM behaving in a special way for a 1st 
synch.

> There is no throttling in place. It's writing to a fast SSD. I'm 
> guessing some sort of O(n^2) problem is in play.

That does not seem consistent with the fact that I can rebuild a MM 
index database from ~500k messages on local spinning rust on a vintage 
(late 2009 i7) iMac in under 2 hours while your MacPro took 24 hours to 
import 300k via IMAP. The only advantage I can see on my side to explain 
that is how fast I can pull message data from disk compared to how fast 
you can pull message data over IMAP. The rebuild task provides a 
progress window showing which mailbox it is working on, which strongly 
suggests that it is a fully serialized process: reading one message at a 
time and indexing it, very possibly using a single thread for that work. 
This makes it easy for me to test the lower bound for how fast a rebuild 
could be if it was entirely i/o bound by serially reading every message 
in my whole local MM spool It turns out I can read 525 messages/second, 
~4MB/s. That would be an order of magnitude higher if HFS+ didn't suck, 
but it's still probably much faster than you can retrieve and store 
messages in a single IMAP session. Maybe even faster than you could in 3 
sessions.

Can you try again and check how many sessions MM is actually opening and 
using?



More information about the mailmate mailing list