[MlMt] lock ups

Benny Kjær Nielsen mailinglist at freron.com
Thu Nov 29 09:58:35 UTC 2012


Hi Bill,

First of all, thanks for all the gory details. I believe this is the 
first time I've seen data for ~400K messages. I would have thought it 
was not currently possible with MailMate -- which would also be a valid 
conclusion based on the numbers, but I prefer the positive view point 
:-).

I'll comment on the MailMate related parts below. Mostly for my own 
benefit when looking into memory usage later on.

On 29 Nov 2012, at 0:05, Bill Cole wrote:

> Data point from 'top' output on a machine with >390K messages:
>
> PID    COMMAND   %CPU TIME     #TH  #WQ #PORTS #MREGS RPRVT  RSHRD  
> RSIZE  VPRVT  VSIZE
> 13146- MailMate  0.0  03:03:09 30    1   353+   3924+ 1820M+ 78M+   
> 1654M+ 2662M+ 3544M+
>
> That was after 55 hours of runtime on a Lion machine with 8GB RAM and 
> no overall memory pressure (>2GB free). MM has 8 source accounts (7 in 
> use, 1 testing account mostly offline). Database.noindex eats 1.6GB 
> and Messages 3.0GB

You seem to have a relatively low average message size, but as 
previously mentioned the number of messages is the main contributor to 
the memory usage. For some parts of MailMate it is the number of body 
parts which is likely to be twice as large.

> There are a few things here which seem a bit odd:
>
> PORTS: Not insanely high, but 354 puts MM in the top 10 on this 
> machine and that number seems to grow (leak?) slowly over time.

I'll make a mental note of that. MailMate did leak ports in the past 
(long time ago), but that quickly lead to problems (when running out of 
ports). I'm guessing it's growing and not leaking, but I may be wrong.

> MREGS: A long-running MM process always ends up with a huge number of 
> memory regions, even beating out sloppy apps like Safari and iTunes. 
> Also grows slowly over time and never shrinks much, so there may be a 
> leak. 3 hours after that sample, the count is now 3959.

MailMate loads a lot of things lazily, i.e., only when needed, but 
MailMate is not very good at throwing things out again. You can view 
that as a leak, but it's not the kind of leak where MailMate has 
forgotten to free data. That kind of leaks may also exist, but I believe 
you are mainly seeing the first kind of “leak”. It's on the ToDo for 
memory optimizations.

> RPRVT/RSHRD/RSIZE: I have no idea how RPRVT>RSIZE can be, but it 
> routinely is for MM. Obviously this is partly a top bug, but whatever 
> mis-counting causes top to violate the concept of RPRVT+RSHRD=RSIZE is 
> uniquely exercised by MM.

Without looking up the definitions (which I have to do every time I look 
at the memory usage categories) I'm guessing it could somehow be related 
to the extensive use of memory mapping in MailMate...

> After a few hours of running, MM's "private" memory use grows to make 
> MM persistently the 2nd largest process on the machine (behind 
> Spotlight's 'mds', which has a pathologically large data set on this 
> machine and hence a pathologically large virtual size.)

The first few hours MailMate uses memory on fetching database index 
files. Whenever a message arrives which uses a header not seen in that 
run, MailMate loads a file into memory. This is the fast growing period.

> There seems to be some slow leak because both RPRVT and VPRVT keep 
> growing. 3 hours after the sample shown, RPRVT has grown by 40M and 
> VPRVT by 24MB.

After most of the (most often) needed index files have been loaded, 
MailMate uses memory every time a message has been displayed. This is 
the cache where MailMate fails to free stuff later on (if I remember 
correctly). Fixing this would probably fix part of (maybe most of) the 
slow growing over time problem.

With the reservation that I may be overlooking something I think 
MailMate would primarily benefit from the following 3 memory 
optimizations:

1. Even better handling of database index files related to headers 
(space efficient data structures and even more lazy loading).
2. Smaller memory footprint for the large number of message sets used 
for smart mailboxes and many other aspects of MailMate.
3. Less caching of messages displayed.

1 and 2 are the hard ones. Number 3 is less important, but the current 
behavior is almost a bug and probably not that hard to fix.

That said, this is a game I cannot win. If I make MailMate efficient 
with 200K messages then someone is going to ask for 400K. And I know 
some users have millions of messages and MailMate cannot do that without 
cutting features. I'm not even sure an NSOutlineView (the messages 
outline) can handle a million entries. (But that does not mean I think 
MailMate could/should not be improved as listed above.)

> Also, after a long runtime MM can take many minutes to quit with top 
> showing it very slowly shedding MREGS and VSIZE. I can count on MM 
> causing a restart or logout to timeout if I don't manually quit it 
> first.

This is a side effect of MailMate cleaning up nicely when exiting. In 
debug mode this is used to look for memory leaks, but in production mode 
I should probably change this to exit quickly after closing server 
connections and saving any unsaved changes to disk. I'll look into that.

-- 
Benny


More information about the mailmate mailing list