[MlMt] Content-type markup=markdown and Microsoft/Yahoo Mail filtering

Bill Cole mmlist-20120120 at billmail.scconsult.com
Mon Feb 26 17:25:00 EST 2018


On 26 Feb 2018, at 12:11, Filip Stokkeland wrote:

> Hi!
> I discovered something interesting with Outlook and Yahoo Mail's
> junk filtering.
>
> If I send to an Outlook.com address with markdown selected in
> MailMate's composer, it's treated as junk on the Microsoft end.
> And similar result with Yahoo Mail.

Yes, email service tends to be worth *at most* what one pays for it. 
Spam control is one of the most expensive quality discriminators for 
mail service. The effects of scale are complex and perverse. The giant 
providers have special challenges and their scores of staff rarely do as 
good a job as small single-domain systems with one or two skilled 
admins.

> It seems like the issue is this header.
> `Content-Type: text/plain; markup=markdown`
>
> I'm not sure why, it doesn't make sense to me.
> (Why should `markup=markdown` make a difference to the junk filter?)

It should not. It is very unlikely that there's a specific test for 
that. It is far more likely that whatever/whoever maintains those 
filters have determined that unknown/non-standard parameters in the 
Content-Type header (which are valid under the MIME spec) are more 
common in spam than non-spam.

But I expect that it is something else... Read on.

> But if I select just Plain text in the Composer, and if I even
> add weird/utf-8 (Scandinavian) characters, then it works.
> The email is **not** treated as junk.
>
> This is the header then:
> `Content-Type: text/plain; charset=utf-8; format=flowed`

And in both types of message there's also a Content-Transfer-Encoding 
header which is probably different between the two formats, although it 
may simply not exist in the first type. That *header* is unlikely to be 
the reason for spam filters to hit one and not the other but the effect 
of the specific encodings is.

The reason that the mail with "Content-Type: text/plain; 
markup=markdown" header may not have a Content-Transfer-Encoding header 
is that it is missing a charset parameter, indicating the default value: 
us-ascii. That means the content can be represented in the traditional 
"7bit" form, which is the default value for the 
Content-Transfer-Encoding header, so the header isn't needed as long as 
there's no need to use QP or B64 encoding to accommodate long lines.

With utf-8, the best CTE choice is almost always base64, which 
eliminates all resemblance between the message as composed and the 
message in transit, with a single-character addition or removal anywhere 
in an encoded part changing the transfer encoding of everything after 
the change in that part. For a behemoth like Yahoo or MS, decoding 
base64 parts to get a meaningful text that can be subjected to Bayesian 
or linguistic analysis is expensive, and they may be just skipping body 
analysis of some or all base64 messages.

Also, if your use of Markdown also meant you were generating 
multipart/alternative messages with a text/html alternative, that also 
can make a difference in how a message is filtered.

-- 
Bill Cole
bill at scconsult.com or billcole at apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


More information about the mailmate mailing list