[MlMt] ML E-mail tagging?

Jim Bates jim at batesiv.net
Sat Jul 19 07:51:28 EDT 2025


I developed a MailMate Bundle which would take the from, subject and 
body of the e-mail, send it to OpenAI for analysis and returns a 
one-word TAG which is subsequently applied to the e-mail. The Bundle 
worked when executed manually, however, it would crash MailMate on 
startup if I created a rule on the Inbox folder which would execute the 
Bundle automatically - so I have not released the Bundle.

I used identical logic and developed a python script which performs the 
same activity. While I would prefer to have the Bundle kick-off 
automatically based upon a new e-mail entering the Inbox folder(s); I 
simply have cron execute this every 5 minutes.

So, comments on the code - enjoy for yourself, I'm just sharing since it 
was fun to do and it works enough for my personal satisfaction. It's 
running in my cron right now and working as intended; I'd like to get 
the Bundle working and not have MM crash on startup, but I think that 
gonna take a bit more time since I can't really trace that out (yet).

One thing I did learn through all this fun - I was sending MORE OpenAI 
tokens with each e-mail than I realized - I had to change my model from 
gpt-4o (accepts 30k tokens) to gpt-4o-mini (accepts 200k) tokens; I also 
hit a rate limit since I was applying this logic to ALL my previous 
e-mails :) 1st time for that one...

Oh, I spent about $0.03 with OpenAI credits doing all this too; so it's 
not a real financial deal breaker!

**Code Below**
---
```
import imaplib
import email
from email.header import decode_header
from openai import OpenAI

client = OpenAI(api_key = "sk-...")


def decode_hdr(value):
     if not value:
         return ""
     parts = decode_header(value)
     decoded = ''
     for part, encoding in parts:
         if isinstance(part, bytes):
             decoded += part.decode(encoding or 'utf-8', 
errors='ignore')
         else:
             decoded += part
     return decoded

def extract_email_fields(raw_email_bytes):
     msg = email.message_from_bytes(raw_email_bytes)

     subject = decode_hdr(msg.get("Subject"))
     from_ = decode_hdr(msg.get("From"))
     to = decode_hdr(msg.get("To"))
     date = decode_hdr(msg.get("Date"))

     # Get plain text body
     body = ""
     if msg.is_multipart():
         for part in msg.walk():
             content_type = part.get_content_type()
             content_dispo = str(part.get("Content-Disposition", 
"")).lower()
             if content_type == "text/plain" and "attachment" not in 
content_dispo:
                 charset = part.get_content_charset() or "utf-8"
                 body = part.get_payload(decode=True).decode(charset, 
errors="ignore")
                 break
     else:
         if msg.get_content_type() == "text/plain":
             charset = msg.get_content_charset() or "utf-8"
             body = msg.get_payload(decode=True).decode(charset, 
errors="ignore")

     return {
         "subject": subject.strip(),
         "from": from_.strip(),
         "to": to.strip(),
         "date": date.strip(),
         "body": body.strip()
     }


# Connect and login
imap = imaplib.IMAP4_SSL("imap.fastmail.com")
imap.login("user at fastmail.com", "abcd1234")

# Select the mailbox
imap.select("INBOX")

# Search for unseen emails
status, messages = imap.search(None, 'UNKEYWORD', 'Processed')
for num in messages[0].split():
     result, data = imap.fetch(num, "(RFC822)")
     raw_email = data[0][1]
     fields = extract_email_fields(raw_email)

     strings = [fields["from"], fields ["subject"], fields["body"] ]
     query = " ".join(strings)
     response = client.chat.completions.create(
         model="gpt-4o",
         messages=[ {"role": "system", "content": "You are an email 
classification system. Based on the email's subject and body, assign one 
of the following categories: Ads,  Finance, Medical, News, Other, Radio, 
Shipping, Update. Return only the category name, nothing else."},
             {"role": "user", "content": f"{query}" }
         ]
     )

     result = response.choices[0].message.content
     result = result.strip()
     imap.store(num, '+FLAGS', f'{result}')
     imap.store(num, '+FLAGS', 'Processed')

imap.logout()
```
---

--
Jim Bates
(804) 690-9143 (Cell/Signal)

On 17 Jul 2025, at 11:05, Jim Bates via mailmate wrote:

> **TLDR:**
>
> Does anyone have a machine learning, Sieve, MailMate rule system or 
> any other systematic process which will [create tags | move mail]  
> based on machine learning algorithms (or alternative logic) instead of 
> hard coded word/syntax matches?
>
> --------
>
> I have grown weary sorting e-mail based upon e-mail addresses and a 
> few, select searchable keywords.
>
> I was playing around with some machine learning (ML) analysis of my 
> e-mail corpus and had various levels of failure :) No real success so 
> far.
>
> I thought I'd send a note out to the MailMate team and see if anyone 
> had developed their own solution.
>
> On a slightly different note, I have purchased and use Spam Sieve - 
> it's integrated into MailMate and though I receive very little Spam 
> anyway, it does a nice job of “learning” about my mail content. I 
> have asked the Spam Sieve team if they are planning on developing a 
> feature set that I've described - they say they're considering it in 
> the product, but does not exist today...
>
>
> Jim Bates
> (804) 690-9143 (Cell/Signal)
> _______________________________________________
> mailmate mailing list
> Unsubscribe: https://lists.freron.com/listinfo/mailmate
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freron.com/pipermail/mailmate/attachments/20250719/5279f14d/attachment-0001.htm>


More information about the mailmate mailing list