Skip to content Skip to sidebar Skip to footer

Python-Parse Email Body And Truncate MIME Headers

I have an email body which looks somewhat like . Now I want to remove all the header from it and just have the conversation email text. How can I do it in python? I tried email.par

Solution 1:

import imaplib,email

hst = "your.host.adresse.com"
usr = "login"
pwd = "password"

imap = imaplib.IMAP4(hst)

try:
    imap.login(usr, pwd)
except Exception as e:
    raise IOError(e)

try:
    imap.select("Inbox") # Tell Imap where to go
    result, data = imap.uid('search', None, "ALL")
    latest = data[0].split()[-1]
    result, data = imap.uid('fetch', latest, '(RFC822)')
    a = data[0][1] # This contains the Mail Data


except Exception as e:
    raise IOError(e)

b = email.message_from_string(a)
if b.is_multipart():
    for payload in b.get_payload():
        b = (payload.get_payload())
else:
    b = (b.get_payload())

print b

This removes all the stuff from the mail you don't want in the final text. I've tested this with your code. You didn't show how you import the mail (your a) so i guess that's where you get the decoding problem from.

If you have any trouble with HTML Mails:

from bs4 import BeautifulSoup
soup = BeautifulSoup(b, 'html.parser')
soup = soup.get_text()
print soup

That should do the job for now, but I'd advise you to change the default python parser to lxml or html5lib.


Post a Comment for "Python-Parse Email Body And Truncate MIME Headers"