Reliable way to only get the email text, excluding previous emails

I'm creating a basic system that allows users to reply to a thread on the website via email. However, most email clients include the text of the previous emails in their reply emails. This text is unwanted on the website.

Is there a reliable way in which I can extract only the new message, without prior knowledge about the earlier emails? I'm using the email class of Python.


Example message:

Content-Type: text/plain; charset=ISO-8859-1

test message! This is the part I want.

On Thu, Mar 24, 2011 at 3:51 PM, <test@test.com> wrote:

> Hi!
>
> Herman just posted a comment on the website:
>
>
> From: Herman
> "Hi there! I might be interested"
>
>
> Regards,
> The Website Team
> http://www.test.com
>

This is a reply message from gmail, I'm sure other clients might do it differently. A good start would probably be to ignore the lines that start with >, but there could also be lines like that in between the new message, and then they probably should be kept. I'll also still have the content-type line and the date line.

7
задан Herman Schaaf 24 March 2011 в 14:11
поделиться