Parsing email "Received:" headers -


According to RFC 5321

We need to parse the email header: email header. We need to remove the domain \ IP that has passed through the mail. In addition, we need to understand whether the IP is internal IP or not. Is there already a library that can help, especially in C \ C ++,

For example,

received: server.mymailhost.com From (mail.mymailhost.com [126.43.75.123]) by pilot01.cl.msu.edu (8.10.2 / 8.10.2) ESMTP ID NAA23597;

Define the format used by the 'received' lines in RFC 2821 And regex can not parse it.

(You can try anyway, and you can be successful for a limited subset of the header created by known software, but when you add it to the category of weird stuff found in real world mail Then it will fail.)

Use an existing RFC 2821 parser and you should be OK, but otherwise you should expect a failure and write software to cope with it. needed.

We need to remove the "by" server.

More likely to 'use' is the hostname given in the 'by' line as the host itself is seen, therefore there is no guarantee that it is a publically Will be resolvable FQDN and of course you do not get valid (TCP-Info) there.


Comments