zbateson / mail-mime-parser Goto Github PK
View Code? Open in Web Editor NEWAn email parser written in PHP
Home Page: https://mail-mime-parser.org/
License: BSD 2-Clause "Simplified" License
An email parser written in PHP
Home Page: https://mail-mime-parser.org/
License: BSD 2-Clause "Simplified" License
Zaahid,
The only remaining issue while fetching email body,
$res = $message->getTextStream();
echo stream_get_contents($res);
Email body contains previous replied mail contents too, you know how to remove them? just need final mail contents in thread.
Thanks!
Subject: =?iso-2022-jp?Q?=1B$B300Y=1B(I5]W2]C^S=1B(B(25?=
=?iso-2022-jp?Q?)(=1B$B%G%b=1B(B)=1B$B7h:QLsDj$N$*CN$i$;=1B(B?=
It is not converted by iconv_mime_decode used in function decodeMime. If it is mb_decode_mimeheader, it will be changed correctly.
Hi Zahid,
Right now I'm using your library in my mail processing code, I'm very happy with it and I enjoy using it, thank you for releasing this.
I would be interested to learn whether you are considering to also implement setter functions? Right now I read the email, parse it with mailmimeparser and then create a new mime object via SwiftMailer API in which I copy/construct elements that I have read/parsed via MailMimeParser. This works but doesn't feel elegant so if there's any inclination on providing setter functions that would be great news for me :-)
Thanks again for your great work and kind regards
Meint
Hi there!
Just wanted to thank you for this awesome library, this is a huge help for the next version of https://externals.io and I cannot thank you enough!
(sorry for the noise)
Bit of a nice-have, would be neat to be able to name the digital signature mime part, right now its something like "Nameless attachment 00019.dat" which can lead to customer confusion (what is this attachment, is it a virus?).
Request from @ThomasLandauer in a pull request... moving here so I can merge but keep the request.
Please explain in more detail, how the automatic decoding works - maybe here:
https://github.com/zbateson/MailMimeParser/wiki/ZBateson-MailMimeParser-MimePart#getcontentresourcehandle
Some questions:
Content-Transfer-Encoding:
header of the mime part?Support for uuencoded streams with a custom stream filter.
I'm trying to build a webmail client using this library that supports PGP messages. The clearsigned emails are easily implemented but I'm having a problem with multipart/signed messages.
To verify the signature I need to get the signature part and the raw first part of the message (including child parts and headers) as a string. I cannot find a method to do this in the library or am I missing something?
Apple Mail creates a very complex, nested structure for HTML e-mail that MailMimeParser is not able to recognize correctly. Example sent as attachment directly.
not a biggie but the release notes for 0.4.2 mention functions countTextParts and countHtmlParts, however these are not available in this version of the library.
Kind regards
Meint
Hello, my friend.
First of all let me thanks you for your awesome lib. It's a realy good alternative to those that uses mailparse extension.
I found some strange behaviour when parse simple mail content.
When I just echo html content it differs from that in email.
$email = file_get_contents($_SERVER['DOCUMENT_ROOT'].'/mail_debug_content.txt'); $mailParser = new ZBateson\MailMimeParser\MailMimeParser(); $Message = $mailParser->parse($email); $html = $Message->getHtmlContent(); $atts = $Message->getAllAttachmentParts(); echo $html;
If it is due to of my misunderstanding of what's going on, please give advise to where look up.
Thanks a lot for your help!
MMP should let the user know somehow that the encoding type detected from a Content-Type header isn't supported.
Hi Zaahid,
I have a text/plain message where I retrieve the text content with getTextContent(). I convert this to a HTML representation and add this to the message via setHtmlPart(). I subsequently remove the text part via removeTextPart(). This last action causes duplication of the HTML content, ie the same HTML message is now present twice in the same mime part. This doesn't happen when the text/plain message has an attachment.
$message = parseInput(EMAIL);
$text = $message->getTextContent();
$body = '<pre>' . $text . '</pre>';
$content = renderTemplate(TEMPLATE_OUTGOING, ['body' => $body, 'tracker_url' => $trackerUrl]);
$message->setHtmlPart($content, 'utf8');
$message->removeTextPart();
Calling $message->removeTextPart(); earlier in the process doesn't yield any effect, the text part remains in the e-mail body.
Kind regards
Meint
I'm trying to get attachments from raw email data that is more than 9mb. The script runs just fine when the raw email is around 3-4mb. But generates this error when parsing a larger file:
Call to a member function getContentResourceHandle() on a non-object
Any idea what I might be doing wrong?
some mails not english, and no charset in the mail header, so cannot detect charset correctly, can I set user default charset for all mails?
I find out this part in "Stream/CharsetStreamFilter.php", can someone teach me how I can change $charset to user define variable?
public function onCreate()
{
$charset = 'ISO-8859-1';
$to = 'UTF-8';
if (!empty($this->params['charset'])) {
$charset = $this->params['charset'];
}
if (!empty($this->params['to'])) {
$to = $this->params['to'];
}
$di = SimpleDi::singleton();
$this->converter = $di->newCharsetConverter($charset, $to);
}
thx
Error: Fatal error: Call to a member function getParent() on null in /var/task/vendor/zbateson/mail-mime-parser/src/Message.php on line 477
Hi,
It could be caused by some not printable characters, but I'm not sure.
Raw contents of the email:
problem-email-original.txt
Thanks
Hi zbateson,
I just tried this library everything works well, really appreciating your effort and time!
I would like to know more about attachment handling, in single word how I can save all of attachments to specific directory in my server?
Here I am standing now
$att = $message->getAttachmentPart(0);
$atts = $message->getAllAttachmentParts();
Could please help me on this? it will be very helpful, I already spend lot-of time to figure out :(
Again Thanks for this great library!
First, thanks for writing the library. Highly useful without having to install mailparse.
I do have a question regarding a feature if it is available. Is there an option to return the raw email (headers + body) with the MIME parts in tact but without attachments? I am looking at feeding this to a spam filter without attachments. If I build a test message using the included features, spamassassin is very unhappy about it missing the MIME boundaries.
I have emails with duplicate header names. When I call the "getHeader" method on these names it returns the last entry only. It would be desirable if it would return an array in this case.
Hi I have an issue I would appreciate any help with.
when I set custom header, for example: $mime->setRawHeader("X-HEADER", "value");
it will save it in raw mime as follows: X--HEADER: value . it will add another "-" to header
and it will continue to add a "-" for every run
for example, second run we will see 2 headers:
and so forth.
the question is why?
and how can i avoid this?
thanks in advance for the help.
Hi Zaahid, i have had a look at the other issues regarding attachments such as #15 but i can not seem to get the attachments working as one would expect, My initial thoughts are i maybe doing something wrong or the second thought is that my testing method may be the cause.
How i am testing is... Sending an email from outlook to AWS SES who then stores the email as an S3 object and then the parser retrieves the s3 object, parses it and displays the required content.
I have tried by attaching some images in outlook and then sending them as normal, Everything works perfect apart from the attachment part.
I am using laravel as the php framework and when i inspect the s3 object i can see that the image i sent is in base64 encoded within the file, Here is the header:
--005_AM4PR0301MB1908EBED5C3BCD19A66F77D8C53E0AM4PR0301MB1908 Content-Type: image/png; name="2 bed floor plan (1).png" Content-Description: 2 bed floor plan (1).png Content-Disposition: attachment; filename="2 bed floor plan (1).png"; size=82879; creation-date="Fri, 24 Mar 2017 10:36:55 GMT"; modification-date="Fri, 24 Mar 2017 10:36:55 GMT" Content-Transfer-Encoding: base64 VBORw0KGgoAAAANSUhE................
I have tried to use a foreach loop as suggested in #15 but no matter what i do i cant seem to get the content displayed or even save the attachment to a directory.
I am able to save the full object and all other content such as from, to, subject, html, text ect ect just not the attachment(s)...
Any help would be greatly appreciated Zaahid and i would also be more than happy to make a donation for the excellent work you have put into the parser, Maybe a donation button on the summary page would allow myself and other appreciative people to show our appreciation :)
I would be greatly appreciated if you could help me out on this, If i use $message->getAttachmentCount() then it returns the correct count, in the case the count is (2)
but if i use $message->getAllAttachmentParts() i just get an empty array of 0 and 1 and also if i $message->getAttachmentPart(0) i dont get the content either.
I have attached a text file of print_r($message->getAttachmentPart(0)) to give you an idea of what i mean, Its quite a large file so i apologise, My guess is you will probably know exactly where to look :)
Thanks, i really appreciate the work you have done :)
Thanks for this library, it is extremely well-written. I have a slight issue and I'm not clear if I'm using it incorrectly, however:
I have a Message of type multipart/alternative containing two parts, text/html and text/plain. When I call addAttachmentPart()
, this calls setMessageAsMixed()
which turns the message into multipart/mixed and adds the multipart/alternative as a child.
This results in a multipart/mixed containing the text/html, text/plain, an empty multipart/alternative and the attachment part.
My question is: should the textPart and htmlPart not be moved to be children of the multipart/mixed?
Might be a fix aligning hhvm to php for base64 decoding -- need to check which version breaks in hhvm and fix appropriately.
I came across a message (X-Mailer: IncrediMail) where content types were not in lowercase. This still works for most parts, but Multipart/Alternative causes getTextContent()
to fail.
I'm unable to do a pull request currently, but it can be fixed by adding strtolower()
in line 121 of Message.php:
private function addToAlternativeContentPartFromParsed(MimePart $part)
{
$partType = strtolower($this->contentPart->getHeaderValue('Content-Type'));
Background: I have emails that I fetch from a database rather than from files.
Problem: It is not currently possible to parse a mail directly from a string (as far as I can see from both the code and the Usage Guide).
Solution: Add a #parseString method to the MailMimeParser
:
$mailParser = new MailMimeParser();
$message = mailParser->parseString($mailAsString);
Appendix: Ideally the Message
could also have a static convenience method #fromString (*):
$message = Message::fromString($mailAsString)
(*) = It could simply instantiate an MailMimeParser
for me, as I don't need that afterwards anyway.
Is it possible to get only original message, without reply's, and forwards.
for example if I have in body :
some message
On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:
asdfasdfasdf
On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:asdfasdfasdf
On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:asdf
On Tue, Mar 6, 2018 at 7:32 PM, Yurii Ploskyi [email protected]
wrote:sdfg
On Tue, Mar 6, 2018 at 7:32 PM, <venue@sandbox7fa88b3af93c4130
8174561625a2cc98.mailgun.org> wrote:Isn't it fun getting mail?!
You've got a message in your inbox."asd..."
I want parse only "some message".
Tried to find such function, however none of the methods returns such info.
Thanks in advance
Sometimes happen decoding error attachments.
Problem ($this->leftover = '0') is here: Base64DecodeStreamFilter.php
if (!empty($this->leftover)) {
Fix:
if (strlen($this->leftover)) {
Hi Zaahid,
I have another Apple case for you :-) (well its actually a bit more generic but this testcase is generated by Mail on iPad)
Date: Fri, 10 Jul 2016 15:29:52 GMT
From: "Test Sender" <[email protected]>
To: "Test Recipient" <[email protected]>
Message-ID: <[email protected]>
Subject: [36]-text-plain-ipad
MIME-Version: 1.0
X-Mailer: iPad Mail (12H143)
Test
The body isn't recognized by any of the existing functions (getTextPart, getHtmlPart, getTextStream, getHtmlStream, getTextContent and getHtmlContent), I'm assuming this happens because of the lack of content-type for this simplest of email message formats?
However the message is a valid message.
The only function that triggers is getAllParts() but that just reflects the entire message, including headers.
Kind regards
Meint
Fatal error: Call to a member function getParent() on null in /var/task/vendor/zbateson/mail-mime-parser/src/Message.php on line 477
Hi,
from time to time I get an error about the iconv()
in CharsetConverter.php
the error looks like:
PHP Fatal error: Uncaught Exception: Error writing to storage ErrorException: [ERR] iconv(): Wrong charset, conversion from `UTF-8^M
<!DOCTYPE HTML>' to `UTF-8//TRANSLIT//IGNORE' is not allowed in /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/Helper/CharsetConverter.php on line 335 in /app/smtpd.walla.co.il/app/bootstrap.php:29
Stack trace:
#0 [internal function]: errHandle(8, 'iconv(): Wrong ...', '/app/smtpd.wall...', 335, Array)
#1 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/Helper/CharsetConverter.php(335): iconv('UTF-8\r\n <!DOCTY...', 'UTF-8//TRANSLIT...', '\r\n<html lang="h...')
#2 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/CharsetStreamFilter.php(48): ZBateson\MailMimeParser\Stream\Helper\CharsetConverter->convert('\r\n<html lang="h...')
#3 [internal function]: ZBateson\MailMimeParser\Stream\CharsetStreamFilter->filter(Resource id #23210, Resource id #23211, NULL, false)
#4 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Message/Writer/Mim in /app/smtpd.walla.co.il/src/Worker/Store.php on line 45
this is probably because of parsing issues, what can I do to fix it?
I think that it will be the best to add a validation in the findSupportedCharset
function to check if the string contains UTF-8 and if so to set the charset to that
Example:
Subject: =?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
=?koi8-r?Q?)?=
Problem in splitRawValue function :
Input Parameter: (=?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
=?koi8-r?Q?)?=)
Split Pattern: (\.|"|(|)|\s+)
Result Array:
(
[0] => =?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
[1] =>
[2] => =?koi8-r?Q?
[3] => )
[4] => ?=
)
I do not know whether it is possible to remove these symbols from the pattern
Hi,
Thank you very much for developing the MailMimeParser. So far it is doing exactly what I need it to do, however, having difficulty figuring out how to parse MESSAGE-ID from eml file.
Hi,
In file MailMimeParser/src/Stream/PartStreamRegistry.php 'uudecode' is missing.
Details:
ZBateson\MailMimeParser\Stream::attachEncodingFilterToStream();
is
case 'x-uudecode':
stream_filer_
should be
case 'x-uuencode':
case 'uuencode':
stream_filter_
Could you correct this issue?
Hi Zaahid, hope you're well
Ttitle says it all, if I get an email containing a s/mime pkcs7 signature the setAsMultipartSigned doesnt work in the sense that the body of the signature gets set yet the Content-Type header keeps pointing to the original signature. This causes the email not to open in Outlook, probably because it gets confused by seeing two pkcs7 signature bodies.
I have added a redacted example.
Kind regards
Meint
hey!
thanks for this package..it works perfectly!
i compared the memory usage and time needed to parse ~10 messages
and saw it uses way more than the mailparse extension.
is there a way to optimize this?
cheers max
nor does getHtmlContent()
message format of multipart-digest is weird, looks like it isn't being recognized as either text/plain or text/html?
I want to parse simple mail, when I try to attach img $message->getAttachmentPart(0) gives me correct attachment part and all is fine, but when I did it for ics file attachment It gives me no results, what am I doing wrong? I have attached example of such file to my question, sorry for being boring due to my misunderstanding.
Hi,
I'm wondering if it could be possible to add a feature to retrieve all attachments from any header. I have some emails with the attachment at the inline
header instead of attachment
.
However, I've updated to the last version 0.4.0
to try to fix it using the new method MimePart::getAllPartsByMimeType()
but it doesn't retrieve nothing...
Is there any way to solve this problem?
I'll try to parsing all ways to send me an attachment in order to fix it...
Thank you so much for this repository :)
Hi, I have a mail part that has a header
Content-transfer-encoding: Quoted-Printable
but the content is not really encoded as quoted printable, when I do $part->getContent()
I get empty content.
How can I fix it. Thanks
Hey again!
i have a Problem with Images that get malformed.
When i use getHtmlStream()
Links and Image Tags appear like this.
<img src=3D"http://g-ec2.images-amazon.com/images/G/01/e-=
mail/logos/a_de_prime_logo_48.gif" alt=3D"Amazon.de" border=3D"0" hspace=3D=
"6" vspace=3D"6">
Where does the 3D come from? Something with encoding seems to be broken?
any hints? ideas?
Mail headers contain shift_jis
charset sometime such as below.
Subject: =?shift_jis?B?g1qDfoNJgVuDX4FbirSKb4LFkUmC1IFBg1eDg4NQg2KDZw==?=
=?shift_jis?B?gUWDcIOTg2OBRYNYg0qBW4NngqqTb4/qgV6BdYKrgr+C8YLGgXaCxg==?=
=?shift_jis?B?gXWDiYNOg2CDk4F2gvCBQo90gsySyovOlZ6CyYKogreCt4Lfg0GDQw==?=
=?shift_jis?B?g2WDgA==?=
The charset is registered in IANA so conforming to RFC-2047.
I guess that it will solve by adding '_` to this regex pattern.
Thanks.
Hi there,
first of all thank you for this library, its really great and I'm having a lot of fun with it!
I have one functionality that I haven't been able to get fully working which is the getPersonName method in a situation where there are multiple to addresses. I can get a list of all the email to addresses via getHeader('To')->getAddresses() and also via getHeader('To')->getParts() and I can get the first name of the first email address via $message->getHeader('To')->getPersonName(). However how do I get all full names if there is more than one email address?
thanks and kind regards
Meint
php composer.phar require zbateson/mail-mime-parser
- Installing zbateson/mail-mime-parser (0.4.8): Loading from cache
Failed to execute unzip -qq '/Users/evtuhovdo/www/project/vendor/zbateson/mail-mime-parser/0786fef0db1349231721e3580a31f74a' -d '/Users/evtuhovdo/www/project/vendor/composer/4a5fe707'
error: cannot create /Users/evtuhovdo/www/project/vendor/composer/4a5fe707/zbateson-MailMimeParser-b05f32a/tests/_data/emails/files/HasenundFr��sche.txt
Illegal byte sequence
The archive may contain identical file names with different capitalization (which fails on case insensitive filesystems)
Unzip with unzip command failed, falling back to ZipArchive class
If the email contains an attachment type text/rtf, then there is the unnecessary conversion encoding.
The problem is here:
PartStreamRegistry.php
private function attachCharsetFilterToStream(MimePart $part, $handle) { $contentType = strtolower($part->getHeaderValue('Content-Type', 'text/plain')); if (strpos($contentType, 'text/') === 0) {
Fix:
if ($contentType === 'text/plain' || $contentType === 'text/html') {
Following on #8 - additional charsets may need to be added. This list of aliases in Python may be handy: http://python.ca/nas/python/sap-25/Lib/encodings/aliases.py
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.