Giter VIP home page Giter VIP logo

mail-mime-parser's People

Contributors

dependabot[bot] avatar droet avatar erlangsec avatar itabrezshaikh avatar joelharkes avatar josh-g avatar jszaszi avatar krisbuist avatar kstenschke avatar mariuszkrzaczkowski avatar matthiaskuehneellerhold avatar michalbundyra avatar nielsvanpach avatar ondrejmirtes avatar pableu avatar peter279k avatar phpfui avatar pupaxxo avatar stefaans avatar stollr avatar styks1987 avatar sunmar avatar thomaslandauer avatar tivnet avatar zbateson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mail-mime-parser's Issues

Strip previous mail reply contents!

Zaahid,

The only remaining issue while fetching email body,

$res = $message->getTextStream();  
echo stream_get_contents($res);

Email body contains previous replied mail contents too, you know how to remove them? just need final mail contents in thread.

Thanks!

setter functions

Hi Zahid,

Right now I'm using your library in my mail processing code, I'm very happy with it and I enjoy using it, thank you for releasing this.

I would be interested to learn whether you are considering to also implement setter functions? Right now I read the email, parse it with mailmimeparser and then create a new mime object via SwiftMailer API in which I copy/construct elements that I have read/parsed via MailMimeParser. This works but doesn't feel elegant so if there's any inclination on providing setter functions that would be great news for me :-)

Thanks again for your great work and kind regards
Meint

Feature request: ability to name digital signature part

Bit of a nice-have, would be neat to be able to name the digital signature mime part, right now its something like "Nameless attachment 00019.dat" which can lead to customer confusion (what is this attachment, is it a virus?).

Getting the raw part from a multipart/signed email

I'm trying to build a webmail client using this library that supports PGP messages. The clearsigned emails are easily implemented but I'm having a problem with multipart/signed messages.

To verify the signature I need to get the signature part and the raw first part of the message (including child parts and headers) as a string. I cannot find a method to do this in the library or am I missing something?

Incorrect Html content parsing

Hello, my friend.
First of all let me thanks you for your awesome lib. It's a realy good alternative to those that uses mailparse extension.
I found some strange behaviour when parse simple mail content.
When I just echo html content it differs from that in email.
$email = file_get_contents($_SERVER['DOCUMENT_ROOT'].'/mail_debug_content.txt'); $mailParser = new ZBateson\MailMimeParser\MailMimeParser(); $Message = $mailParser->parse($email); $html = $Message->getHtmlContent(); $atts = $Message->getAllAttachmentParts(); echo $html;
If it is due to of my misunderstanding of what's going on, please give advise to where look up.

Thanks a lot for your help!

mail_debug_content.txt

removeTextPart() issue?

Hi Zaahid,

I have a text/plain message where I retrieve the text content with getTextContent(). I convert this to a HTML representation and add this to the message via setHtmlPart(). I subsequently remove the text part via removeTextPart(). This last action causes duplication of the HTML content, ie the same HTML message is now present twice in the same mime part. This doesn't happen when the text/plain message has an attachment.

$message = parseInput(EMAIL);
$text = $message->getTextContent();
$body = '<pre>' . $text . '</pre>';
$content = renderTemplate(TEMPLATE_OUTGOING, ['body' => $body, 'tracker_url' => $trackerUrl]);
$message->setHtmlPart($content, 'utf8');
$message->removeTextPart();

Calling $message->removeTextPart(); earlier in the process doesn't yield any effect, the text part remains in the e-mail body.

Kind regards
Meint

set default charset?

some mails not english, and no charset in the mail header, so cannot detect charset correctly, can I set user default charset for all mails?

I find out this part in "Stream/CharsetStreamFilter.php", can someone teach me how I can change $charset to user define variable?

public function onCreate()
{
    $charset = 'ISO-8859-1';
    $to = 'UTF-8';
    if (!empty($this->params['charset'])) {
        $charset = $this->params['charset'];
    }
    if (!empty($this->params['to'])) {
        $to = $this->params['to'];
    }

    $di = SimpleDi::singleton();
    $this->converter = $di->newCharsetConverter($charset, $to);
}

thx

Attachment saving to specific Directory

Hi zbateson,

I just tried this library everything works well, really appreciating your effort and time!

I would like to know more about attachment handling, in single word how I can save all of attachments to specific directory in my server?
Here I am standing now
$att = $message->getAttachmentPart(0);
$atts = $message->getAllAttachmentParts();

Could please help me on this? it will be very helpful, I already spend lot-of time to figure out :(

Again Thanks for this great library!

get raw email without attachments

First, thanks for writing the library. Highly useful without having to install mailparse.

I do have a question regarding a feature if it is available. Is there an option to return the raw email (headers + body) with the MIME parts in tact but without attachments? I am looking at feeding this to a spam filter without attachments. If I build a test message using the included features, spamassassin is very unhappy about it missing the MIME boundaries.

Duplicate Header Clobbering

I have emails with duplicate header names. When I call the "getHeader" method on these names it returns the last entry only. It would be desirable if it would return an array in this case.

Mime - setRawHeader changed header string given

Hi I have an issue I would appreciate any help with.

when I set custom header, for example: $mime->setRawHeader("X-HEADER", "value");
it will save it in raw mime as follows: X--HEADER: value . it will add another "-" to header
and it will continue to add a "-" for every run
for example, second run we will see 2 headers:

  1. X--HEADER: value
  2. X---HEADER: value

and so forth.

the question is why?
and how can i avoid this?

thanks in advance for the help.

Fantastic libary, Only issue i have is getting attachments

Hi Zaahid, i have had a look at the other issues regarding attachments such as #15 but i can not seem to get the attachments working as one would expect, My initial thoughts are i maybe doing something wrong or the second thought is that my testing method may be the cause.

How i am testing is... Sending an email from outlook to AWS SES who then stores the email as an S3 object and then the parser retrieves the s3 object, parses it and displays the required content.

I have tried by attaching some images in outlook and then sending them as normal, Everything works perfect apart from the attachment part.

I am using laravel as the php framework and when i inspect the s3 object i can see that the image i sent is in base64 encoded within the file, Here is the header:

--005_AM4PR0301MB1908EBED5C3BCD19A66F77D8C53E0AM4PR0301MB1908 Content-Type: image/png; name="2 bed floor plan (1).png" Content-Description: 2 bed floor plan (1).png Content-Disposition: attachment; filename="2 bed floor plan (1).png"; size=82879; creation-date="Fri, 24 Mar 2017 10:36:55 GMT"; modification-date="Fri, 24 Mar 2017 10:36:55 GMT" Content-Transfer-Encoding: base64 VBORw0KGgoAAAANSUhE................

I have tried to use a foreach loop as suggested in #15 but no matter what i do i cant seem to get the content displayed or even save the attachment to a directory.

I am able to save the full object and all other content such as from, to, subject, html, text ect ect just not the attachment(s)...

Any help would be greatly appreciated Zaahid and i would also be more than happy to make a donation for the excellent work you have put into the parser, Maybe a donation button on the summary page would allow myself and other appreciative people to show our appreciation :)

I would be greatly appreciated if you could help me out on this, If i use $message->getAttachmentCount() then it returns the correct count, in the case the count is (2)

but if i use $message->getAllAttachmentParts() i just get an empty array of 0 and 1 and also if i $message->getAttachmentPart(0) i dont get the content either.

I have attached a text file of print_r($message->getAttachmentPart(0)) to give you an idea of what i mean, Its quite a large file so i apologise, My guess is you will probably know exactly where to look :)

print_r.txt

Thanks, i really appreciate the work you have done :)

addAttachmentPart() on multipart/alternative results in empty multipart

Thanks for this library, it is extremely well-written. I have a slight issue and I'm not clear if I'm using it incorrectly, however:

I have a Message of type multipart/alternative containing two parts, text/html and text/plain. When I call addAttachmentPart(), this calls setMessageAsMixed() which turns the message into multipart/mixed and adds the multipart/alternative as a child.

This results in a multipart/mixed containing the text/html, text/plain, an empty multipart/alternative and the attachment part.

My question is: should the textPart and htmlPart not be moved to be children of the multipart/mixed?

Tests failling on latest HHVM

Might be a fix aligning hhvm to php for base64 decoding -- need to check which version breaks in hhvm and fix appropriately.

Support target encoding other than UTF-8

@tivnet suggests making the target encoding support other charsets other than UTF-8.

@tivnet -- could you explain what the use case is? Sorry my experience with charsets is somewhat limited, in my experience UTF-8 has always been fine for my needs.

Thank you

Bug when Multipart/Alternative is not lowercase.

I came across a message (X-Mailer: IncrediMail) where content types were not in lowercase. This still works for most parts, but Multipart/Alternative causes getTextContent() to fail.

I'm unable to do a pull request currently, but it can be fixed by adding strtolower() in line 121 of Message.php:

private function addToAlternativeContentPartFromParsed(MimePart $part)
{
    $partType = strtolower($this->contentPart->getHeaderValue('Content-Type'));

Allow parsing mail directly from string

Background: I have emails that I fetch from a database rather than from files.

Problem: It is not currently possible to parse a mail directly from a string (as far as I can see from both the code and the Usage Guide).

Solution: Add a #parseString method to the MailMimeParser:

$mailParser = new MailMimeParser();
$message = mailParser->parseString($mailAsString);

Appendix: Ideally the Message could also have a static convenience method #fromString (*):

$message = Message::fromString($mailAsString)

(*) = It could simply instantiate an MailMimeParser for me, as I don't need that afterwards anyway.

Get last message

Is it possible to get only original message, without reply's, and forwards.
for example if I have in body :

some message

On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:

asdfasdfasdf

On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:

asdfasdfasdf

On Tue, Mar 6, 2018 at 7:33 PM, Yurii Ploskyi [email protected]
wrote:

asdf

On Tue, Mar 6, 2018 at 7:32 PM, Yurii Ploskyi [email protected]
wrote:

sdfg

On Tue, Mar 6, 2018 at 7:32 PM, <venue@sandbox7fa88b3af93c4130
8174561625a2cc98.mailgun.org> wrote:

Isn't it fun getting mail?!
You've got a message in your inbox.

"asd..."

I want parse only "some message".
Tried to find such function, however none of the methods returns such info.
Thanks in advance

Error in base64 decoding

Sometimes happen decoding error attachments.
Problem ($this->leftover = '0') is here: Base64DecodeStreamFilter.php
if (!empty($this->leftover)) {
Fix:
if (strlen($this->leftover)) {

Can't get body from simplest email

Hi Zaahid,

I have another Apple case for you :-) (well its actually a bit more generic but this testcase is generated by Mail on iPad)

Date: Fri, 10 Jul 2016 15:29:52 GMT
From: "Test Sender" <[email protected]>
To: "Test Recipient" <[email protected]>
Message-ID: <[email protected]>
Subject: [36]-text-plain-ipad
MIME-Version: 1.0
X-Mailer: iPad Mail (12H143)

Test

The body isn't recognized by any of the existing functions (getTextPart, getHtmlPart, getTextStream, getHtmlStream, getTextContent and getHtmlContent), I'm assuming this happens because of the lack of content-type for this simplest of email message formats?

However the message is a valid message.

The only function that triggers is getAllParts() but that just reflects the entire message, including headers.

Kind regards
Meint

Charset converter

Hi,
from time to time I get an error about the iconv() in CharsetConverter.php
the error looks like:

PHP Fatal error:  Uncaught Exception: Error writing to storage ErrorException: [ERR] iconv(): Wrong charset, conversion from `UTF-8^M
 <!DOCTYPE HTML>' to `UTF-8//TRANSLIT//IGNORE' is not allowed in /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/Helper/CharsetConverter.php on line 335 in /app/smtpd.walla.co.il/app/bootstrap.php:29
Stack trace:
#0 [internal function]: errHandle(8, 'iconv(): Wrong ...', '/app/smtpd.wall...', 335, Array)
#1 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/Helper/CharsetConverter.php(335): iconv('UTF-8\r\n <!DOCTY...', 'UTF-8//TRANSLIT...', '\r\n<html lang="h...')
#2 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Stream/CharsetStreamFilter.php(48): ZBateson\MailMimeParser\Stream\Helper\CharsetConverter->convert('\r\n<html lang="h...')
#3 [internal function]: ZBateson\MailMimeParser\Stream\CharsetStreamFilter->filter(Resource id #23210, Resource id #23211, NULL, false)
#4 /app/smtpd.walla.co.il/vendor/zbateson/mail-mime-parser/src/Message/Writer/Mim in /app/smtpd.walla.co.il/src/Worker/Store.php on line 45

this is probably because of parsing issues, what can I do to fix it?
I think that it will be the best to add a validation in the findSupportedCharset function to check if the string contains UTF-8 and if so to set the charset to that

Parsing is incorrect when the last character is '"' оr ')' and is on a new line

Example:
Subject: =?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
=?koi8-r?Q?)?=

Problem in splitRawValue function :
Input Parameter: (=?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
=?koi8-r?Q?)?=)
Split Pattern: (\.|"|(|)|\s+)

Result Array:
(
[0] => =?koi8-r?B?9MXIzsnexdPLycUg0sHCz9TZIChFUlAg58HMwcvUycvBIMkg79TexdTZIPTk?=
[1] =>

[2] => =?koi8-r?Q?
[3] => )
[4] => ?=

)

I do not know whether it is possible to remove these symbols from the pattern

Message-ID

Hi,

Thank you very much for developing the MailMimeParser. So far it is doing exactly what I need it to do, however, having difficulty figuring out how to parse MESSAGE-ID from eml file.

lack of uuencode

Hi,
In file MailMimeParser/src/Stream/PartStreamRegistry.php 'uudecode' is missing.
Details:
ZBateson\MailMimeParser\Stream::attachEncodingFilterToStream();
is

case 'x-uudecode':
         stream_filer_

should be

 case 'x-uuencode':
 case 'uuencode':
            stream_filter_

Could you correct this issue?

setAsMultipartSigned doesn't work properly if there is an existing s/mime signature in the email

Hi Zaahid, hope you're well
Ttitle says it all, if I get an email containing a s/mime pkcs7 signature the setAsMultipartSigned doesnt work in the sense that the body of the signature gets set yet the Content-Type header keeps pointing to the original signature. This causes the email not to open in Outlook, probably because it gets confused by seeing two pkcs7 signature bodies.

I have added a redacted example.

example.txt

Kind regards
Meint

Performance / Memory Usage

hey!
thanks for this package..it works perfectly!

i compared the memory usage and time needed to parse ~10 messages
and saw it uses way more than the mailparse extension.

is there a way to optimize this?

cheers max

Attachment parts dont want to show up

I want to parse simple mail, when I try to attach img $message->getAttachmentPart(0) gives me correct attachment part and all is fine, but when I did it for ics file attachment It gives me no results, what am I doing wrong? I have attached example of such file to my question, sorry for being boring due to my misunderstanding.

mail_body.txt

Retrieve all attachments from any header

Hi,

I'm wondering if it could be possible to add a feature to retrieve all attachments from any header. I have some emails with the attachment at the inline header instead of attachment.

However, I've updated to the last version 0.4.0 to try to fix it using the new method MimePart::getAllPartsByMimeType() but it doesn't retrieve nothing...

Is there any way to solve this problem?

I'll try to parsing all ways to send me an attachment in order to fix it...

Thank you so much for this repository :)

Images wont appear in Message.

Hey again!
i have a Problem with Images that get malformed.

When i use getHtmlStream()

Links and Image Tags appear like this.

    <img src=3D"http://g-ec2.images-amazon.com/images/G/01/e-=
mail/logos/a_de_prime_logo_48.gif" alt=3D"Amazon.de" border=3D"0" hspace=3D=
"6" vspace=3D"6">

Where does the 3D come from? Something with encoding seems to be broken?

any hints? ideas?

Parse incorrect when charset contains shift_jis

Mail headers contain shift_jis charset sometime such as below.

Subject: =?shift_jis?B?g1qDfoNJgVuDX4FbirSKb4LFkUmC1IFBg1eDg4NQg2KDZw==?=
 =?shift_jis?B?gUWDcIOTg2OBRYNYg0qBW4NngqqTb4/qgV6BdYKrgr+C8YLGgXaCxg==?=
 =?shift_jis?B?gXWDiYNOg2CDk4F2gvCBQo90gsySyovOlZ6CyYKogreCt4Lfg0GDQw==?=
 =?shift_jis?B?g2WDgA==?=

The charset is registered in IANA so conforming to RFC-2047.

I guess that it will solve by adding '_` to this regex pattern.

Thanks.

How to get all personnames?

Hi there,
first of all thank you for this library, its really great and I'm having a lot of fun with it!

I have one functionality that I haven't been able to get fully working which is the getPersonName method in a situation where there are multiple to addresses. I can get a list of all the email to addresses via getHeader('To')->getAddresses() and also via getHeader('To')->getParts() and I can get the first name of the first email address via $message->getHeader('To')->getPersonName(). However how do I get all full names if there is more than one email address?

thanks and kind regards
Meint

Illegal byte sequence while install

php composer.phar require zbateson/mail-mime-parser

- Installing zbateson/mail-mime-parser (0.4.8): Loading from cache
    Failed to execute unzip -qq  '/Users/evtuhovdo/www/project/vendor/zbateson/mail-mime-parser/0786fef0db1349231721e3580a31f74a' -d '/Users/evtuhovdo/www/project/vendor/composer/4a5fe707'

error:  cannot create /Users/evtuhovdo/www/project/vendor/composer/4a5fe707/zbateson-MailMimeParser-b05f32a/tests/_data/emails/files/HasenundFr��sche.txt
        Illegal byte sequence

    The archive may contain identical file names with different capitalization (which fails on case insensitive filesystems)
    Unzip with unzip command failed, falling back to ZipArchive class

text/rtf

If the email contains an attachment type text/rtf, then there is the unnecessary conversion encoding.
The problem is here:
PartStreamRegistry.php
private function attachCharsetFilterToStream(MimePart $part, $handle) { $contentType = strtolower($part->getHeaderValue('Content-Type', 'text/plain')); if (strpos($contentType, 'text/') === 0) {
Fix:
if ($contentType === 'text/plain' || $contentType === 'text/html') {

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.