mailgun / flanker Goto Github PK
View Code? Open in Web Editor NEWPython email address and Mime parsing library
Home Page: http://www.mailgun.com
License: Apache License 2.0
Python email address and Mime parsing library
Home Page: http://www.mailgun.com
License: Apache License 2.0
Recent releases appear not to be tagged in git. This makes it hard to figure out if a certain change is part of a release. For example I'm trying to figure out why #21 is not included in the current release.
The build is not running with Python 3.
Has anyone ever tried running the tests with Python 3? Is there a reason why it's not supported yet?
When mail_exchanger_lookup()
fails to connect to the mail exchanger for a specific domain, it sets the corresponding cache entry to False
(line 149). However, lookup_exchanger_in_cache()
looks for the string 'False'
to figure out if there was an MX connection failure for a domain in cache.
This asymmetry means that caches that implement proper dict semantics (i.e. not coercing all values to strings unlike the Redis driver) fail miserably. I believe that a simple in memory cache like defaultdict(lambda: None)
should just work and not fail with TypeError. This behaviour is also the root cause for #31.
The email module has a convenient replace_header
method. I don't see a clear way to easily make this with Flanker.
I get the following when installing on Ubuntu:
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libffi' found
compilation terminated.
error: Setup script exited with error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
Hi
I have a use case where I would just like to ensure an email has a valid format. I would love to see some additional separation in the solution.
For this purpose this module has heavy dependencies: chardet dnsq expiringdict mock nose WebOb redis regex dnspython
Aside from my, perhaps slim, use-case there is also the matter of maybe separating mime-parsing and email address validation. I really don't foresee using these two areas in the same parts of my system.
I understand if these aren't relevant, but since your module seems good I figured I'd chip in my thoughts for my use-case.
This import is not working:
from flanker.addresslib import address
Traceback (most recent call last):
File "/home/jabur/PycharmProjects/scripts/tmp/flanker.py", line 1, in
from flanker.addresslib import address
File "/home/jabur/PycharmProjects/scripts/tmp/flanker.py", line 1, in
from flanker.addresslib import address
ImportError: No module named addresslib
Hey there, excellent work on the flanker library. I'm looking to use it in a docker environment where I do not have a local redis cache. I would like to submit a PR that adds configurable Redis tuning for RedisCache
.
My thought was to make it configurable via environment variables REDIS_HOST
, REDIS_PORT
, and REDIS_DB
with the defaults matching the current behavior. If I submitted this, what are the chances of it being included in a (near) future release?
If you do not like this approach, how would you like me to approach the PR?
Relevant here: https://github.com/mailgun/flanker/blob/master/flanker/mime/message/headers/wrappers.py#L185
Other languages deviate from the mostly standard Re:
prefix. For example, some Outlook clients in German locale will use "AW" (reply) and "WG" (forwarded).
A more complete list: http://en.wikipedia.org/wiki/List_of_email_subject_abbreviations#Abbreviations_in_other_languages
Maybe it's a bug...
[email protected] - It's a valid email but the API return it as invalid.
I've been using Flanker for a while with no issues, but I just got a new feed and I'm struggling to fix my problem.
Flanker finds the attachment type with this: msg.parts[1] and then it strips and decodes the attachment with this: msg.parts[1].body
The problem is, the attachments from my new feed aren't in those sections. I get IndexError: List out of range
I don't know how to look anywhere else. If I do msg.parts[2], that doesn't work. Any help?
I'm basically doing this:
message_string = sys.stdin.read() msg = mime.from_string(message_string) msg.headers.items() print msg.parts[1] print msg.parts[1].body
on this email: http://pastebin.com/A3EXGBTB
But can't get it to work.
I am getting errors for:
if msg.content_type.is_multipart(): for part in msg.parts: print 'Content-Type: {} Body: {}'.format(part, part.body)
This is error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u017e' in position 159: ordinal not in range(128)
It looks like your library is only working for use characters. That sucks.
Here are few chars: ka=BEem, prili=E8no in raw email.
Given a MIME header as follows:
Content-Type: multipart/alternative; boundary=decafbad
Doing a round-trip parse-then-serialize using Flanker will result in the header being represented as:
Content-Type: multipart/alternative; boundary="decafbad"
This is due to flanker.mime.messages.headers.encoding.encode_param
calling email.message._formatparam
without explicitly specifying the optional quote
argument (which defaults to True
.)
>>> email.message._formatparam('boundary', 'hello')
'boundary="hello"'
>>> email.message._formatparam('boundary', 'hello', quote=False)
'boundary=hello'
>>> email.message._formatparam('boundary', 'hel<lo')
'boundary="hel<lo"'
>>> email.message._formatparam('boundary', 'hel<lo', quote=False)
'boundary="hel<lo"'
As seen above, if Flanker passes quote=False
when calling email.message._formatparam
, the parameter value would only be quoted if the value does not contain any special characters.
AFAICT, header parameters can be always double-quoted (even if the value is a legal "token", and contains no special characters -- as the example above shows.) So, this is arguably not a bug.
However, I have found (empirically) that Gmail tends to generate hex boundary values, and do not double-quote it in the Content-Type
header. This causes DKIM signature (which covers the Content-Type
header) verification to fail on messages that have gone through this roundtrip conversion.
We have found this to be an issue with using Mailgun API to retrieve inbound messages with the Accept: message/rfc2822
header. We found that the message returned has an incorrectly re-encoded Content-Type
header boundary
parameter. This in turn causes DKIM verification to fail. I'm mentioning this because we think that Mailgun is using Flanker internally.
Obviously, the "fix" above could break in the opposite scenario where the original header had a double-quoted boundary
parameter value even when it contains no special characters. Unfortunately, I do not have evidence whether such messages are out there in the wild, and if so, how frequent.
I also realise that there are other ways that DKIM verification could fail when doing such roundtrip conversions. However, at least for a big major email provider (Gmail), fixing this header value encoding takes care of the issue in our testing.
When a header fails parsing, if there is a character that cannot be decoded to ASCII, it will cause logging to fail.
Traceback (most recent call last):
File "/app/src/flanker/flanker/mime/message/headers/encodedword.py", line 79, in mime_to_unicode
b64encode(header)))
File "/usr/lib/python2.7/base64.py", line 53, in b64encode
encoded = binascii.b2a_base64(s)[:-1]
UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 11: ordinal not in range(128)
FYI. :)
→ sudo pip install flanker
[…]
442 warnings generated.
cc -bundle -undefined dynamic_lookup -arch x86_64 -arch i386 -Wl,-F. build/temp.macosx-10.9-intel-2.7/Python2/_regex.o -o build/lib.macosx-10.9-intel-2.7/_regex.so
Running setup.py install for dnspython
Successfully installed flanker chardet dnsq expiringdict mock nose Paste redis regex dnspython
Cleaning up...
Then on the CLI
→ python
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from flanker.addresslib import address
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/flanker/addresslib/address.py", line 38, in <module>
import flanker.addresslib.parser
File "/Library/Python/2.7/site-packages/flanker/addresslib/parser.py", line 79, in <module>
from flanker.mime.message.headers.encoding import encode_string
File "/Library/Python/2.7/site-packages/flanker/mime/__init__.py", line 61, in <module>
from flanker.mime.message.errors import DecodingError, EncodingError, MimeError
File "/Library/Python/2.7/site-packages/flanker/mime/message/__init__.py", line 1, in <module>
from flanker.mime.message.scanner import ContentType
File "/Library/Python/2.7/site-packages/flanker/mime/message/scanner.py", line 4, in <module>
from flanker.mime.message.headers import parsing, is_empty, ContentType
File "/Library/Python/2.7/site-packages/flanker/mime/message/headers/__init__.py", line 1, in <module>
from flanker.mime.message.headers.headers import MimeHeaders
File "/Library/Python/2.7/site-packages/flanker/mime/message/headers/headers.py", line 1, in <module>
from paste.util.multidict import MultiDict
ImportError: No module named paste.util.multidict
See this issue for full example: nylas/sync-engine#174
Hello,
Thanks for your working. I'm using this package to parse email but I meet a problem.
The question is raised from
flanker/flanker/mime/message/headers/parsing.py
Lines 61 to 65 in 8421b0e
Is the MAX_LINE_LENGTH necessary?
The Yahoo e-mail plugin rejects disposable e-mail addresses created using Yahoo's "AddressGuard" feature. This feature allows accounts to create a basename prefix and then append a hyphen (-) and a keyword to end. It's similar to Gmail's + suffix but the basename is different from the Yahoo username.
The basename and keyword parts appear to support the same characters as a normal username: [ a-z, 0-9, dot, period ]. There doesn't appear to be a minimum length on keyword value as I have seen e-mail addresses using "1" as their keyword.
A few links explaining the functionality are available from Yahoo:
Adding a HACKING.md
would be helpful to potential contributors
nosetests
?)flanker can't decode some messages with charset "gb2312" e.g.:
Content-Type: text/plain;
charset="gb2312"
Content-Transfer-Encoding: base64
DQogICCyze+LmEkNCiAgIDIwMTUtMS0yOQ0K
See StackOverflow for more details.
Currently regex has a hard version pin on a very old version of the regex package. This version does not support Python 3, which results in flanker also not being usable in Python 3 projects.
The version pin has a comment that indicates that this is done for performance reasons. I am wondering a few things:
I was developing something utilizing Flanker in windows successfully. When I switched over to a Mac, connect_to_mail_exchanger() began timing out. Using netcat on linux, mac, and windows, I've discovered that I can't establish a connection to the mail server on port 25 at all. I believe it's because FIOS is blocking it. I can't explain why the code seemed to be working on windows and linux.
Would it be useful to provide a method to suppress this check for those developing (seems likely) or serving (less likely) on a connection that's blocking connection on port 25?
from flanker.addresslib import address
email = address.parse('foo@examplecom', addr_spec_only=True)
print email
gives out: foo@examplecom
Hi there,
I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like flanker, deleted their software history.
Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask flanker developers the following four brief questions:
Thanks in advance for your collaboration,
Gustavo Pinto, PhD
http://www.gustavopinto.org
Flanker parses attachment's body from email as None:
In [19]: mail = open("/tmp/1.txt", "rb").read()
In [20]: mail
Out[20]: 'Delivered-To: [email protected]\nReceived: by 10.27.184.6 with SMTP id i6csp81852wlf;\n Wed, 9 Sep 2015 01:42:19 -0700 (PDT)\nX-Received: by 10.180.75.176 with SMTP id d16mr54538910wiw.75.1441788139160;\n Wed, 09 Sep 2015 01:42:19 -0700 (PDT)\nReturn-Path: <[email protected]>\nReceived: from mail-wi0-f180.google.com (mail-wi0-f180.google.com. [209.85.212.180])\n by mx.google.com with ESMTPS id kf6si11215377wjb.11.2015.09.09.01.42.19\n for <[email protected]>\n (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n Wed, 09 Sep 2015 01:42:19 -0700 (PDT)\nReceived-SPF: softfail (google.com: domain of transitioning [email protected] does not designate 209.85.212.180 as permitted sender) client-ip=209.85.212.180;\nAuthentication-Results: mx.google.com;\n spf=softfail (google.com: domain of transitioning [email protected] does not designate 209.85.212.180 as permitted sender) [email protected]\nReceived: by wiclk2 with SMTP id lk2so12858549wic.1\n for <[email protected]>; Wed, 09 Sep 2015 01:42:19 -0700 (PDT)\nX-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20130820;\n h=x-gm-message-state:from:content-type:subject:message-id:date:to\n :mime-version;\n bh=iADgoaV1xyNMOvT6xlZQjOp+2r7wUfsTdfZA3wUI/eg=;\n b=H6GiBZGlEfhUxlw6ytg1vbcHiqXd69rOWl0z09HqH6ywhG8dSDXlFFVfe0rYKVZAjc\n bD0YAjmEAw1BjgRJUXMsVa4zS48+iRLSqRboeWBjnbxJAseUHesxCKzCOd0FTITxHAA6\n S9E3MwSqUv+zwK6ES7DV90X0hWvxVUyzzVSDtemBnV/rkWr7jlZ9uyAvnaK7dztiTZos\n lKwuz4+H0OvDw0LV1d1y/23rr0R6TMGqd8QmGnlVqyCTI8E6LQjoeHWaQ3b7tLJxHMtM\n d5NIhqkRAl58aVSSTSAbKOEiAUqgBq98ZJpz4q5Nw3stPdu1btF/uDxyLUyaQmoTU8nr\n vIUA==\nX-Gm-Message-State: ALoCoQmaqZjZegl0KF6Y/see4tzw8O/hXN1+vW7W0waIfhTff9DYQa3y+iMBYjCE6XlOJAsq2d1U\nX-Received: by 10.180.230.197 with SMTP id ta5mr31843529wic.26.1441788138945;\n Wed, 09 Sep 2015 01:42:18 -0700 (PDT)\nReturn-Path: <[email protected]>\nReceived: from [192.168.1.9] (215-81-133-95.pool.ukrtel.net. [95.133.81.215])\n by smtp.gmail.com with ESMTPSA id fn8sm2658059wib.2.2015.09.09.01.42.18\n for <[email protected]>\n (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);\n Wed, 09 Sep 2015 01:42:18 -0700 (PDT)\nFrom: Michael Korbakov <[email protected]>\nContent-Type: multipart/mixed; boundary="Apple-Mail=_C9C2B061-E965-4274-A8D2-0CAAB92A2F17"\nSubject: =?utf-8?Q?Fwd=3A_=D1=82=D0=B5=D1=81=D1=82_ApplMail?=\nMessage-Id: <[email protected]>\nDate: Wed, 9 Sep 2015 11:42:16 +0300\nTo: Anton Koval <[email protected]>\nMime-Version: 1.0 (Mac OS X Mail 8.2 \\(2104\\))\nX-Mailer: Apple Mail (2.2104)\n\n\n--Apple-Mail=_C9C2B061-E965-4274-A8D2-0CAAB92A2F17\nContent-Transfer-Encoding: base64\nContent-Type: text/plain;\n\tcharset=utf-8\n\n0KLQtdGB0YINCg==\n--Apple-Mail=_C9C2B061-E965-4274-A8D2-0CAAB92A2F17\nContent-Disposition: attachment;\n\tfilename*=utf-8\'\'%D1%82%D0%B5%D1%81%D1%82%20ApplMail.eml\nContent-Type: message/rfc822;\n\tx-mac-hide-extension=yes;\n\tname="=?utf-8?Q?=D1=82=D0=B5=D1=81=D1=82_ApplMail=2Eeml?="\nContent-Transfer-Encoding: 7bit\n\nDelivered-To: [email protected]\nReceived: by 10.27.173.129 with SMTP id w123csp81617wle;\n Wed, 9 Sep 2015 01:38:33 -0700 (PDT)\nX-Received: by 10.180.101.164 with SMTP id fh4mr54549269wib.25.1441787913118;\n Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nReturn-Path: <[email protected]>\nReceived: from mail-wi0-f172.google.com (mail-wi0-f172.google.com. [209.85.212.172])\n by mx.google.com with ESMTPS id l20si11145816wjw.125.2015.09.09.01.38.33\n for <[email protected]>\n (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nReceived-SPF: softfail (google.com: domain of transitioning [email protected] does not designate 209.85.212.172 as permitted sender) client-ip=209.85.212.172;\nAuthentication-Results: mx.google.com;\n spf=softfail (google.com: domain of transitioning [email protected] does not designate 209.85.212.172 as permitted sender) [email protected]\nReceived: by wicfx3 with SMTP id fx3so12722933wic.0\n for <[email protected]>; Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nX-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20130820;\n h=x-gm-message-state:mime-version:date:message-id:subject:from:to\n :content-type;\n bh=yCsUOUoICXWy85xw3Vv9Q1aD4vzogMR3DYZ0+y+X65w=;\n b=jIoY2gfVjfrjya69jmfc3W/2p108e3c1+3TDCbKQNAgT2B1BQfEtvZqWXHXcij4NbK\n YCyiva/EJ3CBFQ9C0B4j+fiKUCi5DLTkUXF6E9W6INNM5HFdQJiIpWqi/kHdI7gtBN2G\n 6WlrcVm1NHvAvESVh7j65iDcmdimNDC/zjrUFb0nCkpvmldmhP0dJGTJ0K8t1Ho/3RhH\n 6P2idxr+HpR1RbEaV5+0ehUmiEiVZinaEyGUuT8fcLnsz/ztCy6LueNIyT+jmNbvz1HH\n /R5/lIykXTI40Q5FUz5vuLxx09u1s4f7JPvAWISnzmQTm51NIesZQ65F94twSce8wLtC\n A7sA==\nX-Gm-Message-State: ALoCoQmKzyo22NkyAdTk4HymZiAf2dqGE2bdsk3Drc+uloOqVIX4tuICf8KIgMN5JuNsFIQbZlFn\nMIME-Version: 1.0\nX-Received: by 10.194.201.71 with SMTP id jy7mr56686474wjc.93.1441787912788;\n Wed, 09 Sep 2015 01:38:32 -0700 (PDT)\nReceived: by 10.27.176.135 with HTTP; Wed, 9 Sep 2015 01:38:32 -0700 (PDT)\nDate: Wed, 9 Sep 2015 11:38:32 +0300\nMessage-ID: <CABxjYs9Kk=4OnT6uYrR5=kiQ3H+yGw9XNcu_JaJvwRDU_U5GSA@mail.gmail.com>\nSubject: =?UTF-8?B?0YLQtdGB0YIgQXBwbE1haWw=?=\nFrom: Anton Koval <[email protected]>\nTo: Michael Korbakov <[email protected]>\nContent-Type: multipart/alternative; boundary=047d7bae4944623830051f4c6811\n\n--047d7bae4944623830051f4c6811\nContent-Type: text/plain; charset=UTF-8\nContent-Transfer-Encoding: base64\n\n0JAg0YHQtNC10LvQsNC5INGA0LXQv9C70LDQuSDQvdCwINGN0YLQviDQv9C40YHRjNC80L4g0YfQ\ntdGA0LXQtyDRjdC/0L/Qu9C+INC60LvQuNC10L3Rgi4NCg==\n--047d7bae4944623830051f4c6811\nContent-Type: text/html; charset=UTF-8\nContent-Transfer-Encoding: base64\n\nPGRpdiBkaXI9Imx0ciI+0JAg0YHQtNC10LvQsNC5INGA0LXQv9C70LDQuSDQvdCwINGN0YLQviDQ\nv9C40YHRjNC80L4g0YfQtdGA0LXQtyDRjdC/0L/Qu9C+INC60LvQuNC10L3Rgi48YnI+PC9kaXY+\nDQo=\n--047d7bae4944623830051f4c6811--\n\n--Apple-Mail=_C9C2B061-E965-4274-A8D2-0CAAB92A2F17--'
In [21]: msg = mime.from_string(mail)
In [23]: parts = [p for p in msg.walk(with_self=True)]
In [24]: parts
Out[24]:
[<flanker.mime.message.part.MimePart at 0x10bfa7d50>,
<flanker.mime.message.part.MimePart at 0x10bfa7ad0>,
<flanker.mime.message.part.MimePart at 0x10bfa7cd0>,
<flanker.mime.message.part.MimePart at 0x10bfa7c50>,
<flanker.mime.message.part.MimePart at 0x10bfa7b50>,
<flanker.mime.message.part.MimePart at 0x10bfa7bd0>]
In [25]: [(p.is_attachment(), p) for p in parts]
Out[25]:
[(False, <flanker.mime.message.part.MimePart at 0x10bfa7d50>),
(False, <flanker.mime.message.part.MimePart at 0x10bfa7ad0>),
(True, <flanker.mime.message.part.MimePart at 0x10bfa7cd0>),
(False, <flanker.mime.message.part.MimePart at 0x10bfa7c50>),
(False, <flanker.mime.message.part.MimePart at 0x10bfa7b50>),
(False, <flanker.mime.message.part.MimePart at 0x10bfa7bd0>)]
In [26]: attach = parts[2]
In [27]: attach.dete
attach.detected_content_type attach.detected_file_name attach.detected_format attach.detected_subtype
In [27]: attach.detected_file_name
Out[27]: u'\u0442\u0435\u0441\u0442 ApplMail.eml'
In [28]: attach.body is None
Out[28]: True
However:
In [29]: p_attach = attach.to_python_message()
In [32]: p_attach.get_payload()[0].as_string()
Out[32]: 'Delivered-To: [email protected]\nReceived: by 10.27.173.129 with SMTP id w123csp81617wle;\n Wed, 9 Sep 2015 01:38:33 -0700 (PDT)\nX-Received: by 10.180.101.164 with SMTP id fh4mr54549269wib.25.1441787913118; \n Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nReturn-Path: <[email protected]>\nReceived: from mail-wi0-f172.google.com (mail-wi0-f172.google.com.\n [209.85.212.172])\n by mx.google.com with ESMTPS id l20si11145816wjw.125.2015.09.09.01.38.33\n for <[email protected]>\n (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nReceived-SPF: softfail (google.com: domain of transitioning [email protected]\n does not designate 209.85.212.172 as permitted sender)\n client-ip=209.85.212.172; \nAuthentication-Results: mx.google.com;\n spf=softfail (google.com: domain of transitioning [email protected] does not\n designate 209.85.212.172 as permitted sender) [email protected]\nReceived: by wicfx3 with SMTP id fx3so12722933wic.0\n for <[email protected]>; Wed, 09 Sep 2015 01:38:33 -0700 (PDT)\nX-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20130820;\n h=x-gm-message-state:mime-version:date:message-id:subject:from:to\n :content-type;\n bh=yCsUOUoICXWy85xw3Vv9Q1aD4vzogMR3DYZ0+y+X65w=;\n b=jIoY2gfVjfrjya69jmfc3W/2p108e3c1+3TDCbKQNAgT2B1BQfEtvZqWXHXcij4NbK\n YCyiva/EJ3CBFQ9C0B4j+fiKUCi5DLTkUXF6E9W6INNM5HFdQJiIpWqi/kHdI7gtBN2G\n 6WlrcVm1NHvAvESVh7j65iDcmdimNDC/zjrUFb0nCkpvmldmhP0dJGTJ0K8t1Ho/3RhH\n 6P2idxr+HpR1RbEaV5+0ehUmiEiVZinaEyGUuT8fcLnsz/ztCy6LueNIyT+jmNbvz1HH\n /R5/lIykXTI40Q5FUz5vuLxx09u1s4f7JPvAWISnzmQTm51NIesZQ65F94twSce8wLtC\n A7sA==\nX-Gm-Message-State: ALoCoQmKzyo22NkyAdTk4HymZiAf2dqGE2bdsk3Drc+uloOqVIX4tuICf8KIgMN5JuNsFIQbZlFn\nMIME-Version: 1.0\nX-Received: by 10.194.201.71 with SMTP id jy7mr56686474wjc.93.1441787912788;\n Wed, 09 Sep 2015 01:38:32 -0700 (PDT)\nReceived: by 10.27.176.135 with HTTP; Wed, 9 Sep 2015 01:38:32 -0700 (PDT)\nDate: Wed, 9 Sep 2015 11:38:32 +0300\nMessage-ID: <CABxjYs9Kk=4OnT6uYrR5=kiQ3H+yGw9XNcu_JaJvwRDU_U5GSA@mail.gmail.com>\nSubject: =?UTF-8?B?0YLQtdGB0YIgQXBwbE1haWw=?=\nFrom: Anton Koval <[email protected]>\nTo: Michael Korbakov <[email protected]>\nContent-Type: multipart/alternative; boundary=047d7bae4944623830051f4c6811\n\n--047d7bae4944623830051f4c6811\nContent-Type: text/plain; charset=UTF-8\nContent-Transfer-Encoding: base64\n\n0JAg0YHQtNC10LvQsNC5INGA0LXQv9C70LDQuSDQvdCwINGN0YLQviDQv9C40YHRjNC80L4g0YfQ\ntdGA0LXQtyDRjdC/0L/Qu9C+INC60LvQuNC10L3Rgi4NCg==\n--047d7bae4944623830051f4c6811\nContent-Type: text/html; charset=UTF-8\nContent-Transfer-Encoding: base64\n\nPGRpdiBkaXI9Imx0ciI+0JAg0YHQtNC10LvQsNC5INGA0LXQv9C70LDQuSDQvdCwINGN0YLQviDQ\nv9C40YHRjNC80L4g0YfQtdGA0LXQtyDRjdC/0L/Qu9C+INC60LvQuNC10L3Rgi48YnI+PC9kaXY+\nDQo=\n--047d7bae4944623830051f4c6811--\n'
Maybe I'm trying to get attachment's body in a wrong way?
There is no LONG_DESCRIPTION
in setup.py
, which means if you go to the pypi page for flanker you don't see any information about the app.
Since the URL is to Mailgun there is also no easy way for somebody to find the documentation (which is inside this repo)
Best way to handle this is probably to just use RST for the core README and then have long_description pull from that and change the URL to point to this repo instead of the main mailgun site.
Is expiringdict
still required for this project? The github search bar only finds it in the setup.py
. If it is no longer needed, it should be removed from the setup.py
.
Following this:
https://github.com/mailgun/flanker
I get this:
ubuntu@ubuntu:~/Desktop/python/flanker$ python setup.py install
Traceback (most recent call last):
File "setup.py", line 4, in <module>
from setuptools import setup, find_packages
ImportError: No module named setuptools
This fixes it:
wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python
Hello there,
I recently bumped into strange case.
Passing this url to address.validate_list function, returns it as it is a valid email address:
http://mail.bg/#message/inbox/1/1/all
>>> address.validate_list("http://mail.bg/")
[http://mail.bg/]
This happens also with this url:
http://broshura.bg/shops/hippoland
At the same time, other urls where not returned as not valid email which is the expected result.
Passing the urls from above to address.validate_address works as expected and returns None.
>>> address.validate_address("http://mail.bg/")
>>>
Regards,
Lyubo
Hi,
We use flanker to parse MIME parts and I think I found a special case where the parser crashes on slightly-malformed content. I've narrowed it to the following test case:
Delivered-To: [email protected]
Date: 11 Jan 2013 18:54:26 -0000
MIME-Version: 1.0
To: <[email protected]>
Message-ID: <1357884894.S.69618.18751.f5mail-224-118.example.com>
Sender: [email protected]
Subject: Dear sir
From: "John Doe " <[email protected]>
Content-Type: multipart/mixed;
boundary="=_e6ddd3579a993208589b263b76d66bec"
--=_e6ddd3579a993208589b263b76d66bec
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="UTF-8"
URGENT - HELP ME DISTRIBUTE MY $15 MILLION TO CHARITY
IN SUMMARY:- I have 15,000,000.00 (fifteen million) U.S. Dollars and I want you to assist me in distributing the money to charity organizations.
--=_e6ddd3579a993208589b263b76d66bec
Content-Transfer-Encoding:
Content-Type: message/rfc822;
name="ForwardedMessage";
Content-Disposition: inline;
filename="ForwardedMessage";
--=_e6ddd3579a993208589b263b76d66bec--
I think the parser expects the Message part to be followed by \r\n
which makes it crash.
I use the following program to trigger the bug:
import sys
from flanker import mime
fd = open(sys.argv[1], "r")
contents = fd.read()
parsed = mime.from_string(contents)
for mimepart in parsed.walk(with_self=parsed.content_type.is_singlepart()):
print mimepart.headers
The traceback is:
Traceback (most recent call last):
File "/contrib/flanker/testcase.py", line 11, in <module>
print mimepart.headers
File "/contrib/flanker/flanker/mime/message/part.py", line 389, in headers
return self._container.headers
File "/contrib/flanker/flanker/mime/message/part.py", line 42, in headers
self._load_headers()
File "/contrib/flanker/flanker/mime/message/part.py", line 65, in _load_headers
self.stream.seek(self.start)
TypeError: an integer is required
I'd be happy to contribute a patch if you could point me in the right direction.
Suppose CompanyA buys CompanyB and maintains their email service but changes the MX servers of CompanyB to point to the MX servers of CompanyA. This throws our grammar check for a loop, are we checking the grammar of CompanyA or CompanyB?
A modern day example, AOL bought CompuServe. So when someone tries to validate [email protected] (where x is an integer) we try against AOL grammar and mark it as invalid.
We shouldn't rely completely on MX servers, but also take the domain name into consideration when checking custom grammar.
The culprit is this piece of code in flanker/mime/message/part.py:
def detected_file_name(self):
...
cdisp = self.content_disposition
if cdisp.value == 'attachment':
file_name = cdisp.params.get('filename', '') or file_name
It doesn't check inline
content disposition
For an example message:
It has non-ascii or utf8 characters in its Subject line.
This only causes an error on accessing the .subject
property.
to_unicode
in ./flanker/flanker/mime/message/headers/parsing.py
does:
return unicode(val, 'utf-8', 'strict')
However other places that are trying to convert strings to utf8 will use ignore, for example flanker.utils.to_utf8
.
See the following issue on the validator-demo repo: mailgun/validator-demo#5
The basic issue is that MS Exchange allows apostrophes, so the validator should as well even though RFC grammar does not.
I've received failed validation on a number of addresses containing the swedish characters åäö as well as the slightly more french é. From what I've gathered any UTF-8 is valid (https://en.wikipedia.org/wiki/International_email#Email_addresses). It might not be nice and I will consider rejecting these anyway, but what kind of ruleset does Flanker actually run on?
When attempting to extract emails from the Sent folder, the MIME parts include the original message but not the actual sent response. Is that by design?
A domain DNS is with one server where MX record has been in another server it says invalid email address
We're using flanker to parse emails from Gmail API. Data we receive for "From:" header looks like "some\"thing" <[email protected]>
and is not treated as valid email by Flanker.
I understand that this string can be invalid according to RFC, but that's what we're getting from Gmail.
Original message has next From field:
From: =?iso-8859-1?Q?W=F6rz=2C_Michael?= <[email protected]>
after parsing whole message with flanker.mime.from_string()
:
ipdb> sndr = headers['From']
ipdb> sndr
u'W\xf6rz, Michael <[email protected]>'
and
ipdb> address.parse(sndr) is None
True
Looks like something wend wrong on step of decoding value in From
header?
Flanker's EmailAddress
object implements an __eq__
method to support comparison with strings, which is super convenient, but the missing __ne__
(and friends) creates some crazy behavior:
>>> email_address = address.parse('[email protected]')
>>> email_address == '[email protected]'
True
>>> email_address != '[email protected]'
True
Both of these things can't be true, obviously. The answer here is to implement __ne__
and the rest of the python comparison methods.
How does flanker work with large attachments? From what I saw you need to pass an string to flanker, so it means you first need to read the whole mime message into a string, therefore it will all be in memory right? Is there a way to use streams?
Some email clients use the form:
First Last ([email protected])
Flanker doesn't parse this. You may have a reason for not allowing it and if so that's fine, but since some common email clients use that form it is probably worth adding support for?
It looks like according to RFC1341 section 7.2.1 and RFC822 section 4.1 two CRLF between parts should be enough, however:
>>> s = """Content-Type: multipart/mixed; boundary="----=_20140710132934_74779"
...
... ------=_20140710132934_74779
... Content-Type: text/plain; charset="windows-1251"
... Content-Transfer-Encoding: 8bit
...
... ------=_20140710132934_74779
... Content-Type: application/x-zip; name="ZIP-1.zip"
... Content-Transfer-Encoding: base64
... Content-Disposition: attachment; filename="ZIP-1.zip"
...
... xxx"""
>>>
>>>
>>>
>>> m = mime.from_string(s)
>>> m.parts
deque([<flanker.mime.message.part.MimePart object at 0x2a84e90>, <flanker.mime.message.part.MimePart object at 0x2a84f10>])
>>> m.parts[0].content_type
('text/plain', {'charset': u'windows-1251'})
>>> m.parts[0].body
u'------=_20140710132934_74779\nContent-Type: application/x-zip; name="ZIP-1.zip"\nContent-Transfer-Encoding: base64\nContent-Disposition: attachment; filename="ZIP-1.zip"\n\nxxx'
>>>
Hi
How should it work the function for validating the email
from flanker.addresslib import address
if name == 'main':
isValid = address.parse('[email protected]', addr_spec_only=True)
isValid2 = address.validate_address('[email protected]')
print isValid
print isValid2
The email [email protected] does not exist and the seconde [email protected] is my email
The result is:
[email protected]
None
it's not working
The character detection code in chardet
is very very slow. A simple profiling of flanker parsing shows that ~85% of CPU time is spent in chardet
.
I know something like https://github.com/hbattat/verifyEmail can use PHP to validate the email.
but use it oftenly will cause the IP blocked, so the amount is limited.
I want to know Flanker will hurt the IP reputation or not?
Thank you
Hi,
We are using the flanker validation (we are using the webservice, not selfhosted) for our webapplication and I'm getting a complaint from a user using a hotmail.nl address. His email address is:
I've changed the letters and numbers to anonymize the email address. The api response is:
{
"address": "[email protected]",
"did_you_mean": "[email protected]",
"is_valid": false,
"parts": {
"display_name": null,
"domain": null,
"local_part": null
}
I have tested the email address and can confirm the email address exists.
According to the HTTP RFC 2388:
Field names originally in non-ASCII character sets may be encoded
within the value of the "name" parameter using the standard method
described in RFC 2047.
This may occur in cases where an HTTP payload is converted in to SMTP MIME. An HTTP payload uses the Content-Type to define that the file is an "application/octet-stream". This is invalid for SMTP MIME, thus, Flanker's conversion methods.
If Content-Type value is broken or missing, Flanker will attempt to reconstruct in method fix_content_type().
As a result, if an encoded file name is present, value.lower().split("/"), could truncate the name if a slash exists in the filename.
This method should probably check and ignore encoded filenames by inspecting the string for UTF prefixes. "=?UTF-8?b?0L/RgNC+0LHQu9C10LzQsC5wbmc=?="
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.