Giter VIP home page Giter VIP logo

go-message's Introduction

go-message

Go Reference

A Go library for the Internet Message Format. It implements:

Features

  • Streaming API
  • Automatic encoding and charset handling (to decode all charsets, add import _ "github.com/emersion/go-message/charset" to your application)
  • A mail subpackage to read and write mail messages
  • DKIM-friendly
  • A textproto subpackage that just implements the wire format

License

MIT

go-message's People

Contributors

2pi avatar alexwennerberg avatar alrs avatar antonholubenko avatar balejk avatar benburwell avatar brenns10 avatar brunnre8 avatar ddevault avatar dvalter avatar emersion avatar flyingshit-xinhuang avatar foxcpp avatar gonzih avatar guissimatheus avatar iredmail avatar jameshoulahan avatar jugendhacker avatar mnagy avatar mschneider82 avatar necoro avatar rockorager avatar rusart avatar spiral90210 avatar teodorescuserban avatar tgulacsi avatar threefx avatar yqylovy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-message's Issues

NextPart() fails with quoted-printable encoding and = in boundary

When I tried to parse a mail with some attachments using emersion/go-imap, I came across a strange behavior: If the mail is encoded in "quoted-printable", and the boundary contains any = signs, then the Reader.NextPart() fails to correctly detect the parts of the message, leading to an "unexpected EOF" on the first call.

Changing either the encoding to something else or changing the boundary to not contain the = sign avoids the error.

Is this an actual bug in the package, or am I doing something wrong?

Example code:

The following code reproduces the unexpected behavior of the MultiPartReader:

package main

import (
	"fmt"
	"github.com/emersion/go-message/mail"
	"io/ioutil"
	"strings"
)

func main() {
	content := `Content-Type: multipart/mixed;charset="utf-8";boundary="BrokenBoundary=="
Content-Transfer-Encoding: quoted-printable

--BrokenBoundary==
Content-Type: text/html;charset="utf-8"
Content-Transfer-Encoding: 8bit

Some Fancy Content.

--BrokenBoundary==--
`

	mr, _ := mail.CreateReader(strings.NewReader(content))
	for {
		part, err := mr.NextPart()
		if err != nil {
			fmt.Println(err)
			break
		}
		b, _ := ioutil.ReadAll(part.Body)
		fmt.Println(string(b))
	}
}

Expected output:

Some Fancy Content.

EOF

Actual output:

multipart: NextPart: unexpected EOF

unused code

textproto/multipart.go:34:2: U1000: field `disposition` is unused 
	disposition       string

textproto/multipart.go:35:2: U1000: field `dispositionParams` is unused 
	dispositionParams map[string]string

textproto/multipart.go:46:16: U1000: func `(*Part).parseContentDisposition` is unused 
func (p *Part) parseContentDisposition() {

Handle MIME-encoded address list headers

Currently, in mail/header.go, we use h.Get(key) to grab the header to
parse. Sometimes, address list headers include MIME-encoded segments,
particularly in the display names. Currently, we don't decode these
segments. Is this a bug? Is there some reason we shouldn't be using
h.Text(key) instead?

Happy to send a patch, just wanted to check if it is a valid fix first.

ParseHeader erroneously accepts spaces in header names

This is an issue that I encountered in Aerc, which uses this library:

https://todo.sr.ht/~sircmpwn/aerc2/341

According to RFC5322, spaces are not valid in email header field names. This parser, however, erroneously accepts headers with spaces in them. Relevant code is linked here. I noticed a reference to RFC 7230 in this code, but I'm not familiar enough to understand why. Should we be following RFC 7320 or RFC 5322 for the email headers?

Let me know if I'm missing something or misunderstanding the RFCs. And I'd be happy to submit a pull request to fix this! :)

Add limits to prevent abuse

It would be great to have options to limit the max number of mime parts to be processed and max nested levels to descend.

Commit our enhancement back to go standard library

Hi, I think our implementation of textproto.Header struct is more advanced than the go standard textproto.MIMEHeader (which is a map[string][]string), especially in the way of keeping the original raw headers ( necessary for DKIM ). Is it possible for us to submit a change request to the golang project, to make it be part of the go standard library to replace the textproto.MIMEHeader? In this way, we don't have to have copy of the multipart.go, also will get benefits for golang bug fixes.

Additional ascii charset

I've encountered several cases where a message has "utf-ascii" charset so I've tried to add it as a valid charset, like "us-ascii", and it works as exptected (in my cases).

Could you consider doing the same?
(I'll create a PR, if you like)

mail: doesn't work with git-send-email

From: emersion <redacted>
To: redacted
Subject: [PATCH] server: add wl_display_destroy_clients
Date: Mon, 11 Dec 2017 23:02:35 +0100
Message-Id: <20171211220235.17968-1>
X-Mailer: git-send-email 2.15.1

wl_display_destroy doesn't destroy clients, so client socket file descriptors are being kept open until the compositor process exits.

To maintain ABI compatibility, we cannot destroy clients in wl_display_destroy. Thus, a new wl_display_destroy_clients functions is added and should be called by compositors right before wl_display_destroy.

See https://patchwork.freedesktop.org/patch/128832/

Signed-off-by: emersion <redacted>
---
 src/wayland-server-core.h |  3 +++
 src/wayland-server.c      | 22 ++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/src/wayland-server-core.h b/src/wayland-server-core.h
index fd458c5..2e725d9 100644
--- a/src/wayland-server-core.h
+++ b/src/wayland-server-core.h
@@ -214,6 +214,9 @@ wl_display_run(struct wl_display *display);
 void
 wl_display_flush_clients(struct wl_display *display);
 
+void
+wl_display_destroy_clients(struct wl_display *display);
+
 struct wl_client;
 
 typedef void (*wl_global_bind_func_t)(struct wl_client *client, void *data,
diff --git a/src/wayland-server.c b/src/wayland-server.c
index 82a3b01..7f24ef1 100644
--- a/src/wayland-server.c
+++ b/src/wayland-server.c
@@ -1264,6 +1264,28 @@ wl_display_flush_clients(struct wl_display *display)
 	}
 }
 
+WL_EXPORT void
+wl_display_destroy_clients(struct wl_display *display)
+{
+	struct wl_list tmp_client_list;
+	struct wl_client *client, *next;
+
+	// Move the whole client list to a temporary head because some new clients
+	// might be added to the original head
+	wl_list_init(&tmp_client_list);
+	wl_list_insert_list(&tmp_client_list, &display->client_list);
+	wl_list_init(&display->client_list);
+
+	wl_list_for_each_safe(client, next, &tmp_client_list, link) {
+		wl_client_destroy(client);
+	}
+
+	if (!wl_list_empty(&display->client_list)) {
+		wl_log("wl_display_destroy_clients: cannot destroy all clients because "
+			   "new ones were created by destroy callbacks\n");
+	}
+}
+
 static int
 socket_data(int fd, uint32_t mask, void *data)
 {
-- 
2.15.1

Add buffered e-mail helpers

Not sure this is a good idea yet. Would need to have more use-cases in mind.

Some users of the library may want not to use a streaming API, for instance because they need to perform operations with multiple passes. We could expose APIs that let users process messages by loading them completely in memory, like almost all (or all?) other e-mail libraries do.

Of course this comes with the downside that whole messages (with their attachments) are loaded in memory.

Add a package with all charsets

It seems some people don't care about binary size and want to register all charsets supported by the Go x/text/encoding library. We could add a charset/all package which registers all of them.

returning error on len(line) > maxLineOctets can lead to missing attachments

Inside readLineSlice function:

if len(line) > maxLineOctets {
	return nil, TooBigError{"line"}
}

This implementation is very useful to ignore the contents that can be spam, however it can cause caller to skip next attachment files in a mail message. Is it possible to use break instead of returning with an error here ?

Add Entity.Reader

Add an Entity.Reader function which returns an io.Reader for the entity. This would allow users to read a header, inspect it and then forward the reader to someone else.

Ideally this wouldn't decode and re-encode the body: we'd save the original io.Reader. That can only work if the headers aren't modified.

Bonus points if message.Read can be optimized in case it's passed the result of an Entity.Reader call (just like we have for NewMultipart and Entity.MultipartReader).

How to read an attachment?

Could I ask for an example how to save an attachment to disk or as byte string?

I know that the name of an attachment can be obtained with something like this

	case mail.AttachmentHeader:
			// This is an attachment
			filename, _ := h.Filename()
			log.Println(filename)

thx

Error handling for charset errors

@mnagy raised this issue in #8

For the moment, charset errors are silently ignored. This is bad, but this allows the parser to do not hard fail on these errors.

A solution would be to return charset errors (which could be checked with a function like IsCharsetError) but to allow users to ignore this error if they want.

RFC 5322 character use limitation

RFC 5322 limits use of characters in general (US-ASCII) and for some specific scopes. For example in headers only printable ASCII characters ('!'..'~'), SPACE and HTAB are allowed and \r\n sequence is used for line folding.

It's currently possible to pass some disallowed symbols to go-message. The most noticeably it's a single \n in header values which is treated as a regular symbol or a part of \r\n sequence. As it was discussed in #77, the best way of handling such cases is just failing with an explicit error.

I have a few possible solutions. but I don't feel like any of them is clearly "the best":

  • checking whether \n is a part of the folding sequence; solves only a subproblem, but does not change behavior on externally folded line
  • Matching strings with a regex ("[\t -~]*" for headers); probably the most clean, but may be not great in terms of performance
  • Manually checking strings symbol by symbol, may introduce some possible optimizations like single pass foldLine but certainly is not as fancy as the previous one.

I don't know what would be better to implement (or maybe I don't see something else), so I'd like to discuss it here before coding.

Partial reads only leading to invalid multipart parsing

Hi there

For some reason, my IMAP emails aren't read entirerly. It seems to read only the first 4096 bytes of the message (which seems to be the peekBufferSize in Go's multipart), which leads to incomplete message bodies, and then an error from multipart: `multipart: expecting a new Part; got line "You can also turn email notifications off:\r\n"

My code is very simple (I'm using go-imap):

r := msg.GetBody("BODY[]")

		if r == nil {
			log.Errorf("Server returned empty body")
			continue
		}

		bodyParser, err := message.Read(r)
		if message.IsUnknownEncoding(err) {
			// Non-fatal error
			log.Noticef("Unknown encoding: %s", err)
		} else if err != nil {
			log.Errorf("Failed to read message: %s", err)
			continue
		}

		if mr := bodyParser.MultipartReader(); mr != nil {
			// This is a multipart message
			for {
				p, err := mr.NextPart()
				if err == io.EOF {
					break
				} else if err != nil {
					log.Errorf("Error parsing email part: %s", err) // This throws the error
					// log.Infof("%s", msgBytes)
					continue
				}

				t, _, _ := p.Header.ContentType()
				log.Infof("Part is type %s", t)

				part, err := ioutil.ReadAll(p.Body)

				log.Infof("Part: %s", part) // This shows an incomplete email
			}
		} else {
			t, _, _ := bodyParser.Header.ContentType()
			log.Infof("Non-multipart message of type %s", t)
		}

E-mails are simple Slack notification emails. I'm going to see if I can anonymize them and include them in the unit tests.

Tagging a new release

Hey! you have added the ability to decode more types of message charsets in your master branch. Could you tag a new release with those features @emersion ?

BTW really love your go email libraries so far.

some more exotic iso charmaps missing.

I found out today that iso-8859-9 is missing, causing applications using the charset package to just error out.
I'd rather keep using your package than just use a fork. PR is prepared if you want to add it.
Thank you for your awesome work.

make it a constant

header.go:57:10: string `text/plain` has 4 occurrences, make it a constant (goconst)
		return "text/plain", nil, nil

Simple example

Hi!
Where can I find simple example of reading email message body from imap server by entering login/password?

mail: access parent parts

It may be necessary to get the parent parts header when handling leaf parts, for instance to know if parts should be appended to each other (multipart/related) or if they are different versions of the same content (multipart/alternative).

But do we want to support multipart/related?

Tag master for vgo

Hi Emersion,

Thanks so much for the great library!

GoLand synced go.mod with v0.10.7 but build fails with:

header.Len undefined (type *"github.com/emersion/go-message/mail".Header has no field or method Len)

Compared master and v0.10.7 tag, bug is fixed in master. Can you tag master as v0.10.8 please?

Thanks!

mail: add a way to create non-multipart messages

This is enough for simple plaintext messages. API could look like this:

CreateSingleWriter(w io.Writer, msgHeader Header, inlineHeader *InlineHeader) (io.WriteCloser, error)

Where inlineHeader can be nil.

Encoding trimming

I received a mail with:

Content-Type: text/html; charset="UTF-8" 
Content-Transfer-Encoding: quoted-printable\n\n
X-Spam-Checker-Version: ... 

This lead to unhandled encoding error in encoding.go

Linefolding introduces illegal spaces into Q-Encoded subjects

This seems related to #5 and also to the recent changes to fix #44.

The end-effect we are seeing are subject lines that only display correctly in some email clients. As a particular example, MacOS Mail will display subjects like this one: =?utf-8?q? =F0=9F=8E=9F_Test_folding_of_subject_header_with_unicode_start?= without decoding them.

I think this is explainable as a bug as follows.

background
If a subject line contains utf-8 that needs encoding (e.g. emoji) then mime.QEncoding.Encode("utf-8", s) as called by EncodeHeader will produce one or more Q-Encoded words. For longer subject lines, formatHeaderField will call foldLine to split the subject line across multiple multiple lines.

the bug

  1. Q-Encoded words may not contain spaces (https://tools.ietf.org/html/rfc2047#section-2)
  2. The folding algorithm should only introduce CRLF before whitespace (https://tools.ietf.org/html/rfc5322#section-2.2.3)

However the current folding algorithm appears to detect some Q-encoded sequences as quoted-printable and introduces CRLFSP sequences inside the quoted-words. When the receiving mail client subsequently unfolds the subject only the CRLF are removed (leaving the SP). This results in invalid Q-Encoded words containing a space which are displayed literally to the end-user.

This analysis is supported by the fact that short subject lines containing emoji that do not require folding display correctly in MacOS Mail.

If I am correct in my analysis then the three test cases starting at https://github.com/emersion/go-message/blob/master/textproto/header_test.go#L311 are incorrect. All introduce \r\n into the middle of Q-encoded words. The last two of the three also appear to start with sequences that are not valid Q-Encodings (because they contain literal spaces).

default charset?

hello, I've stumbled upon your library and it's really great for me.

However I stumbled upon a line that might be wrong:
https://github.com/emersion/go-message/blob/master/entity.go#L52
shouldn't there be a default encoding of ASCII i.e. windows1252?

Currently if an email does not define an encoding it is normally encoded in ascii.

The code should probably be something like:

ch, ok := mediaParams["charset"]
if ! ok {
    ch = "ascii"
}
if converted, charsetErr := charset.Reader(ch, body); charsetErr != nil {
    err = unknownEncodingError{charsetErr}
} else {
    body = converted
}

(couldn't find a appropriate rfc yet)

Read part

I need to get only the text of the letter, without attachments. I get it through my BODY[1] When I try to read, I get an error: malformed MIME header line: ------=_Part_431250_2039599834.1547645922837, if I add BODY[HEADER], BODY[1] when searching, then the answer is in different Body section

Error return value not checked

We could do more checking of returned error values :-)

mail/mail.go:30:14: Error return value of `binary.Write` is not checked 
	binary.Write(&now, binary.BigEndian, time.Now().UnixNano())
mail/mail.go:31:11: Error return value of `rand.Read` is not checked 
	rand.Read(nonce)
multipart.go:66:21: Error return value of `w.CloseWithError` is not checked 
				w.CloseWithError(err)
multipart.go:71:21: Error return value of `w.CloseWithError` is not checked 
				w.CloseWithError(err)
textproto/header.go:304:16: Error return value of `r.UnreadByte` is not checked 
			r.UnreadByte()
textproto/header.go:317:14: Error return value of `r.UnreadByte` is not checked 
	r.UnreadByte()
textproto/multipart.go:234:9: Error return value of `io.Copy` is not checked 
	io.Copy(ioutil.Discard, p)
textproto/multipart.go:441:13: Error return value of `WriteHeader` is not checked 
	WriteHeader(&b, header)
writer.go:41:21: Error return value of `ww.mw.SetBoundary` is not checked 
			ww.mw.SetBoundary(mediaParams["boundary"])

Fetching message documentation error

Fetching message Doc

This section:

switch h := p.Header.(type) {
		case mail.TextHeader:
			// This is the message's text (can be plain-text or HTML)
			b, _ := ioutil.ReadAll(p.Body)
			log.Println("Got text: %v", string(b))
		case mail.AttachmentHeader:
			// This is an attachment
			filename, _ := h.Filename()
			log.Println("Got attachment: %v", filename)
		}

TextHeader does not even exist... I changed it to InlineHeader then it gives the following error.

impossible type switch case: part.Header (type "github.com/emersion/go-message/mail".PartHeader) cannot have dynamic type "github.com/emersion/go-message/mail".InlineHeader (Add method has pointer receiver)

and

impossible type switch case: part.Header (type "github.com/emersion/go-message/mail".PartHeader) cannot have dynamic type "github.com/emersion/go-message/mail".AttachmentHeader (Add method has pointer receiver)

And I can't find a way to find out the header... Any idea?

Option to turn off auto encoding and charset handling

First I want to say that this library is really awesome! Huge thanks.

I hope we can have an option to turn off the auto encoding/decoding and charset handling feature.

The reason is:

  1. There are too many non-conformant email in the real world and we will want to handle them in a customized way. For example, we optionally process those with supported charset, and bypass those unknown charset instead of failure.
  2. There are some cases that encoding/decoding are not necessary. Turn it off will increase the performance

Thank you

Add message.Pipe

Same as io.Pipe but for messages. This would allow to chain message transformations without having to format and parse messages during each step.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.