Comments (11)
The hard coded use of utf-8 is likely the cause of bug #44 as well
from offlineimap3.
@thekix yes, sorry I forgot to close it after packaging the fix in Debian.
from offlineimap3.
Another user reported the same issue at https://bugs.debian.org/981685
from offlineimap3.
I'm affected by this as well. I'm using Danish characters in my email signature and have set send_charset = "us-ascii:iso-8859-1:utf-8"
in my muttrc. This allows mutt to recode my emails down to the first character set that works, which in most cases is ISO-8859-1, which offlineimap3 seems unhappy about.
Exciting with the Python 3 version of OfflineIMAP!
from offlineimap3.
Like @ahf , I have noticed this on sent mail using mutt as well.
Furthermore, the patch listed at Debian bug 981485 will result in issues when the message is synced to an IMAP server since the encoding is hard coded to utf-8 and will result in a discrepancy between the content type listed in the email header and the actual encoding. In other words the email will be encoded for 'utf-8' but say it's encoded as 'iso-8859-1' resulting in mangled text when viewed in an email client.
So a proper fix would either need to mangle the original message to change the encoding type, or the code will need to factor in and store the encoding so that it can be properly encoded/decoded at various points throughout the software.
A work around in the interim is to set send_charset = "us-ascii:utf-8" and avoid using other charsets like 'iso-8859-1'. The change to mutt to fix this offlineimap bug is not ideal but will sidestep the issue of composing messages in mutt for the time being at the cost of a few extra bytes here and there.
from offlineimap3.
I am affected by this bug as well (the change of encoding from utf-8 to iso-8859-1 is not due to German umlauts in my case, but to the “&” character).
Being the user who reported bug #44, I think that the two bugs are probably related indeed.
from offlineimap3.
Hello,
this bug is very interesting, but IMO, it is hard to solve it in the right way :-)
I will try to explain it:
# Interface from BaseFolder
def getmessage(self, uid):
"""Return the content of the message."""
filename = self.messagelist[uid]['filename']
filepath = os.path.join(self.getfullname(), filename)
file = open(filepath, 'rt')
retval = file.read()
file.close()
# TODO: WHY are we replacing \r\n with \n here? And why do we
# read it as text?
return retval.replace("\r\n", "\n")
This function reads the message as text, (rt
), then replace the carriage return). Probably this is not the right way to do it, and we should simply read the message as binary (take a look in the rb
, something like:
# Interface from BaseFolder
def getmessage(self, uid):
"""Return the content of the message."""
filename = self.messagelist[uid]['filename']
filepath = os.path.join(self.getfullname(), filename)
file = open(filepath, 'rb')
retval = file.read()
file.close()
return retval
The problem is we need make more changes in other parts. We need check the header to read some values.
IMO, there are three options to solve the problem:
- Deep analysis of this code and rewrite some functions. Read it as binary,...
- Include a new option in the configuration file to specify the charset (like mutt)
- Try to detect the charset
I will try with the last option, because it is backward compatible with offlineimap2 and it is faster than option 1 (I am very very busy these days)
Regards,
kix
PS. I won't close this bug, because I will try to check the option 1.
PS2. Please, the new patch includes a new library, chardet (python3-chardet) in Debian. @sudipm-mukherjee please, check the Depends
PS3. If the patch is working for you, please, add an smile or something to this post (as feedback). Thanks.
from offlineimap3.
this bug is very interesting, but IMO, it is hard to solve it in the right way :-)
IMO, there are three options to solve the problem:
1. Deep analysis of this code and rewrite some functions. Read it as binary,... 2. Include a new option in the configuration file to specify the charset (like mutt) 3. Try to detect the charset
I will try with the last option, because it is backward compatible with offlineimap2 and it is faster than option 1 (I am very very busy these days)
Regards,
kixPS. I won't close this bug, because I will try to check the option 1.
PS2. Please, the new patch includes a new library, chardet (python3-chardet) in Debian. @sudipm-mukherjee please, check the Depends
PS3. If the patch is working for you, please, add an smile or something to this post (as feedback). Thanks.
@thekix - I have made significant progress with option 1 with #48, if you have time to give it a look.... I can create a pull request if desired as I am just getting to testing the changes.
The problem with option 3 is that it won't work given that messages can contain multiple encodings and simply detecting one the encoding doesn't save you when you go to write it back to the server. I will explain further in #53 comments.
from offlineimap3.
Hi @jishac
Of course, IMO the option 1 is the best. I was checking your repo/patch, amazing!
Some comments:
Please double check the syntax, for example, some spaces here:
msg.add_header(headername,headervalue)
return msg.get_all(headername,[])
I think you are replacing the function "get_message_date()" (file emailutil):
- message_timestamp = emailutil.get_message_date(content, 'Date')
+ message_timestamp = self.get_message_date(msg, 'Date')
IMO is better change it in the same file. Take a look that you are changing all calls:
kix@inle:~/src/offlineimap3$ rgrep get_message_date * | grep -v binar
offlineimap/emailutil.py:def get_message_date(content, header='Date'):
offlineimap/folder/IMAP.py: rtime = emailutil.get_message_date(content)
offlineimap/folder/Maildir.py: message_timestamp = emailutil.get_message_date(content, 'Date')
offlineimap/folder/Maildir.py: message_timestamp = emailutil.get_message_date(
offlineimap/folder/Maildir.py: datestr = emailutil.get_message_date(content)
offlineimap/folder/Maildir.py: date = emailutil.get_message_date(content, 'Date')
offlineimap/folder/Maildir.py: datestr = emailutil.get_message_date(content)
kix@inle:~/src/offlineimap3$
Please, could you create a new pull request with these changes and with the current offlineimap status? (remove my stuff and include your code).
Again, thanks a lot for your amazing work!!
Best regards,
kix
from offlineimap3.
Hello @sudipm-mukherjee
probably we can close this bug. Is it ok?
Regards!
kix
from offlineimap3.
Thanks!!
from offlineimap3.
Related Issues (20)
- Why does the readme recommend an AUR package? HOT 3
- [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed error HOT 4
- Offlineimap config files not stored when installing via `pip` HOT 7
- APPEND command error: BAD ['invalid rfc5322 message: Sender should not be present if equal to From'] HOT 3
- offlineimap3 produces error due to defect in Spam email HOT 6
- Support py3.12 HOT 1
- ERROR: UID 11544 has defects preventing it from being processed! HOT 2
- No module named commands error (Debian) HOT 1
- Syncronization of previously deleted/moved emails: 'failed to label messages: Message does not exist'
- "tmp" folder conflicts with mapfilenametmp
- keyring has broken user environment and scripts HOT 2
- Imaputil quote function should escape backslashes first and then quotes
- TypeError: decoding with '136' codec failed (TypeError: utf7m_decode() takes 1 positional argument but 2 were given)
- Document reasoning for having maxsyncaccounts
- IMAP <> IMAP sync introduces a large fixed offset for the received date-time when mail headers contain a negative timezone (eg a USA-based remote mail host). HOT 1
- IMAP with outlook.office365.com and XOAUTH2 fails since 18-Feb-2024 with `command LIST illegal in state NONAUTH`
- Cannot handle 'already exists' and 'Cannot create this folder' messages when syncing folders with Courier IMAP
- Incompatible with Python 3.12 HOT 4
- KeyError: 'cygwin' when installing
- openssl password decryption error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from offlineimap3.