Giter VIP home page Giter VIP logo

Comments (17)

dnebauer avatar dnebauer commented on June 17, 2024 1

With kix' new script I am now happily running offlineimap3. You are correct that this problem was with the pythonscript and not core offlineimap, but I suspect this solution will be useful to many users. Thank you both for helping me through the upgrade.

from offlineimap3.

sudipm-mukherjee avatar sudipm-mukherjee commented on June 17, 2024 1

Thanks @thekix

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

HI @sudipm-mukherjee

after checking offlineimap and this issue, IMO the problem is not related to offlineimap. I will try to explain the problem.

Using Python 2

offlineimap is using the mailfolders as byte objects. Then, the object is sent to the lambda function used in nametrans:

nametrans = lambda foldername: foldername.decode('imap4-utf-7').encode('utf-8')

This call is more or less:

def nametrans(byte_object_in_imap4-utf-7):
    string_in_imap4-utf-7 = byte_object_in_imap4-utf-7.decode('imap4-utf-7')
    byte_object_in_utf8 = string_in_imap4-utf-7.encode('utf-8')
    return byte_object_in_utf8

So the function receives a byte object and returns a byte object. It uses an string to change one format to other.

Using Python 3

offlineimap is using the mailfolders as (utf-8) string objects. Then, the object is sent to the lambda function used in nametrans. This function is waiting for a bytes object and then it fails in the byte_object_in_imap4-utf-7.decode('imap4-utf-7') and we get the error:

   ERROR: While attempting to sync account 'testmail'
      'str' object has no attribute 'decode'

Then?

IMO, the problem is not offlineimap, and then, we should not change offlineimap for it. In Python 3, all strings objects are already UTF-8, so we don't need convert them. If we need change the string format, we need change it in a different way in the attached offlineimap.py script.

The new script should convert an (utf-8) "string imap_utf7" to (utf-8) "string utf-8".

offlineimap is doing this conversion in some places. For example, check the function "decode_mailbox_name" and other functions like "IMAP_utf8" and "utf8_IMAP" (in the same file, next functions):

https://github.com/OfflineIMAP/offlineimap3/blob/master/offlineimap/imaputil.py#L321

I saw these functions when I migrated offlineimap to Python 3. Probably these functions are not perfect. I check them and check the functions used in IMAPClient (https://github.com/mjs/imapclient), and probably the functions in imapclient could be better. But this is other history.

Probably you can check this code: https://github.com/MarechJ/py3_imap_utf7 to use as lambda in the offlineimap.py file. Seems to be better than imapclient. I didn't test it, but probably you can use it as (including it as pythonfile; do not forget include the import subprocess if you are using the password in remotepasseval):

nametrans = lambda foldername: decode(encode(foldername))

Please, try this new option. If the new options is working and you agree with my explanation, feel free to close this bug.

Best Regards,
Rodolfo (kix)

from offlineimap3.

dnebauer avatar dnebauer commented on June 17, 2024

I'm the downstream bug reporter that @sudipm-mukherjee has been helping. Using your link I created the following pythonfile:

import binascii
# import codecs
# from base64 import b64encode, b64decode

# encoding


def modified_base64(s):
    s = s.encode('utf-16be')
    return binascii.b2a_base64(s).rstrip(b'\n=').replace(b'/', b',')
    # return b64encode(s, b"+,")#.rstrip('\n=')


def doB64(_in, r):
    if _in:
        r.append(b'&' + modified_base64(''.join(_in)) + b'-')
    del _in[:]


def encode(s: str):
    r = []
    _in = []
    for c in s:
        ordC = ord(c)
        if 0x20 <= ordC <= 0x25 or 0x27 <= ordC <= 0x7e:
            doB64(_in, r)
            r.append(c.encode())
        elif c == '&':
            doB64(_in, r)
            r.append(b'&-')
        else:
            _in.append(c)
    doB64(_in, r)
    return b''.join(r)


# decoding


def modified_unbase64(s):
    # b = b64decode(s, b"+,")
    b = binascii.a2b_base64(s.replace(b',', b'/') + b'===')
    return b.decode('utf-16be')


def decode(s: bytes):
    r = []
    decode = bytearray()
    for c in s:
        if c == ord('&') and not decode:
            decode.append(ord('&'))
        elif c == ord('-') and decode:
            if len(decode) == 1:
                r.append('&')
            else:
                ab = modified_unbase64(decode[1:])
                r.append(ab)
            decode = bytearray()
        elif decode:
            decode.append(c)
        else:
            r.append(chr(c))

    if decode:
        r.append(modified_unbase64(decode[1:]))

    bin_str = ''.join(r)
    return bin_str


# required for reading passwords

import subprocess

and used the nametrans lambda you suggested above:

nametrans = lambda foldername: decode(encode(foldername))

Clearly, however, my python-fu is not strong enough. This configuration resulted in the same error @sudipm-mukherjee reported at the top of this issue. Playing with the function names in the pythonfile shows that the "attribute 'decode'" in the error message refers to the function "decode" in the pythonfile -- if you change the name of the "decode" function in the pythonfile and nametrans call, the changed name appears in the error message.

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

Hi @dnebauer!

Thanks for your comment. Could you send me the foldername with problems?

Regards,
kix

from offlineimap3.

dnebauer avatar dnebauer commented on June 17, 2024

I'm not sure I understand your question. The error I reported occurs with every account I have defined, if I use the pythonfile and nametrans lambda shown above. The defined accounts are 'austdomains' and 'bigpond'. The 'austdomains' account has the mailboxes 'INBOX', 'INBOX.Archive', 'INBOX.Drafts', 'INBOX.Junk', 'INBOX.Sent', 'INBOX.spam', and 'INBOX.Trash'. The 'bigpond' account has the mailboxes 'Deleted', 'Drafts', 'INBOX', 'Junk', 'Sent', and 'Trash'. Let me know if you were after something else.

from offlineimap3.

sudipm-mukherjee avatar sudipm-mukherjee commented on June 17, 2024

I am unable to reproduce this error with the file that @dnebauer sent me. This is how I am testing:

  1. Delete all the mail folders from laptop.
  2. create new label "abc®D" in my test gmail account.
  3. use the offlineimap.py that @dnebauer sent me
  4. Use nametrans = lambda foldername: decode(encode(foldername)) in offlineimaprc
  5. sync using offlineimap -o -u quiet
    I dont get any error, and after syncing I do see a local folder named abc&AK4-D.

Without the nametrans and all the same steps as above I get a folder with name 'abc&AK4-D'. Only difference between these two are the '

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

Hi,

the problem is the ampersand character. In utf-7 the character should be a plus (+), so the folder name should be abc+AK4-D instead of abc&AK4-D. Then, the folder name is converted to abc@D

I did a modified version of the conversion script:

import subprocess  # required for reading passwords
import re


def kix_utf7_to_utf8(str_imap):
    """
    This function converts an IMAP_UTF-7 string object to UTF-8.
    It first replaces the ampersand (&) character with plus character (+)
    in the cases of UTF-7 character and then decode the string to utf-8.

    If the str_imap string is already UTF-8, return it.

    For example, "abc&AK4-D" is translated to "abc+AK4-D"
    and then, to "abc@D"

    Example code:
    my_string = "abc&AK4-D"
    print(kix_utf7_to_utf8(my_string))

    Args:
        bytes_imap: IMAP UTF7 string

    Returns: UTF-8 string

    """
    try:
        str_utf7 = re.sub('&(\w{3}\-)', '+\\1', str_imap)
        str_utf8 = str_utf7.encode('utf-8').decode('utf_7')
        return str_utf8
    except UnicodeDecodeError:
        # Error decoding, already utf-8. return original string.
        return str_imap

The nametrans should be:

nametrans = lambda foldername: kix_utf7_to_utf8(foldername)

It works fine downloading the folder, but I have a problem when offlineimap reads the local folder and try to create it on the remote. It fails. Then, I created this modified version:

import subprocess  # required for reading passwords
import re


def kix_utf7_to_utf8(str_imap):
    """
    This function converts an IMAP_UTF-7 string object to UTF-8.
    It first replaces the ampersand (&) character with plus character (+)
    in the cases of UTF-7 character and then decode the string to utf-8.

    If the str_imap string is already UTF-8, return it.

    For example, "abc&AK4-D" is translated to "abc+AK4-D"
    and then, to "abc@D"

    Example code:
    my_string = bytes("abc&AK4-D", 'utf-8')
    print(kix_utf7_to_utf8(my_string))

    Args:
        bytes_imap: IMAP UTF7 bytes

    Returns: UTF-8 string

    """
    try:
        str_utf7 = re.sub('&(\w{3}\-)', '+\\1', str_imap)
        str_utf8 = str_utf7.encode('utf-8').decode('utf_7')
        return str_utf8
    except UnicodeDecodeError:
        # Error decoding, already utf-8. convert to utf-7
        str_utf8 = str_imap.encode('utf_7').decode('utf-8')
        str_utf7 = re.sub('\+(\w{3}\-)', '&\\1', str_utf8)
        return str_utf7

It works fine the first run, but the second run fails again. The error message is:

 ERROR: INFINITE FOLDER CREATION DETECTED! Folder 'abc®D' (repository 'Local-gmail') would be created as folder 'abc®D' (repository 'Remote-gmail'). The latter becomes 'abc&AK4-D' in return, leading to infinite folder creation cycles.
 SOLUTION: 1) Do set your nametrans rules on both repositories so they lead to identical names if applied back and forth. 2) Use folderfilter settings on a repository to prevent some folders from being created on the other side.
WARNING:OfflineImap:ERROR: INFINITE FOLDER CREATION DETECTED! Folder 'abc®D' (repository 'Local-gmail') would be created as folder 'abc®D' (repository 'Remote-gmail'). The latter becomes 'abc&AK4-D' in return, leading to infinite folder creation cycles.
 SOLUTION: 1) Do set your nametrans rules on both repositories so they lead to identical names if applied back and forth. 2) Use folderfilter settings on a repository to prevent some folders from being created on the other side.

Probably I need create two different functions, one for every foldersettings (Remote and Local). I think this is the way, I will continue later. You can work on this direction if you want.

Regards,
kix

from offlineimap3.

dnebauer avatar dnebauer commented on June 17, 2024

Since I never intend to create new folders on my local repository to be synced to a remote repository, I don't see that I will ever trigger the problem you had with your first conversion script above. Is that correct?

That first conversion script works perfectly for the example from @sudipm-mukherjee in that it searches for '&' followed by three alphanumeric characters and a dash. Is that the correct universal algorithm? I thought all ampersands had to be replaced by plus characters.

from offlineimap3.

sudipm-mukherjee avatar sudipm-mukherjee commented on June 17, 2024

Thanks for checking @thekix.
As @dnebauer mentioned, I too think creating folder locally and then syncing with remote is a very rare usecase and most of the average user will not face that problem. If you want I can keep this issue open for this usecase or close this one as I think the main problem that @dnebauer had is solved now.

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

Hi,

@dnebauer , about the question about conversion script, you have more info here: https://en.wikipedia.org/wiki/UTF-7 and here http://string-functions.com/encodingtable.aspx?encoding=65000&decoding=20127 As you can see, is +three characters-.
we need change the ampersand (&) to plus to make this codification in the previous examples.

@sudipm-mukherjee IMO, there is no problem with offlineimap, and we provided an script and other information about this problem. IMO, is better close the bug if the problem is solved.

Finally, I can confirm that my dovecot test server returns an @ character instead of &AK4- and offlineimap works fine.

So, if all of us agree that the problem is solved, we can close the bug.

from offlineimap3.

sudipm-mukherjee avatar sudipm-mukherjee commented on June 17, 2024

With kix' new script I am now happily running offlineimap3. You are correct that this problem was with the pythonscript and not core offlineimap, but I suspect this solution will be useful to many users. Thank you both for helping me through the upgrade.

I agree with @dnebauer that the script @thekix gave might be useful for other users also.
@thekix do you think it might be useful to add the modified pythonfile in the contrib folder and use that commit to close this issue? If not I can directly close this issue.

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

Hi @sudipm-mukherjee

Yes, of course. I can add the file to the contrib folder. What script do you think is better (please, paste it)? About the filename, nametrans_imap_to_utf8_example.py could be fine? I will add an small header to the file and I will remove the import subprocess # required for reading passwords. Comments are welcome.

Regards,
kix

from offlineimap3.

sudipm-mukherjee avatar sudipm-mukherjee commented on June 17, 2024

@dnebauer is an user who is using nametrans, so I think I will leave it on @dnebauer to say which script is better.

from offlineimap3.

dnebauer avatar dnebauer commented on June 17, 2024

I appreciate @sudipm-mukherjee leaving the decision to me, but it should be pretty obvious by now that I'm hardly an expert on offlineimap, python, or utf encodings. For what it's worth, though, I went with a minor variation on the first script by @thekix because I was only interested in a utf7-to-utf8 conversion, given that I never want to create local folders to be synchronised back to a remote server, and the second script can potentially return utf7.

For what it's worth, here is the script I'm now using:

import subprocess  # noqa, required for reading passwords
import re


def convert_utf7_to_utf8(str_imap):
    """
    This function converts an IMAP_UTF-7 string object to UTF-8.
    It first replaces the ampersand (&) character with plus character (+)
    in the cases of UTF-7 character and then decode the string to utf-8.

    If the str_imap string is already UTF-8, return it.

    For example, "abc&AK4-D" is translated to "abc+AK4-D"
    and then, to "abc@D"

    Example code:
    my_string = "abc&AK4-D"
    print(kix_utf7_to_utf8(my_string))

    Args:
        bytes_imap: IMAP UTF7 string

    Returns: UTF-8 string

    Source: https://github.com/OfflineIMAP/offlineimap3/issues/23

    """
    try:
        str_utf7 = re.sub(r'&(\w{3}\-)', '+\\1', str_imap)
        str_utf8 = str_utf7.encode('utf-8').decode('utf_7')
        return str_utf8
    except UnicodeDecodeError:
        # error decoding because already utf-8, so return original string
        return str_imap

You will see I added the raw string prefix r before the re.sub pattern. This prevents linters complaining that \w and \- are invalid escape sequences.

Thanks again to both of you for all your help.

from offlineimap3.

thekix avatar thekix commented on June 17, 2024

Thanks @sudipm-mukherjee @dnebauer !

Uploaded!
kix

from offlineimap3.

hrehfeld avatar hrehfeld commented on June 17, 2024

I found the above unsatisfying, and wasn't sure what exactly to put where.

Using python-imap-tools I adapted what I had for python2 in .offlineimap.py, which seems to work for offlineimap3 and will hopefully work both ways (untested).

#!/usr/bin/python
from imap_tools.imap_utf7 import encode, decode

def from_imap_nametrans(foldername):
    assert isinstance(foldername, str), str
    foldername = foldername.encode('utf-8')
    foldername = decode(foldername)
    return foldername

def to_imap_nametrans(foldername):
    assert isinstance(foldername, str), foldername
    foldername = encode(foldername)
    foldername = foldername.decode('utf-8')
    return foldername

Then use nametrans = to_imap_nametrans for local repos and nametrans = from_imap_nametrans for remote repos in .offlineimaprc

ps. I find it extremely irritating that I have to convert Umlaute in foldernames with custom scripting. A lot of German users will be affected by this. As a user, I would heavily bin this into "bug".

from offlineimap3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.