Giter VIP home page Giter VIP logo

Comments (18)

jakeogh avatar jakeogh commented on June 25, 2024

Oops. It's not IDNA2008: http://unicode.org/cldr/utility/character.jsp?a=2603 closing.

from idna.

annevk avatar annevk commented on June 25, 2024

It's valid uts46 though which is what browsers use. You might want to reconsider this.

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

Nice utility http://unicode.org/cldr/utility/idna.jsp?a=%E2%98%83.net

from idna.

kjd avatar kjd commented on June 25, 2024

It's not valid UTS46 for IDNA 2008, it is only valid for IDNA 2003. Look for the "NV8" in the UTS46 table data. Now there may be an argument to add fall-through IDNA 2003 processing, but as of today this library only supports IDNA 2008.

from idna.

jakeogh avatar jakeogh commented on June 25, 2024

+1 for optional 2003 fall-through, I run into real web corner cases that are IDNA2003.

from idna.

annevk avatar annevk commented on June 25, 2024

@kjd UTS46 (the actual document) doesn't have a processing mode where this code point is somehow rejected and this is the first implementation I have seen that does such a thing.

from idna.

jribbens avatar jribbens commented on June 25, 2024

That's because this isn't an implementation of UTS46, it's an implementation of IDNA2008. If for some reason you want only UTS46 and not IDNA2008 then presumably you can call idna.uts46_remap directly.

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

idna.uts46_remap doesn't encode anything though...

from idna.

kjd avatar kjd commented on June 25, 2024
import idna
import encodings.idna as idna2003

def questionable_encode(s):

    try:
        return idna.encode(s, uts46=True)
    except idna.IDNAError:
        try:
            return idna2003.ToASCII(idna.uts46_remap(s))
        except:
            raise idna.IDNAError("Input string is supported by no flavour of IDNA")
>>> questionable_encode("\u2603")
"xn--n3h"

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

Thanks! Stick that function in the library ;)

So is this a bona fide implementation of uts46? (Sorry for still not being totally clear on what the spec entails)

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

Doh.

>>> questionable_encode("\u2603.net")
b'xn--.net-4g3b'

from idna.

kjd avatar kjd commented on June 25, 2024

This was just a quick function I rattled off the top of my head, not tested. You probably need to do a few more lines to break down the input string into individual labels to use the idna2003 portion. If we added support to do something like this in this library (see issue #18) then it will have a proper test suite etc.

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

Ok, thanks. This version works for at least these two test inputs:

import idna
import encodings.idna as idna2003

def questionable_encode(s):
    try:
        return idna.encode(s, uts46=True)
    except idna.IDNAError:
        try:
            labels = idna.uts46_remap(s).split(".")
            punycode_labels = [idna2003.ToASCII(label) for label in labels]
            return b".".join(punycode_labels)
        except:
            raise idna.IDNAError("Input string is supported by no flavour of IDNA")
>>> questionable_encode("\u2603.net")
b'xn--n3h.net'
>>> questionable_encode("\u2603")
b'xn--n3h'

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

If we added support to do something like this in this library (see issue #18) then it will have a proper test suite etc.

Yes please!

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

FYI the function above mishandles faß.de. Correct result is fass.de

>>> questionable_encode('faß.de')
b'xn--fa-hia.de'

In fact a number of the examples from http://unicode.org/cldr/utility/idna.jsp don't work.

from idna.

annevk avatar annevk commented on June 25, 2024

No, fass.de is only the correct result for transitional mode, which is not what we want to align on.

from idna.

nlevitt avatar nlevitt commented on June 25, 2024

Oh. Well, chromium does fass.de, firefox does xn--fa-hia.de.

from idna.

annevk avatar annevk commented on June 25, 2024

Yeah I know, bugs have been filed.

from idna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.