Giter VIP home page Giter VIP logo

Comments (2)

SmaugPool avatar SmaugPool commented on August 17, 2024 1

Problem

The issue is to decide what are valid escape sequences.

For now Aiken UPLC parser does not support any:

rule string() -> String
    = "\"" s:[^ '"']* "\"" { String::from_iter(s) }

Plutus Spec

The Plutus Core Spec says in Appendix A.1:

Concrete syntax for strings. Strings are represented as sequences of Unicode characters enclosed in
double quotes, and may include standard escape sequences.

However despite some escape sequences being standardized for some languages, like C, there is as far as I know no "standard escape sequences".

PlutusTx

PlutusTx conText seems to rely on megaparsec charLiteral:

-- | Parser for string constants. They are wrapped in double quotes.    
conText :: Parser T.Text    
conText = lexeme . fmap T.pack $ char '\"' *> manyTill Lex.charLiteral (char '\"') 

Which implements the Haskell Report grammar rules:

The literal character is parsed according to the grammar rules defined in the Haskell report.

I'm not sure what is supported by those exactly, it seems to be:
https://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html

Which includes quite a lot of non common ones and use \xHEX for unicode escape sequence (instead of C \uHEX or common \u{HEX} like in rust).

Aiken

It may also make sense to have the same escape sequences supported in UPLC Aiken compiler than in Aiken language.

For now Aiken seems to support a few single character escape sequences in escape lexer, but no unicode ones:

let escape = just('\\').ignore_then(        
    just('\\')            
        .or(just('/'))            
        .or(just('"'))            
        .or(just('b').to('\x08'))            
        .or(just('f').to('\x0C'))            
        .or(just('n').to('\n'))            
        .or(just('r').to('\r'))            
        .or(just('t').to('\t')),            
);         

Also not sure why it supports the weird \/ one that does not require escaping.

Conclusion

  1. We need to decide on a "standard" set of escape sequences and if it should match PlutuxTx one or not, maybe trying to get it official in Plutus Core spec.
  2. We need to decide if we want the same one in Aiken language and Aiken UPLC compiler.
  3. We can then implement those, with tests and documentation in the guide.

from aiken.

rvcas avatar rvcas commented on August 17, 2024

@SmaugPool

cool that makes sense. Thanks for writing this.

from aiken.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.