Giter VIP home page Giter VIP logo

Comments (4)

hlorenzi avatar hlorenzi commented on June 8, 2024

How would we go about describing instruction invocations? They're not context-free -- they're completely mysterious until it's matching time, and are parsed in context of each possible instruction. Meaning if two possible instructions have expression slots in different spots, what is considered an expression is going to change for the same invocation.

For example:

#ruledef
{
  ld   {a}, x + 1 => 0x11
  ld x + 1,   {a} => 0x22
}

x = 0
ld x + 1, x + 1 ; invocation

It's undefined what the invocation syntax is until the parsing algorithm runs, which will try to parse it twice: one pass for each rule you declared beforehand. When x + 1 is specified verbatim in an instruction's pattern, it's not parsed as an expression -- it's simply parsed as a sequence of characters (currently, not even as proper tokens!).

With that in mind, do you still think it would make sense to keep an EBNF grammar around? Maybe for the other parts of the language?

The reason asm block parameters need enclosing braces is to enable the assembler to perform substitution token-for-token, without syntactic context -- since braces are some of the only tokens not allowed to be part of an instruction's pattern, it's easy to spot them in a context-free manner.

Now, the reason you can also specify numerical asm block parameters without the braces is kind of an oversight of mine -- behavior from before I realized you need token-for-token substitution to cover all cases. Behavior which maybe should be deprecated? All types of parameters should work fine with enclosing braces anyway, albeit changing the semantics a little.

from customasm.

parasyte avatar parasyte commented on June 8, 2024

My argument is that there is a grammar for the metalanguage. It might look something like this, just kind of making it up:

letter = "A" | "B" | "C" | "D" | "E" | "F" | "G"
       | "H" | "I" | "J" | "K" | "L" | "M" | "N"
       | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
       | "V" | "W" | "X" | "Y" | "Z" | "a" | "b"
       | "c" | "d" | "e" | "f" | "g" | "h" | "i"
       | "j" | "k" | "l" | "m" | "n" | "o" | "p"
       | "q" | "r" | "s" | "t" | "u" | "v" | "w"
       | "x" | "y" | "z" ;
nonzero digit = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
digit = nonzero digit | "0" ;
binary digit = "0" | "1" ;
octal digit = binary digit | "2" | "3" | "4" | "5" | "6" | "7" ;
hex digit = decimal digit | "A" | "B" | "C" | "D" | "E" | "F"
       | "a" | "b" | "c" | "d" | "e" | "f" ;
character = letter | digit | "_" ;

binary = "0b", binary digit, { binary digit } ;
octal = "0o", octal digit, { octal digit } ;
decimal = nonzero digit, { digit } ;
hex = "0x", hex digit, { hex digit } ;

identifier = ( letter | "_" ), { character } ;
number = [ "-" ], ( binary | octal | decimal | hex ) ;
string = '"', { all characters - '"' }, '"' ;

ruledef directive = "#ruledef", white space, [ identifier ], white space, ruledef arguments ;
ruledef arguments = "{", match expression, { match expression }, "}" ;
match expression = match rule, match body ;
match rule = white space, { all characters }, "=>", white space ;
match body = expression | expressions ;

expressions = "{", white space, expression, { white space, expression }, white space "}" ;

white space = ? white space characters ? ;
all characters = ? all visible characters ? ;

With this, define what an expression is and you have a good starting point for a grammar to write #ruledef directives, identifiers, numbers, and strings.

I don't think it's worth trying to define the grammar of the instructions defined inside #ruledef, which seems to be where you are getting stuck. It's enough to understand the grammar at a higher level.

When x + 1 is specified verbatim in an instruction's pattern, it's not parsed as an expression -- it's simply parsed as a sequence of characters (currently, not even as proper tokens!).

That's perfectly fine! The grammar for the metalanguage should specify this and that solves it.

Now, the reason you can also specify numerical asm block parameters without the braces is kind of an oversight of mine -- behavior from before I realized you need token-for-token substitution to cover all cases. Behavior which maybe should be deprecated? All types of parameters should work fine with enclosing braces anyway, albeit changing the semantics a little.

This would probably be nice to address. AFAIK wrapping integral typed parameters in braces in the asm context does not work.

Screen Shot 2022-07-14 at 8 38 38 PM

from customasm.

hlorenzi avatar hlorenzi commented on June 8, 2024

I think the confusion with the asm blocks will mostly be resolved with the next release I'm working on, where all arguments can be specified with braces within the asm block. Feel free to open this again if you still think the EBNF is worth it!

from customasm.

parasyte avatar parasyte commented on June 8, 2024

I think some specification of the meta language syntax is still important even if it is not EBNF. For instance when I wrote a syntax definition for Sublime Text, I didn't have a great resource for defining the parser. It is mostly just an approximation based on the wiki and empirical observation.

Cf. #105 (comment)

from customasm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.