Giter VIP home page Giter VIP logo

mscript's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

weimingtom

mscript's Issues

Unify error object in the mscript codebase

Right now, the VM and Parser have separate error types which are pretty much the same. They both report whether an error occurred and then some further information. Regrettably the bytecode generator doesn't even have any error type since it started as a very simple series of routines that eventually evolved to have a lot of error conditions that are now ignored. I'd like to come up with a single error object and result type that can be used throughout the application to report errors.

Create a single VM value type for global references

Right now, globals are referenced in the VM by pushing expression values onto the data stack and passing an argument with the opcode indicating the number of data elements which are global nodes. I want to create a single data element as it's own VM value, but since global nodes can be any arbitrary expression, I'm not sure how that can be done elegantly.

Permit variable argument and keyword argument style function calls/definitions

In Python, functions can be defined with variadic arguments using the collect into list operator (*ident) and keyword arguments using the collect into dict operator (**ident). Similarly, parameters in function definitions may specify default values if left unspecified. mscript should support a similar paradigm:

func Foo(a, *b, c=null, **d) { }

...could be called as:

Foo("a string", "another string", "a third string", c=1, some_dict_val=null)
//  ^a          ^b[0]             ^b[1]             ^c   ^**d

Fix `NEW_NAME` bytecode generation in iterator `for` loops

In iterator style for statements, the OPC_NEW_NAME opcode is pushed to the bytecode stack before pushing the iterator expression. This means that if the iterator name is the same as the for loop variable name, the loop will fail:

i = [1, 2, 3];
for var i in i {
    // we will literally never get here because
    // `var i` occurs before calling __next__ on
    // the iterator i
}

This works fine in Python:

i = [1, 2, 3]
for i in i:
    print(i)

prints

1
2
3

Related to issue #37

Names are new'ed in bytecode before pushing values, so names can't be shadowed

The OPC_NEW_NAME opcode is pushed to the bytecode stack before pushing any existing values onto the data stack. This means that if a declared variable is the same name as any value in its initial expression, the value in the initial expression will be set to null:

var index := 1;
func test() {
    var index := index + 1; // index => null + 1
    return index;
}

Implement a defer statement to provide easy context cleanup

In Go, programmers can write defer stmt to defer cleanup steps like closing files or database connections until exiting the current scope. This makes cleanup easier than the traditional methods like C's goto or lots of nested if statements.

Optimize tail-calls

Lua's interpreter seems to use a special opcode OP_TAILCALL to make a tail call.

Certain code causes the interpreter to hang

The code name = causes the interpreter to hang indefinitely and apparently consume an enormous amount of memory (memory consumption after hanging for a few mins):

==86835== HEAP SUMMARY:
==86835==     in use at exit: 1,047,424,811 bytes in 32,731,207 blocks
==86835==   total heap usage: 32,749,227 allocs, 18,020 frees, 1,049,981,757 bytes allocated```

Remove increment style `for` statement

Python does not use an incrementing for statement like mscript's for var i := begin : end : step { }. Instead, they provide a generic range function, which I think I prefer since it means we no longer have to maintain a separate syntactic form.

Attach line and column to each AST element

Only the lexer and parser know the line and column number of each symbol in code since they are not codified in the AST. This makes certain types of errors pretty vague in terms of error reporting:

> index;
mscript: reference to undefined variable `index`

Create array/list object

The only composite data structure in mscript is the object for now (well... at least that was the plan). However, it'd be nice to have an array type too.

Permit multiple declarations on one line

The grammar in parser.c actually suggests that this syntax should be allowed:

var first := "John", middle := "H", last := "Smith"

However, the actual parser quits after a single variable declaration. So it should allow that

Run semantic checks on the AST

The parser generates an AST for any input text which matches the grammar, but there are no checks for correct language semantics such as:

  • break and continue statements may only appear in for statement blocks
  • return statements may only appear in function blocks
  • shadowing variables in containing lexical blocks (warning)
  • redeclaring a variable name in the same lexical block

Stop returning surrounding quote characters with string tokens

The lexer returns a string token containing the string a string as "a string" if that string appeared in the source with a " quotation or 'a string' if it appeared in the source with a ' character. This, of course, means that later consumers need to account for these quotes, which is just annoying and counterproductive.

Compound assignment for bitwise operators

For now, mscript has +=, -=, *=, /=, \=, and %= operators. There should be equivalent compound assignment operators for the standard bitwise ops too: &=, |=, ^=. Maybe <<= and >>= too?

Remove `merge` statement

I pulled the merge statement verbatim from M, but I'm not sure if it makes sense in mscript. It could still work, but I think there may be other better idioms.

Parse object literals

Object literals can appear in code much the same way as dicts in Python or Objects in Javascript or tables in Lua:

{ key1: value1, key2: value2, ... }

Create context management protocol like C# or Python

Python uses the with ... as ...: statement to manage context for unmanaged resources and C# uses using (x = ...) { ... } for the same. mscript should include a context management protocol (probably the same way Python does with __enter__ and __exit__ methods). This probably fulfills the same role as the suggested defer statement referenced in issue #18.

Correctly verify variable declarations in `for` loops

The declarations (or assignments) used in for loops semantically occur within the loop block, but the current checks are performed as if they were outside of that block. This means we probably won't correctly catch shadowing and will incorrectly flag redeclarations within the same scope.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.