titzer / virgil Goto Github PK

A fast and lightweight native programming language

Shell 31.53% Java 22.64% C 19.83% Emacs Lisp 0.82% Io 0.07% JavaScript 5.20% C++ 5.25% WebAssembly 0.58% Makefile 1.26% Assembly 3.07% Python 0.49% Vim Script 1.25% TypeScript 5.68% HTML 2.34%

compiler garbage-collection native programming-language system-programming systems webassembly

virgil's People

Contributors

Stargazers

Watchers

virgil's Issues

Typo in sample code

In https://github.com/titzer/virgil/blob/master/doc/tutorial/Variance.md, where you have "f(list.head)" you obviously mean "f(l.head)"

How to compile programs into WASI?

Hi,

I'm starting to trial Virgil a little bit more, and I was unable to get an application compiling to WASI.

Here's what I'm doing (in latest master)

cd apps/HelloWorld/
$ ../../bin/dev/v3c-wasi -run HelloWorld.v3
!EvalUnimplemented: CallAddress
	in main() [HelloWorld.v3 @ 2:20]

I also tried to compile and then run it with Wasmer, but it seems that the Wasm file is not valid:

cd apps/HelloWorld/
$ ../../bin/dev/v3c-wasi HelloWorld.v3
$ wasmer HelloWorld.wasm
error: failed to run `HelloWorld.wasm`
│   1: module instantiation failed (engine: universal, compiler: cranelift)
╰─▶ 2: Validation error: type mismatch: expected i32 but nothing on stack (at offset 409)

Did I miss some steps here? What can I do to help fix the issue?

Why is Virgil so fast?

I ran a Hello World + Fibonacci benchmark comparing Virgil with Rust and TinyGo (the two most often cited Wasm compilers) — the results seem to good to be true!

Virgil outperforms both Rust and TinyGo by orders of magnitude in terms of both compiler speed and executable file sizes. Yes, the 0.00s compile time is correct — time(1) reports to the nearest 1/100s (when I compiled my first Virgil program it was so fast I thought it hadn't run).

The Numbers

wasm

	Compile time (secs)	Executable size (B)	Execution time (secs)
go	4.13s	428,547	0.62s
rust	0.33s	2,054,632	1.80s
virgil	0.00s	8,802	0.96s

wasm-optimised

	Compile time (secs)	Executable size (B)	Execution time (secs)
go	3.88s	191,265	0.62s
rust	0.80s	301,363	0.66s
virgil	0.01s	7,891	1.07s

x86-64-linux

	Compile time (secs)	Executable size (B)	Execution time (secs)
go	1.94s	503,640	0.39s
rust	0.30s	3,853,504	1.53s
virgil	0.01s	20,552	0.63s

x86-64-linux-optimised

	Compile time (secs)	Executable size (B)	Execution time (secs)
go	2.04s	140,056	0.38s
rust	2.73s	1,653,736	0.32s
virgil	0.01s	19,552	0.64s

WebAssembly Performance

The Virgil compiler is ~50x faster than the Rust compiler and over 300x faster than the TinyGo compiler.
The optimised Virgil executable is over 35x smaller than the Rust executable and over 20x smaller than TinyGo executable.
The TinyGo executable runs ~7% faster than the Rust executable and ~83% faster than the Virgil executable (executed on the wasmtime runtime).

x86-64 Performance

The Virgil compiler is ~30x faster than the Rust compiler and ~200x faster than the TinyGo compiler.
The optimised Virgil executable is ~80x smaller than the Rust executable and ~7x smaller than TinyGo executable.

Notes

Virgil Wasm code generated with the compiler-opt=all option ran slower than without it but the executable size was ~10% smaller, so currently there's not a lot to be gained using the -opt=all option.
Importing the fmt package increased the size of the TinyGo Wasm executable from 8KB to 191KB (an increase of 183KB), whereas importing the Virgil Strings component increased the size of the Virgil Wasm executable from 3.6KB to 7.9KB (an increase of only 4.3KB).
The compiled Wasm files were executed with wasmtime-cli 0.39.1

Details

The raw data along with source code and platform information is attached.
go-results.txt
rust-results.txt
virgil-results.txt

Support 64-bit integers

Language or Compiler Enhancement?
===========================

Language and Compiler

Justification (summary of perceived benefit)
===============================

Large (64-bit) integers are necessary to represent large quantities, like 
offsets in large files, etc.

Impact estimate (select one)
----------------------
2 - small
================

Priority (select one)
----------------------
2 - medium
================

Original issue reported on code.google.com by [email protected] on 24 Jan 2013 at 7:29

Allow _ for super constructor calls

Language or Compiler Enhancement?
===========================

Language

Justification (summary of perceived benefit)
===============================

The current design requires subclasses to re-declare constructor parameters 
that are simply passed along to super constructors. This tends to be redundant 
and increases the amount of work necessary to write subclasses.

E.g.

class A {
  new(i: int) { }
}

class B extends A {
  new(i: int, j: int) super(i) { }
}


B be written more easily:

class B extends A {
  new(j: int) super(_) { }
}

or even:

class A(i: int) { }
class B(j: int) extends A(_) { }
class C extends B(3, _) { }

Impact estimate (select one)
----------------------
2 - small
================

Priority (select one)
----------------------
2 - medium
================

Original issue reported on code.google.com by [email protected] on 24 Jan 2013 at 7:24

-1 bytes

user@host009:~/src/virgil/bin/dev$ wave -?
==25740==WARNING: AddressSanitizer failed to allocate 0xffffffffffffffff bytes
==25740==AddressSanitizer's allocator is terminating the process instead of returning 0
==25740==If you don't like this behavior set allocator_may_return_null=1
==25740==AddressSanitizer CHECK failed: /build/llvm-toolchain-6.0-QjOn7h/llvm-toolchain-6.0-6.0/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:225 "((0)) != (0)" (0x0, 0x0)
    #0 0x4f0645  (/home/user/src/virgil/bin/dev/wave+0x4f0645)
    #1 0x50def5  (/home/user/src/virgil/bin/dev/wave+0x50def5)
    #2 0x4f6a36  (/home/user/src/virgil/bin/dev/wave+0x4f6a36)
    #3 0x4f6a76  (/home/user/src/virgil/bin/dev/wave+0x4f6a76)
    #4 0x432837  (/home/user/src/virgil/bin/dev/wave+0x432837)
    #5 0x51ff8f  (/home/user/src/virgil/bin/dev/wave+0x51ff8f)
    #6 0x52a31b  (/home/user/src/virgil/bin/dev/wave+0x52a31b)
    #7 0x527efe  (/home/user/src/virgil/bin/dev/wave+0x527efe)
    #8 0x5257d7  (/home/user/src/virgil/bin/dev/wave+0x5257d7)
    #9 0x7f0a40925b96  (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #10 0x427639  (/home/user/src/virgil/bin/dev/wave+0x427639)

doc/wiki/Functions.md has a few Markdown misformattings

... for code blocks, making it hard to read: https://github.com/titzer/virgil/blob/master/doc/wiki/Functions.md . Cases can be found e.g. by searching for:

```

NullCheckException in LinearScanRegAlloc.assignEnd

127 user@host009:~/src/virgil/bin$ ./v3c-x86-linux ../apps/Multi/Multi.v3
!NullCheckException
in LinearScanRegAlloc.assignEnd() [/Users/titzer/virgil/aeneas/src/mach/RegAlloc.v3 @ 95:66]
in LinearScanRegAlloc.assignRegs() [/Users/titzer/virgil/aeneas/src/mach/RegAlloc.v3 @ 68:57]
in X86CodeGen.genCode() [/Users/titzer/virgil/aeneas/src/x86/X86CodeGen.v3 @ 72:32]
in X86Runtime.genX86Code() [/Users/titzer/virgil/aeneas/src/x86/X86Runtime.v3 @ 24:28]
in MachProgram.layoutCode() [/Users/titzer/virgil/aeneas/src/mach/MachProgram.v3 @ 405:32]
in X86Linux.encodeCode() [/Users/titzer/virgil/aeneas/src/x86/X86Linux.v3 @ 86:32]
in X86Linux.emit() [/Users/titzer/virgil/aeneas/src/x86/X86Linux.v3 @ 37:27]
in X86Target.emit() [/Users/titzer/virgil/aeneas/src/x86/X86Target.v3 @ 21:24]
in Compilation.emit() [/Users/titzer/virgil/aeneas/src/main/Compiler.v3 @ 292:36]
in Compiler.compile() [/Users/titzer/virgil/aeneas/src/main/Compiler.v3 @ 128:34]
in Aeneas.compile() [/Users/titzer/virgil/aeneas/src/main/Aeneas.v3 @ 111:33]
in Aeneas.main() [/Users/titzer/virgil/aeneas/src/main/Aeneas.v3 @ 71:32]

error in test/all.bash with x86-64

user@DESKTOP-3H33NCK:~/src/virgil$ bash test/all.bash:

build scripts & test harness in Virgil?

What do you think about having the build & test scripts written in Virgil? I'm happy to toy around with it if it's a valid endeavor.

Top-level namespacing

How can you disambiguate same-named components and classes from different source files (aka the module problem)?

I see that this issue was addressed as as "Future Work" in the 2013 paper https://dl.acm.org/doi/10.1145/2491956.2491962

The current EBNF grammar file contains "import" and "export" keywords but I haven't been able to find a description of their semantics:

virgil/doc/virgil-grammar.ebnf

Line 5 in 3038dea

ComponentDecl ::= "import"? "component" IDENTIFIER "{" Member* "}"

virgil/doc/virgil-grammar.ebnf

Line 35 in 3038dea

 ExportDecl ::= "export" ( DefMethod | ( STRING | Ident ) ( "=" SymbolParam )? ";" ) 

Generate stacktraces for System.error

System.error should generate a stacktrace on native platforms.

Currently it does not because it requires a compiler intrinsic to get the 
caller IP and SP to begin stack walking.

Original issue reported on code.google.com by [email protected] on 28 Mar 2012 at 7:02

turn doc/tutorial into wiki

Github has the great option to integrate markdown documentation in its project wiki.

If you checkout the default wiki (just add /wiki to your project) you can add it back via git submodule.

All .md files will automatically appear in a dropdown list.

See here for (mediocre) example :
https://github.com/pannous/wasp/wiki <=
https://github.com/pannous/wasp/

Where to ask for help - slices / memcopy

I'm having some fun making a tokenizer/parser and learning virgil. Really enjoying the language so far. Are there any community places such as irc or discord for questions?

If not, i'm trying to answer these two questions:

how to use Pointers
how to slice / get a substring of an array without making a copy

The usecase is basically just a memcopy from the end of an array to the beginning. I tried using Arrays.copy and Arrays.copyInto but couldn't figure out how to slice the last n elements.

And when I try to use something like Pointer.atContents() i'm getting an UnresolvedIdentifier: identifier "Pointer" cannot be found. I looked and couldn't find a file where Pointer is defined. 🤔

UTF-8 string literals

Does Virgil support UTF-8 string literals?

The documentation suggests it does:

virgil/doc/lib-issues.txt

Line 116 in 3038dea

- Aeneas parses text as bytes, only allows UTF-8 inside string constants

Here I've inserted the copyright character in a string literal:

$ cat hello.v3    
def main() {
        System.puts("Hello World ©\n");
}

$ virgil run tmp/hello.v3
[tmp/hello.v3 @ 2:21] ParseError: invalid string literal
        System.puts("Hello World ©\n");
                    ^

Hex byte values work though:

$ cat hello.v3
def main() {
        System.puts("Hello World \xC2\xA9\n");
}

$ virgil run hello.v3
Hello World ©

Swig bingings for Virgil ?

Ref http://www.swig.org/exec.html

Allowing Python, Java, (etc etc) teams to drop to Virgil for functions is a key incremental strategy for converting larger projects, and indeed the developers too. A bottom-up world take over .. or pick your own lofty claim.

docs / release?

I read the Virgil III paper, and it seems like a really cool mix of high level features and low level code -- the holy grail :)

I tried it out and made an echo.v3 program run under the interpreter, ran a jar, and ran an x86 executable.

Any plans for a release or maybe a README describing how to use it? I had to read the shell scripts to figure out how to invoke it.

I'm interested in the compile-time metaprogramming among other things.

It looks like it is still aimed more at embedded systems, since there's no module system or C integration? (as far as I can see)

Built in interpretor

What IR does the built-in interpreter interpret? Is is wasm? It seems that this basic question is left untouched in the documentation which I have dug through multiple times.

Exceptions don't work on MacOS 10.8

What steps will reproduce the problem?
1. Write a program that throws an exception.
2. Run the program.

What is the expected output? What do you see instead?

Expected output should be an exception stacktrace. Instead, program gets stuck, 
probably in the signal handler.

Original issue reported on code.google.com by [email protected] on 18 Apr 2013 at 9:18

I wonder if it is feasible to add ARM backend for virgil

Hi,

I'm looking for a VM that can run a PLC.
The idea is:

Have some kind of high level IDE (disclaimer. I build one: https://github.com/vlsi/ide61131)
Make that IDE to compile the program into some kind of bytecode
Make PLC to execute that bytecode

The bytecode is there so IDE, etc can be PLC-independent (that is one can easily support different kinds of CPUs without changing front-end too much).

I do like virgil's idea of "allocating all the memory at initialization".

For the "OS" I think of http://www.freertos.org/ (that will provide TCP/UDP, and some multitasking).

I aim for 200KiB...1MiB RAM devices with 200Mhz+ cpus.
I aim to fit "simple logic computation" under 1ms.

The devices might have different CPUs (some kind of ARM, AURIX) and common denominator is probably some kind of C compiler (gcc, TI, Keil, etc).

Is virgil feasible/reasonable for that?

wish: run-time eval (aka self-embedding and runtime linking)

Could Aeneas embed [the entirety of] itself if it sees a very special command ("eval()", for instance) when compiling? (similar to how it appends the GC source when needed)

Next request after that is to enable linking of compiled code from the running code..

Gracefully handle stackoverflow on native platforms

When running on the JVM, a stack overflow manifests itself with a 
java.lang.StackOverflowError, which terminates the Virgil program. On native 
platforms, a stack overflow manifests itself by the program running past the 
end of the stack's mapped pages, causing a SIGSEGV (or SIGBUS), which cannot be 
handled by the runtime's signal handler (since there is no stack), resulting in 
termination of the program by the operating system.

Native platforms should check for stack overflow via one of the known 
mechanisms:

1. Use sigaltstack for signals
2. Stack banging
3. Explicit stack checks

Call graph analysis can likely eliminate most of #2 and #3 statically.

Original issue reported on code.google.com by [email protected] on 28 Mar 2012 at 6:46

Looks great!

Very RUST-y :) (My favorite programming language)

Foreach loop over sequences, lists, maps

Language or Compiler Enhancement?
===========================

Language

Justification (summary of perceived benefit)
===============================

Virgil currently only supports foreach loops over arrays. Devise a mechanism to 
allow cons Lists, Sequences, Maps, and other user types to be foreach-iterable. 
This must be done as efficiently as possible, without implicitly allocating 
objects on the heap.


Impact estimate (select one)
----------------------
2 - small
================

Priority (select one)
----------------------
3 - high
================

Original issue reported on code.google.com by [email protected] on 24 Jan 2013 at 7:28

[MacOS] Rosetta 2 refuses to run Virgil x86-64-darwin binaries

I recently ported Virgil to x86-64-darwin, i.e. 64-bit MacOS on Intel. I expected that the binaries generated would work automatically under Rosetta 2, but they apparently do not.

I could some help debugging some issues.

% aeneas bootstrap
% cd apps/HelloWorld/
% v3c-x86-64-darwin -output=/tmp HelloWorld.v3 
% /tmp/HelloWorld 
rosetta error: /tmp/HelloWorld: overlapping Mach-O segments
Trace/BPT trap: 5
%

Tinkering with the segment layouts in aeneas/src/x86-64/X86_64Darwin.v3 can get past that, but other errors remain. I am not sure what the rules are for Rosetta.

To consider the αcτµαlly pδrταblε εxεcµταblε (APE) format cross-platform executable

Hi, I just saw your project Virgil, and the interesting part is the compilation of different targets (Native machine & Virtual Machine).
I don't know which language you used to produce the 1st compiler of Virgil, But I may guess it was C. The reason I am referring to this point is because of an open-source that tries to solve the issues of cross-compilation to native machine code from a different view. I will not give more details than the links to read, download, compile and test from the author (Justine Tunney) blogs posts. Here are the links:

αcτµαlly pδrταblε εxεcµταblε
cosmopolitan libc : "your build-once run-anywhere c library."

In summary, it is about compiling one-time to a native-machine code (At the moment is x86) and running on any operating system platform (Darwin, Linux, BSD, Window). I see the project is promising as yours.

Improve JVM bytecode generation

What steps will reproduce the problem?
1. Try to parse Aeneas.jar with IKVM (http://ikvm.net/)
2. Look at the warnings
3. See the conversation at https://sourceforge.net/p/ikvm/bugs/281/

What is the expected output? What do you see instead?
IKVM is pretty good at converting pretty much every JAR i've stumbled upon. So, 
while I personally didnt check the JAR for correctness as per JRE spec, my bet 
is that the generated bytecode is wrong.

What version of the product are you using? On what operating system?
Virgil binary from latest commit, IKVM 7.2 binary, Windows 8.1 x64, .NET 4

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 22 Sep 2013 at 1:38

Grammar railroad diagram

Going through the code on https://github.com/titzer/virgil/blob/master/aeneas/src/vst/Parser.v3 and using this tool https://github.com/mingodad/CocoR-CSharp to create a LL(1) parser to then generate an EBNF understood by https://www.bottlecaps.de/rr/ui to generate a railroad diagram (https://en.wikipedia.org/wiki/Syntax_diagram) I've got an initial version that already shows a big chunk of virgil grammar.

Copy the EBNF shown bellow on https://www.bottlecaps.de/rr/ui in the tab Edit Grammar then switch to the tab View Diagram, I think that it's useful for documentation and understand/develop the syntax of virgil:

//
// EBNF generated by CocoR parser generator to be viewed with https://www.bottlecaps.de/rr/ui
//

//
// productions
//

Virgil ::=  parseToplevelDecl* EOF
parseToplevelDecl ::=  TK_class parseIdentCommon ( "(" ( parseClassParam ( "," parseClassParam )* )? ")" )? ( TK_extends parseTypeRef )? parseTupleExpr? parseMembers | TK_component parseIdentVoid parseMembers | TK_import TK_component parseIdentVoid parseMembers | parseVar | parseDef | parseVariant | parseEnum | parseExport
parseIdentCommon ::=  identParam parseTypeRef ( "," parseTypeRef )* ">" | ident
parseClassParam ::=  TK_var? parseParamWithOptType
parseTypeRef ::=  ( "(" ( parseTypeRef ( "," parseTypeRef )* )? ")" | parseIdentCommon ( "." parseIdentCommon )* ) ( "->" parseTypeRef )*
parseTupleExpr ::=  "(" ( parseExpr ( "," parseExpr )* )? ")"
parseMembers ::=  "{" parseMember* TK_rbrace
parseIdentVoid ::=  ident
parseVar ::=  TK_var parseIdentVoid parseFieldSuffix
parseDef ::=  TK_def TK_var? ( parseIndexed | parseIdentCommon ( parseMethodSuffix | parseFieldSuffix ) )
parseVariant ::=  TK_type parseIdentCommon ( "(" ( parseVariantCaseParam ( "," parseVariantCaseParam )* )? ")" )? parseVariantCases
parseEnum ::=  TK_enum parseIdentVoid ( "(" ( parseEnumParam ( "," parseEnumParam )* )? ")" )? "{" ( parseEnumCase ( "," parseEnumCase )* )? TK_rbrace
parseExport ::=  TK_export ( parseDef | ( parseStringLiteral | parseIdent ) ( "=" parseIdent parseDottedVarExpr? )? ";" )
parseVariantCaseParam ::=  parseParamWithOptType
parseVariantCases ::=  "{" parseVariantCase* TK_rbrace
parseVariantCase ::=  parseDef | TK_case parseIdentVoid ( "(" ( parseVariantCaseParam ( "," parseVariantCaseParam )* )? ")" )? ( ";" | parseMembers )
parseStringLiteral ::=  string
parseIdent ::=  ident
parseDottedVarExpr ::=  "." parseTypeRef ( "." parseTypeRef )*
parseEnumParam ::=  parseParamWithOptType
parseEnumCase ::=  parseIdentVoid ( "(" ( parseExpr ( "," parseExpr )* )? ")" )?
parseExpr ::=  parseSubExpr ( "=" parseExpr | addBinOpSuffixes )?
parseMember ::=  TK_private? ( parseDef | parseNew | parseVar )
parseNew ::=  TK_new "(" ( parseNewParam ( "," parseNewParam )* )? ")" ( ":"? TK_super parseTupleExpr )? parseBlockStmt
parseNewParam ::=  TK_var? parseParamWithOptType
parseBlockStmt ::=  "{" parseStmt* TK_rbrace
parseTypeParam ::=  parseIdentVoid
parseStmt ::=  parseBlockStmt | parseEmptyStmt | parseIfStmt | parseWhileStmt | parseMatchStmt | parseVarStmt | parseDefStmt | parseBreakStmt | parseContinueStmt | parseReturnStmt | parseForStmt | parseExprStmt
parseEmptyStmt ::=  ";"
parseIfStmt ::=  TK_if parseControlExpr parseStmt ( TK_else parseStmt )?
parseWhileStmt ::=  TK_while parseControlExpr parseStmt
parseMatchStmt ::=  TK_match parseControlExpr "{" ( parseMatchCase parseMatchCase* )? TK_rbrace ( TK_else parseStmt )?
parseVarStmt ::=  TK_var parseIdentVoid parseVars*
parseDefStmt ::=  TK_def parseIdentVoid parseVars*
parseBreakStmt ::=  TK_break ";"
parseContinueStmt ::=  TK_continue ";"
parseReturnStmt ::=  TK_return parseExpr? ";"
parseForStmt ::=  TK_for "(" parseLocal ( "<" parseExpr | TK_in parseExpr | ";" parseExpr ";" parseExpr ) ")" parseStmt
parseExprStmt ::=  parseExpr ";"
parseControlExpr ::=  "(" parseExpr ")"
parseLocal ::=  parseIdentVoid ( ":" parseTypeRef )? ( "=" parseExpr )?
parseMatchCase ::=  "_" "=>" parseStmt | matchPattern matchPattern*
matchPattern ::=  parseMatchPattern ( "," parseMatchPattern )* "=>" parseStmt
parseMatchPattern ::=  parseIdMatchPattern | parseByteLiteral | "-"? parseNumber
parseIdMatchPattern ::=  ( TK_true | TK_false | TK_null ) | parseIdentCommon ( ":" parseTypeRef | parseDottedVarExpr? ( "(" ( parseMatchParam ( "," parseMatchParam )* )? ")" )? )
parseByteLiteral ::=  charcon
parseNumber ::=  bincon | floatcon | intcon
parseMatchParam ::=  parseIdentVoid
parseVars ::=  ( ":" parseTypeRef )? ( "=" parseExpr )? ( "," parseIdentVoid parseVars? | ";" )
parseFieldSuffix ::=  parseVars
parseIndexed ::=  "[" ( parseMethodParam ( "," parseMethodParam )* )? "]" ( "=" parseMethodParam | "->" parseTypeRef ) ( ";" | parseBlockStmt )
parseMethodSuffix ::=  "(" ( parseMethodParam ( "," parseMethodParam )* )? ")" ( "->" ( TK_this | parseTypeRef ) )? ( ";" | parseBlockStmt )
parseMethodParam ::=  TK_var? parseParamWithOptType
parseParamWithOptType ::=  parseIdentCommon ( ":" parseTypeRef )?
parseSubExpr ::=  parseTerm ( termMultSuffix termMultSuffix* incOrDec? | incOrDec )?
addBinOpSuffixes ::=  parseInfix parseSubExpr ( parseInfix parseSubExpr )*
parseTerm ::=  TK_if "(" parseExpr "," parseExpr ( "," parseExpr )? ")" | TK_true | TK_false | TK_null | "-"? ( parseNumber | parseTupleExpr | parseIdentCommon ) | ( "!" | "~" ) parseSubExpr | parseByteLiteral | parseStringLiteral | parseArrayLiteral | parseParamExpr | incOrDec parseSubExpr
termMultSuffix ::=  addMemberSuffix | parseTupleExpr | parseArrayLiteral
incOrDec ::=  "++" | "--"
addMemberSuffix ::=  "." ( parseIdentUnchecked | ( "!" | "?" ) ( "<" parseTypeRef ( "," parseTypeRef )* ">" )? | parseInfix | intcon | "~" | "[" "]" "="? )
parseArrayLiteral ::=  "[" ( parseExpr ( "," parseExpr )* )? "]"
parseIdentUnchecked ::=  parseIdentCommon
parseInfix ::=  "==" | "!=" | "||" | "&&" | "<" | "<=" | ">" | ">=" | ( "|" | "&" | "<<" | "<<<" | TK_shr | ">>>" | "+" | "-" | "*" | "/" | "%" | "^" ) "="?
parseParamExpr ::=  "_"

//
// tokens
//

TK_break ::= "break"
TK_case ::= "case"
TK_class ::= "class"
TK_component ::= "component"
TK_continue ::= "continue"
TK_def ::= "def"
TK_else ::= "else"
TK_enum ::= "enum"
TK_export ::= "export"
TK_extends ::= "extends"
TK_false ::= "false"
TK_for ::= "for"
TK_if ::= "if"
TK_import ::= "import"
TK_in ::= "in"
TK_layout ::= "layout"
TK_match ::= "match"
TK_new ::= "new"
TK_null ::= "null"
TK_private ::= "private"
TK_return ::= "return"
TK_struct ::= "struct"
TK_super ::= "super"
TK_this ::= "this"
TK_true ::= "true"
TK_type ::= "type"
TK_var ::= "var"
TK_while ::= "while"
TK_shr ::= ">>"
TK_rbrace ::= "}"

Here is the LL(1) parser that still need fixes to parse all the .v3 files of this project:

#include "Scanner-virgil.nut"

COMPILER Virgil
	int scanStateDepth = 0;

TERMINALS
	T_SYMBOL

CHARACTERS
	letter    = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".
	oct        = '0'..'7'.
	digit     = "0123456789".
	bindigit     = "01".
	bindigitwsep     = "01_".
	nzdigit    = '1'..'9'.
	digitwsep     = "0123456789_".
	cr        = '\r'.
	lf        = '\n'.
	tab       = '\t'.
	stringCh  = ANY - '"' - '\\' - cr - lf.
	charCh    = ANY - '\'' - '\\' - cr - lf.
	printable = '\u0020' .. '\u007e'.
	hex       = "0123456789abcdefABCDEF".
	hexwsep       = "0123456789abcdefABCDEF_".

	newLine   = cr + lf.
	notNewLine = ANY - newLine .
	ws         = " " + tab + '\u000b' + '\u000c'.

TOKENS
	ident     = letter { letter | digit | '_'}.
	identParam  = letter { letter | digit | '_'} '<'.
	floatcon =
		( digit {digitwsep} '.' digit {digitwsep} [('e'|'E')  ['+'|'-'] digit {digit}]
		| digit {digitwsep} ('e'|'E')  ['+'|'-'] digit {digit}
		) ['f'|'F' | 'd' | 'D']
		| digit {digitwsep} ('f'|'F') .

	intcon   = ( digit {digitwsep}
		//| '0' {oct}
		| ("0x"|"0X") hex {hexwsep}
		) [('u'|'U') ['l'|'L'] | ('l'|'L') ['u'|'U'] | ('d' | 'D')] .

	bincon  = '0' ('b' | 'B') bindigit {bindigitwsep} ['u' | 'U'].

	string    = '"' { stringCh | '\\' printable } '"'.
	badString = '"' { stringCh | '\\' printable } (cr | lf).
	charcon      = '\'' ( charCh | '\\' printable { hex } ) '\''.

	TK_break = "break" .
	TK_case = "case" .
	TK_class = "class" .
	TK_component = "component" .
	TK_continue = "continue" .
	TK_def = "def" .
	TK_else = "else" .
	TK_enum = "enum" .
	TK_export = "export" .
	TK_extends = "extends" .
	TK_false = "false" .
	TK_for = "for" .
	TK_if = "if" .
	TK_import = "import" .
	TK_in = "in" .
	TK_layout = "layout" .
	TK_match = "match" .
	TK_new : ident = "new" .
	TK_null = "null" .
	TK_private = "private" .
	TK_return = "return" .
	TK_struct = "struct" .
	TK_super = "super" .
	TK_this : ident = "this" .
	TK_true = "true" .
	TK_type = "type" .
	TK_var = "var" .
	TK_while = "while" .

	//types
	//TK_Array = "Array" .
	//TK_bool = "bool" .
	//TK_byte = "byte" .
	//TK_double = "double" .
	//TK_float = "float" .
	//TK_int = "int" .
	//TK_long = "long" .
	//TK_short = "short" .
	//TK_string = "string" .
	//TK_void = "void" .

	//Operators
	TK_shr = ">>" . //(. print("<<DAD>>"); .)

	//Puntuation
	TK_rbrace = '}' .

PRAGMAS

	COMMENTS FROM "/*" TO "*/" NESTED
	COMMENTS FROM "//" TO lf

IGNORE cr + lf + tab

/*-------------------------------------------------------------------------*/

PRODUCTIONS

Virgil =
	{parseToplevelDecl}
	EOF
	.

parseToplevelDecl =
	"class" parseIdentCommon ['(' [parseClassParam {',' parseClassParam}] ')'] ["extends" parseTypeRef] [parseTupleExpr] parseMembers
	| "component" parseIdentVoid parseMembers
	| "import" "component" parseIdentVoid parseMembers
	| parseVar
	| parseDef
	| parseVariant
	| parseEnum
	| parseExport
	.

parseVariant =
	"type" parseIdentCommon ['(' [parseVariantCaseParam {',' parseVariantCaseParam}] ')'] parseVariantCases
	.

parseVariantCases =
	'{' {parseVariantCase}'}'
	.

parseVariantCase =
	parseDef
	| "case" parseIdentVoid ['(' [parseVariantCaseParam {',' parseVariantCaseParam}] ')'] (';' | parseMembers)
	.

parseExport =
	"export" (parseDef | (parseStringLiteral | parseIdent) ['=' parseIdent [parseDottedVarExpr]] ';')
	.

parseEnum =
	"enum" parseIdentVoid ['(' [parseEnumParam {',' parseEnumParam}] ')']
		'{' [ parseEnumCase {','
			(. if(la.kind == ParserTokens._TK_rbrace) {break; /*allow trailing separator*/} .)
			parseEnumCase} ] '}'
	.

parseEnumCase =
	parseIdentVoid ['(' [parseExpr {',' parseExpr}] /*[',']*/ ')']
	.

parseMembers =
	'{' {parseMember} '}'
	.

parseMember =
	["private"] (parseDef | parseNew | parseVar)
	.

parseNew =
	"new" '(' [parseNewParam {',' parseNewParam}] ')' [([':'] "super") parseTupleExpr] parseBlockStmt
	.
/*
parseParamCommon =
	parseIdentCommon [parseIdentVoid] [':' parseTypeRef]
	.
*/
parseTypeRef =
	(
		'(' [parseTypeRef {',' parseTypeRef}] ')'
		| parseIdentCommon {'.' parseIdentCommon}
	)
	{"->" parseTypeRef}
	.

parseTypeParam =
	parseIdentVoid
	.

parseStmt =
	parseBlockStmt
	| parseEmptyStmt
	| parseIfStmt
	| parseWhileStmt
	| parseMatchStmt
	| parseVarStmt
	| parseDefStmt
	| parseBreakStmt
	| parseContinueStmt
	| parseReturnStmt
	| parseForStmt
	| parseExprStmt
	.

parseBlockStmt =
	'{' {parseStmt} '}'
	.

parseEmptyStmt =
	';'
	.

parseControlExpr =
	'(' parseExpr ')'
	.

parseIfStmt =
	"if" parseControlExpr parseStmt ["else" parseStmt]
	.

parseWhileStmt =
	"while" parseControlExpr parseStmt
	.

parseForStmt =
	"for" '(' parseLocal ('<' parseExpr | "in" parseExpr | ';' parseExpr ';' parseExpr) ')' parseStmt
	.

parseMatchStmt =
	"match" parseControlExpr '{' [parseMatchCase {parseMatchCase}] '}' ["else" parseStmt]
	.

parseMatchCase =
	'_' "=>" parseStmt
	| matchPattern {matchPattern}
	.

matchPattern =
	parseMatchPattern {',' parseMatchPattern} "=>" parseStmt
	.

parseMatchPattern =
	parseIdMatchPattern
	| parseByteLiteral
	| ['-'] parseNumber
	.

parseDottedVarExpr =
	'.' parseTypeRef {'.' parseTypeRef}
	.

parseIdMatchPattern =
	("true" | "false" | "null")
	| parseIdentCommon (
		':' parseTypeRef
		| [parseDottedVarExpr] ['(' [parseMatchParam {',' parseMatchParam}] ')']
	)
	.

parseMatchParam =
	parseIdentVoid
	.

parseVarStmt =
	"var" parseIdentVoid {parseVars}
	.

parseDefStmt =
	"def" parseIdentVoid {parseVars}
	.

parseBreakStmt =
	"break" ';'
	.

parseContinueStmt =
	"continue" ';'
	.

parseReturnStmt =
	"return" [parseExpr] ';'
	.

parseVar =
	"var" parseIdentVoid parseFieldSuffix
	.

parseDef =
	"def" ["var"] (parseIndexed | parseIdentCommon (parseMethodSuffix | parseFieldSuffix))
	.

parseIndexed =
	'[' [parseMethodParam {',' parseMethodParam}] ']' ('=' parseMethodParam | "->" parseTypeRef) (';' | parseBlockStmt)
	.

parseMethodSuffix =
	'(' [parseMethodParam {',' parseMethodParam}] ')' ["->" ("this" | parseTypeRef)] (';' | parseBlockStmt)
	.

parseParamWithOptType =
	parseIdentCommon [':' parseTypeRef]
	.

parseMethodParam =
	["var"] parseParamWithOptType
	.

parseNewParam =
	["var"] parseParamWithOptType
	.

parseClassParam =
	["var"] parseParamWithOptType
	.

parseEnumParam =
	parseParamWithOptType
	.

parseVariantCaseParam =
	parseParamWithOptType
	.

parseExprStmt =
	parseExpr ';'
	.

parseExpr =
	parseSubExpr ['=' parseExpr | addBinOpSuffixes]
	.

parseSubExpr =
	parseTerm [termMultSuffix {termMultSuffix} [incOrDec] | incOrDec]
	.

incOrDec =
	"++" | "--"
	.

termMultSuffix =
	addMemberSuffix | parseTupleExpr | parseArrayLiteral
	.

addMemberSuffix = (. scanner.stateNo=6; .) //for tuple indexing by integers
	'.' (
		parseIdentUnchecked
		| ('!' | '?') ['<' parseTypeRef {',' parseTypeRef} '>']
		| parseInfix
		| intcon
		| '~'
		| '[' ']' ['=']
	) (. scanner.stateNo=0; .)
	.

parseTerm =
	"if" '(' parseExpr ',' parseExpr [',' parseExpr]')'
	//| parseVarExpr
	| "true"
	| "false"
	| "null"
	| ['-'] (parseNumber | parseTupleExpr | parseIdentCommon)
	| ('!' | '~') parseSubExpr
	| parseByteLiteral
	| parseStringLiteral
	| parseArrayLiteral
	| parseParamExpr
	| incOrDec parseSubExpr
	.

parseParamExpr =
	'_'
	.

parseByteLiteral =
	charcon
	.

parseStringLiteral =
	string
	.

parseTupleExpr =
	'(' [parseExpr {',' parseExpr}] ')'
	.

parseArrayLiteral =
	'[' [parseExpr {',' parseExpr}] ']'
	.

parseNumber =
	bincon //BinLiteral
	//| HexLiteral
	| floatcon //FloatLiteral
	| intcon //DecLiteral
	.
/*
parseVarExpr =
	parseIdentCommon
	| "true"
	| "false"
	| "null"
	.
*/
parseIdent =
	ident
	.

parseIdentVoid =
	ident
	.

parseIdentCommon =
	identParam
		(. if(scanStateDepth++ == 0) scanner.stateNo = 5; .)
		parseTypeRef {',' parseTypeRef} '>'
		(. if(--scanStateDepth == 0) scanner.stateNo = 0; .)
	| ident
	.

parseIdentUnchecked =
	parseIdentCommon
	.

parseLocal =
	parseIdentVoid [':' parseTypeRef] ['=' parseExpr]
	.

parseVars =
	[':' parseTypeRef] ['=' parseExpr] (',' parseIdentVoid [parseVars] | ';')
	.

parseFieldSuffix =
	parseVars
	.

addBinOpSuffixes =
	parseInfix parseSubExpr {parseInfix parseSubExpr}
	.

parseInfix =
	"=="
	| "!="
	| "||"
	| "&&"
	| '<'
	| "<="
	| '>'
	| ">="
	| (
		'|'
		| '&'
		| "<<"
		| "<<<"
		| ">>"
		| ">>>"
		| '+'
		| '-'
		| '*'
		| '/'
		| '%'
		| '^'
	) ['=']
	.

END Virgil.

non-void main in wasm - expected 1 elements on the stack for fallthru to @1, found 2

While trying to diagnose the remainder of #18, I changed hello-v3.v3 to read:

def main() -> int {
	System.puts("Hello world!\n");
	return 33;
}

and got this output after compilation:

user@host009:~/src/virgil/bin/dev$ wave hello-v3.wasm
<unknown>:-1: Uncaught CompileError: WebAssembly.Module(): Compiling wasm function
 "wasm-function[4]" failed: expected 1 elements on the stack for fallthru to @1, found 2 @+232
ERROR: could not compile hello-v3.wasm
255 user@host009:~/src/virgil/bin/dev$

Windows support

What steps will reproduce the problem?
1. Download virgil
2. Try to invoke aeneas.jar
3. There is no output or something

What is the expected output? What do you see instead?
Since virgil is bootstrapped and able to target JVM I expected it to work on 
windows out-of-the-box.
But it depends on the shell script which is Unix-specific.

What version of the product are you using? On what operating system?
Tried virgil-starter and latest source checkout (fe2ba7c2d39d), on Windows 8.1 
x64.

Please provide any additional information below.
Using Cygwin is not an option, it's very bloated and slow. I'll try to hack an 
equivalent to virgil shell though.

Original issue reported on code.google.com by [email protected] on 22 Sep 2013 at 12:12

Port Virgil to x86-64 Linux

This is a tracking issue for porting Virgil to x86-64 Linux.

I've recently filled out lib/asm/x64/X64Assembler.v3 and soon will start on the codegen.

Add -program-name option

Add an option to allow renaming the binaries output from the compiler.

% v3c -program-name=Blah.exe Foo.v3

Will output into a binary Blah.exe instead of Foo

Impact estimate (select one)
----------------------
1 - trivial
================

Priority (select one)
----------------------
3 - high
================

Original issue reported on code.google.com by [email protected] on 13 Apr 2012 at 5:26

Improve register allocation for Virgil

Issue for tracking implementation of a graph-coloring based register allocation for Virgil with both spilling and splitting of live ranges. The splitting heuristic implemented should be cheap to compute and should not sacrifice the quality of allocation. A simple splitting heuristic is implemented in the V8 optimizing JS compiler.

The main concern with a graph-coloring approach is the cost of rebuilding the interference graph after splitting or spilling a live range. Ideally, the interference graph reconstruction is cheap. An approach like the tiling based approach of Callahan-Koblenz [1] can be considered.

Current Roadmap

Rip out liveness analysis from current register allocator
Interference graph construction and graph coloring (with spilling)
Refine splitting heuristic

[1]: Callahan, David, and Brian Koblenz. “Register Allocation via Hierarchical Graph Coloring.” Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, vol. 26, no. 6, 1991, pp. 192–203.

Improve WASM targets

Virgil compiles to WASM and has three different runtime implementations:

(1) rt/wave, the (W)eb(A)ssembly (V)irgil (E)nvironment), a minimal set of imports just to get Virgil running, which was only implemented in Wizard and a bit-rotted version in V8 using the C embedding API.
(2) rt/wasi_snapshot_preview1, an implementation of System against the wasi_snapshot_preview1 API, but it is not complete.
(3) rt/node, an implementation of System against the node.js APIs.

I could use some help with (2), as there are a few things not working. Also, there is no equivalent of chmod, which is a minor thing, but needed to bootstrap the compiler.

Also, I haven't fully tested (3) or made it robust. Could use some help here.

Lambda support

virgil/doc/tutorial/Functions.md

Line 174 in 3709c05

Virgil doesn't currently support lambdas, but support will be added soon.

I guess this means anonymous functions with closures?

For example:

def intSeq() -> () -> int {
    var i = 0;
    return () -> int {
        i = i + 1;
        return i;
    }
}

Precedence of virgil syntax

Is there any material specifies the precedence of virgil syntax?

I'm trying to write the virgil's grammar.js for tree-sitter, which will bring syntax highlight and basic code navigation for virgil in common editors like vim, vscode etc.

cc the rust precedence table https://github.com/tree-sitter/tree-sitter-rust/blob/0f14a10011ac6e56f309fb99a94829c3312b743a/grammar.js#L1

Rename !Exception to !Violation

Since these safety violations result in a termination of the program, it's not 
really accurate to call them exceptions. Violation = Termination!

Original issue reported on code.google.com by [email protected] on 28 Mar 2012 at 6:52

compile-time initialization System.puts error

is it supposed to still work?

lib/util: Implement %f to print floating point numbers

Since adding floating point back in 2020, the Virgil utility libraries have not yet added support for printing/rendering floating point numbers (in decimal). In C, the printf format string supports %f for specifying an argument is a floating point number. The analogous place to add code to handle this is in StringBuilder, which can deal with floating point strings.

Slight differences with C printf and Virgil StringBuffer:

StringBuilder uses %d (for decimal) for specifying decimal output of integer values. I think it'd be nice to just extend
StringBuilder doesn't yet support specifying the width (in characters) of the output item, nor left or right-justifying
Performance of printing floats isn't highly important; as long as the routine doesn't allocate memory, it should be fine to do the naive algorithm

I could use some help on this, and it might make a good starter project for someone new wanting to kick the tires with Virgil and contribute.

How to compile Virgil into WASI

I've been playing with the idea of getting compilers and new languages into the Wasm/WASI world, as I really think it enriches the ecosystem.
Zig has recently supported compiling to Wasm/WASI and I was wondering how we could get the self-hosting Virgil compiler into the WASI world.

I tried playing a bit with the /bin/dev bash files but I was unable to get Virgil compiling into WASI (the compiler itself).
Any help will be greatly appreciated!

interop with C?

How is interop with C supposed to work? with FFI or clang preprocessing?

Aeneas.wasm hello-v3.v3 - no output

user@host009:~/src/virgil/bin/dev$ wave ../current/wasm/Aeneas.wasm run \
../../rt/wave/wave.v3 ../../rt/wave/System.v3 ../../test/wave/hello-v3.v3
user@host009:~/src/virgil/bin/dev$ echo $?
0

shouldn't Rosetta be able to run x86 ?

/opt/virgil/ arch
i386
/opt/virgil/ ./bin/v3c-x86-darwin demo.v3
/opt/virgil/ ./demo             
zsh: bad CPU type in executable: ./demo

Enable optimizations later in the pipeline

With 2b34472, I've added switches for turning on optimizations later in the pipeline, after specialization and normalization.

This is a tracking issue. There are a couple bugs and inlining should be enabled as well.

Allow ; for empty constructors

Syntactically, ";" for an empty constructor body is allowed, it just results in 
a !UnimplementedException in at runtime when invoking the constructor--not very 
useful. Allow ";" for an empty constructor body, or make it a compile-time 
error.

Original issue reported on code.google.com by [email protected] on 28 Mar 2012 at 6:49

Idea: TS Transpiler to Virgil

Since TypeScript can have programs with fully statically defined types, I wonder if it may be possible to have a transpiler that transforms TS codebases into Virgil and then we use it to compile to very lean Wasm code.

There are already some projects that do static compilation of Typescript:

AssemblyScript (subset of TS) https://www.assemblyscript.org/
TypescriptCompiler (compiles TS to LLVM IR) https://github.com/ASDAlexander77/TypeScriptCompiler
Microsoft's PXT https://github.com/microsoft/pxt/tree/master/pxtcompiler (it has an interpreter, an optimal JS converter and also supports ARM Thumb compilation)

Could it be possible to achieve with Virgil? Where you think may be the limitations?

Simplified for loop

This kind of for loop (/lib/util/Arrays.v3) is not documented in the tutorial:

for (i < index) n[i] = array[i];

What about initialization and incrementation of i variable?

x86-64-linux executables exit with random non-zero exit codes

This is an odd one: x86-64-linux executables sometimes return a (seemingly random) non-zero exit code from the main() function:

$ cat hello.v3 && v3c-x86-64-linux hello.v3 && ./hello; echo $?
def main() {
  System.puts("Hello World\n");
}
Hello World
216

If a second print statement is added the exit code is zero:

$ cat hello.v3 && v3c-x86-64-linux hello.v3 && ./hello; echo $?
def main() {
  System.puts("Hello World\n");
  System.puts("Hello World\n");
}
Hello World
Hello World
0

Explicitly returning an exit code from main() seems to resolve the problem:

$ cat hello.v3 && v3c-x86-64-linux hello.v3 && ./hello; echo $?
def main() -> int {
  System.puts("Hello World\n");
  return 0;
}
Hello World
0

The x86-linux target does not seem to exhibit this behaviour.

$ cat hello.v3 && v3c-x86-linux hello.v3 && ./hello; echo $? 
def main() {
  System.puts("Hello World\n");
}
Hello World
0

Support floating point

Language or Compiler Enhancement?
===========================

Language and Compiler

Justification (summary of perceived benefit)
===============================

Support for 32 and 64 bit floating point numbers is necessary to perform all 
manner of scientific and graphics calculations.

Impact estimate (select one)
----------------------
3 - large
================

Priority (select one)
----------------------
2 - medium
================