mattjquinn / jcompiler Goto Github PK

View Code? Open in Web Editor NEW

32.0 3.0 0.0 12.74 MB

a compiler for the J programming language

License: GNU General Public License v3.0

J 3.70% Rust 70.44% C 23.50% Shell 2.26% Assembly 0.09%

j c rust llvm arm

jcompiler's People

Contributors

Stargazers

Watchers

jcompiler's Issues

Verbs should inherently be entities that can be printed to stdout

In ijconsole, writing a verb like + and hitting return results in the text representation of that verb being printed to stdout; jcompiler should do the same.

Dyadic verbs should display "length error" for i.e. 1 2 3 + 4 5

Compiler should replicate interpreter's behavior/wording:

$ ijconsole
   1 2 + 3 4 5
|length error
|   1 2    +3 4 5

Add benchmarks, with errors for regressions

Benchmark both compiler speed, and the speed of compiled binaries.

Add initial support for dyadic $ (Shape) verb.

https://code.jsoftware.com/wiki/Vocabulary/dollar#dyadic

Binaries segfault when global ident is used without declaration/initialization

Compiler is correctly assigning global identifiers ids into the global table, but not enforcing requirement that identifiers must have been assigned to. Simple change to llvm.rs needed for this - if an ident hasn't been declared/initialized yet, refuse to compile.

Run valgrind and clang static analysis on each build

Run with all warnings switched on, stop build if any warnings/errors.

Support padded print and width detection for doubles

i.e.,

a =: 2 3 $ 1.2323 4.43435 9.23333333333333 8.32 9.1121111111111 5
a

currently causes an unsupported message/exit. Add in appropriate snprintf/printf calls as was done for integers.

Print large numbers using scientific notation

Above a certain magnitude, J displays numbers in scientific notation, i.e., 5e36. Modify jprint to handle this.

Experiment with defining C runtime funcs in Rust instead on branch `with-rust-runtime`

Possibly use Cargo's workspace feature to contain both the compiler crate and the dylib runtime crate so that they can share definitions such as type and verb enums.

Add benchmarks for compiled binaries.

A subset of J programs that use a wide subset of J should be benchmarked to ensure compiler/runtime changes don't cause performance regressions.

Add support for sparse n-dim arrays and/or lazy evaluation

Will improve performance when using large arrays that either:

don't have their internal elements accessed at all (i.e., $ 100 100 100 $ 9999)
only have parts of their element space accessed

Tests fail to run in parallel due to memory leak/sharing in compiler.

Running the tests in parallel (cargo's default) fails due to memory corruption; this is likely due to something being mismanaged in llvm.rs.

Add Rust macros to eliminate duplication of code

i.e., in compiler_tests.rs

Before next stable release, add rustdoc comments to all Rust and C functions

Add tests for dyadic Power, Residue, Reciprocal verbs (using both ints and doubles like the tests for division)

Reorganize llvm.rs, make as much as possible private

For example, global_idents map on Module struct should not be accessible outside module.

Improve parser/compiler error messages (especially using `ansi_term` crate).

The PEST parser already provides good error messages, but compile-time failures need improvement.

SIMD/CUDA support

J's focus on arrays lends itself naturally to parallelization, experiment with ways to harness SIMD/CUDA for extra speedups.

Implement J's random generator to gain a source of dynamic expressions with which to test.

This will allow writing of tests that ensure code doesn't expect that all expressions are known statically at compile time.
Use C's srand(seed) to implement.

Support dyadic `,` Append verb for strings.

i.e., string concatenation

Replace exit() calls in the C library code with a handler that cleans up properly

In LLVM IR builder, only import/declare J library functions that are actually used by the program being compiled.

i.e., if a program doesn't call any monads, there's no need to import jmonad from jverbs.c

For release 0.1.0, update CHANGELOG.md and provide precompiled binaries in "Releases" tab.

Eliminate code duplication among print-related functions.

Current print code has a large amount of overlap due to need to detect table column widths; format specifiers are used multiple times rather than in a single place. This should all be consolidated.

Write more parser tests.

Because one is not enough...

Detect performance regressions of both the compiler and compiled binaries duirng builds.

Use criterion's comparison feature to compare benchmarks with the previous commit; if a performance decrease occurs, the build should fail.

Unable to consistent deploy built documentation from travis to GitHub pages.

Using cargo doc-upload gives inconsistent results from the Travis build; it worked at one point but now it doesn't. Rather than use GitHub pages, something like S3 should be used to prevent use of a repo branch itself to store the documentation.

Also note that benchmark reports using criterion should be placed in a subfolder of the built docs so that the latest reports can be linked to from ie README.md.

Replace JArrayType with JNDimensionalArrayType

all arrays should have multi-dimensional characteristics by default (i.e., shape and rank).

Support escaping of single quotes within strings.

Parser currently gets tripped up on escaped single quotes within strings.

Handle divide by zero

4 % 0 is represented as inf, should be represented as _ as in J.

Minimize heap allocations, put as much as possible on the stack.

For performance reasons, as much as possible should be allocated on the stack. Currently malloc calls are favored to keep the code understandable in its early stages, but in the future stack-only allocations should become a focus.

For instance, rather than malloc arrays we can decrement the stack pointer continually until we reach the end. The number of decrements is the length of the array. We can then apply a monadic function directly to this array, or for dyads, we remember the verb, store the next array, and intelligently operate on the two arrays. As a rough example:

1 + 2 3 4 becomes: sub $sp, store 1 $len1, (remember "+"), store $sp,
sub $sp (3 times), store 3 $len2, then call add 3 times

Arrays of random length, or whose length are unknown at compile time, should have the same thing done; the number of decrements to the stack pointer simply becomes variable and verb code will need to handle it accordingly.

Arthur Whitney seems to use only the stack in his B language interpreter, so maybe get some inspiration there:
http://kparc.com/b/
https://docs.google.com/document/d/1W83ME5JecI2hd5hAUqQ1BVF32wtCel8zxb7WPq-D4f8/edit
https://github.com/tlack/b-decoded

And while this early interpreter for J uses malloc, there should be something to learn here as well:
https://code.jsoftware.com/wiki/Essays/Incunabulum

http://www.jsoftware.com/help/jforc/contents.htm#_Toc191734291

APL: A Glimpse of Heaven: https://news.ycombinator.com/item?id=19325361
K7 Tutorial: https://news.ycombinator.com/item?id=19418570

mattjquinn / jcompiler Goto Github PK

jcompiler's People

Contributors

Stargazers

Watchers

jcompiler's Issues

Recommend Projects

Recommend Topics

Recommend Org