Giter VIP home page Giter VIP logo

jou's Introduction

Jou programming language

Jou is an experimental toy programming language. It looks like this:

import "stdlib/io.jou"

def main() -> int:
    puts("Hello World")
    return 0

See the examples and tests directories for more example programs or read the Jou tutorial.

So far, Jou is usable for writing small programs that don't have a lot of dependencies. For example, I solved all problems of Advent of Code 2023 in Jou. See examples/aoc2023 for the code.

Goals:

  • Minimalistic feel of C + simple Python-style syntax
  • Possible target audiences:
    • People who find C programming fun
    • Python programmers who want to try programming at a lower level (maybe to eventually learn C or Rust)
  • Compatibility with C, not just as one more feature but as the recommended way to do many things
  • Self-hosted compiler
  • Eliminate some stupid things in C. For example:
    • Many useful warnings being disabled by default
    • UB for comparing pointers into different memory areas (as in array <= foo && foo < array+sizeof(array)/sizeof(array[0]))
    • negative % positive is negative or zero, should IMO be positive or zero (unless that is a lot slower, of course)
    • Strict aliasing
    • int possibly being only 16 bits
    • long possibly being only 32 bits
    • char possibly being more than 8 bits
    • char possibly being signed
    • char being named char even though it's really a byte
  • Generics, so that you can implement a generic list (dynamically growing array) better than in C
  • Compiler errors for most common bugs in C (missing free(), double free(), use after free, etc.)
  • More keywords (def, decl, forwarddecl)
  • Enumerated unions = C union together with a C enum to tell which union member is active
  • Windows support that doesn't suck

Non-goals:

  • Yet another big language that doesn't feel at all like C (C++, Zig, Rust, ...)
  • Garbage collection (should feel lower level than that)
  • Wrapper functions for the C standard library
  • Wrapper libraries for existing C libraries (should just use the C library directly)
  • Trying to detect every possible memory bug at compile time (Rust already does it better than I can, and even then it can be painful to use)
  • Copying Python's gotchas (e.g. complicated import system with weird syntax and much more weird runtime behavior)

Setup

These instructions are for using Jou. The instructions for developing Jou are in CONTRIBUTING.md.

Linux
  1. Install the dependencies:
    $ sudo apt install git llvm-14-dev clang-14 make
    
    Let me know if you use a distro that doesn't have apt, and you need help with this step.
  2. Download and compile Jou.
    $ git clone https://github.com/Akuli/jou
    $ cd jou
    $ make
    
  3. Run the hello world program to make sure that Jou works:
    $ ./jou examples/hello.jou
    Hello World
    
    You can now run other Jou programs in the same way.
  4. (Optional) If you want to run Jou programs with simply jou filename instead of something like ./jou filename or /full/path/to/jou filename, you can add the jou directory to your PATH. To do so, edit ~/.bashrc (or whatever other file you have instead, e.g. ~/.zshrc):
    $ nano ~/.bashrc
    
    Add the following line to the end:
    export PATH="$PATH:/home/yourname/jou/"
    
    Replace /home/yourname/jou/ with the path to the folder (not the executable file) where you downloaded Jou. Note that the ~ character does not work here, so you need to use a full path (or $HOME) instead.

These LLVM/clang versions are supported:

  • LLVM 11 with clang 11
  • LLVM 13 with clang 13
  • LLVM 14 with clang 14

By default, the make command picks the latest available version. You can also specify the version manually by setting the LLVM_CONFIG variable:

$ sudo apt install llvm-11-dev clang-11
$ make clean    # Delete files that were compiled with previous LLVM version
$ LLVM_CONFIG=llvm-config-11 make
MacOS

MacOS support is new. Please create an issue if something doesn't work.

  1. Install Git, make and LLVM 13. If you do software development on MacOS, you probably already have Git and make, because they come with Xcode Command Line Tools. You can use brew to install LLVM 13:
    $ brew install llvm@13
    
  2. Download and compile Jou.
    $ git clone https://github.com/Akuli/jou
    $ cd jou
    $ make
    
  3. Run the hello world program to make sure that Jou works:
    $ ./jou examples/hello.jou
    Hello World
    
    You can now run other Jou programs in the same way.
  4. (Optional) If you want to run Jou programs with simply jou filename instead of something like ./jou filename or /full/path/to/jou filename, you can add the jou directory to your PATH. To do so, edit ~/.bashrc (or whatever other file you have instead, e.g. ~/.zshrc):
    $ nano ~/.bashrc
    
    Add the following line to the end:
    export PATH="$PATH:/Users/yourname/jou/"
    
    Replace /Users/yourname/jou/ with the path to the folder (not the executable file) where you downloaded Jou. Note that the ~ character does not work here, so you need to use a full path (or $HOME) instead.
NetBSD Support for NetBSD is still experimental. Please report bugs and shortcomings.
  1. Install the dependencies:
    # pkgin install bash clang git gmake libLLVM
    
    Optionally diffutils can be installed for coloured diff outputs.
  2. Download and compile Jou.
    $ git clone https://github.com/Akuli/jou
    $ cd jou
    $ gmake
    
  3. Run the hello world program to make sure that Jou works:
    $ ./jou examples/hello.jou
    Hello World
    
    You can now run other Jou programs in the same way.
  4. (Optional) If you want to run Jou programs with simply jou filename instead of something like ./jou filename or /full/path/to/jou filename, you can add the jou directory to your PATH. Refer to the manual page of your login shell for exact syntax.

NB: Using Clang and LLVM libraries built as a part of the base system is not currently supported.

64-bit Windows
  1. Go to releases on GitHub. It's in the sidebar at right.
  2. Choose a release (latest is probably good) and download a .zip file whose name starts with jou_windows_64bit_.
  3. Extract the zip file somewhere on your computer.
  4. You should now have a folder that contains jou.exe, lots of .dll files, and subfolders named stdlib and mingw64. Add this folder to PATH. If you don't know how to add a folder to PATH, you can e.g. search "windows add to path" on youtube.
  5. Write Jou code into a file and run jou filename.jou on a command prompt. Try the hello world program, for example.

Updating to the latest version of Jou

Run jou --update. On old versions of Jou that don't have --update, you need to instead delete the folder where you installed Jou and go through the setup instructions above again.

Editor support

Tell your editor to syntax-highlight .jou files as if they were Python files. You may want to copy some other Python settings too, such as how to handle indentations and comments.

If your editor uses a langserver for Python, make sure it doesn't use the same langserver for Jou. For example, vscode uses the Pylance language server, and you need to disable it for .jou files; otherwise you get lots of warnings whenever you edit Jou code that would be invalid as Python code.

For example, I use the following configuration with the Porcupine editor:

[Jou]
filename_patterns = ["*.jou"]
pygments_lexer = "pygments.lexers.Python3Lexer"
syntax_highlighter = "pygments"
comment_prefix = '#'
autoindent_regexes = {dedent = 'return( .+)?|break|pass|continue', indent = '.*:'}

To apply this configuration, copy/paste it to end of Porcupine's filetypes.toml (menubar at top --> Settings --> Config Files --> Edit filetypes.toml).

How does the compiler work?

See CONTRIBUTING.md.

jou's People

Contributors

akuli avatar arrinao avatar littlewhitecloud avatar sumeshir26 avatar taahol avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

jou's Issues

segfault with --verbose

akuli@Akuli-ThinkPad:~/jou$ make && ./jou --verbose examples/primes.jou 

...

line 7: Define a function: is_prime(n: int [line 7]) -> bool [line 7]
|--- [line 8] if
| |--- condition: [line 8] lt
| | |--- [line 8] get variable "n"
| | `--- [line 8] 2 (32-bit signed)
| `--- body:
|   `--- [line 9] return a value
|     `--- [line 9] False
|--- [line 11] For loop
| |--- init: [line 11] assign
| | |--- [line 11] get variable "divisor"
| | `--- [line 11] 2 (32-bit signed)
| |--- cond: [line 11] lt
| | |--- [line 11] get variable "divisor"
| | `--- [line 11] get variable "n"
| |--- incr: [line 11] expression statement
| | `--- [line 11] post-increment
| |   `--- [line 11] get variable "divisor"
| `--- body:
Segmentation fault

complicated subscript expressions

Doesn't work:

declare putchar(ch: int) -> int

def main() -> int:
    foo = "hey"
    putchar(foo[1+1])
    putchar('\n')
    return 0

expected output: y printed, no error

actual output: compiler error in file "asd.jou", line 5: expected a ']', got '+'

Unicode variable names

I wanted to make a variable named yrjö and yrjö2 so I did this

diff --git a/src/tokenize.c b/src/tokenize.c
index 559bb08..e47e5a9 100644
--- a/src/tokenize.c
+++ b/src/tokenize.c
@@ -66,7 +66,7 @@ static void unread_byte(struct State *st, char c)
 
 static bool is_identifier_first_byte(char c)
 {
-    return ('A'<=c && c<='Z') || ('a'<=c && c<='z') || c=='_';
+    return ('A'<=c && c<='Z') || ('a'<=c && c<='z') || c=='_' || (unsigned char)c > 127;
 }
 
 static bool is_identifier_continuation(char c)

-1

Currently the only way to create a negative number is to write e.g. 0-1. I think a simple -1 would be better.

It should be parsed probably like this: whenever we are parsing an expression with additions and subtractions, we check if the first token is -, and in that case the first term is negated. So -x+y parses as Add(Neg(Var(x)), Var(y)).

Moosems thoughts about what Jou should/shouldn't have

  • "a good easy package manager like pip goes a long way"
  • "And good stack traces and errors like in python"
  • "I dislike the whole importing thing in C/C++ ... for example the header files and source files could just be one"

I agree with all of these points.

Double import

This is not an error, but it should be:

from "stdlib/io.jou" import printf, printf

Cyclic imports

If foo.jou imports bar.jou and bar.jou imports foo.jou, this should not be a problem. I don't think that quite works currently

Importing a file that doesn't exist

file asd.jou contains:

from "x.jou" import printf

Error message is:

compiler error in file "x.jou": cannot open file: No such file or directory

It should IMO be something like:

compiler error in file "asd.jou", line 1: cannot import from "x.jou": No such file or directory

very difficult to cast to smaller type

To make a byte value that holds an int, say 128, you need to do:

    # Set x_byte to a byte value 128
    x_int = 128
    x_int_ptr: void* = &x_int
    x_byte_ptr: byte* = x_int_ptr
    x_byte: byte = *x_byte_ptr

This is ridiculous.

Should structs have methods?

In this context, "class" means "struct with methods".

C doesn't have methods in structs, but a lot of people would like to have a "C with classes" language with most of the minimality of C. This was historically C++ but definitely isn't anymore.

Another advantage of classes is the similarity to Python: all Python developers know classes and expect to see them in a language, and the class keyword syntax highlights nicely when abusing Python's syntax highlighting.

A disadvantage with classes is that they easily lead to a lot of unnecessary fluff that doesn't belong in a minimal language (like C or Jou):

  • Generics
  • Templates
  • Inheritance and other fancy/disgusting OOP tricks
  • Runtime type info
  • Static methods / class methods / extrafancy abstract boilerplate poopoo methods
  • ...

A bigger disadvantage of classes is that when you have classes, it becomes tempting to make wrappers everything standard library. If you want to stay close to C (like C++ did and Jou does), you end up with two copies of everything. For example, in C++ there is std::cin and stdin, std::string and char*, smart pointers and C pointers, etc.

One more disadvantage is magic methods that can hide a lot, such as copy constructors, destructors and operator overloading. Ewww.

I think I will add classes to Jou, but intentionally restrict them a lot to prevent these problems:

  • No new standard-library classes. For example, there will be no std::string that hides all the fun details inside itself. Jou should be a language where you enjoy playing with pointers, not a language where you abstract everything away until your code is unreadable and then blame other people for not understanding it. (Looking at you, C++)
  • Wrapping C code is painless, and doesn't involve making new structs/classes. Below is an idea about how that could work.
# io.jou
opaque struct FILE:  # opaque = fields not known to compiler
    functionmethod fprintf(self: FILE*, format: byte*, ...) -> int
    functionmethod fclose(self: FILE*)



# asdasd.jou
import io

def main() -> int:
    FILE *file = fopen("hello.txt", "w")
    file->fprintf("hello\n")
    file->fclose()
    return 0

importing creates weird file names

# a.jou
def foo(


# b.jou
from "./a.jou" import foo
akuli@akuli-desktop:~/jou$ ./jou b.jou 
compiler error in file "././a.jou", line 1: expected an argument name, got end of line

What's up with ././a.jou, should be simply a.jou imo.

cannot assign to array[index]

declare malloc(size: int) -> void*

def main() -> int:
    x: int* = malloc(111)
    x[1] = 8

Expected result: no error

Actual result:

akuli@Akuli-ThinkPad:~/jou$ ./jou asd3.jou
compiler error in file "asd3.jou", line 5: cannot assign to a newly calculated value

Attempting this quick and dirty fix...

diff --git a/src/typecheck.c b/src/typecheck.c
index 127a013..be7c777 100644
--- a/src/typecheck.c
+++ b/src/typecheck.c
@@ -569,12 +569,12 @@ static void typecheck_statement(TypeContext *ctx, const AstStatement *stmt)
                         break;
                     case AST_EXPR_GET_FIELD:
                     case AST_EXPR_DEREF_AND_GET_FIELD:
+                    default:
                         snprintf(
                             errmsg, sizeof errmsg,
                             "cannot assign a value of type FROM into field '%s' of type TO",
                             targetexpr->data.field.fieldname);
                         break;
-                    default: assert(0);
                 }
                 const ExpressionTypes *targettypes = typecheck_expression(ctx, targetexpr);
                 typecheck_expression_with_implicit_cast(ctx, valueexpr, targettypes->type, errmsg);

...gives a new error:

akuli@Akuli-ThinkPad:~/jou$ ./jou asd3.jou
compiler error in file "asd3.jou", line 5: cannot assign to a newly calculated value

No compile error, if you forget return

cdecl puts(string: byte*) -> int

def main() -> int:
    puts("Hello World")
    # No return statement. Should not compile, but currently segfaults at runtime.

I wonder if I should ask LLVM whether the end of the function can be reached without a return statement, or should I check for that myself.

Jou is slower than C

I created two test programs, fib40.c and fib40.jou.

#include <stdio.h>

int fib(int n) {
    if (n <= 2)
        return n;
    return fib(n-1) + fib(n-2);
}

int main()
{
    printf("fib(40) = %d\n", fib(40));
    return 0;
}
declare printf(format: byte*, ...) -> int

def fib(n: int) -> int:
    if n <= 1:
        return n
    return fib(n-1) + fib(n-2)

def main() -> int:
    printf("fib(40) = %d\n", fib(40))
    return 0

On this computer, the C program runs in about 0.30 seconds and the Jou program in about 0.47 seconds, both with -O3.

akuli@akuli-desktop:~/jou$ clang -O3 fib40.c && time ./a.out
fib(40) = 165580141

real	0m0,298s
user	0m0,298s
sys	0m0,000s
akuli@akuli-desktop:~/jou$ time ./jou -O3 fib40.jou
fib(40) = 102334155

real	0m0,472s
user	0m0,460s
sys	0m0,012s

However, this includes parsing and compiling the Jou code. To measure only the runtime, I placed clang's LLVM IR to fib40c.ll and Jou's LLVM IR to fib40j.ll.

$ clang -S -emit-llvm fib40.c -O3 -o fib40c.ll
$ ./jou --verbose -O3 fib40.jou
...copy "Optimized LLVM IR" part of output...
$ cat > fib40j.ll
...paste from clipboard...

Turns out that Jou's LLVM IR is slower by the same amount:

akuli@akuli-desktop:~/jou$ clang -O3 fib40c.ll && time ./a.out 
fib(40) = 165580141

real	0m0,301s
user	0m0,301s
sys	0m0,000s
akuli@akuli-desktop:~/jou$ clang -O3 fib40j.ll && time ./a.out 
warning: overriding the module target triple with x86_64-pc-linux-gnu [-Woverride-module]
1 warning generated.
fib(40) = 102334155

real	0m0,479s
user	0m0,478s
sys	0m0,000s

A few ideas what might be causing this:

  • Clang's warning, which basically says that Jou's LLVM IR doesn't specify the kind of CPU I'm using. Maybe LLVM would optimize better if I told it to target my CPU specifically?
  • The LLVM IR generated from C has a lot of stuff that I don't know what they are (such as dso_local and attributes #0).
Content of fib40c.ll (from C code)
; ModuleID = 'fib40.c'
source_filename = "fib40.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [14 x i8] c"fib(40) = %d\0A\00", align 1

; Function Attrs: nounwind readnone uwtable
define dso_local i32 @fib(i32 %0) local_unnamed_addr #0 {
  %2 = icmp slt i32 %0, 3
  br i1 %2, label %11, label %3

3:                                                ; preds = %1, %3
  %4 = phi i32 [ %8, %3 ], [ %0, %1 ]
  %5 = phi i32 [ %9, %3 ], [ 0, %1 ]
  %6 = add nsw i32 %4, -1
  %7 = tail call i32 @fib(i32 %6)
  %8 = add nsw i32 %4, -2
  %9 = add nsw i32 %7, %5
  %10 = icmp slt i32 %4, 5
  br i1 %10, label %11, label %3

11:                                               ; preds = %3, %1
  %12 = phi i32 [ 0, %1 ], [ %9, %3 ]
  %13 = phi i32 [ %0, %1 ], [ %8, %3 ]
  %14 = add nsw i32 %13, %12
  ret i32 %14
}

; Function Attrs: nofree nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #1 {
  %1 = tail call i32 @fib(i32 40)
  %2 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i64 0, i64 0), i32 %1)
  ret i32 0
}

; Function Attrs: nofree nounwind
declare dso_local i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #2

attributes #0 = { nounwind readnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nofree nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { nofree nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"Debian clang version 11.0.1-2"}
Content of fib40j.ll (from Jou code)
source_filename = "fib40.jou"

@string_literal = private global [14 x i8] c"fib(40) = %d\0A\00"

; Function Attrs: nofree nounwind
declare i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #0

; Function Attrs: nounwind readnone
define i32 @fib(i32 %0) local_unnamed_addr #1 {
block0:
  %int_lt7 = icmp slt i32 %0, 2
  br i1 %int_lt7, label %block1, label %block4

block1:                                           ; preds = %block4, %block0
  %accumulator.tr.lcssa = phi i32 [ 0, %block0 ], [ %int_add, %block4 ]
  %.tr.lcssa = phi i32 [ %0, %block0 ], [ %int_sub4, %block4 ]
  %accumulator.ret.tr = add i32 %.tr.lcssa, %accumulator.tr.lcssa
  ret i32 %accumulator.ret.tr

block4:                                           ; preds = %block0, %block4
  %.tr9 = phi i32 [ %int_sub4, %block4 ], [ %0, %block0 ]
  %accumulator.tr8 = phi i32 [ %int_add, %block4 ], [ 0, %block0 ]
  %int_sub = add nsw i32 %.tr9, -1
  %fib_return_value = tail call i32 @fib(i32 %int_sub)
  %int_sub4 = add nsw i32 %.tr9, -2
  %int_add = add i32 %fib_return_value, %accumulator.tr8
  %int_lt = icmp slt i32 %.tr9, 4
  br i1 %int_lt, label %block1, label %block4
}

; Function Attrs: nofree nounwind
define i32 @main() local_unnamed_addr #0 {
block0:
  %fib_return_value = tail call i32 @fib(i32 40)
  %printf_return_value = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([14 x i8], [14 x i8]* @string_literal, i64 0, i64 0), i32 %fib_return_value)
  ret i32 0
}

attributes #0 = { nofree nounwind }
attributes #1 = { nounwind readnone }

No unreachable warning for some infinite loops

I'm preparing a pull request to show warnings for unreachable code, but it doesn't work in at least these two cases I tried:

def after_infinite_loop_with_variable() -> void:
    flag = True
    while flag:
        puts("hi")
    puts("yooooo wat")  # Warning: this code will never run

def after_infinite_loop_with_variable_set_after_loop() -> void:
    flag = True
    while flag:
        puts("hi")
    flag = False  # Warning: this code will never run

Spreading imports on multiple lines

e.g. this should work

from "stdlib/io.jou" import (
    printf,
    puts,
    scanf,
    sscanf,
    asdasd,
)

Because asdasd does not exist in io.jou I should get an error. The line number of the error should point at asdasd.

Evaluating variable expressions

declare printf(fmt: byte*, ...) -> int

def main() -> int:
    n = 0
    printf("%d %d\n", n, ++n)
    return 0

Expected output: 0 1

Actual output: 1 1

Function arguments are evaluated left to right, but the problem is that "evaluating" the n argument doesn't actually do anything. The result of that "evaluation" is just the n variable, which gets reassigned to 1 before it is used.

Initializing struct fields

This prints 0, but the value is never explicitly set to zero. It's just whatever LLVM happens to give from its alloca instruction.

declare printf(fmt: byte*, ...) -> int

struct Foo:
    n: int

def main() -> int:
    foo = Foo{}
    printf("%d\n", foo.n)
    return 0

We should zero-initialize all fields that aren't set explicitly (probably with a memset).

A way to mark symbols as private

Currently all functions and structs are made available for other files to import. There should be a way to make functions and structs be visible only within the current file.

Maybe static?

static def internal_util_function() -> int:
    return 123

static struct PrivateStruct:
    x: int
    y: int

def public_function() -> int:
    return internal_util_function() * 2

struct PublicStruct:
    message: byte*
    line: int

old comment

jou/src/jou_compiler.h

Lines 41 to 48 in ba1938f

/*
After parsing, the AST contains types only where the types were specified
by the user, e.g. in function declarations. A separate "typing pass" fills
in the types of all expressions.
This is a bit weird, but I don't want to duplicate the AST into "typed AST"
and "untyped AST". I have done that previously in other projects.
*/

fill_types is gone: #9

Maybe should also grep for the word "fill", as in "fill types".

I think returning a string doesn't work.

jou/src/codegen.c

Lines 85 to 93 in ea8dcbd

static LLVMValueRef make_a_string_constant(const struct State *st, const char *s)
{
// https://stackoverflow.com/a/37906139
LLVMValueRef array = LLVMConstString(s, strlen(s), false);
LLVMValueRef stack_space_ptr = LLVMBuildAlloca(st->builder, LLVMTypeOf(array), "string_data");
LLVMBuildStore(st->builder, array, stack_space_ptr);
LLVMTypeRef string_type = LLVMPointerType(LLVMInt8Type(), 0);
return LLVMBuildBitCast(st->builder, stack_space_ptr, string_type, "string_ptr");
}

Presumably string literals should live in static memory, not on the stack, but I couldn't figure out how to do it when I wrote this code.

import relies on current working directory

./jou examples/hello.jou works, but after cd examples, I cannot do ../jou hello.jou:

akuli@akuli-desktop:~/jou/examples$ ../jou hello.jou 
compiler error in file "hello.jou", line 1: cannot import from "stdlib/io.jou": No such file or directory

I don't like the cdecl keyword

A declare keyword would be much more obvious. The same keyword could then be used for external C functions and forward-declaring Jou functions.

casting to signed int doesn't seem to work

declare printf(fmt: byte*, ...) -> int

def main() -> int:
    # Set x_byte to a byte value 128
    x_int = 128
    x_int_ptr: void* = &x_int
    x_byte_ptr: byte* = x_int_ptr
    x_byte: byte = *x_byte_ptr

    y: int = x_byte
    printf("%d %d\n", x_byte, y)
    return 0

Expected result: 128 128

Actual result: 128 -128

Unused variable warning

Should get warning, if:

  • variable set but not used
  • variable never set

Example:

declare puts(message: byte*) -> int

def set_but_not_used() -> void:
    message = "hi"
    puts("hi")

# Uses hypothetical `x: int` declaration syntax that doesn't exist yet
def never_set() -> void:
    x: int
    puts("hi")

annoyingly many unreachability warnings

code

cdecl printf(format: byte*, ...) -> int

def main() -> int:
    return 0

    i = 0
    while True:
        printf("%d->", i)
        if i == 3:
            printf("end\n")
            break
            printf("yooooooo\n")
        i = i+1
        printf("%d ", i)

expected result:

compiler warning for file "../asd.jou", line 6: this code will never run

actual result:

compiler warning for file "../asd.jou", line 12: this code will never run
compiler warning for file "../asd.jou", line 13: this code will never run
compiler warning for file "../asd.jou", line 10: this code will never run
compiler warning for file "../asd.jou", line 8: this code will never run
compiler warning for file "../asd.jou", line 7: this code will never run
compiler warning for file "../asd.jou", line 6: this code will never run

indexing signed-ness

// TODO: does indexing with all types work? signed 8bit?

I believe it doesn't work, because signed and unsigned types are indistinguishable in LLVM and we don't tell LLVM anywhere whether to interpret the index as signed or unsigned.

fuzzer.sh false positives

The fuzzer works by shuffling lines of code from test files in a few ways. It then runs the resulting code, and expects to get a compiler error from running it.

Sometimes the resulting code happens to be a valid program, and it runs without errors. This causes the fuzzer to think that something is wrong.

Same function defined in two different files

# foo1.jou
def foo() -> int:
    return 1

# foo2.jou
def foo() -> int:
    return 2

# foo2_wrapper.jou
from foo2 import foo
def wrapped_foo() -> int:
    return foo()

# main.jou
from "stdlib/io.jou" import printf
from "./foo1.jou" import foo
from "./foo2.jou" import wrapped_foo
def main() -> int:
    printf("%d %d\n", foo(), wrapped_foo())

IMO this should be a linker error. Each file by itself should compile fine, because the name foo is not used twice within the same file, but the linker still has to deal with two functions named foo.

I haven't tried what this does currently.

"=" ambiguity

Struct instantiation syntax is currently Point{x = 1, y = 2}. The problem here is that x = 1 by itself is an expression with a different meaning, it sets the variable x. So Point{(x = 1), (y = 2)} is not invalid syntax, but also not same as Point{x = 1, y = 2}. This feels wrong and misleading.

I'm planning to also support Point{1, 2}. This makes the problem even worse: if a user knows about the Point{1, 2} syntax, and also knows that x = 1 is an expression, they will misunderstand what Point{x = 1, y = 2} does.

I think the correct solution is to make x = 1 be a statement rather than an expression. It also has the advantage of being consistent with Python and less confusing: I am trying to make a simple language, not a C++-like language that makes you do fancy tricks and then get angry when other people don't understand your code.

Another (worse) option I considered would be Point{x: 1, y: 2}, which is consistent with how JavaScript does it, but also looks too much like a dict/object/mapping. Structs and dicts are conceptually different: dict keys are typically some kind of user input, but struct field names and types are fixed and known at compile time.

One more option would be Point{.x = 1, .y = 2}. It looks a bit weird at first but makes sense after getting used to it, and also works nicely with the { [FOO]=1, [BAR]=2 } array initializers.

confusing & error with temporary struct object

I will soon create a PR that adds structs. It will include a funny corner case...

struct Foo:
    x: int

def foo() -> void:
    lol = Foo{x=1}.x

Expected result: no error or an understandable error.

Actual result: compiler error in file "asd.jou", line 5: the address-of operator '&' cannot be used with a newly calculated value

The error message is confusing because it talks about & operator which doesn't even appear in the code.

This happens because to evaluate foo.bar, we need to compute the offset of bar in the struct and add it to the address of foo. So we need to know the address of foo. In this case it means finding the address (&) of a temporary value which is not allowed.

no error if variable already exists

def main() -> int:
    x: int
    x: int
    return 0

Should error on line 3 but does not. Another example that doesn't error either, even though it really should:

def main() -> int:
    x: int
    x: byte
    return 0

importing a struct

# a.jou
struct Foo:
    x: int
    y: int

# b.jou
from "./a.jou" import Foo
def main() -> int:
    f = Foo{x=1, y=2}
    printf("%d %d\n", f.x, f.y)  # Output: 1 2
    return 0

$ ./jou b.jou

Expected result: 1 2 printed

Actual result:

compiler error in file "b.jou", line 1: file "././a.jou" does not contain a function named 'Foo'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.