Giter VIP home page Giter VIP logo

bolt's Introduction

Bolt BuildCIBadge

Bolt will be a programming language and compiler, intended to be a replacement for C in my own personal projects. For this reason it will largely be a general purpose language capable of tasks ranging from Kernel Development to App Development.

// A simple Hello, World program in Bolt.
import std

func<Int> main() {
	print("Hello, World")
	return 0
}

The current state of Bolt

Bolt is still very much in its infancy. The language is still be fleshed out and designed. The ideas are changing and evolving almost daily. It is by no means ready for actual use.

I've made it publically viewable, either so people can get involved if they want to, or see the evolution of a new language and compiler. Will it ever match the popularity or usage of something like Rust, Swift, Python, etc... most likely not, but that's not why I'm developing it.

v0.0.2 Milestone

The current v0.0.1 of the compiler is capable of building a basic Hello, World program. This represents the first major milestone of the project. With that reach, my sights are now on the second major milestone. This includes improvements to the architecture of the compiler, adding the ability to use variadic arguments, better type coverage, compiler directives, arithmatic, bracketed expressions and order of operations.

High level concepts and ideas for Bolt

Important: Details listed below may change quickly and frequently, and is meant to represent a general idea of the concepts employed by the project currently.

Bolt is primarily being designed to be readable and easily written. This is easier said than done, and will likely mean that the syntax of the language is changable for some time.

One of the concepts currently being played with is how types are associated to declarations. Take a the following declaration in C.

// Define foo to be an integer with value 50.
int foo = 50;

This is quite a simple definition, and the type leads the definition. With bolt we'd write the same thing like

var<Int> foo = 50

This is a convention used throughout many declaration in Bolt. The root keyword of the declaration (var, let, func, etc) followed by the resulting type in angle brackets < >. The goal is that it should be easy to tell what type something ultimately is.

License

Bolt is provided under the MIT License, and thus free to use make use of in anyway you see fit. All that I ask is you provide the following license with any other versions of Bolt.

This license applies to the Compiler and the Standard Library of Bolt, but not the language itself... though I would ask for some form of attribution in any of implementation of the language.

Copyright (c) 2019 Tom Hancocks

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

bolt's People

Contributors

tjhancocks avatar

Stargazers

 avatar

Watchers

 avatar James Cloos avatar

bolt's Issues

Add support for conditions

The Feature

There is currently no branching in Bolt, which means programs will execute in a linear fixed manner. There needs to be support for conditions (if-elseif-else) so that programs can execute appropriate functionality based on runtime state.

What needs to be done?

There will be a number of aspects that need to be added for this, including parsers, sema, code generation, etc.

The basic idea for conditions will be as follows:

if condition {
    // code here
}
elif condition {
    // code here
}
else {
    // code here
}

This will require further operators/operations to be implemented on top of those added in #38, in order to allow conditions to be formed (these include ==, !=, <, >, >=, <=, &&, &, ||, |, !)

Increase type coverage in language

The Feature

Most programming languages and indeed programs require more than just Int, String, Int8 and None as types.

What needs to be done?

The following scalar types need to be added to the language, as well as a mechanism in the compiler that allows specifying the target architecture so that IntPointer, UIntPointer, Int, and UInt all adopt the correct bit width.

Type Checklist

  • Int16
  • Int32
  • Int64
  • IntPointer
  • UInt8
  • UInt16
  • UInt32
  • UInt64
  • UInt
  • UIntPointer
  • Bool

Produce an initial syntax definition for Sublime Text

The hope with the project is provide integrations to various tools right alongside the compiler, and to make them as supported by the toolchain as possible.

The sublime text syntax definition should try to be in sync with the current language grammar as possible and provide at least the following features:

  • Syntax highlighting
  • Build/Run actions for the compiler
  • Auto-completions for definitions, types and standard library elements.

Refactor the Abstract Syntax Tree in the Compiler

The Issue

v0.0.1 was about reaching Hello, World at all cost. v0.0.2 is about making a future to build upon. The AST built in v0.0.1 is very OO and not even in a well thought out way. This should be rectified, and the whole thing cleaned up or rebuilt.

What needs to be done?

I see 2 main ways that this could be solved.

  1. Clean up and massively reorganise the AST. Try to make use of generics and protocols to avoid duplication and too much inheritance when eliminating code duplication.

  2. Use an Enum, with different cases representing a different type of expression in the AST.

The problem has arisen due to my desire to create a traversal algorithm whilst trying to eliminate the number of nodes present. This means the number of distinct/unique decisions that need to be made on each node increases, thus massively complicating a traversal algorithm. Whilst the traversal itself is simple, trying to act on some nodes and not others and mutate the tree at the same time has proved to be a bad idea and led to some questionable decisions.

There may be requirement to be add a Concrete Syntax Tree to the compiler, which is generated prior to the ASTs creation.

Create a foundation for Token Parsing

The compiler needs to parse a stream of tokens and identify syntactic structures within it. At its most basic it needs to be able to do the following:

  • Identify and expand import directives
  • Identify function declarations
  • Identify function definitions
  • Identify code blocks
  • Identify type information
  • Identify symbol information
  • Construct an Abstract Syntax Tree
  • Construct a Symbol Table

The bare minimum should be built for this to allow for the target Hello World executable and basic standard library imports required in this version to be met.

Link object files into executable.

Object files that were produced by the compiler should be linked together into a single executable. This executable should be able to run on the current host architecture.

Code generation using LLVM and the Abstract Syntax Tree

Once semantic analysis has been completed, the compiler needs to begin the process of code generation (we're ignoring potential optimisations at the moment). For this it needs to produce the LLVM IR code necessary to represent the program being compiled.

Create a foundation for Lexical Analysis.

The lexical analysis functionality of the compiler needs to be implemented to the point of being able to identify the following types of token:

  • Comment (Discard)
  • String
  • Integer
  • Float/Double
  • Keyword
  • Identifier
  • Operators
  • Symbols

The lexer should not attempt to resolve type information or symbol names, as this will be done by the parser later in the Compiler.

Add support for basic arithmetic operations

The Feature

Currently the language has no means of performing basic arithmetic operations. This makes it extremely limiting. There needs to be support for the basic arithmetic operations added into the language.

What needs to be done?

In order to implement this in the language a concept of prefix, postfix and infix operations need to be added to the language as either unary and binary operations. In addition to this support for precedence levels between operations needs to be added, along with the appropriate semantic analysis, optimisation and code generation.

Checklist

  • Addition
  • Subtraction
  • Multiplication
  • Division
  • Modulo
  • Increment
  • Decrement
  • Operation precedence levels

End of file comment doesn't lex if it is not terminated with a newline

If a source code file is terminated by a comment, and that comment is not terminated by a newline, then the lexer will run outside of the bounds of the file and throw and error.

let<Int> foo = 24 // No newline at the end of this comment

There will likely need to be an improvement to character consumption in the lexer to ensure this doesn't occur in other scenarios either.

Ensure the necessary tooling is in place to build and install the project.

Although the compiler itself can be built using swift run or swift build, it would be nice if the project had make rules and triggers to kick off installing various integrations, builds and what have you.

Everything should be as easy to use as humanly possible and not require new users to torture themselves for no good reason.

Add directive syntax for providing compiler instructions

The Feature

libc.bolt requires compiled programs to link against the C standard library. This is currently hard coded into the linking phase of the compiler. Indeed, not all programs will require it, and some may even need it to not happen. There should be a way for libc.bolt to instruct the compiler to include it. This can then also expand out for future stdlib files to include additional libraries.

What needs to be done?

The current syntax idea for providing a compiler directive is this:

@pragma(<domain>, <flag>)
@pragma(linker, -lc)

Support for this will need to be added into the language, and an appropriate definition added into the Sublime Text syntax definition

Create a Semantic Analyser

The AST generated by the Parser will potentially have errors (due to syntactic errors in the source code) and thus need validation to ensure correctness. This is the role of the Semantic Analyser.

It needs to be able to sure matching types between components of an expression, correct arity of function calls, and so on.

The initial implementation will likely be simple, serving to get to the goal of a Hello World executable.

  • Highlight type errors in arguments of function calls. Do the arguments match the type of the corresponding parameter?
  • Highlight arity issues in function calls. Trying to pass 0, 2 or 3 arguments to a single argument function should throw an error.

Add variadic argument support to function declarations.

The Feature

The C standard library, and indeed the Bolt standard library will need to support variadic arguments in functions for the implementation of aspects like printf() in libc.bolt.

What needs to be done?

Support needs to be added to the parameter representation in the AST to allow it to represent a variadic argument, and then into the code generation of functions so that they can correctly create the IR representation of such a function.

Further to test this a new function declaration should be added to libc.bolt for printf.

[Optional] Checklist

  • Add appropriate AST Nodes - add update the parser accordingly.
  • Add appropriate code generation.
  • Add printf declaration to libc.

Error reporting improvements

Currently error reporting has rudimentary support in the compiler but it is not fantastic and it is not comprehensive. This really needs to be sorted before v0.0.1 so that it does not become tedious to manage later on.

Fatal Errors in the Compiler

Some error states are impossibilities for the compiler to be in, and thus should be treated as such. If the compiler gets into such a state it should raise a fatalError() with a message about why the fatal error was raised. This can help track down bugs in the compiler.

Source Errors

These are the kinds of errors that need to be reported to the user, regarding their code that they are trying to compile. The compiler already has a concept of locations in the source code (see Mark) and should be passing this via the Error. However the coverage of such errors is spotty and/or inconsistent. Further to this some errors are using .unknown for the Mark.

Integrate LLVM into compiler

Bolt is going to be using LLVM as the backend for the compiler as it will give access to assembly code generation, object file generation, linking, optimisation, etc. There maybe some experimentation without LLVM in the future, but currently there is no desire to do so.

This project will be making use of LLVMSwift.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.