lol3rrr / cpiler Goto Github PK
View Code? Open in Web Editor NEWA Rust based Compiler for C
A Rust based Compiler for C
Right now the Backend for Mac is kind of a mess and all over the Place and its pretty confusing to work with as some parts are repeated all over the place and in general its not very nice to work with.
Start by breaking up the Code and restructuring it a bit to make more sense and pack common functionality into a single location. Then probably also start by building a better Assembly abstraction to allow for an easier time while working with it to generate code as it actually does not need to be soo close to the actual assembly in the end
Right now the Variable spilling kind of works but breaks down for more complicated Situations and sometimes fails to properly spill Variables leading to the Problem that it just tries to spill Variables over and over again without making any real Progress.
The Way Variables are spilled should therefore be revisited and improved to hopefully mitigate these Problems and provide a more advanced and solid solution
Right now Conditionals and Phi Nodes are causing a lot of Problems with spilling as their interference is not really handled correctly.
Right now the Compiler basically treats them very dumb like this
if (..) {
x_1 = 5;
} else {
x_2 = 13;
}
// For the Compiler x_3 only starts existing after this point
x_3 = phi(x_1, x_2);
when in reality the semantics/logic of this is more like this
if (..) {
x_1 = 5;
// X_3 exists at this point
// Implicit: x_3 = x_1
} else {
x_2 = 13;
// X_3 exists at this point as well
// Implicit: x_3 = x_2
}
// This only exists because we need a single Definition but in the final non SSA form this is moved up like shown
x_3 = phi(x_1, x_2);
This mainly causes Problems when the compiler attempts to insert a spill before the Phi Node, because it needs a Register for the x_3
Variable but handles it incorrectly as it has wrong assumptions about the interference
C expects certain Binary Logic Operations to short-circuit, like when one part of an and (&&) is false the Rest will not get evaluated anymore.
This is currently not supported and could lead to undefined behaviour, as there is definetly code that relies on this Property.
The easiest solution would be to have the Semantics generate the correct IR by inserting the correct Jumps in the IR before evaluating other Parts of the Expression
Right now Enums are not supported by the Compiler and it will simply Panic when it encounters one, which is obviously not how it is supposed to work so this will need some more work in the future
Right now these loops are only kind of supported, as they work in the basic Case, but when their Condition modifies any Variable the generated IR is actually not correct
Works
int x = 0;
while (x) {
x++;
}
Does not work
int x = 0;
while (x++) {
x++;
}
Currently Phi Nodes dont really work well with Register Allocation, because it gets "invalided" when destructuring the SSA Form of the Phi Nodes
This is caused by an oversight of the interference graph/register allocation.
Lets say we define x1 in b1 and x2 in b2 and now define x3 as Phi(x1, x2). With the current System this means that we assign Registers to x1, x2 and x3 independently of each other and the Register assigned to x3 might be used before x3 was assigned with the Phi Node. When we then destructure our SSA we essentially move the assignment of x3 up into b1 and b2, but because the Register for x3 was also used before we originally assigned x3 using the Phi we now overwrite its register in between.
Original SSA:
b1:
x1 = 13
jump b3
b2:
x2 = 23
jump b3
b3:
test1 = 123
print(test1)
x3 = Phi(x1, x2)
print (x3)
Destructured
b1:
x1 = 13
x3 = x1
jump b3
b2:
x2 = 23
x3 = x2
jump b3
b3:
test1 = 123 // Could be storing into the same Register that x3 was previously stored in and therefore corrupting the Data
print (test1)
print (x3)
Because test1 and x3 dont interfere in the original SSA they could very well be assigned to the same Register meaning that we then generate an invalid SSA after destructuring because the Register for x3 is overwritten, but we just use it again afterwards as if nothing happened
Currently the IR can not really be cloned/copied in the Sense that you get independant Versions of it. This is very Problematic because it hinders the Options for Scaling the Optimizer as it cant run in in parallel if it needs access to information about other Functions, like for inlining.
The Old Design relies on Arc and WeakPointer to represent the IRs Controlflow, which works fine as it was intended, however as already noted above, this does not really allow for proper cloning/copying to create independant Objects that can be used concurrently.
Add a new Representation where each Function contains a List of Blocks and then each Block is identified by its index in that List. All references to other Blocks are then replaced with their Index. This makes the entire Data Structure easily clonable and also makes it easier to keep track of everything.
There should also be a way to easily convert between these Formats as most of the Compiler will probably continue to use the Old-Design and just adopt the New-Design where needed, like in the Optimizer.
Right now the Compiler represents a char as a signed 8 bit integer, but this is actually not really what I want in most cases when using it for example as i cant store for example 255 into it.
Change the Code to instead interpret the char as a unsigned 8 bit integer which is easier for me overall to deal with as well
Right now every Variable in the Program with the same Name also kind of refers to the same Variable even if they are in different Scopes and should have nothing to do with each other.
For this to work properly there needs to be a way to sort of rename Variables depending on their current Scope or the like to help differentiate Variables with the same Name between different Scopes as well as allowing Variables in the inner scope to shadow Variables from the outer Scope.
int main() {
int x = 13;
if (1) {
int x = 25;
}
}
This currently fails to compile as far as I am aware, but it is actually valid code and should work
For loops are currently behaving pretty weird when using other Variables outside the loop as it somehow creates a phi node that contains itself but nothing else resulting in some weird behaviour
Right now the compiler does not support Global Variables of any sort, which is really bad and should obviously be supported.
However I currently dont know where these Variables should be stored to avoid any sort of Problems.
Right now the Compiler only keeps the Variables in Registers, this however wont work as soon as Pointers are introduced and also supported by the Backends. Because of this the IR was extended to support 2 new Statements, LoadVariable
and StoreVariable
.
This is intended to force the Compiler to read the Variable from the Stack (or wherever it is actually placed) back into its register, i.e. forcing a reload of it.
This is intended to force the Compiler to write the Variable out from the Register to the actual storage Location, making it visible from Pointers as well. This should now be generated after every assignment as it ensures that Pointers always read the correct Value from it. To not hamper Performance to much we should also later on add a new Optimization-Pass that tries to remove unnecessary stores, but only where it is proven to be valid.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.