Giter VIP home page Giter VIP logo

silo's People

Contributors

salmanahmad avatar

Watchers

 avatar

silo's Issues

Test VOID Returns

What happens if you have a native function that returns void. Will the compiler artificially insert a null for you?

Example:

public class foo extends Function {
    @Function.Body
    public static void invoke(ExecutionContext context) {
        System.out.println("Hi!")
    }
}

Incorrect Type Propagation

A function like this:

func(main(null => String) {
    while(true { return("Hello") })
})

will cause a compilation error because it will think "Object" is being returned rather than "String" since the while loop will propagate an "Object" on the stack.

Try in Your Browser

Create a Silo REPL shell that allows developers to execute commands. The shell would also need to include a sandbox capability because I would want to deploy this shell interface over the Web for a "Try in your Browser".

Variables

The first challenge is supporting local variables in a sane way. One way that I could handle this is to macro expand the code and then re-write assignment nodes (=) so that they use a HashMapActor or something. This HashMap actor can be serialized and even stored in a cookie to reduce for the Web app.

Runtime

Each session should run in its own runtime so I can avoid name clashes.

Security

The first thing that I need to do is prevent against obvious security issues. I can do this by enabling sandboxing in the JVM with a security manager as described here:

http://stackoverflow.com/questions/502218/sandbox-against-malicious-code-in-a-java-application

In addition, I need to be wary of users that create threads. To prevent creating threads I need to look into ThreadGroups and setting the security manager to prevent new threads from being created.

This is a good start but I also need to be wary of run away code that could run in an infinite loop or attempt to allocate a huge amount of memory. To prevent this, I need to run each command on new thread with a timeout that is canceled after a certain amount of time. There are certain ways of doing this. One simple way is to use a Executors.newSingleThreadExecutor and Future#get(DURATION, TimeUnits.SECONDS) followed by a ExecutorService#shutdownNow(). The problem with shutdownNow() like here:

http://stackoverflow.com/questions/2275443/how-to-timeout-a-thread

However, we are not there yet. This only works if the user code pays attention to Thread.interrupt() and a malicious thread may ignore that. To avoid that we could use "Thread.stop()", which is deprecated but is an option, or we could macro expand the code and then insert if(thread.isInterrupted()) throw("foo"). Both of these work nicely, however, a malicious thread could still circumvent them by catching all exceptions (including the ThreadDead execution from Thread.stop()) and ignore it. Which leads us to our last security precaution a simple blacklist:

https://github.com/Raynes/tryclojure

https://github.com/Raynes/clojail/blob/master/src/clojail/testers.clj

Basically, we blacklist certain commands / forms that we do not like. The try-catch statement is one of those.

Fall Back

As a fallback, I should kill the JVM process every 20 minutes to ensure proper quality of service. To do this, I should have a Silo program that is monitoring another Silo program running in another JVM. This will avoid any run away threads that may be leaked.

Additionally, the Silo program should be run on a dedicated linux box that is disconnected from everything else (so SSH keys, etc). The OS user account should also be a low-priviledge user. This is not as simple as it may seem since I would want the process to be listening on port 80. Perhaps I create a Silo reverse proxy application (or just use nginx / haproxy which is less interesting).

Exception Handling with Fibers + Actor Migration

If a resumed Fiber throws an exception that function that catches the exception will not have the local variables initialized correctly and will also not clear the relevant execution frames to prevent replaying the execution over and over again.

The first step is easy - inside all catch blocks I should emit code that re-sets the local variables if the ExecutionContext has a status of THROWING (as opposed to RUNNING or YIELDING).

The second step is much harder - how do I invalidate all of the ExecutionFrames? I need a mechanism that allows me to determine which ExecutionFrame belongs to the current function, find it, and then remove all ExecutionFrames on-top of it. One idea is to embed an interned string with the fully qualified name of the function. I then call a utility method that iterates through the ExecutionFrames in reverse and returns the first frame that matches the fully qualified name. It then "nulls" out the rest.

Optimize Coroutines with Custom Stack Frames

Create a custom class for each call site with fields and static initialization method that matches the call site's JVM operand stack.

This has several benefits. First, it makes it very efficient. Removing all items from the stack since the static initialization method can "pop" all of the elements off the stack in one go. Additionally, it does not require any auto-boxing and minimal casting (when when restoring the stack). Second, it reduces the emitted code size. Instead of emitting a separate pop for each operand it is all in one go and instead of swapping to create an array, it is a single static method call. Additionally, I eliminate the need for many of the casts.

While I do this, I should also add two other "special" variables: returnValue and currentFrame. These variables are used to avoid having to constantly load the ExecutionContext and ExecutionFrame (through a virtual method call) and to avoid having to swap the return value when clearing dummy values during coroutine resumption.

The challenge with this, is that is causes me to generate a lot of small classes. I cannot dynamically create the StackFrames at runtime and re-use them across functions because of AOT compilation --- the class loader used to load a stdlib function will likely not be a RuntimeClassLoader. However, what I can do share custom StackFrames across function that used the same CompilationContext. This will likely reduce a lot of the small temporary classes while not having to worry about AOT compilation issues.

When creating these custom stack frames, I need to be wary about duplicated names and clashes. I am of two minds here. First, if there is a name conflict, it shouldn't matter because the two classes will technically be exactly the same (the types of the operand stack are represented in the name likes frame_IIL (int, int, object)). On the other hand, I worry that this will create issues in two contexts: (1) the JVM may reject the class and create a ClassCastError because an internal manifest field may be different and (2) it may cause the class writing to skip writing the stack frame class to disk and when the files are copied to a new machine the stack frame may be missing. Thus, I am also debating using an UUID to name the stack frames since they will practically never collide with one another.

Long Literals are not Working

For some reason I cannot create a Long literal. This does not work:

l : long = 500L

The workaround for now is silly:

l : long = Integer(500)#longValue()

Semantics of Top Level Code

Right now, top-level code could create duplicated __function__1 definitions. I should consider how I am going to solve that problem.

A good test / use case for this is when you want to add a new function to an existing AOT compiled package. That would certainly create multiple ___function___1 definitions.

Inconsistent Stack Height with Finally Clause

The following code (using the HTTP API) creates a compilation error.

    try({
        connection.readAll(c)
        connection.writeAll(c, 200, null, "Hello, World!");
    } catch(e : Exception) {
        println("exception")
    } finally {
        println("Finally!")
    })

error: java.lang.RuntimeException: java.lang.VerifyError: (class: handler, method: invoke signature: (Lsilo/lang/ExecutionContext;Lsilo/net/http/connection/Connection;)Ljava/lang/Object;) Inconsistent stack height 3 != 2

Cryptic Error with If Statement

I wrote the following code:

if(...) {

}

(Note the braces are NOT instead the parens).

This lead to a weird cryptic error message that was hard to track down.

String Concatenation

Support string concatenation using the plus operator. Example:

s : String = "Hello" + "World!" + 9

Profile Compiler

Profile the compiler and figure out any bottlenecks. In particular, why does core.silo take so long to compile?

One of my current guesses is that Node#getChildren() is really slow since it has to copy everything and is called a bunch of times. Once replaced with persistent vectors, it could be much faster.

Add Test Case for Void Returns

Add a simple test that ensures that void returns from Java don't mess everything up.

map : HashMap = HashMap()
fn(h : HashMap, h#clear())(map)

Also

// Should be a type error
i : int = map#clear()

Also

// 'i' should be null
i : Object = map#clear()

Exclusively Package Imports

If I want access to the File operation commands I cannot do:

import(silo.io.file)

file.path("foo", "bar")

What I want to be able to do is something like:

include(silo.io.file)

Alias will also not work.

Shutdown ExecutorService

Right now, the runtime has no hook to shutdown the ExecutorService. Does this happen automatically?

String Comparison Operators

Support the full gamut of operators with String objects. For example, all of the following should be valid and performed using String#compareTo.

a : String = ...
b : String = ...

a < b
a > b
a < = b
a >= b

Left Associativity in the Grammar

Most operators are implemented using right associativity when they should be left. The # operator is especially troublesome and should be updated sooner than later.

Scoped Imports

It would be nice to be able to scope imports and alias to a package block. So, for example:

package(silo.net.http.connection {
    import(silo.net.http)

   server.createServer(...) // Okay
})

server.createServer(...) // [server] not found because the import is no longer active.

Actor System

Augment the runtime with the following fields and methods

Runtime
- ExecutorService taskPool <-- Fixed
- ExecutorService backgroundTaskPool <-- Cached
- ConcurrentHashMap<String, Actor> actors
- HashMap<String, Integer> pendingActors;
- sendMessage()
- scheduleActor()
- unscheduleActor()

Actor
- String address
- ArrayList mailbox <-- All access is synchronized
- ExecutionContext context

Scheduling Algorithm
The scheduleActor() method is completely synchronized. When it is called, it checks pendingActors to see if the actor is currently pending. If the actor is not in the HashMap then the actor will be sent to the taskPool. If the actor is there, the number is incremented.

After the taskPool executes an actor it called unscheduleActor which is synchronized on the same lock as scheduleActor. unscheduleActor will look into pendingActors and confirm that the number is zero. If the number is zero, then it will remove the actor from the map and return. If the number is not-zero, it will set the value to zero and immediately re-execute the task.

Basically, this algorithm gives the actor "one more chance" to execute before unscheduling it. It is somewhat likely that the actor will be resumed twice (which is okay because resuming a fiber is really fast) but it is very unlikely that it will be resumed more than that because the second time will likely start up, check that there is nothing in the mailbox, and immediately quit again, all very quickly. Thus, the window for another thread to schedule the actor is really small.

This two-way checking is needed to ensure avoid a race condition in which an actor is scheduled but is never actually sent to the taskPool. This becomes a problem since we want to avoid an actor from being executed concurrently by two threads.

Change Blocks in Grammar

Code blocks {...} currently compile to do(...). Instead, they should compile to a node with a null label.

Support Types, Classes, and Subclasses

I need to provide some capabilities to allow Silo code to create and extend Java classes and interfaces.

In the case of classes (which is the more general case) I need syntax to:

  • Give a name
  • List superclasses and super interfaces
  • Define methods (public, private, protected)
  • Define field (public, private, protected)

Importantly, the following are non-goals and do not need to be supported:

  • Static methods (just use Silo functions, perhaps "NO_BLOCK" functions)
  • Static field (just use Silo constants)
  • Define new abstract classes or interfaces (just use Silo traits)
  • Enforcing final semantics
class(Reader(Object, Comparable, List) {
    field(foo : int = 5)
    field(public bar : int = 5)

    init(this : Reader, foo : int {
        this.foo = foo
    })

    method(public toString(this => String) {
        "Foo is: " + this.foo
    })
})

Ignore Macro

Adding an ignore macro could be interesting.

f : File = null
ignore(FileNotFoundException {
    f = File.open("foo.txt")
})

Auto-Boxing for Math and Relational Operations

If either a math or relational operation is performed on a wrapper class (Integer, Double, etc.) the operands should be unboxed and the operation should proceed as normal.

In the case of a math operation, the outcome should be boxed up once again at the end.

Function Resolution Error (Most Specific Parameters)

If you have methods:

void foo(CharSequence, Object) // 1
void foo(String, Object) // 2

The current compiler cannot determine that you should pick the second because String is more specific. The issue is inside Invoke#resolveFunctionByArguments on the following line:

if(java.util.Arrays.equals(parameters[options.get(index)], parameters[options.get(i)]))

Fiber Thread Safety & Immutability

I need to ensure that a Fiber cannot ever be executed by multiple threads concurrently. I don't know exactly how I can do this. This can happen if you were to send a Fiber to another Actor (which I want to support). Perhaps Fibers can be made to be immutable?

Change Order in InstanceOf and CheckCast

It should be:

instanceof(o, java.util.Vector)

and not:

instanceof(java.util.Vector, o)

that way you can pipe them to one another:

o | instanceof(java.util.Vector)

Create HTTP File Server

Similar to the Python SimpleHTTPServer. Basically, you run it on the command line and it serves the current directory that you are in.

There are a couple of things that go along with this. The first is the notion of system-wide installs for 3rd party tools. This could be similar to NPM or Gem. I need to figure this out because this tool should NOT be part of the Silo standard library.

Second, it forces me to figure out the HTTP pipeline for the Silo core library.

Figure Apache Bench Issue with Handler

In HttpServerHandler when I enable this:

this.actor = runtime.spawn(connection.actorId, handle, handler, connection)

the Apache Bench starts to crap out on connection. I am not sure why. I am starting to expect that it could be calling the function dynamically with Function#apply.

Add Test Case for Resumption Anonymous Function

Add a test case that creates a fiber that calls a functions that calls an anonymous function before yielding. Something like this should be good:

func(foo() {
    f : Function = fn(return(fiber.yield()))
    f()
})

fiber : fiber.Fiber = fiber.Fiber(foo)
fiber.resume(fiber)

Better Error Message for Functions Defined in Finally

Functions that are defined in a finally block will be duplicated during compilation since the finally block essentially copies code into different places. This currently leads to an error message that is hard to decipher. I should make it more straight forward.

Pattern matching API

Create an API that allows nodes to be matched against a pattern. Look to regular expressions, Haskell, Caml for inspiration.

Make Node Immutable

In particular, Node make heavy use of java.util.Vector that needs to be replaced

Check for Dead Fibers

Add a safety check to see if a fiber cannot be resumed and is "dead". This means that the function that the fiber was calling returns naturally with an ExecutionContext status of RUNNING. This fiber should NOT be allowed to be resumed as weird things could happen.

Mark Functions as NO_BLOCK

If a function does not perform any blocking operations that would cause the current coroutine to be switched out, I can execute that function as a normal Java method instead of including the ceremony of beginCall(), endCall(), stack store, and stack restore. Practically speaking this will likely not improve performance but it will reduce code size.

All functions are considered to be blocking unless specifically told that they are not. Thus, NO_BLOCK is opt-in. Certain library calls like vector.create(...) will be marked as NO_BLOCK because it just does computation. If a function $F$ only calls NO_BLOCK functions then $F$ is assumed to also be NO_BLOCK.

Async API

Create async.spawn. Note that you do not need to do anything else with this approach to Async because it will have runtime support and no other functions are needed. I will use async.spawn (instead of actor.spawn) and then the normal actor API (read, receive, etc.)

Exceptions do not Propagate with CLI

I had a file that looked like:

options : silo.lang.FooBarBaz = silo.lang.FooBarBaz()

When run with silo file.silo no exception was reported that "FooBarBaz" does not exist.

Changing Size of ActorExecutor

The Runtime.actorExecutor service is a fixed size thread pool. It would be nice to be be able to change the size of the thread pool dynamically - is that something that can be done? A key use case is if an actor wants to "lock" the underlying Java thread so that it executes the actor and no other actor. However, you would also want to replace this thread with another one so it does not block the entire system. Is that something could be done?

Migrate from IPersistentVector to PersistentVector

There are a couple of areas where I need to be mindful of the transition:

  • Varargs uses a calling convention that relies on IPersistentVector
  • Node make heavy use of java.util.Vector that needs to be replaced
  • CompilationContext defaults for aliases and imports

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.