apigee / trireme Goto Github PK

Embed Node.js inside a Java Virtual Machine

License: Other

Java 22.86% JavaScript 75.36% C 0.04% HTML 0.57% Shell 0.12% C++ 0.04% Python 0.83% Makefile 0.09% XSLT 0.04% Ruby 0.03% R 0.03%

trireme's Introduction

Trireme

This is a set of libraries for running node.js scripts inside Java.

What is Trireme?

Trireme runs Node.js scripts inside the JVM. This is important because there is a lot of software out there (including our own) that is built in Java and isn't going to get rewritten in JavaScript now or in the future.

Trireme is specifically designed to be embeddable within any Java program. There is a lot of support inside Trireme for this specific case:

Many Node.js scripts can run inside a single JVM, subject to memory constraints.
Each script is totally isolated from the others -- there is no way for one script to affect the heap of the others.
A sandbox is provided that lets the container control how, or if, the script gains access to the filesystem and to the network.
The HTTP server implementation is pluggable. An "HTTP Adapter" is supported that allows a container to embed a Node.js script inside an existing HTTP container like a web server or other product. (A sample adapter, built using Netty 4.0, is included.)
The sandbox supports a Rhino feature that makes it possible to limit the execution time for a script. With this feature enabled, a script that runs an infinite loop can be terminated after some time.

For a more detailed introduction, see our intro presentation:

Introduction to Trireme

So, again, why would I use Trireme?

To embed Node.js apps inside an existing Java application
To run Node.js apps that take advantage of Java libraries you can't live without, like JDBC drivers and XML parsers

If neither of those reasons apply to you, then stick with "regular node!"

How do I get it?

From NPM

sudo npm install -g trireme
trireme -h
trireme <your script name here>

The NPM package for Trireme lets you run it on the command line just like "node".

Unfortunately, Trireme does not support the "repl" yet (and it's hard since Java gives us limited control over the TTY) so just running "trireme" with no arguments produces an error right now.

From Maven Central

The best reason to use Trireme is because it's important to embed Node.js code inside an existing Java application. In that case you will use the modules under "io.apigee.trireme" on Maven Central:

io.apigee.trireme

The "module map" later in this document shows which modules to use in which cases.

From GitHub

See the releases page to download the latest release files.

"trireme-x.y.z.jar" is always a stand-alone jar that you can run just like "node":

java -jar trireme-x.y.z.jar script.js

What version of Node.js does Trireme Support?

Trireme supports two versions of Node.js:

0.10.32. This is the default, fully-supported version.
0.12.7. This version is still a work in progress.

Support for Node.js 4.0 depends on more complete ES6 code in Rhino. The Rhino community is making progress on this but it will be quite some time before we are ready to support 4.0.

Running Trireme

Using NPM

If you installed Trireme using NPM, just run:

trireme <script name>

Trireme will execute your script just like Node.

In addition, the environment variable TRIREME_CLASSPATH may be used to add extra JARs or directories to the classpath used to run Trireme. Anything on this path will be appended to the classpath used to launch Trireme. This allows you to add JDBC drivers, etc.

For help, use:

trireme -h

Using Java

The "jar" module builds a self-contained JAR that may be used to launch Trireme on the command line just like regular Node.js:

mvn install
java -jar jar/target/trireme.X.Y.Z.jar <script name>

(and with no arguments it will launch the "repl" but that implementation is not complete)

For Apigee Edge Customers

Node.js apps on Apigee Edge run inside Trireme using a few specific settings that make them slightly different from the standard Trireme runtime. In particular, an "HTTP adaptor" is used, which is not 100% the same as the standard HTTP module. This does not affect most applications, but it may affect applications that depend on undocumented or internal functions and properties of the default HTTP module. (While this is a poor programming practice, it's quite common.)

In order to test applications that will run on Apigee Edge in the most compatible environment, the module "samples/apigee-edge-like-runner" will build a self-contained JAR that sets up Trireme this way. Use it just like the "Using Java" link above:

mvn install
java -jar samples/apigee-edge-like-runner/target/apigee-edge-like-launcher-X-Y-Z-SNAPSHOT.jar <script name>

Embedding Trireme as a servlet

The war sample is a sample that shows how to assemble a Node.js application into a WAR file. It uses the trireme-servlet module to link the servlet to the Node.js script. Any script that operates as an HTTP server using the "http" module can be embedded in this way.

Embedding Trireme Anywhere Else that Java Runs

There is JavaDoc for the "NodeEnvironment" and "NodeScript" classes, and many other features. Here are the basics:

import io.apigee.trireme.core.NodeEnvironment;
import io.apigee.trireme.core.NodeScript;

// The NodeEnvironment controls the environment for many scripts
NodeEnvironment env = new NodeEnvironment();

// Pass in the script file name, a File pointing to the actual script, and an Object[] containg "argv"
NodeScript script = env.createScript("my-test-script.js",
                                     new File("my-test-script.js"), null);

// Wait for the script to complete
ScriptStatus status = script.execute().get();

// Check the exit code
System.exit(status.getExitCode());

Selecting the Node version

The all-in-one JAR, and the "trireme" NPM package, include code for both Node.js 0.10 and 0.12. To select a version from the command-line, use the "--node-version" option, like this:

trireme --node-version=0.12 foo.js

When embedding Trireme, select the version using the "setNodeVersion" method in the NodeScript class.

With this scheme, both versions of Node can run inside the same JVM.

The version numbers are "semver-style" although they do not support every single feature of semver. The best bet is to use "0.10" and "0.12" to select each.

Trireme Extensions

There are a few NPM modules that only work in Trireme. These allow access to features of the Java platform that are normally accessed via native code in regular Node.js. These modules are as follows:

trireme-jdbc: This module provides access to JDBC drivers from inside Node.js code. This makes it possible to use databases that have excellent JDBC drivers (such as Oracle) without compiling any native code.
trireme-xslt: This module provides access to the XSLT processor inside the Java platform, which is faster and can support more of the XSLT standard than any of the native Node.js options.
trireme-support: Additional Trireme-specific support functions. These include the ability to detect if Trireme is being used, and the ability to dyamically load additional Node.js modules from a JAR file.

Logging

Trireme uses slf4j for logging the stuff that is happening in Java. The pre-built JAR, and the NPM "trireme" wrapper, use "slf4j-simple". To turn on debug logging, set the system property "org.slf4j.simpleLogger.defaultLogLevel" to "debug". (Or "trace" for even more output.)

When embedding trireme, you can use any SLF4J-compatible logging framework you wish, such as logback.

How Complete is Trireme?

Trireme supports most of the Node.js APIs and passes much of the Node.js test suite.

The table below shows each module and its status. "Complete" means that a module is functionally complete, although it may not necessarily pass all the node.js tests.

Module	Status	Source
assert	Complete	node.js
child_process	Partial	Trireme
cluster	Not Implemented Yet	node.js
console	Complete	node.js
crypto	Complete	node.js + Trireme
debugger	Not Supported
dgram	Complete	node.js + Trireme
dns	Complete	Trireme
domain	Complete	node.js + Trireme
events	Complete	node.js
fs	Complete	node.js + Trireme
globals	Complete	node.js + Trireme
http	Complete	node.js + Trireme
https	Complete but See Notes	Trireme
module	Complete	node.js
net	Complete	node.js + Trireme
os	Partial	Trireme
path	Complete	node.js
process	Complete	Trireme
punycode	Complete	node.js
querystring	Complete	node.js
readline	Partial	node.js
repl	Not Supported	node.js
stream	Complete	node.js
string_decoder	Complete	node.js
timers	Complete	node.js + Trireme
tls	Complete but See Notes	Trireme
tty	Complete	Trireme
url	Complete	node.js
util	Complete	node.js
vm	Complete	Trireme
zlib	Complete	Trireme

What are the Major Differences with "real" node.js?

A few of the modules are different, some in major ways:

JavaScript Language

Trireme runs in the JVM on Rhino, which is the most complete JavaScript implementation for the JVM and the one that works on the largest variety of Java distributions. The latest version of Rhino has many of the new JavaScript language features that Node users are used to, but still does not have all the features that are supported when the "--harmony" flag is set.

For specifics, see the Rhino compatibility table. Trireme is currently using RHino 1.7.7.

Most of the time the differences between V8 and Rhino do not affect Node.js code, but occasionally there is a problem. We would love some help from the Rhino community to start to address these differences.

TLS/SSL and HTTPS

Trireme uses Java's standard "SSLEngine" for TLS/SSL and HTTPS support, whereas standard Node.js uses OpenSSL. The TLS implementation in Node.js is a fairly thin layer on top of OpenSSL and we chose not to try and replicate this in Java.

For the most part, TLS and HTTPS in Trireme will work just like they do in Node.js. However, they SSLEngine and OpenSSL are not exactly the same. There are a few differences:

Most notably, especially with Java 7, SSLEngine supports a different set of cipher suites, particularly the various elliptical curve ciphers. There are ciphers in common (otherwise almost everything will break) but there are many that are not. Many Node.js tests that rely on older cipher suites using DES or RC4 will not run on Trireme because many of these older and weaker cipher suites are disabled by default in Java. However, "OpenSSL style" names work in Trireme just as they do in regular Node and if the JVM supports a particular cipher suite from OpenSSL, you will get the same one in Trireme.
Java handles SSL sessions differently, and gives the user less control about it. Right now, Trireme is unable to support the ability of a TLS or HTTPS client to retrieve the session from an existing connection and re-use it for another TCP connection.
Java also will produce different certificate validation errors than OpenSSL does. The errors will still come in the same places and for the same reasons, but if your code depends on a specific error message, it will likely get a different one.
Java's SSLEngine relies on its own "keystore" files, whereas OpenSSL can operate on a variety of files but typically processes PEM files. Trireme handles this disparity by using the "Bouncy Castle" crypto framework to translate PEM files into keys and certificates that SSLEngine can understand. In addition, you can also use regular Java keystore files, as described below.

In order to support TLS and HTTPS using PEM files, the "trireme-crypto" module and its dependencies (Bouncy Castle) must be in the class path. If they are not present, then TLS is still available, but it will only work with Java keystore files (see below) or without using any keys at all. Trireme checks for this dependency at runtime, so it is simply a matter of including it on the class path, since it will fail at runtime if the dependency is needed, and work otherwise.

(For instance, Trireme can still execute a Node program that acts as an HTTPS client using only default certificates without requiring trireme-crypto. But if it needs to validate a particular CA certificate or if it needs to use a client-side certificate then trireme-crypto is also necessary.)

In addition, the TLS and HTTPS-related methods in Trireme can use a Java keystore instead of PEM files. There are three parameters that are relevant here:

keystore: The file name of a Java ".jks" keystore file containing a key and certificate
truststore: The file name of a Java ".jks" keystore file containing trusted CA certificates
passphrase: The passphrase for the keystore and truststore

The corresponding Trireme script may be written like this, as it would be in any Node.js program. However, if the "trireme-crypto" module is not present in the classpath, then this will raise an exception:

var options = {
  key: fs.readFileSync(common.fixturesDir + '/keys/agent1-key.pem'),
  cert: fs.readFileSync(common.fixturesDir + '/keys/agent1-cert.pem')
};

var server = https.createServer(options, function(req, res) {
  console.log('got request');
});

In addition, the following is also valid, and "trireme-crypto" will not be needed:

var options = {
  keystore: common.fixturesDir + '/keys/agent1.jks',
  passphrase: 'secure'
};

var server = https.createServer(options, function(req, res) {
  console.log('got request');
});

Crypto

Like TLS, certain features (Sign/Verify in particular) only work if the "trireme-crypto" module and its dependencies are in the class path. If they are not present then these methods will throw an exception. This is primarily because the trireme-crypto module uses Bouncy Castle to implement PEM file reading and decryption. It is possible to run Trireme without Bouncy Castle if these features are not needed.

Child Process

Child processes are supported. Arbitrary commands may be executed, just like in standard Node.js. The Sandbox may be used to restrict whether particular commands may be executed, or if none should be executed at all.

When a Trireme script uses "fork" to spawn a new instance of itself, the script runs as a separate thread inside the same JVM, rather than as a separate OS process as it works in regular Node.js. The parent may use "send" on the child process to send messages to the child, and the child can use "process.send" to talk back to the parent. This "IPC" mechanism works just like regular Node.js except that it all happens inside the same JVM using a concurrent queue.

Support for "handles" is not currently implemented, however, so a parent may not send a TCP socket to the child and expect the child to be able to handle it.

Some Node.js scripts rely on the ability to spawn a process called "./node" in order to fork itself. Trireme looks for this and tries to use it to spawn a new thread but it does not work in all cases. It does seem to be mostly the Node.js test suite itself that does this.

Cluster

The "cluster" module is not yet supported. When it is, it will support running multiple scripts within a single JVM, like the child process module works as described above.

Filesystem

The filesystem is fairly complete, but remember that Java is an abstraction on top of the OS so it may not behave exactly the same as it does on Linux.

On Java 6, the filesystem implementation falls back to using only the APIs supported in this version of Java, which means that many things like symbolic links are not supported, and support for "chmod" and the like is not exactly the same as in standard Node.js. On Java 7, Trireme is able to use additional features and the filesystem support is much more complete.

Programs that make extensive use of the filesystem, such as NPM, work on Java 7 but we cannot guarantee that they will work on Java 6.

OS

Again, Trireme runs on top of the JVM, which presents an operating-system abstraction. Things that Node.js programs can do like set up signal handlers and the like are not supported.

How Fast is It?

Rhino on the JVM is much slower than V8. (In some benchmarks it is 50 times slower.) However, Node.js programs take advantage of a lot of native code, especially when HTTP and TLS are used, so Trireme generally fares much better.

In general, we have seen simple HTTP benchmarks run at about one-half the speed of the same programs on standard Node.js. Some things are slower than that, and others are faster -- it all depends, as it does with all benchmarks.

Furthermore, Java is notoriously slow to start up, and this especially hurts Trireme when it's used to run command-line tools. So please try it as a long-running server (which is Java's strong suit) before dismissing the whole thing because "trireme /tmp/hello-world.js" runs 40 times slower than node. Thanks!

Finally, we would love to be able to use a faster JavaScript implementation, which would speed up all of Trireme. However, for many programs, Trireme on Rhino will be just fine, and the ability to embed Trireme inside another container is especially helpful.

Package Map

Trireme today consists of several modules. A typical application will wish to include the following in CLASSPATH:

trireme-kernel
trireme-core
trireme-node10src
trireme-crypto
trireme-util

The last two packages are optional for environments that are constrained by space or strong aversion to third- party dependencies.

The bare minimum set of required modules is:

trireme-kernel
trireme-core
Either trireme-node10src or trireme-node12src

Note that if Maven is used, trireme-node10src, trireme-node12src, trireme-crypto, and trireme-util will not be automatically pulled in by trireme-core -- it is the responsibility of the calling application to include each one explicitly. This way, Trireme may be used in environments where space is an issue.

This table will help keep them straight:

module	Required?	Recommended?	Description
trireme-kernel	X	X	Generic runtime support needed by the core
trireme-core	X	X	The core module containing the guts of Trireme
trireme-node10src	See Notes	X	JavaScript code that makes Trireme implement Node.js 0.10
trireme-node12src	See Notes	X	JavaScript code that makes Trireme implement Node.js 0.12
trireme-crypto		X	Support code for reading PEM files and some other crypto operations. Uses Bouncy Castle. If not in the classpath, certain crypto operations (notably PEM file support for TLS and HTTPS) will not work. Nonetheless, this is a separate package in case some implementations are wary of distributing Bouncy Castle.
trireme-util		X	Native Trireme / Java implementations of a few Node.js modules, notably "iconv". These are faster than the usual packages from NPM. If in the classpath, these modules will be used instead of searching the module path for a regular module.
trireme-servlet			A generic servlet that may be packaged with Node code so that it may run in a WAR.
trireme-net			An HttpAdaptor implementation that uses Netty. Mainly useful as an example to show how to write an HTTP adaptor for embedding into another container.
trireme-shell			A command-line shell for Trireme that mimics "node"
trireme-jar			A package that builds an all-in-one jar that contains all of the above.
rhino-compiler			A Maven plugin that compiles JavaScript into .class files for use in Rhino. Used in the build process or "node10src" and others.

Additional modules in this directory are used only for testing.

What Are the Dependencies?

Since Trireme is supposed to be highly embeddable, we try to minimize the dependencies.

Rhino

This is the most mature framework for running JavaScript under Java and it works all versions of Java. Nashorn (new in Java 8) is faster in nearly all cases, but making Trireme run on Nashorn is more of a re-write of Trireme than a simple "port."

Slf4j

This is the de facto standard logging API for Java.

Java SE 6

Trireme runs on Java 6 and up, although at least Java 7 is recommended. Java 7 supports a much richer filesystem abstraction, which the "fs" module depends upon. Certain more complex Node applications, such as "NPM," can only run on Trireme when Java 7 or higher is used.

Trireme works fine on Java 8. It uses the standalone version of Rhino, so it is not affected by the fact that the default JavaScript engine was changed between Java 7 and Java 8.

Design

Node.js Implementation

Trireme has a similar architecture to Node.js itself. Many of the core modules in standard Node.js rely on a JavaScript shell, with native modules underneath that are written in C++.

Trireme is similar, and in many cases it exposes Java modules that mimic the interfaces of the C++ native modules in Node.js. So for instance, Trireme implements a native "tcp_wrap" module in Java that uses NIO to emulate the same API as the "tcp_wrap" module in Node.js. The same goes for udp, HTTP parsing, and many other things.

Threading Model

Each Trireme script runs in a single thread. In other words, when the script is executed, it spawns a new thread and occupies it until the script exits. Ticks and timers are implemented within that single thread. If the script exits (has no ticks or timers, is not "pinned" by a library like http, and falls off the bottom of the code) then the thread exits.

This way, there is no need for synchronization for most of the things that the scripts do, just like in regular Node.js.

However, some modules, such as the filesystem, may block, so those modules dispatch to a thread pool, just like in many other Java programs.

Similarly, the "HTTP adapter" allows Trireme to be embedded inside an existing server container, and in that case HTTP requests may come from many different threads. For that reason, the main event loop for each Trireme script depends on underlying collections that are thread-safe, so that different threads may place events on the event loop.

In the future, we may choose to support multi-tenant script threads, so that many isolated scripts may run in the same thread. That would decrease memory usage and context switching for servers that run many scripts.

HTTP Adapter

The HTTP adapter is an interface that a server may implement and plug in to Trireme. When it is plugged in, the adapter is responsible for calling Trireme when new HTTP requests arrive, and for presenting the actual HTTP requests and responses.

When this is used, Trireme scripts work just as they do in standard Node.js, but the "server" part of http is delegated to the adapter. (The client side of http continues to work the same way, however.)

The Sandbox

The sandbox is an interface that a server may implement that allows control over what scripts are allowed to do. It allows a script to accept or reject requests to access the filesystem, access the network, and execute programs. Using the sandbox, it is possible to run Node.js scripts in a totally isolated environment in a multi-tenant server.

trireme's People

Contributors

Stargazers

Watchers

trireme's Issues

Add callback for ClassShutter whitelist

By default, Trireme uses the Rhino ClassShutter to explicitly reject access to any Java classes other than ones that Trireme needs to do its work. However, some applications that embed trireme actually want to use other Java classes, and we need to create a way to extend Trireme to allow that.

Options are adding a callback to Sandbox that Trireme can call from its ClassShutter, or allowing the user to pass a Set of class names, or regexps for that matter.

Scripts that compile to more than 64K fail

Any script that compiles to more than 64K of bytecode fails because the JVM cannot execute classes that large. This is a limitation in Java and it can't be changed.

However, in Rhino we can put the context into interpreted mode, run the script, and then switch back to compiled mode. We will try this and see if it works.

trireme vs avatar-js vs nodyn

Hi,

I've just discovered trireme project. I find very interesting that it can run on Rhino with pure Java code.
As I know 2 other similar projects (avatar-js and nodyn), I would like to know what would be the advantage of using trireme as at least nashorn/avatar-js are from Oracle, and I guess, somehow official for Java-8.
I'm looking for a real nodejs "commonjs" way of loading modules installed with npm. Is this feasible with trireme?

Thanks in advance,
jgran

Add a BSON module to trireme-utils.

The "bson" module is used pretty often, most notably in the MongoDB client. It can run in JavaScript mode and in native mode. Native mode, of course, does not work in Trireme.

We should add a BSON implementation to trireme-utils so that there is a Java version out there for this module.

Complete "crypto" module

Tasks remaining:

Cipher (including insecure but Node-compatible password-based encryption)

Diffie-Hellman (need to extend the noderunner-crypto module since this will depend on Bouncy Castle -- the native stuff in Java does not have an API that will work with the Node API)

Finish DSA signature support

Support "iconv" and "iconv-lite"

Customers are starting to ask about character set conversion. This is typically done using one of two NPM modules:

"iconv" uses the native "iconv" library on most systems. It therefore does not work in Noderunner since we do not support native code.

"iconv-lite" is built entirely in JavaScript, but works using gigantic, generated JavaScript source files that are too large to run on the JVM because they result in .class files larger than 64K.

Both of these modules have simple interfaces. We should build Java implementations of them (using the built-in charset support) and install them as built-in modules so that they override any other "iconv" or "iconv-lite" module that is supported.

This is blocking at least one customer who is trying to work with an email client.

Move code from Node.js to a separate repo and module

Today, whenever a new Node release comes out, we copy the relevant .js files from the Node project into the Trireme repo. This means that Trireme can run only one release of "Node" at a time.

Instead, we should copy these files into a separate repo, and pre-compile them there. We can include other Trireme-specific JS files there as well. This will eventually make it easier to support multiple releases of "node.js" on the same JVM as long as the "native" interface remains backward-compatible.

Cannot replace process.argv

Nodejs allows user to replace process.argv vector, but Trireme does not.
Create simple test.js file:

console.log(process.argv);
process.argv = [];
console.log(process.argv);

After that:

> node test.js
[ 'C:\\Program Files\\nodejs\\node.exe', 'T:\\Temp\\test.js' ]
[]
> java -jar trireme.0.7.5.jar test.js
[ './node', 'T:\\Temp\\test.js' ]
[ './node', 'T:\\Temp\\test.js' ]

This blocks us from replacing the environment, for example, for the typescript compiler (which currently takes arguments only from process.argv).

child_process.spawn hangs on no permission

If the sandbox is set up to prevent launching child processes, then child _process.spawn hangs rather than return an "error" event. This is holding up NPM on Apigee.

Re-do datagram using NIO

The datagram support in Trireme is pretty basic, and spawns a lot of threads. NIO supports datagrams and should be a lot cleaner and more efficient.

The main reason to make this better is so that we can use it as the basis of a more complete DNS implementation.

Make more parts of "socket" available when HTTP adapter is used

When the HTTP adapter is used, each HTTP request has a "socket" object on it with a few basic properties and methods. However, it should have more, such as a "remoteAddress" field.

On the same lines, there's supposed to be a "connection" event that is fired on a new HTTP connection -- it's not clear how often it's used but the adapter should at least emulate it.

Add a JDBC module

There are some databases out there without native Node.js drivers, and it'd be better to have a way to use them that takes advantage of the power of Java.

Create a "trireme-jdbc" NPM module that is backed by a native Java module that exposes JDBC. At the very least, it should:

Allow connections to any JDBC driver in the class path using the usual convention of a JDBC URL.
Support common properties on the connection
Support prepared statements, wildcards, and other basic ways of executing queries
Can return query results to an EventListener as in a stream

It's debatable whether this should support connection pooling -- if it does not, then we will need another Node.js module to do that part.

Un-seal the root context

In order to save memory and start up time, each NodeEnvironment creates a single Rhino context that contains all the standard objects (String, Date, etc) and "seals" it so that these cannot be modified. It then shares them between all scripts.

However, lots of Node scripts like to modify these things. We should stop doing this, and incur the cost of calling "initStandardObjects" for each script, and not trying to share a global root.

Implement faster "ascii" character set

We use the built-in character set conversions in Java for all string encoding and decoding. This works great for lots of character sets, but for ASCII it is quite pedantic in the way that it checks for valid and invalid characters. Native Node, on the other hand, appears to just lop off every byte and cast it directly to an ASCII character, with no validation.

Looking at benchmark results, Trireme suffers more than usual against Node when ASCII characters are used. We should see if implementing a "dumber" version of the ASCII character set from Java would speed these up and make them more compatible with native Node.

Hide network interfaces via Sandbox option

The "os" module may be used to enumerate the IP addresses of the host. When embedding Trireme inside a security-sensitive environment, it may not be safe to expose internal IP addresses to Node.js scripts. There should be a way to turn that off.

Add a JMS module

JMS is used often in some environments and by definition it is Java only. Trireme can help with this.

Create a "trireme-jms" NPM module that exposes JMS to Node.js apps. It should support, at the minimum:

Connect using the usual JNDI parameters so we can use any JMS provider on the class path
Automatically pool sessions and producers so that the client API is as simple as possible
Receive messages as a stream using one or more message listeners
Map the various JMS object types to JavaScript as best we can.

It's debatable how many additional features, such as transacted sessions and the ability to get access to the actual "session" object should be supported.

We should not bother supporting XA.

Support WebSockets in the HTTP Adapter

In order to integrate Trireme with systems that support WebSockets, we're going to have to come up with a way for the HTTP adapter to allow an HTTP connection to be upgraded to WebSockets, and then pass the traffic back and forth.

Use "node.js" for the main loop

Today the Trireme main loop is coded in Java. This is nice, but it has led to some complex coding as we try to implement the fairly complex error handling and "tick" execution logic of regular Node.js.

We should see if we can use "node.js" itself as the main event loop, which would let us take out some complex code. Obviously performance is one thing that we need to test.

Assertion error when talking to HTTPS target

When calling "end" on an HTTPS client request, connections may fail with an assertion error on http.js:1710, related to the "socket.writable" state in the "responseOnEnd" method.

The fix is to properly manage the "writable" property in tls.js.

NodeScript.executeModule() loses script name in process.argv

Whenever I access process.argv in a standard Node.js environment, it looks like this:

0: The path used to execute node
1: The path to the script being executed
...: The command line arguments

So if I invoked my application using node myscript.js test, I'd get this: ['node', '/full/path/to/myscript.js', 'test']. When I do the same thing via Trireme, the path to the script being executed is missing. I'd get something like this: ['./node', 'test']. For anyone use to using process.argv.slice(2) to get your application arguments, this fails. Here is an example in Java:

// ...
NodeScript nodeScript = nodeEnv.createScript("myscript.js", new File("myscript.js"), new String[] { "test" });
ScriptFuture runningScript = nodeScript.executeModule();
Scriptable module = runningScript.getModuleResult();
// ...

Looking into this, I see what is going on. Basically NodeScript.executeModule() creates a wrapper module string and uses the NodeScript constructor that isn't passed a File object. So when it comes time to set the script name in process.argv, the assumption is that we are evaluating a string and the script name is lost. Looking at the code, I came up with a simple change that conditionally sets the process.argv script to null:

diff --git a/core/src/main/java/io/apigee/trireme/core/internal/ScriptRunner.java b/core/src/main/java/io/apigee/trireme/core/internal/ScriptRunner.java
index 503a7b1..6d94140 100644
--- a/core/src/main/java/io/apigee/trireme/core/internal/ScriptRunner.java
+++ b/core/src/main/java/io/apigee/trireme/core/internal/ScriptRunner.java
@@ -537,12 +537,12 @@ public class ScriptRunner
             if ((scriptFile == null) && (script == null)) {
                 // Just have trireme.js process "process.argv"
                 process.setForceRepl(forceRepl);
-                setArgv(null);
+                setArgv(forceRepl ? null : scriptFileName);
             } else if (scriptFile == null) {
                 // If the script was passed as a string, pretend that "-e" was used to "eval" it.
                 process.setEval(script);
                 process.setPrintEval(scriptObject.isPrintEval());
-                setArgv(null);
+                setArgv(scriptObject.isPrintEval() ? null : scriptFileName);
             } else {
                 // Otherwise, assume that the script was the second argument to "argv".
                 setArgv(scriptFileName);

Per Greg's request, I've created this issue to allow for a better understanding of the problem and to see example code.

SSLException when using https.request

I'm getting an SSLException with "Unsupported record version" when using https.request.

The URL I'm using is https://api.usergrid.com/kevinswiber/sandbox/store-locations.

Example script to reproduce:

var https = require('https');

var url = 'https://api.usergrid.com/kevinswiber/sandbox/store-locations';

var req = https.request(url, function(res) {
 console.log('yippee');
 res.on('data', function(d) {
   process.stdout.write(d);
 });
});

req.end();

req.on('error', function(e) {
  console.error(e);
});

Attach context to HTTP requests in HTTP adapter

We may need to pass data associated with a particular HTTP request between a container and an HTTP server using the HTTP adapter. We should:

Add a parameter to the adapter's HTTP request
Set it on the HTTP 'request' object
Allow it to be set to any Java object.

Cache compiled classes between scripts

Rhino generates bytecode every time a script is compiled, and that bytecode goes into the PermGen nutil the script is no longer used and it gets GCed. In a server with hundreds of node.js scripts running, if each is using a large set of modules such as Express, then we end up with hundreds of megabytes of bytecode sitting in PermGen.

We can drastically reduce the memory usage in large deployments by caching bytecode. For instance:

SHA1 hash each script before compiling it
Check a thread-safe cache to see if the same script has already been compiled
Only compile if the bytecode is not cached
Make the cache use weak / soft / whatever references so that it will be GCed on memory pressure and not grow forever.

npm does not run

It'd be nice to be able to get NPM to run on Noderunner so that we can automatically run it in our cloud.

Right now it doesn't run because "umask" is not supported.

Re-write Java/JavaScript interface using javax.script API

Recording this here so that we know why we didn't do it.

The "javax.script" package, aka JSR 223, is the standard interface for scripting in Java. It lets a Java program run JavaScript code, and it lets the JavaScript code directly create and access Java classes.

Rewriting Trireme to use this API instead of the Rhino APIs would make it easier to port to other script engines in the future. However there are a lot of issues to overcome. We're not going to do this now and this issue is going to record why.

javax.script allows us to:

Run a script inside the JVM
The script can use a special API to create Java objects and call their methods and access their fields
Lots of scripts can be running inside the JVM in different contexts.

Using Rhino, today we are also doing things that javax.script can't do:

Creating JavaScript objects and populating their fields from inside Java code. We do this all over the place. In theory we could do it by calling functions in JS for each object that we want to create (see below though) but it would be complicated and might not even work the way that we need.
Passing a function down from JavaScript to Java so that we can invoke it again. This happens all over Trireme as well, as the Java code is constantly asked to do something asynchronously and call a function when it is complete. Again in theory we can create lots of top-level functions in JavaScript to do this but it would again be ugly.
Implementing functions in Java that can be called from JavaScript via a variable argument list. This is less important however.

Also, a few more items of interest:

Rhino lets us set up a "class shutter" to control which Java classes Java code can access. This is a critical security feature. Doing that with javax.script would require some sort of custom classloader or security manager hack that few people would be able to understand and debug.
Rhino can call us back every N instructions, and allow us to check the current running time of a Node "tick" and throw an exception. That lets us abort long-running tasks in a clean way.
Rhino gives us very clean control of the prototype hierarchy, which makes it easier to cleanly implement modules like "vm". Not sure how to do this in javax.script yet.

mocha does not run

Looks like the issue is that we don't have all the bits in place to make "tty" work like regular node.

Use Nashorn instead of Rhino (if available)

It would be great if Trireme could use Nashorn instead of Rhino if Nashorn is available. I think ideally, there should be a setting to specify the name of the engine to use, then if the engine isn't specified, Nashorn should be tried followed by Rhino.

Since Nashorn is faster and provides more javascript features than Rhino, it seems to be the clear choice (when there is one), and with Nashorn backports popping up, it may be a more realistic option soon even for users of Java < 8.

Support Windows

There are a number of issues that caused windows support to fail to work:

Node tries to turn "C:/" paths into UNC paths that start with "//?/" which is legal in some universe but not in Java.
"posix" file permissions and other file attributes don't work in Windows, so we have to handle them differently
A few other minor things.

Some basic OOTB Node.js vs Noderunner perf with Argo.

I ran a basic ab test using Node v0.10.4 and then on Noderunner.

I'm running this on Mac OS X v10.8.3, 2.4GHz Intel Core i5, 8GB DDR3 RAM, only using a single core for Node (Not sure how the Netty implementation handles this. Could be multiple?)

Here's the command:

~/projects/httpd-2.4.3/support/ab -n10000 -c100 -k \
http://127.0.0.1:3000/web/greeting

I'm using a compiled version of ab because the version that comes with Mac OS X is somewhat broken.

Here's the script I'm running: https://github.com/argo/argo/blob/master/example/map.js

Calling /web/greeting takes a particularly circuitous route, so it should be much slower than some of the other examples in that repo.

Here are the results!

Running Node.js v0.10.4:

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            3000

Document Path:          /web/greeting
Document Length:        13 bytes

Concurrency Level:      100
Time taken for tests:   1.819 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    10000
Total transferred:      1590000 bytes
HTML transferred:       130000 bytes
Requests per second:    5497.50 [#/sec] (mean)
Time per request:       18.190 [ms] (mean)
Time per request:       0.182 [ms] (mean, across all concurrent requests)
Transfer rate:          853.62 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.4      0       5
Processing:     2   18   1.8     18      26
Waiting:        2   18   1.8     18      26
Total:          2   18   1.7     18      26

Percentage of the requests served within a certain time (ms)
  50%     18
  66%     19
  75%     19
  80%     19
  90%     20
  95%     21
  98%     22
  99%     23
 100%     26 (longest request)

Running Noderunner at commit d3a90b8 (Note: This is after running the benchmark several times. It seems like Java needs to heat up before the numbers start looking decent. That's cool. So does V8. I'm assuming Java is doing some optimizations at runtime?):

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            3000

Document Path:          /web/greeting
Document Length:        13 bytes

Concurrency Level:      100
Time taken for tests:   3.771 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    10000
Total transferred:      1590000 bytes
HTML transferred:       130000 bytes
Requests per second:    2651.82 [#/sec] (mean)
Time per request:       37.710 [ms] (mean)
Time per request:       0.377 [ms] (mean, across all concurrent requests)
Transfer rate:          411.76 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.3      0      15
Processing:    12   37   8.6     34     110
Waiting:       12   37   8.6     34     110
Total:         12   38   9.0     34     110

Percentage of the requests served within a certain time (ms)
  50%     34
  66%     35
  75%     39
  80%     46
  90%     51
  95%     56
  98%     61
  99%     67
 100%    110 (longest request)

No insightful comments. This is just informational. :)

No errors with Argo. No problems. This project is moving along wonderfully from my perspective. 👍

Add a big integer implementation to "trireme-utils"

Large integer arithmetic is going to be important to a variety of Node programs. Doing it in JavaScript is going to be a lot slower than having a native Java implementation.

We should add a big integer implementation to the "trireme-utils" module so that we can have a built-in Java implementation, like we have for "iconv."

There are many possibilities on NPM. "bignumber.js" seems to be the one that has the most dependents.

Build a servlet adapter and sample

Build an adapter that lets you use the "HTTP adapter" to construct a WAR file that runs a Node.js app. Create a sample app as well.

jade template engine does not work

in "lib/compiler.js," it relies on String.trimLeft().

The fix for this is to get an updated version of Rhino.

Fix "escape()" so that it is V8 compatible

The (old, ugly, and deprecated) "escape()" global JavaScript function works slightly differently in V8 and Rhino.

In V8 it takes one parameter and ignores the rest. Rhino looks at an optional second "mask" parameter that comes from ancient JavaScript history.

The problem comes when you try to pass "escape" as a parameter to "String.replace," which is done in a lot of places, notably various SDKs for Amazon Web Services, like this:

var output = encodeURIComponent(string);
output = output.replace(/[^A-Za-z0-9_.~\-%]+/g, escape);

The workaround is to replace "escape" with a version that ignores all but the first parameter, like this:

var oldEscape = escape;
escape = function(s) {
    return oldEscape(s);
};

We will fix this in Trireme by putting that code in "trireme.js" so that "escape" in Trireme works just like it does in V8.

How do I execute noderunner?

If I want to test some JavaScript against this bad boy, how do I do it?

Provide a native implementation of "node-expat"

"node-expat" is a popular module that uses expat to do SAX. It's one of the most popular XML modules. Since it uses native C code it won't work in Trireme. We should build support into the "utils" module that implements the interface using the platform's SAX parser.

Suitability in RingoJS

I've been using RingoJS for a few years now, and I want to add some node libraries to my projects, such as the AWS SDK. Would trireme be suitable to leverage commonjs modules with node dependencies in an existing javascript framework such as RingoJS, or does it require the node application structure and implementation?

Java 6 filesystem code should fail more gracefully

A number of filesystem features can't be supported in Java 6, such as symbolic links and "chmod". These features should fail gracefully with proper error messages and not just error out with Java exceptions like they do now. That way programs like NPM that recover gracefully from certain filesystem failures will behave better on Java 6.

Error when running mongodb client

Wrapped java.lang.ClassCastException: org.mozilla.javascript.NativeObject cannot be cast to io.apigee.trireme.core.modules.Process$ProcessImpl
at /Users/Apigee/src/test/mongo/node_modules/mongodb/lib/mongodb/connection/server.js:201 (anonymous)
at /Users/Apigee/src/test/mongo/node_modules/mongodb/lib/mongodb/db.js:347 (anonymous)
at /Users/Apigee/src/test/mongo/node_modules/mongodb/lib/mongodb/db.js:285 (anonymous)
at /home/ec2-user/workspace/Trireme/core/src/main/javascript/io/apigee/trireme/scripts/trireme_loop_support.js:42 (submitTick)

Support charset detection

One customer has asked about charset detection. This is an inexact science, but the "jschardet" module does it. However, it works using gigantic generated source files that do not work on the JVM.

We can work around this by creating a Noderunner-specific module based on the open-source libraries such as:

https://code.google.com/p/juniversalchardet/

Then make this another built-in module.

Load JAR files from NPM

Figure out a way to load JAR files from NPM.

The idea is that if someone uses NPM to install a module that requires native code, we can find a JAR in there and automatically load the code from there, similar to the way that NPM loads native code today.

Give Trireme its own version of Rhino

We have a fork of Rhino with a lot of important compatibility updates sitting in GitHub waiting to go. However, we're not making progress on getting it pushed into the official Mozilla Maven group, which means that we have to call the module "io.apigee.rhino".

Once we do that, there is a potential classpath conflict with any project that wishes to use a non-Apigee version of Rhino, in case they have something else in their project that depends on both "io.apigee.rhino" and "org.mozilla.rhino".

Furthermore, we have run into some compatibility issues recently that would be improved if we could optimize for V8 compatibility rather than Rhino compatibility -- the "escape" function's undocumented second parameter is one example.

Finally, we are going to need to extend Rhino to support some of the new things that Node 11 and 12 will need, namely the ability to bind external array data to an object, which has to be built in to the runtime.

So this issue proposes that we use our Rhino fork and rename the packages to "io.apigee.rhino" so that we can have a version that is bound to Trireme.

Use

Finish dns implementation

The "dnsjava" library looks like it has everything that we need and does not depend on any native code. Various versions are available in Maven Central.

http://www.dnsjava.org/

Replace native IO code with Netty

Pros of doing this:

-- A bunch of complex code in "TCPWrap" is replaced with much simpler code using Netty
-- A bunch of even more complex code in "SSLWrap" and "tls.js" is also replaced
-- We benefit from many people fixing bugs and improving performance, especially in the TLS area
-- We can potentially use multiple CPU cores to offload the SSL processing

Cons:

-- We depend on Netty 4.0
-- Netty 4.0 has a very fast release train
-- This will make integration with other third-party components more complex and potentially make Trireme less embeddable.

Latest npm fails with 'Cannot call method "call" of undefined'

NPM 1.4.9 (and possibly other versions) fails due to a problem on line 310 of the "glob" module, fails with:

Cannot call method "call" of undefined  
(/Users/Apigee/src/noderunner/apptests/src/test/resources/npm/node_modules/npm/node_modules/glob/glob.js#310)

I will paste the source to this function below.

This is easily fixable by moving the "next" function in the code below above the code that calls it. The problem is that for some reason on Trireme/Rhino the function "next" is undefined in this scope but in Node/V8 it is not. The code below is legal JavaScript AFAIK and Rhino can handle simpler cases like this, but not this one.

For those who need to run NPM on Trireme, we can fork NPM, but I hesitate to do that. We can ask the NPM people to change the code but there is no good reason for them to change it because the code is (excessively cute and complicated but) perfectly correct.

Glob.prototype._processEmitQueue = function (m) {
this.log("pEQ paused=%j processing=%j m=%j", this.paused,
       this._processingEmitQueue, m)
var done = false
while (!this._processingEmitQueue &&
     !this.paused) {
this._processingEmitQueue = true
var m = this._emitQueue.shift()
this.log(">processEmitQueue", m === this.EOF ? ":EOF:" : m)
if (!m) {
  this.log(">processEmitQueue, falsey m")
  this._processingEmitQueue = false
  break
}

if (m === this.EOF || !(this.mark && !this.stat)) {
  this.log("peq: unmarked, or eof")
  next.call(this, 0, false)
} else if (this.statCache[m]) {
  var sc = this.statCache[m]
  var exists
  if (sc)
    exists = sc.isDirectory() ? 2 : 1
  this.log("peq: stat cached")
  next.call(this, exists, exists === 2)
} else {
  this.log("peq: _stat, then next")
  this._stat(m, next)
}

function next(exists, isDir) {
  this.log("next", m, exists, isDir)
  var ev = m === this.EOF ? "end" : "match"

  // "end" can only happen once.
  assert(!this._endEmitted)
  if (ev === "end")
    this._endEmitted = true

  if (exists) {
    // Doesn't mean it necessarily doesn't exist, it's possible
    // we just didn't check because we don't care that much, or
    // this is EOF anyway.
    if (isDir && !m.match(/\/$/)) {
      m = m + "/"
    } else if (!isDir && m.match(/\/$/)) {
      m = m.replace(/\/+$/, "")
    }
  }
  this.log("emit", ev, m)
  this.emit(ev, m)
  this._processingEmitQueue = false
  if (done && m !== this.EOF && !this.paused)
    this._processEmitQueue()
}
}
done = true
}

Implement "cluster" module

This will require completing the "child-process" module. This module today can spawn another script as a thread inside the VM but in order for cluster to work it needs to be able to communicate between scripts.

Once that is done in theory cluster should be fairly simple. Passing "file descriptors" back and forth would require that we pass objects between the various scripts and deal with NIO selectors.

Implement an XSLT module

The leading XSLT module is "node_xslt", which depends on native code.

Add XSLT support that uses what is built in to Java.

NodeJs hosting .html fails with ENOENT

I am trying to serve an HTML file with express. This is my code here, and it works perfectly locally.

var express = require('express'),
fs = require('fs');
path = require('path');

// Start app
var app = express(express.logger());
app.use(express.json());
app.use(express.urlencoded());

app.get('/test', function(req, res) {
console.log(path.join(__dirname + '/test.html'));
fs.readFile(path.join(__dirname + '/test.html'), 'utf8', function (err, data) {
if (err) {
throw err;
} else {
res.send(data)
}
});
});

//test.html just a basic HTML form. Its on the same level as app.js

But when I deploy the same to edge, I get a ENOENT error
{"fault":{"faultstring":"Script executed prematurely: ENOENT","detail":{"errorcode":"messaging.runtime.ScriptExitedError"}}}

The console.log resolves to
/organizations/asu/environments/test/apis//test.html ('apiproxy/node/resources' is missing)
when run locally, the console.log resolves properly

Symbolic links to relative destinations don't come out right

NPM uses a lot of symbolic links. When Trireme creates them, it tries to make them into absolute paths relative to the current working directory, which is wrong for two reasons. First of all, it comes up with the wrong answer, and second of all, it means that the link doesn't get created in a relative way so that you can't move it around with the rest of the directory.

Note that this fix will only work for Java 7 since Java 6 does not support symbolic links.

HPE_INVALID_CONSTANT error with request module

Using request module to delete an entity on a customer opdk. Works fine on V8. But with Trireme it dumps with
error { [Error: Parse Error] message: 'Parse Error', code: 'HPE_INVALID_CONSTANT' }
TypeError: Cannot read property "statusCode" from undefined.

This gist https://gist.github.com/girishbr/4378ce246f9ee579a8b2 contains the traces. Please look.