Giter VIP home page Giter VIP logo

redstorm's Introduction

RedStorm - JRuby on Storm

Gem Version build status Code Climate Coverage Status

RedStorm provides a Ruby DSL using JRuby integration for the Storm distributed realtime computation system.

Like RedStorm? visit us on IRC at #redstorm on freenode

Check also these related projects:

Documentation

Chances are new versions of RedStorm will introduce changes that will break compatibility or change the developement workflow. To prevent out-of-sync documentation, per version specific documentation are kept in the wiki when necessary.

Dependencies

Stable 0.6.6

  • Tested on OSX 10.8.3, Ubuntu Linux 12.10 using Storm 0.9.0-wip16, JRuby 1.7.4, OpenJDK 7

Current 0.7.0.beta1

  • Tested on OSX 10.9.1, Ubuntu Linux 12.10 using Storm 0.9.1-incubating, JRuby 1.7.11, OpenJDK 7

Installation

  • RubyGems

    $ gem install redstorm
  • Bundler

    source "https://rubygems.org"
    gem "redstorm", "~> 0.6.6"

Usage

Overview

  • create a project directory

  • install the RedStorm gem

  • create a subdirectory for your topology code

  • perform the initial setup as to build and install dependencies

    $ redstorm install
  • run your topology in local mode

    $ redstorm local <path/to/topology_class_file_name.rb>

Initial setup

$ redstorm install

This will install default Java jar dependencies in target/dependency, generate & compile the Java bindings in target/classes.

By default, the Java compilation will use the current JVM version as source/target compilation compatibility. If you want to force a specific source/target version, use the --[JVM VERSION] option. For example, to force compiling for 1.6 use:

$ redstorm install --1.6

Create a topology

Create a subdirectory for your topology code and create your topology class using this naming convention: underscore topology_class_file_name.rb MUST correspond to its CamelCase class name.

Here's an example hello_world_topology.rb

require 'red_storm'

class HelloWorldSpout < RedStorm::DSL::Spout
  on_init {@words = ["hello", "world"]}
  on_send {@words.shift unless @words.empty?}
end

class HelloWorldBolt < RedStorm::DSL::Bolt
  on_receive :emit => false do |tuple|
    log.info(tuple[0])
  end
end

class HelloWorldTopology < RedStorm::DSL::Topology
  spout HelloWorldSpout do
    output_fields :word
  end

  bolt HelloWorldBolt do
    source HelloWorldSpout, :global
  end
end

Gems in your topology

RedStorm requires Bundler if gems are needed in your topology. Supply a Gemfile in the root of your project directory with the gems required in your topology. If you are using Bundler also for other gems than those required in the topology you should group the topology gems in a Bunder group of your choice.

Note that bundler is only used to help package the gems prior to running a topology. Your topology code should not use Bundler. With require "red_storm" in your topology class file, RedStorm will take care of setting the gems path. Do not require 'bundler/setup' in the topology.

  1. have Bundler install the gems locally
$ bundle install
  1. copy the topology gems into the target/gems directory
$ redstorm bundle [BUNDLER_GROUP]
  1. make sure your topology class has require "red_storm"
require 'red_storm'

The redstorm bundle command copy the gems specified in the Gemfile (in a specific group if specified) into the target/gems directory. In order for the topology to run in a Storm cluster, the fully installed gems must be packaged and self-contained into a single jar file. This has an important consequence: the gems will not be installed on the cluster target machines, they are already installed in the jar file. This could lead to problems if the machine used to install the gems is of a different architecture than the cluster target machines and some of these gems have C or FFI extensions.

Custom Jar dependencies in your topology (XML Warning! :P)

By defaut, RedStorm installs Storm and JRuby jars dependencies into target/dependency. RedStorm uses Ivy 2.3 to manage dependencies. You can fully control and customize these dependencies.

There are two distinct sets of dependencies: the storm dependencies manages the requirements (Storm jars) for the Storm local mode runtime. The topology dependencies manages the requirements (JRuby jars) for the topology runtime.

You can supply custom storm and topology dependencies by creating ivy/storm_dependencies.xml and ivy/topology_dependencies.xml files. Below are the current default content for these files:

  • ivy/storm_dependencies.xml

    <?xml version="1.0"?>
    <ivy-module version="2.0" xmlns:m="http://ant.apache.org/ivy/maven">
      <info organisation="redstorm" module="storm-deps"/>
      <dependencies>
        <dependency org="storm" name="storm" rev="0.9.0-wip16" conf="default" transitive="true" />
        <override org="org.slf4j" module="slf4j-log4j12" rev="1.6.3"/>
      </dependencies>
    </ivy-module>
  • ivy/topology_dependencies.xml

    <?xml version="1.0"?>
    <ivy-module version="2.0" xmlns:m="http://ant.apache.org/ivy/maven">
      <info organisation="redstorm" module="topology-deps"/>
      <dependencies>
        <dependency org="org.jruby" name="jruby-core" rev="1.7.4" conf="default" transitive="true"/>
    
        <!-- explicitely specify jffi to also fetch the native jar. make sure to update jffi version matching jruby-core version -->
        <!-- this is the only way I found using Ivy to fetch the native jar -->
        <dependency org="com.github.jnr" name="jffi" rev="1.2.5" conf="default" transitive="true">
          <artifact name="jffi" type="jar" />
          <artifact name="jffi" type="jar" m:classifier="native"/>
        </dependency>
    
      </dependencies>
    </ivy-module>

The jars repositories can be configured by adding the ivy/settings.xml file in the root of your project. For information on the Ivy settings format, see the Ivy Settings Documentation. Below is the current default:

  • ivy/settings.xml

    <?xml version="1.0"?>
    <ivysettings>
      <settings defaultResolver="repositories"/>
      <resolvers>
        <chain name="repositories">
          <ibiblio name="ibiblio" m2compatible="true"/>
          <ibiblio name="sonatype" root="http://repo.maven.apache.org/maven2/" m2compatible="true"/>
          <ibiblio name="clojars" root="http://clojars.org/repo/" m2compatible="true"/>
          <ibiblio name="conjars" root="http://conjars.org/repo/" m2compatible="true"/>
        </chain>
      </resolvers>
    </ivysettings>

Run in local mode

$ redstorm local <sources_directory_path/topology_class_file_name.rb>

note that the topology can also be launched with the following command:

$ java -cp "target/classes:target/dependency/storm/default/*:target/dependency/topology/default/*:<sources_directory_path>" redstorm.TopologyLauncher local <sources_directory_path/topology_class_file_name.rb>

See examples below to run examples in local mode or on a production cluster.

Run on production cluster

Locally installing the Storm distribution is not required. Note that RedStorm does not provide the Storm command line utilities and you will need to install the Storm distribution to use the Storm command line utilities.

  1. create the Storm config file ~/.storm/storm.yaml and add the following
nimbus.host: "host_name_or_ip"

you can also use an alternate path and use the --config <other/path/to/config.yaml>

  1. generate target/cluster-topology.jar. This jar file will include your sources directory plus the required dependencies
$ redstorm jar <sources_directory1> <sources_directory2> ...
  1. submit the cluster topology jar file to the cluster
$ redstorm cluster <sources_directory/topology_class_file_name.rb>

or if you have an alternate Storm config path:

$ redstorm cluster --config <some/other/path/to/config.yaml> <sources_directory/topology_class_file_name.rb>

note that the cluster topology jar can also be submitted using the storm command with:

$ storm jar target/cluster-topology.jar -Djruby.compat.version=RUBY1_9 redstorm.TopologyLauncher cluster <sources_directory/topology_class_file_name.rb>

The Storm wiki has instructions on setting up a production cluster. You can also manually submit your topology.

Examples

RedStorm includes several example topologies to help get you started. You can find documentation for the examples here.

Ruby DSL

Ruby DSL Documentation

Multilang ShellSpout & ShellBolt support

Please refer to Using non JVM languages with Storm for the complete information on Multilang & shelling in Storm.

In RedStorm ShellSpout and ShellBolt are supported using the following construct in the topology definition:

bolt JRubyShellBolt, ["python", "splitsentence.py"] do
  output_fields "word"
  source SimpleSpout, :shuffle
end
  • JRubyShellBolt must be used for a ShellBolt and the array argument ["python", "splitsentence.py"] are the arguments to the class constructor and are the commands to the ShellBolt.

  • The directory containing the topology class must contain a resources/ directory with all the shell files.

See the shell topology example

Transactional and LinearDRPC topologies

Despite the fact that both transactional and linear DRPC topologies are now deprecated as of Storm 0.8.1 work on these has been merged in RedStorm 0.6.5. Lots of the work done on this is required toward Storm Trident topologies. Documentation and examples for transactional and linear DRPC topologies will be added shorty.

Known issues

  • SnakeYAML conflict between Storm and JRuby

    See issue. This is a classic Java world jar conflict. Storm 0.9.0 uses snakeyaml 1.9 and JRuby 1.7.x uses snakeyaml 1.11. If you try to use YAML serialization in your topology it will crash with an exception. This problem is easy to solve when running topologies in local mode, simply override in the storm dependencies with the correct jar version. You can do this be creating a custom storm dependencies:

    • ivy/storm_dependencies.xml

      <?xml version="1.0"?>
      <ivy-module version="2.0">
        <info organisation="redstorm" module="storm-deps"/>
        <dependencies>
          <dependency org="storm" name="storm" rev="0.9.0-wip16" conf="default" transitive="true" />
          <override org="org.slf4j" module="slf4j-log4j12" rev="1.6.3"/>
          <override org="org.yaml" module="snakeyaml" rev="1.11"/>
        </dependencies>
      </ivy-module>

    In remote cluster mode you will have to update snakeyaml manually or with your favorite deployment/provisioning tool.

RedStorm Development

It is possible to fork the RedStorm project and run local and remote/cluster topologies directly from the project sources without installing the gem. This is a useful setup when contributing to the project.

Requirements

  • JRuby 1.7.4

Workflow

  • fork project and create branch

  • install RedStorm required gems

    $ bundle install
  • install dependencies in target/dependencies

    $ bundle exec redstorm deps
  • generate and build Java source into target/classes

    $ bundle exec redstorm build

    By default, the Java compilation will use the current JVM version as source/target compilation compatibility. If you want to force a specific source/target version, use the --[JVM VERSION] option. For example, to force compiling for 1.6 use:

    $ bundle exec redstorm build --1.6

    if you modify any of the RedStorm Ruby code or Java binding code, you need to run this to refresh code and rebuild the bindings

  • follow the normal usage patterns to run the topology in local or remote cluster.

    $ bundle exec redstorm bundle ...
    $ bundle exec redstorm local ...
    $ bundle exec redstorm jar ...
    $ bundle exec redstorm cluster ...

Remote cluster testing

Vagrant & Chef configuration to create a single node test Storm cluster is available in https://github.com/colinsurprenant/redstorm/tree/master/vagrant/

Notes about 1.8/1.9 JRuby compatibility

Ruby 1.9 is the default runtime mode in JRuby 1.7.x

If you require Ruby 1.8 support, there are two ways to have JRuby run in 1.8 runtime mode:

  • by setting the JRUBY_OPTS env variable

    $ export JRUBY_OPTS=--1.8
  • by using the --1.8 option

    $ jruby --1.8 -S redstorm ...

By defaut, a topology will be executed in the same mode as the interpreter running the $ redstorm command. You can force RedStorm to choose a specific JRuby compatibility mode using the [--1.8|--1.9] parameter for the topology execution in local or remote cluster.

$ redstorm local|cluster [--1.8|--1.9] ...

If you are not using the DSL and only using the proxy classes (like in examples/native) you will need to manually set the JRuby version in the Storm Backtype::Config object like this:

class SomeTopology
  RedStorm::Configuration.topology_class = self

  def start(base_class_path, env)
    builder = TopologyBuilder.new
    builder.setSpout ...
    builder.setBolt ...

    conf = Backtype::Config.new
    conf.put("topology.worker.childopts", "-Djruby.compat.version=RUBY1_8")

    StormSubmitter.submitTopology("some_topology", conf, builder.createTopology);
  end
end

How to contribute

Fork the project, create a branch and submit a pull request.

Some ways you can contribute:

  • by reporting bugs using the issue tracker
  • by suggesting new features using the issue tracker
  • by writing or editing documentation
  • by writing specs
  • by writing code
  • by refactoring code
  • ...

Projects using RedStorm

If you want to list your RedStorm project here, contact me.

  • Tweigeist - realtime computation of the top trending hashtags on Twitter. See Live Demo.

Author

Colin Surprenant, http://github.com/colinsurprenant/, @colinsurprenant, [email protected], http://colinsurprenant.com/

Contributors

License

Apache License, Version 2.0. See the LICENSE.md file.

redstorm's People

Contributors

adsummos avatar anders94 avatar colinsurprenant avatar dinedal avatar evan-stripe avatar iconara avatar jakekdodd avatar jhottenstein avatar jondot avatar nbrochu avatar pirj avatar rb2k avatar stonegao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

redstorm's Issues

automatically set JRuby 1.8/1.9 mode in remote cluster context

remote cluster JRuby mode is controlled using the topology configure block with something like:

configure do |env|
  set "topology.worker.childopts", "-Djruby.compat.version=RUBY1_9"
end

just like with local execution we should default to the current JRuby interpreter version if not explicitly configured.

undefined method 'new'

We occasionally see the following error

java.lang.RuntimeException: org.jruby.exceptions.RaiseException: (NoMethodError) undefined method new' for main:Object at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87) at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58) at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62) at backtype.storm.daemon.executor$fn__4022$fn__4031$fn__4078.invoke(executor.clj:658) at backtype.storm.util$async_loop$fn__437.invoke(util.clj:377) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:679) Caused by: org.jruby.exceptions.RaiseException: (NoMethodError) undefined methodnew' for main:Object

It generally happens soon after the topology first starts processing tuples and the topology requires restarting when it occurs

redstorm should support namespaced class definition

I tried add namespace to word_count_topology.rb in examples/simple/ :

require 'examples/simple/random_sentence_spout'
require 'examples/simple/split_sentence_bolt'
require 'examples/simple/word_count_bolt'
module Example
  module Simple
    class WordCountTopology < RedStorm::SimpleTopology
      spout RandomSentenceSpout, :parallelism => 5

      bolt SplitSentenceBolt, :parallelism => 8 do
        source RandomSentenceSpout, :shuffle
      end

      bolt WordCountBolt, :parallelism => 12 do
        source SplitSentenceBolt, :fields => ["word"]
      end

      configure :word_count do |env|
        case env
        when :local
          debug true
          max_task_parallelism 3
        when :cluster
          debug true
          num_workers 20
          max_spout_pending(1000);
        end
      end

      on_submit do |env|
        if env == :local
          sleep(5)
          cluster.shutdown
        end
      end
    end

  end
end

But after adding namespace, redstorm can't find WordCountTopology anymore. For real projects, we need to use namespace to organize code, so redstorm should support namespaced class definition.

stone-laptop:redstorm-try stone$ redstorm local examples/simple/word_count_topology.rb 
launching java -cp "/Users/stone/tmp/redstorm-try/target/classes:/Users/stone/tmp/redstorm-try/target/dependency/*" redstorm.TopologyLauncher local examples/simple/word_count_topology.rb
file:/Users/stone/tmp/redstorm-try/target/dependency/jruby-complete-1.6.5.jar!/builtin/javasupport/core_ext/object.rb:99 warning: already initialized constant Config
RedStorm v0.2.1 starting topology WordCountTopology in local environment
1    [main] ERROR org.apache.zookeeper.server.NIOServerCnxn  - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (NameError) uninitialized constant WordCountTopology
        at org.jruby.RubyModule.const_missing(org/jruby/RubyModule.java:2590)
        at #.(eval)((eval):1)
        at org.jruby.RubyModule.module_eval(org/jruby/RubyModule.java:2246)
        at #.main(/Users/stone/.rvm/gems/jruby-1.6.5/gems/redstorm-0.2.0/lib/red_storm/topology_launcher.rb:43)

better handling of ant dependency

as reported in Stom ML: https://groups.google.com/d/topic/storm-user/jLY_f320Hek/discussion

when ant is not locally installed :

redstorm install
IOError: Cannot run program "ant" (in directory "/home/ant/r"): java.io.IOException: error=2, No such file or directory
popen at org/jruby/RubyIO.java:3613
load_from_ant at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/shared/ant.rb:5
load at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/shared/ant.rb:38
Ant at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/shared/ant.rb:61
(root) at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/shared/ant.rb:3
require at org/jruby/RubyKernel.java:1033
require at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36
(root) at /home/ant/.rvm/rubies/jruby-1.6.7.2/lib/ruby/site_ruby/shared/ant.rb:1
load at org/jruby/RubyKernel.java:1058
run at /home/ant/.rvm/gems/jruby-1.6.7.2/gems/redstorm-0.6.2/lib/tasks/red_storm.rake:54
(root) at /home/ant/.rvm/gems/jruby-1.6.7.2/gems/redstorm-0.6.2/bin/redstorm:15
load at org/jruby/RubyKernel.java:1058
(root) at /home/ant/.rvm/gems/jruby-1.6.7.2/bin/redstorm:23

java.io.NotSerializableException when serialize Hash between tasks

One of the fields in our spout is Hash, but it seems that JRuby's RubyHashEntry is not serializable.

1461 [Thread-49] ERROR backtype.storm.util  - Async loop died!
java.io.NotSerializableException: org.jruby.RubyHash$RubyHashEntry
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1164)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330)
        at backtype.storm.serialization.JavaSerialization.serialize(JavaSerialization.java:20)
        at backtype.storm.serialization.JavaSerialization.serialize(JavaSerialization.java:14)
        at backtype.storm.serialization.FieldSerialization.serialize(FieldSerialization.java:26)
        at backtype.storm.serialization.ValuesSerializer.serializeInto(ValuesSerializer.java:50)
        at backtype.storm.serialization.TupleSerializer.serialize(TupleSerializer.java:31)
        at backtype.storm.daemon.worker$fn__3076$exec_fn__860__auto____3077$fn__3137$fn__3138.invoke(worker.clj:189)
        at backtype.storm.daemon.worker$fn__3076$exec_fn__860__auto____3077$fn__3137.invoke(worker.clj:184)
        at clojure.lang.AFn.applyToHelper(AFn.java:165)
        at clojure.lang.AFn.applyTo(AFn.java:151)
        at clojure.core$apply.invoke(core.clj:540)
        at backtype.storm.util$async_loop$fn__445.invoke(util.clj:215)
        at clojure.lang.AFn.run(AFn.java:24)
        at java.lang.Thread.run(Thread.java:680)

JRubyShellSpout is in wrong package

I installed redstorm-0.6.4 from rubygems, and am having trouble setting up a shell spout. It looks like the package for JRubyShellSpout is backtype.storm.clojure, but the simple topology is expecting it to be redstorm.storm.jruby.

I'd submit a pull request to fix this, but he other issue is that I can't seem to find JRubyShellSpout.java or JRubyShellBolt.java in the GitHub repo, though they exist in my installed gems.

Manually emitting tuples

Suppose I have a a bolt that does not auto-emit. How do I manually emit a tuple using your DSL? For example,

class MyBolt < RedStorm::SimpleBolt
  output_fields :my_output_field

  on_receive :emit => false do |tuple|
    ...
    unanchored_emit ["output-field-value"] # also tried without the square brackets
  end
end

results in the following error:

10934 [Thread-38] ERROR backtype.storm.util  - Async loop died!
java.lang.NullPointerException
  at backtype.storm.tuple.Tuple.<init>(Tuple.java:43)

Should the above work? Also, should the DSL have an "emit" keyword seeing as the bolt is already specified as unanchored (by default).

BTW, great work on the DSL.

cannot find redstorm gem when using rbenv

I followed the instructions in the README, but when I try the examples:

redstorm examples/local_exclamation_topology.rb

I get:

Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: org.jruby.exceptions.RaiseException: (LoadError) no such file to load -- red_storm/version

I am using OSX 10.6.8 and JRuby 1.6.5. Specifically, the error is caused by

java -cp "target/classes:target/dependency/*" redstorm.TopologyLauncher examples/local_exclamation_topology.rb

So JRuby can't find my installed gems. I am using rbenv (http://github.com/sstephenson/rbenv) so my gems are in a non-standard location. However, should the required gems also be in target/classes? For example, if I:

gem install -i target/classes redstorm ruby-maven --no-rdoc --no-ri

then the problem goes away.

0.3.0 release?

Any chance of seeing red storm 0.3.0 being released on rubygems?

The documentation says that running gem install redstorm will install redstorm 0.3.0, but it is still 0.2.1 that is the latest release available.

How do I run in 1.9 mode?

I can't find a way to run my code in 1.9 mode, I only get 1.8.7. Setting JRUBY_OPTS has no effect. Is there any way to get RedStorm to run the Ruby code in 1.9?

generate local Rakefile instead of using redstorm bin?

see if instead of using the redstorm bin for all commands in workflow, we could use a generator to create a local Rakefile and expose rake tasks for the workflow. This would allow easy local customization if needed.

gem dependencies and bundler not picking up groups

Hey guys,

I've traced a problem here https://github.com/colinsurprenant/redstorm/blob/master/lib/tasks/red_storm.rake#L160

Which causes redstorm bundle to not copy any gems.

Where bundler keeps the default group as the symbol :default and not the string 'default':

irb(main):019:0> Bundler.definition.groups
=> [:default]

This is bundler 1.1.4, I didn't make a pull request because I'm not sure what it is the string 'default' in the first place... any idea?

thanks

Fields grouping must be declared with strings, not symbols

This topology code works

bolt UniqBolt do
  source FriendsBolt, :fields => ['friend_id']
end

but this does not

bolt UniqBolt do
  source FriendsBolt, :fields => [:friend_id]
end

the latter fails with the following message

1    [main] ERROR org.apache.zookeeper.server.NIOServerCnxn  - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (TypeError) can't define singleton
    at org.jruby.RubyModule.extend_object(org/jruby/RubyModule.java:2066)
    at #<Class:0x42623a9d>.define_grouping(/Users/theo/.rvm/gems/jruby-1.6.5@redstorm_demo/gems/redstorm-0.2.1/lib/red_storm/simple_topology.rb:40)
    at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1612)
    at RedStorm::SimpleTopology::BoltDefinition.define_grouping(/Users/theo/.rvm/gems/jruby-1.6.5@redstorm_demo/gems/redstorm-0.2.1/lib/red_storm/simple_topology.rb:35)
    at Topology.start(/Users/theo/.rvm/gems/jruby-1.6.5@redstorm_demo/gems/redstorm-0.2.1/lib/red_storm/simple_topology.rb:112)
    at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1612)
    at RedStorm::SimpleTopology.start(/Users/theo/.rvm/gems/jruby-1.6.5@redstorm_demo/gems/redstorm-0.2.1/lib/red_storm/simple_topology.rb:109)
    at #<Class:0x32f2e8ca>.main(/Users/theo/.rvm/gems/jruby-1.6.5@redstorm_demo/gems/redstorm-0.2.1/lib/red_storm/topology_launcher.rb:43)

Since fields are declared with symbols, it feels logical that it should be possible to refer to them using symbols too.

`redstorm bundle` fails to load git based gems

Updated

Only gems from RubyGems will work when bundling a topology. If a gem is from a git repo, the .gemspec will not be compiled like the ones from RubyGems. As a result, it will fail to load.

Steps to reproduce:

# Gemfile
group :topology do
  gem 'multi_json', git: 'https://github.com/intridea/multi_json.git'
end

bundle exec redstorm bundle topology

Then attempt to run the topology locally to get a LoadError.

on_ack and on_fail not getting called

I have a topology with a single spout and a single bolt:

class NumberSpout < RedStorm::SimpleSpout
  output_fields :number

  on_ack do |msg_id|
    puts "ACK #{msg_id}"
  end

  on_fail do |msg_id|
    puts "FAIL #{msg_id}"
  end

  on_send do
    rand(10)
  end
end

class AcksBolt < RedStorm::SimpleBolt
  on_receive :emit => false do |tuple|
    ack tuple
  end
end

class AcksTopology < RedStorm::SimpleTopology
  spout NumberSpout

  bolt AcksBolt, :parallelism => 1 do
    source NumberSpout, :shuffle
  end

  configure :topology do |environment|
    case environment
    when :local
      debug true
      max_task_parallelism 5
    end
  end
end

The bolt just acks every tuple it receives from the spout. Unfortunately on_ack is never getting called. Should it be? on_fail is never getting called either.

allow specification of output_fields in topology DSL

output_fields is currently defined in the bolt/spout DSL. I think it would make sense to allow this definition to occur in the topology definition DSL which would then provide all the spouts & bolts "wiring" or routing information: output_fields, source and grouping definition.

TopologyLauncher does not run in Ruby 1.9 in the 19mode branch

Using the 19mode branch when you run storm jar ... it will run the TopologyLauncher using Ruby 1.8 by default which will in turn run your topology configuration script in ruby 1.8 rather than 1.9 even if you've got JRUBY_OTPS=--1.9 set from the environment you deploy from.

To fix this we need to set the JVM property -Djruby.compat.version=RUBY1_9 before we run the TopologyLauncher. This has to be done by storm whilst it creates the command to call java ...., if there isn't another way to do this we should file a ticket with the main storm project to add support for this.

Modifying storm this locally I've found this to work.

Support for Trident API and other topologies

Is there anything planned regarding the support of the newly introduced Trident API (Storm 0.8) as well as the other existing topologies?
I just updated the storm dependency of redstorm to storm 0.8.0 for my own testing and it seems that the existing redstorm samples work flawlessly.

add support for Distributed RPC

I do not currently see a way to handle DRPC in RedStorm. DSL / proxy support for LinearDRPCTopologyBuilder and related DRPC classes would be very useful.

Update RedStorm to target Storm 0.8

RedStorm currently targets 0.7.4 right now. Storm is up to 0.8.1 with 0.8.2 to be release soon. It adds some awesome features I'm sure a lot of RedStorm users would appreciate.

As it stands, I've been using RedStorm with 0.8.0 and 0.8.1 and it seems to work alright.

Storm's testing facilities Integration

Storm will expose its testing facilities to Java API, it's only available in clojure currently. ( https://github.com/nathanmarz/storm/issues/72 ).

Testing is indispensable for writing reliable storm topologies. So redstorm should integrate the testing facilities of Storm once it's available.

For the time being, @nathanmarz suggest that using Clojure for the test. Any suggestions on Clojure & JRuby interop, so we can using the testing facilities?

Don't shell out unless absolutely necessary

When running the Redstorm tasks there's a lot of shelling out going on. Redstorm shells out to rmvn, which, IIRC, shells out to mvn. Shelling out is very slow on the JVM, and doubly so since all the processes started are also JVMs. It feels unnecessary and can probably be sped up considerably if all the work was done in-process.

Perhaps the dependency management could be done with Pom Pom Pom instead? It uses Ivy under the hood, has a simple programmatic API (even experimental Bundler integration), and does everything in-process. (Full disclosure: I'm the author, if you hadn't guessed already).

I haven't published a gem for v2 yet, v1 is very old and not very good. If you don't like the idea, perhaps I could help just rewriting the dependency management parts using Ivy directly?

redstorm examples fails when directory examples already exists

it actually tries to launch a topology class "examples" because File.exist? return true on "examples".

launching java -cp "/Users/colin/dev/praized/src/internal/redstorm-test/target/classes:/Users/colin/dev/praized/src/internal/redstorm-test/target/dependency/*" redstorm.TopologyLauncher examples
file:/Users/colin/dev/praized/src/internal/redstorm-test/target/dependency/jruby-complete-1.6.5.jar!/builtin/javasupport/core_ext/object.rb:99 warning: already initialized constant Config
RedStorm v0.1.1 starting topology Examples
0 [main] ERROR org.apache.zookeeper.server.NIOServerCnxn - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (LoadError) no such file to load -- examples
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
at Kernel.require(file:/Users/colin/dev/praized/src/internal/redstorm-test/target/dependency/jruby-complete-1.6.5.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36)
at #Class:0x36656758.main(/Users/colin/.rvm/gems/jruby-1.6.5@redstorm-test/gems/redstorm-0.1.1/lib/red_storm/topology_launcher.rb:39)

redstorm use absolute path for generated Java proxy classes

For generated Java proxy classes in target/src/redstorm/ ( TopologyLauncher.java, proxy/{Bolt.java, Spout.java} )

I noticed in the generated source code, it use absolute path, will this cause error when deployed to other machines?

  __ruby__.executeScript(source, "/Users/stone/.rvm/gems/jruby-1.6.5/gems/redstorm-0.2.1/lib/red_storm/proxy/spout.rb");
  __ruby__.executeScript(source, "/Users/stone/.rvm/gems/jruby-1.6.5/gems/redstorm-0.2.1/lib/red_storm/proxy/bolt.rb");
  __ruby__.executeScript(source, "/Users/stone/.rvm/gems/jruby-1.6.5/gems/redstorm-0.2.1/lib/red_storm/topology_launcher.rb"); 

redstorm install fails when using --verbose flags

When I run "redstorm install" I get the following error when it gets to the first jruby compile statement:

--> Compiling JRuby
/Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1451:in complete': invalid option: --verbose (OptionParser::InvalidOption) from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1449:incatch'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1449:in complete' from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1262:inparse_in_order'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1255:in catch' from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1255:inparse_in_order'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1249:in order!' from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1340:inpermute!'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:1361:in parse!' from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/site_ruby/shared/jruby/compiler.rb:74:incompile_argv'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/1.8/optparse.rb:792:in initialize' from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/site_ruby/shared/jruby/compiler.rb:34:innew'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/lib/ruby/site_ruby/shared/jruby/compiler.rb:34:in `compile_argv'
from /Users/mkovacs/.rvm/rubies/jruby-1.5.6/bin/jrubyc:9

Pulling the --verbose out of the jrubyc and out of the javac fixes things for me. Not sure why it's breaking for me... I'm running java 1.6 and jruby 1.5.6 as you see in the output.

Storm UI

Hi, is there a way that I can use storm ui through ruby? And if it's local mode. Can I still use it?

Update on_receive block to handle yield statements

on_receive method blocks should be able to yield a tuple, which would in turn get emitted. Right now, it looks like the current implementation will iterator through an array of tuple and emit them, however, aggregating them inside of the on_receive block might yield poor performance.

Example:

(current form)

class DemultiplexerBolt < RedStorm::SimpleBolt
  on_receive do |tuple|
    tuples = []
    tuple.getValueByField(:ids).each do |id|
      tuples << tuple + [id] 
    end
    tuples
  end
end

(proposed form)

class DemultiplexerBolt < RedStorm::SimpleBolt
  on_receive do |tuple|
    tuple.getValueByField(:ids).each do |id|
      yield tuple + [id] 
    end
  end
end

Does anyone see any issues with updating the logic within SimpleBolt#execute to account for this?

no method 'setSpout' after upgrade to redstorm 0.4.0 and storm 0.6.2

I upgraded to redstorm 0.4.0 today for our project (previously its based on redstorm 0.2.1). When running the topology locally the following exception occurs:

0    [main] ERROR org.apache.zookeeper.server.NIOServerCnxn  - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (NameError) no method 'setSpout' for arguments (org.jruby.RubyString,redstorm.storm.jruby.JRubySpout,org.jruby.RubyFixnum) on Java::BacktypeStormTopology::TopologyBuilder
    at EventTopology.start(/Users/stone/.rvm/gems/jruby-1.6.6/gems/redstorm-0.4.0/lib/red_storm/simple_topology.rb:110)
    at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1612)
    at RedStorm::SimpleTopology.start(/Users/stone/.rvm/gems/jruby-1.6.6/gems/redstorm-0.4.0/lib/red_storm/simple_topology.rb:108)
    at #.main(/Users/stone/.rvm/gems/jruby-1.6.6/gems/redstorm-0.4.0/lib/red_storm/topology_launcher.rb:43)

make topology class configure statement optional

I'm writing my first topology, so beware :-)

When running it, I got:

launching java -Djruby.compat.version=RUBY1_9 -cp "/Users/thbar/git/project/storm-prototype/target/classes:/Users/thbar/git/project/storm-prototype/target/dependency/storm/default/*:/Users/thbar/git/project/storm-prototype/target/dependency/topology/default/*:/Users/thbar/git/project/storm-prototype/" redstorm.TopologyLauncher local simple_topology.rb
SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
0    [main] ERROR org.apache.zookeeper.server.NIOServerCnxn  - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (NoMethodError) undefined method `name' for nil:NilClass
  at (Anonymous).main(/Users/thbar/.rvm/gems/jruby-1.7.0@project/gems/redstorm-0.6.4/lib/red_storm/topology_launcher.rb:42)

The work-around is to add a configure block: do we need a better error handling here, if configure is actually required?

(NameError) uninitialized constant Java::BacktypeStorm::Config::CONFIG

Tried redstorm with other extra gems dependencies (Mongoid, Resque), I start topology using the following command:

RACK_ENV=development redstorm local lib/redstorm-topology/event_topology.rb

But it throw uninitialized constant Java::BacktypeStorm::Config::CONFIG error. Also I am using Mac OS X but the JRuby stack trace has some weird lines related with WIN32API:

launching java -cp "/Users/stone/code/redstorm-topology/target/classes:/Users/stone/code/redstorm-topology/target/dependency/*" redstorm.TopologyLauncher local lib/redstorm-topology/event_topology.rb
file:/Users/stone/code/redstorm-topology/target/dependency/jruby-complete-1.6.5.jar!/builtin/javasupport/core_ext/object.rb:99 warning: already initialized constant Config
RedStorm v0.2.1 starting topology EventTopology in local environment
1    [main] ERROR org.apache.zookeeper.server.NIOServerCnxn  - Thread Thread[main,5,main] died
org.jruby.exceptions.RaiseException: (NameError) uninitialized constant Java::BacktypeStorm::Config::CONFIG
    at org.jruby.RubyModule.const_missing(org/jruby/RubyModule.java:2590)
    at #.(root)(file:/Users/stone/code/redstorm-topology/target/dependency/jruby-complete-1.6.5.jar!/META-INF/jruby.home/lib/ruby/site_ruby/shared/Win32API.rb:2)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.Dir(file:/Users/stone/code/redstorm-topology/target/dependency/jruby-complete-1.6.5.jar!/META-INF/jruby.home/lib/ruby/site_ruby/shared/Win32API.rb:14)
    at #.(root)(file:/Users/stone/code/redstorm-topology/target/dependency/jruby-complete-1.6.5.jar!/META-INF/jruby.home/lib/ruby/1.8/tmpdir.rb:9)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(file:/Users/stone/code/redstorm-topology/target/dependency/jruby-complete-1.6.5.jar!/META-INF/jruby.home/lib/ruby/1.8/tmpdir.rb:5)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(/Users/stone/.rvm/gems/jruby-1.6.5/gems/rack-1.2.4/lib/rack/utils.rb:1)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(/Users/stone/.rvm/gems/jruby-1.6.5/gems/rack-1.2.4/lib/rack/request.rb:3)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(/Users/stone/.rvm/gems/jruby-1.6.5/gems/rack-1.2.4/lib/rack/showexceptions.rb:1)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(/Users/stone/.rvm/gems/jruby-1.6.5/gems/sinatra-1.2.7/lib/sinatra/showexceptions.rb:6)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at #.(root)(/Users/stone/.rvm/gems/jruby-1.6.5/gems/sinatra-1.2.7/lib/sinatra/base.rb:1)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at Bundler::Runtime.require(/Users/stone/.rvm/gems/jruby-1.6.5/gems/resque-1.19.0/lib/resque/server.rb:68)
    at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1612)
    at Bundler::Runtime.require(/Users/stone/.rvm/gems/jruby-1.6.5/gems/bundler-1.0.21/lib/bundler/runtime.rb:66)
    at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1612)
    at Bundler::Runtime.require(/Users/stone/.rvm/gems/jruby-1.6.5/gems/bundler-1.0.21/lib/bundler/runtime.rb:55)
    at #.require(/Users/stone/.rvm/gems/jruby-1.6.5/gems/bundler-1.0.21/lib/bundler.rb:122)
    at #.(root)(/Users/stone/code/redstorm-topology/lib/redstorm-topology/boot.rb:17)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at Kernel.require(/Users/stone/code/redstorm-topology/lib/redstorm-topology/boot.rb:55)
    at #.(root)(./lib/redstorm-topology/event_topology.rb:4)
    at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1038)
    at Kernel.require(./lib/redstorm-topology/event_topology.rb:55)
    at #.main(/Users/stone/.rvm/gems/jruby-1.6.5/gems/redstorm-0.2.1/lib/red_storm/topology_launcher.rb:42)

I am using Bundler to manage gem dependencies, boot.rb is just copy & paste from rails:

require 'rubygems'

# Set up gems listed in the Gemfile.
PROJECT_ROOT = File.expand_path('../../..', __FILE__)

gemfile = File.join(PROJECT_ROOT, 'Gemfile')
begin
  ENV['BUNDLE_GEMFILE'] = gemfile
  require 'bundler'
  Bundler.setup
rescue Bundler::GemNotFound => e
  STDERR.puts e.message
  STDERR.puts "Try running `bundle install`."
  exit!
end if File.exist?(gemfile)

Bundler.require(:default)

BTW: I think it's great if you can create a redstorm starter project to demonstrate the project source code layout, how to manage dependencies and how to using JRuby with Java together to develop storm topologies. (redstorm it's self is a great example, but another one which focus on the usage of redstorm is awesome).

Thanks for your help

Stone

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.