Giter VIP home page Giter VIP logo

killrweather / killrweather Goto Github PK

View Code? Open in Web Editor NEW
1.2K 148.0 397.0 16.07 MB

KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka for fast, streaming computations on time series data in asynchronous event-driven environments.

License: Apache License 2.0

Scala 100.00%

killrweather's Introduction

KillrWeather

KillrWeather is a reference application (which we are constantly improving) showing how to easily leverage and integrate Apache Spark, Apache Cassandra, and Apache Kafka for fast, streaming computations in asynchronous Akka event-driven environments. This application focuses on the use case of time series data.

Sample Use Case

I need fast access to historical data on the fly for predictive modeling with real time data from the stream.

Basic Samples

Basic Spark, Kafka, Cassandra Samples

Reference Application

KillrWeather Main App

Time Series Data

The use of time series data for business analysis is not new. What is new is the ability to collect and analyze massive volumes of data in sequence at extremely high velocity to get the clearest picture to predict and forecast future market changes, user behavior, environmental conditions, resource consumption, health trends and much, much more.

Apache Cassandra is a NoSQL database platform particularly suited for these types of Big Data challenges. Cassandra’s data model is an excellent fit for handling data in sequence regardless of data type or size. When writing data to Cassandra, data is sorted and written sequentially to disk. When retrieving data by row key and then by range, you get a fast and efficient access pattern due to minimal disk seeks – time series data is an excellent fit for this type of pattern. Apache Cassandra allows businesses to identify meaningful characteristics in their time series data as fast as possible to make clear decisions about expected future outcomes.

There are many flavors of time series data. Some can be windowed in the stream, others can not be windowed in the stream because queries are not by time slice but by specific year,month,day,hour. Spark Streaming lets you do both.

Start Here

Clone the repo

git clone https://github.com/killrweather/killrweather.git
cd killrweather

Build the code

If this is your first time running SBT, you will be downloading the internet.

cd killrweather
sbt compile
# For IntelliJ users, this creates Intellij project files, but as of
# version 14x you should not need this, just import a new sbt project.
sbt gen-idea

Setup (for Linux & Mac) - 3 Steps

1.Download the latest Cassandra and open the compressed file.

2.Start Cassandra - you may need to prepend with sudo, or chown /var/lib/cassandra. On the command line:

./apache-cassandra-{version}/bin/cassandra -f

3.Run the setup cql scripts to create the schema and populate the weather stations table. On the command line start a cqlsh shell:

cd /path/to/killrweather/data
path/to/apache-cassandra-{version}/bin/cqlsh

Setup (for Windows) - 3 Steps

  1. Download the latest Cassandra and double click the installer.

  2. Chose to run the Cassandra automatically during start-up

  3. Run the setup cql scripts to create the schema and populate the weather stations table. On the command line start a cqlsh shell:

    cd c:/path/to/killrweather
    c:/pat/to/cassandara/bin/cqlsh

In CQL Shell:

You should see:

 Connected to Test Cluster at 127.0.0.1:9042.
 [cqlsh {latest.version} | Cassandra {latest.version} | CQL spec {latest.version} | Native protocol {latest.version}]
 Use HELP for help.
 cqlsh>

Run the scripts, then keep the cql shell open querying once the apps are running:

 cqlsh> source 'create-timeseries.cql';
 cqlsh> source 'load-timeseries.cql';

Run

Logging

You will see this in all 3 app shells because log4j has been explicitly taken off the classpath:

log4j:WARN No appenders could be found for logger (kafka.utils.VerifiableProperties).
log4j:WARN Please initialize the log4j system properly.

What we are really trying to isolate here is what is happening in the apps with regard to the event stream. You can add log4j locally.

To change any package log levels and see more activity, simply modify

From Command Line

1.Start KillrWeather

cd /path/to/killrweather
sbt app/run

As the KillrWeather app initializes, you will see Akka Cluster start, Zookeeper and the Kafka servers start.

For all three apps in load-time you see the Akka Cluster node join and start metrics collection. In deployment with multiple nodes of each app this would leverage the health of each node for load balancing as the rest of the cluster nodes join the cluster:

2.Start the Kafka data feed app In a second shell run:

sbt clients/run

You should see:

Multiple main classes detected, select one to run:

[1] com.datastax.killrweather.KafkaDataIngestionApp
[2] com.datastax.killrweather.KillrWeatherClientApp

Select KafkaDataIngestionApp, and watch the shells for activity. You can stop the data feed or let it keep running. After a few seconds you should see data by entering this in the cqlsh shell:

cqlsh> select * from isd_weather_data.raw_weather_data;

This confirms that data from the ingestion app has published to Kafka, and that raw data is streaming from Spark to Cassandra from the KillrWeatherApp.

cqlsh> select * from isd_weather_data.daily_aggregate_precip;

Unfortunately the precips are mostly 0 in the samples (To Do).

3.Open a third shell and again enter this but select KillrWeatherClientApp:

sbt clients/run

This api client runs queries against the raw and the aggregated data from the kafka stream. It sends requests (for varying locations and dates/times) and for some, triggers further aggregations in compute time which are also saved to Cassandra:

  • current weather
  • daily temperatures
  • monthly temperatures
  • monthly highs and low temperatures
  • daily precipitations
  • top-k precipitation

Next I will add some forecasting with ML :)

Watch the app and client activity in request response of weather data and aggregation data. Because the querying of the API triggers even further aggregation of data from the originally aggregated daily roll ups, you can now see a new tier of temperature and precipitation aggregation: In the cql shell:

cqlsh> select * from isd_weather_data.daily_aggregate_temperature;
cqlsh> select * from isd_weather_data.daily_aggregate_precip;

From an IDE

  1. Run the app com.datastax.killrweather.KillrWeatherApp
  2. Run the kafka data ingestion server com.datastax.killrweather.KafkaDataIngestionApp
  3. Run the API client com.datastax.killrweather.KillrWeatherClientApp

To close the cql shell:

cqlsh> quit;

killrweather's People

Contributors

chbatey avatar evanvolgas avatar helena avatar mslinn avatar pmcfadin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

killrweather's Issues

ZooKeeper fail to start

Hi,
I am trying to start the application but seems to me that Zookeeper or Akka failed to start.
Below the log. How can I fix this?

Thank you

root@debian:/git/killrweather# sbt app/run
[info] Loading project definition from /root/git/killrweather/project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to KillrWeather (in build file:/root/git/killrweather/)
[info] Running com.datastax.killrweather.KillrWeatherApp
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
[INFO] [2015-12-15 16:40:52,251] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-12-15 16:40:52,435] [Remoting]: Starting remoting
[INFO] [2015-12-15 16:40:53,215] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2550]
[INFO] [2015-12-15 16:40:53,255] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Starting up...
[INFO] [2015-12-15 16:40:53,476] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-12-15 16:40:53,478] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Started up successfully
[INFO] [2015-12-15 16:40:53,523] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - No seed-nodes configured, manual cluster join required
[INFO] [2015-12-15 16:40:53,652] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Metrics collection has started successfully
error java.net.BindException: Cannot assign requested address
java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at com.datastax.spark.connector.embedded.EmbeddedZookeeper.(EmbeddedZookeeper.scala:51)
at com.datastax.spark.connector.embedded.EmbeddedKafka.(EmbeddedKafka.scala:23)
at com.datastax.spark.connector.embedded.EmbeddedKafka.(EmbeddedKafka.scala:15)
at com.datastax.spark.connector.embedded.EmbeddedKafka.(EmbeddedKafka.scala:20)
at com.datastax.killrweather.KillrWeather.(KillrWeatherApp.scala:77)
at com.datastax.killrweather.KillrWeather$.createExtension(KillrWeatherApp.scala:54)
at com.datastax.killrweather.KillrWeather$.createExtension(KillrWeatherApp.scala:50)
at akka.actor.ActorSystemImpl.registerExtension(ActorSystem.scala:713)
at akka.actor.ExtensionId$class.apply(Extension.scala:79)
at com.datastax.killrweather.KillrWeather$.apply(KillrWeatherApp.scala:50)
at com.datastax.killrweather.KillrWeatherApp$delayedInit$body.apply(KillrWeatherApp.scala:46)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at com.datastax.killrweather.KillrWeatherApp$.main(KillrWeatherApp.scala:38)
at com.datastax.killrweather.KillrWeatherApp.main(KillrWeatherApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
[trace] Stack trace suppressed: run last app/*:run for the full output.
^C^Croot@load1:
/git/killrweather# ^C

Data Ingestion using HTTP POST

I have followed all the instruction to run the app and the clients. I wanted to use the REST interface to ingest data. In the KafkaDataIngestionApp, I have commented out the following lines to stop the initial data ingestion.
/*
/* Handles initial data ingestion in Kafka for running as a demo. */
for (fs <- initialData; data <- fs.data) {
log.info("Sending {} to Kafka", data)
router ! KafkaMessageEnvelope[String, String](KafkaTopic, KafkaKey, data)
}
*/
When I used the following command to post the data, I am not able to connect. I have changed the host name to loop back address, changed the port number etc Still it is giving the same problem of connection refused.

curl -v -X POST --header "X-DATA-FEED: ./data/load/sf-2008.csv.gz" http://localhost:8080/weather/data

  • Adding handle: conn: 0x7f9f00803a00
  • Adding handle: send: 0
  • Adding handle: recv: 0
  • Curl_addHandleToPipeline: length: 1
  • - Conn 0 (0x7f9f00803a00) send_pipe: 1, recv_pipe: 0
  • About to connect() to localhost port 8080 (#0)
  • Trying ::1...
  • Trying 127.0.0.1...
  • Trying fe80::1...
  • Failed connect to localhost:8080; Connection refused
  • Closing connection 0
    curl: (7) Failed connect to localhost:8080; Connection refused

Can you please help?

App and clients are running on different IPs in the as-is configuration

Hi - thanks for this awesome application.

I followed the step-by-step guide (with C* 2.0.x and on a MacOS X 10.9.5) and when starting the ingestion client, it does not find the app, because it listens on 192.168.2.103 and the clients expect it to be on 127.0.0.1.

Changing to the app IP in killrweather-clients/src/main/resources/reference.conf solved that for the moment. You might want to align them, though :-)

start from a new Cassandra

I try to run the examples.
first, I got the newest Cassandra 3.0.3, and I got this Error:
16/03/02 11:29:57 ERROR NIOServerCnxnFactory: Thread Thread[main,5,main] died java.io.IOException: Failed to open native connection to Cassandra at {192.168.211.128}:9042 at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:181) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167) at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31) at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56) at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:76) at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:104)

then I got the Cassandra 2.1, it works.
Actually we use Cassandra 3.0.x in the test environment, how can I do to fix this?
What if I upgrate the other dependencies?
I'm new to spark, tell me step by step please.
Many thanks.

Paging for weather stations

At the moment it is limited to 100 otherwise the message will be too big for Akka, implement paging so dashboard can get all of the stations.

Unable to connect to Cassandra

I'm trying to run the demo. After importing the data into Cassandra I try to run with sbt app/run. Instead of being presented with three choices on which to run, I am given an error. Some lines omitted:

vagrant@vagrant-ubuntu-trusty-64:~/killrweather$ sbt app/run
[info] Loading project definition from /home/vagrant/killrweather/project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn]  * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to KillrWeather (in build file:/home/vagrant/killrweather/)
[info] Running com.datastax.killrweather.KillrWeatherApp 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/vagrant/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/vagrant/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
[INFO] [2015-09-16 15:00:59,279] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-09-16 15:00:59,676] [Remoting]: Starting remoting
[INFO] [2015-09-16 15:01:00,019] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2550]
[INFO] [2015-09-16 15:01:00,042] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Starting up...
[INFO] [2015-09-16 15:01:00,161] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-09-16 15:01:00,162] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Started up successfully
[INFO] [2015-09-16 15:01:00,189] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - No seed-nodes configured, manual cluster join required
[INFO] [2015-09-16 15:01:00,674] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Metrics collection has started successfully
ZooKeeperServer isRunning: true
log4j:WARN No appenders could be found for logger (kafka.utils.VerifiableProperties).
log4j:WARN Please initialize the log4j system properly.
Starting the Kafka server at 10.0.2.15:2181
[INFO] [2015-09-16 15:01:07,192] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-09-16 15:01:07,220] [Remoting]: Starting remoting
[INFO] [2015-09-16 15:01:07,235] [Remoting]: Remoting started; listening on addresses :[akka.tcp://sparkDriver@vagrant-ubuntu-trusty-64:36445]
[INFO] [2015-09-16 15:01:07,667] [org.spark-project.jetty.server.Server]: jetty-8.y.z-SNAPSHOT
[INFO] [2015-09-16 15:01:07,719] [org.spark-project.jetty.server.AbstractConnector]: Started [email protected]:36072
[INFO] [2015-09-16 15:01:08,041] [org.spark-project.jetty.server.Server]: jetty-8.y.z-SNAPSHOT
[INFO] [2015-09-16 15:01:08,080] [org.spark-project.jetty.server.AbstractConnector]: Started [email protected]:4040
[INFO] [2015-09-16 15:01:09,078] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Node [akka.tcp://[email protected]:2550] is JOINING, roles [analytics]
[INFO] [2015-09-16 15:01:09,099] [com.datastax.killrweather.NodeGuardian]: Starting at akka.tcp://[email protected]:2550
[INFO] [2015-09-16 15:01:09,100] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Trying to join seed nodes [akka.tcp://[email protected]:2550] when already part of a cluster, ignoring
[INFO] [2015-09-16 15:01:09,197] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Leader is moving node [akka.tcp://[email protected]:2550] to [Up]
[DEBUG] [2015-09-16 15:01:09,203] [com.datastax.killrweather.NodeGuardian]: Member [akka.tcp://[email protected]:2550] joined cluster.
[ERROR] [2015-09-16 15:01:11,059] [akka.actor.OneForOneStrategy]: Failed to open native connection to Cassandra at {10.0.2.15}:9042
akka.actor.ActorInitializationException: exception during creation

Cassandra is running

vagrant@vagrant-ubuntu-trusty-64:~/killrweather$ ps aux | grep cassandra
vagrant   1463  2.2 11.0 1693652 227192 pts/0  Sl+  14:55   0:11 java -ea -javaagent:./cassandra-2.2.1/bin/../lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1000M -Xmx1000M -Xmn100M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+PerfDisableSharedMem -XX:CompileCommandFile=./cassandra-2.2.1/bin/../conf/hotspot_compiler -XX:CMSWaitDuration=10000 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=10000 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcassandra.jmx.local.port=7199 -XX:+DisableExplicitGC -Djava.library.path=./cassandra-2.2.1/bin/../lib/sigar-bin -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=./cassandra-2.2.1/bin/../logs -Dcassandra.storagedir=./cassandra-2.2.1/bin/../data -Dcassandra-foreground=yes -cp ./cassandra-2.2.1/bin/../conf:./cassandra-2.2.1/bin/../build/classes/main:./cassandra-2.2.1/bin/../build/classes/thrift:./cassandra-2.2.1/bin/../lib/ST4-4.0.8.jar:./cassandra-2.2.1/bin/../lib/airline-0.6.jar:./cassandra-2.2.1/bin/../lib/antlr-runtime-3.5.2.jar:./cassandra-2.2.1/bin/../lib/apache-cassandra-2.2.1.jar:./cassandra-2.2.1/bin/../lib/apache-cassandra-clientutil-2.2.1.jar:./cassandra-2.2.1/bin/../lib/apache-cassandra-thrift-2.2.1.jar:./cassandra-2.2.1/bin/../lib/cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:./cassandra-2.2.1/bin/../lib/commons-cli-1.1.jar:./cassandra-2.2.1/bin/../lib/commons-codec-1.2.jar:./cassandra-2.2.1/bin/../lib/commons-lang3-3.1.jar:./cassandra-2.2.1/bin/../lib/commons-math3-3.2.jar:./cassandra-2.2.1/bin/../lib/compress-lzf-0.8.4.jar:./cassandra-2.2.1/bin/../lib/concurrentlinkedhashmap-lru-1.4.jar:./cassandra-2.2.1/bin/../lib/crc32ex-0.1.1.jar:./cassandra-2.2.1/bin/../lib/disruptor-3.0.1.jar:./cassandra-2.2.1/bin/../lib/ecj-4.4.2.jar:./cassandra-2.2.1/bin/../lib/guava-16.0.jar:./cassandra-2.2.1/bin/../lib/high-scale-lib-1.0.6.jar:./cassandra-2.2.1/bin/../lib/jackson-core-asl-1.9.2.jar:./cassandra-2.2.1/bin/../lib/jackson-mapper-asl-1.9.2.jar:./cassandra-2.2.1/bin/../lib/jamm-0.3.0.jar:./cassandra-2.2.1/bin/../lib/javax.inject.jar:./cassandra-2.2.1/bin/../lib/jbcrypt-0.3m.jar:./cassandra-2.2.1/bin/../lib/jcl-over-slf4j-1.7.7.jar:./cassandra-2.2.1/bin/../lib/jna-4.0.0.jar:./cassandra-2.2.1/bin/../lib/joda-time-2.4.jar:./cassandra-2.2.1/bin/../lib/json-simple-1.1.jar:./cassandra-2.2.1/bin/../lib/libthrift-0.9.2.jar:./cassandra-2.2.1/bin/../lib/log4j-over-slf4j-1.7.7.jar:./cassandra-2.2.1/bin/../lib/logback-classic-1.1.3.jar:./cassandra-2.2.1/bin/../lib/logback-core-1.1.3.jar:./cassandra-2.2.1/bin/../lib/lz4-1.3.0.jar:./cassandra-2.2.1/bin/../lib/metrics-core-3.1.0.jar:./cassandra-2.2.1/bin/../lib/metrics-logback-3.1.0.jar:./cassandra-2.2.1/bin/../lib/netty-all-4.0.23.Final.jar:./cassandra-2.2.1/bin/../lib/ohc-core-0.3.4.jar:./cassandra-2.2.1/bin/../lib/ohc-core-j8-0.3.4.jar:./cassandra-2.2.1/bin/../lib/reporter-config-base-3.0.0.jar:./cassandra-2.2.1/bin/../lib/reporter-config3-3.0.0.jar:./cassandra-2.2.1/bin/../lib/sigar-1.6.4.jar:./cassandra-2.2.1/bin/../lib/slf4j-api-1.7.7.jar:./cassandra-2.2.1/bin/../lib/snakeyaml-1.11.jar:./cassandra-2.2.1/bin/../lib/snappy-java-1.1.1.7.jar:./cassandra-2.2.1/bin/../lib/stream-2.5.2.jar:./cassandra-2.2.1/bin/../lib/super-csv-2.1.0.jar:./cassandra-2.2.1/bin/../lib/thrift-server-0.3.7.jar:./cassandra-2.2.1/bin/../lib/jsr223/*/*.jar org.apache.cassandra.service.CassandraDaemon
vagrant   2137  0.0  0.0  10460   936 pts/1    S+   15:04   0:00 grep --color=auto cassandra

Chunked time series and Spark

Hi,

I'm working on a system which has to deal with time series data. I've been happy using Cassandra for time series and Spark looks promising as a computational platform. The killrweather app seems like a good place to start.

I consider chunking time series in Cassandra necessary, e.g. by 3 weeks as kairosdb does it. This allows an 8 byte chunk start timestamp with 4 byte offsets for the individual measurements. And it keeps the data below 2x10^9 (I've understood this to be a hard limit in Cassandra) even at 1000 Hz.

This schema works quite okay when dealing with one time series at a time. Because the data is partitioned by time series id and chunk of time (e.g. the three weeks mentioned above), it requires a little client side logic to retrieve the partitions and glue them together, but this is quite okay.

However, when working with many / all of the time series in a table at once, e.g. in Spark, the story changes dramatically. Say I'd want to compute something simple as a moving average, I have to deal with data all over the place. I can't currently think of anything but performing aggregateByKey causing a shuffle every time.

Could you indicate how killrweather would be transformed if it had to deal with raw data with a higher measurement interval then 1 hour. I.e. if the partition key in create-timeseries.cql would have a time component in the partition part (e.g. year or month ...).

Cheers,
Frens Jan

Getting Unresloved dependency while building with SBT

Hi ,

I have following version s installed in my Windows 10 PC

Scala : 2.11.7

SBT : 0.13

I have cloned the project and added build.sbt under killrweather . I have the following config under build.sbt

name := "killrweather"

version := "1.0"

scalaVersion := "2.11.7"

resolvers += Classpaths.typesafeResolver

addSbtPlugin("org.typelevel" % "sbt-typelevel" % "0.3.1")

addSbtPlugin("com.typesafe.sbt" % "sbt-git" % "0.6.4")

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.3")

addSbtPlugin("com.scalapenos" % "sbt-prompt" % "0.2.1")

I have navigated to this path and try run with sbt compile command from command prompt .

Am getting below dependecy resolve error , Below is the console error

warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Loading project definition from C:\Users\puneethkumar\git\killrweather\project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to killrweather (in build file:/C:/Users/puneethkumar/git/killrweather/)
[info] Updating {file:/C:/Users/puneethkumar/git/killrweather/}root...
[warn] module not found: org.typelevel#sbt-typelevel;0.3.1
[warn] ==== local: tried
[warn] C:\Users\puneethkumar.ivy2\local\org.typelevel\sbt-typelevel\scala_2.11\sbt_0.13\0.3.1\ivys\ivy.xml
[warn] ==== public: tried
[warn] https://repo1.maven.org/maven2/org/typelevel/sbt-typelevel_2.11_0.13/0.3.1/sbt-typelevel-0.3.1.pom
[warn] ==== typesafe-ivy-releases: tried
[warn] https://repo.typesafe.com/typesafe/ivy-releases/org.typelevel/sbt-typelevel/scala_2.11/sbt_0.13/0.3.1/ivys/ivy.xml
[warn] module not found: com.typesafe.sbt#sbt-git;0.6.4
[warn] ==== local: tried
[warn] C:\Users\puneethkumar.ivy2\local\com.typesafe.sbt\sbt-git\scala_2.11\sbt_0.13\0.6.4\ivys\ivy.xml
[warn] ==== public: tried
[warn] https://repo1.maven.org/maven2/com/typesafe/sbt/sbt-git_2.11_0.13/0.6.4/sbt-git-0.6.4.pom
[warn] ==== typesafe-ivy-releases: tried
[warn] https://repo.typesafe.com/typesafe/ivy-releases/com.typesafe.sbt/sbt-git/scala_2.11/sbt_0.13/0.6.4/ivys/ivy.xml
[warn] module not found: com.eed3si9n#sbt-assembly;0.11.3
[warn] ==== local: tried
[warn] C:\Users\puneethkumar.ivy2\local\com.eed3si9n\sbt-assembly\scala_2.11\sbt_0.13\0.11.3\ivys\ivy.xml
[warn] ==== public: tried
[warn] https://repo1.maven.org/maven2/com/eed3si9n/sbt-assembly_2.11_0.13/0.11.3/sbt-assembly-0.11.3.pom
[warn] ==== typesafe-ivy-releases: tried
[warn] https://repo.typesafe.com/typesafe/ivy-releases/com.eed3si9n/sbt-assembly/scala_2.11/sbt_0.13/0.11.3/ivys/ivy.xml
[warn] module not found: com.scalapenos#sbt-prompt;0.2.1
[warn] ==== local: tried
[warn] C:\Users\puneethkumar.ivy2\local\com.scalapenos\sbt-prompt\scala_2.11\sbt_0.13\0.2.1\ivys\ivy.xml
[warn] ==== public: tried
[warn] https://repo1.maven.org/maven2/com/scalapenos/sbt-prompt_2.11_0.13/0.2.1/sbt-prompt-0.2.1.pom
[warn] ==== typesafe-ivy-releases: tried
[warn] https://repo.typesafe.com/typesafe/ivy-releases/com.scalapenos/sbt-prompt/scala_2.11/sbt_0.13/0.2.1/ivys/ivy.xml
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: org.typelevel#sbt-typelevel;0.3.1: not found
[warn] :: com.typesafe.sbt#sbt-git;0.6.4: not found
[warn] :: com.eed3si9n#sbt-assembly;0.11.3: not found
[warn] :: com.scalapenos#sbt-prompt;0.2.1: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn]
[warn] Note: Some unresolved dependencies have extra attributes. Check that these dependencies exist with the requested attributes.
[warn] org.typelevel:sbt-typelevel:0.3.1 (scalaVersion=2.11, sbtVersion=0.13)
[warn] com.typesafe.sbt:sbt-git:0.6.4 (scalaVersion=2.11, sbtVersion=0.13)
[warn] com.eed3si9n:sbt-assembly:0.11.3 (scalaVersion=2.11, sbtVersion=0.13)
[warn] com.scalapenos:sbt-prompt:0.2.1 (scalaVersion=2.11, sbtVersion=0.13)
[warn]
[warn] Note: Unresolved dependencies path:
[warn] org.typelevel:sbt-typelevel:0.3.1 (scalaVersion=2.11, sbtVersion=0.13) (C:\Users\puneethkumar\git\killrweather\build.sbt#L9-10)
[warn] +- com.datastax.killrweather:killrweather_2.11:1.0
[warn] com.typesafe.sbt:sbt-git:0.6.4 (scalaVersion=2.11, sbtVersion=0.13) (C:\Users\puneethkumar\git\killrweather\build.sbt#L11-12)
[warn] +- com.datastax.killrweather:killrweather_2.11:1.0
[warn] com.eed3si9n:sbt-assembly:0.11.3 (scalaVersion=2.11, sbtVersion=0.13) (C:\Users\puneethkumar\git\killrweather\build.sbt#L13-14)
[warn] +- com.datastax.killrweather:killrweather_2.11:1.0
[warn] com.scalapenos:sbt-prompt:0.2.1 (scalaVersion=2.11, sbtVersion=0.13) (C:\Users\puneethkumar\git\killrweather\build.sbt#L15-16)
[warn] +- com.datastax.killrweather:killrweather_2.11:1.0
[warn] Scala version was updated by one of library dependencies:
[warn] * org.scala-lang:scala-compiler:2.10.0 -> 2.10.4
[warn] To force scalaVersion, add the following:
[warn] ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true) }
[warn] Run 'evicted' to see detailed eviction warnings
sbt.ResolveException: unresolved dependency: org.typelevel#sbt-typelevel;0.3.1: not found
unresolved dependency: com.typesafe.sbt#sbt-git;0.6.4: not found
unresolved dependency: com.eed3si9n#sbt-assembly;0.11.3: not found
unresolved dependency: com.scalapenos#sbt-prompt;0.2.1: not found
at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:278)
at sbt.IvyActions$$anonfun$updateEither$1.apply(IvyActions.scala:175)
at sbt.IvyActions$$anonfun$updateEither$1.apply(IvyActions.scala:157)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:151)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:151)
at sbt.IvySbt$$anonfun$withIvy$1.apply(Ivy.scala:128)
at sbt.IvySbt.sbt$IvySbt$$action$1(Ivy.scala:56)
at sbt.IvySbt$$anon$4.call(Ivy.scala:64)
at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:93)
at xsbt.boot.Locks$GlobalLock.xsbt$boot$Locks$GlobalLock$$withChannelRetries$1(Locks.scala:78)
at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:97)
at xsbt.boot.Using$.withResource(Using.scala:10)
at xsbt.boot.Using$.apply(Using.scala:9)
at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:58)
at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:48)
at xsbt.boot.Locks$.apply0(Locks.scala:31)
at xsbt.boot.Locks$.apply(Locks.scala:28)
at sbt.IvySbt.withDefaultLogger(Ivy.scala:64)
at sbt.IvySbt.withIvy(Ivy.scala:123)
at sbt.IvySbt.withIvy(Ivy.scala:120)
at sbt.IvySbt$Module.withModule(Ivy.scala:151)
at sbt.IvyActions$.updateEither(IvyActions.scala:157)
at sbt.Classpaths$$anonfun$sbt$Classpaths$$work$1$1.apply(Defaults.scala:1318)
at sbt.Classpaths$$anonfun$sbt$Classpaths$$work$1$1.apply(Defaults.scala:1315)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$85.apply(Defaults.scala:1345)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$85.apply(Defaults.scala:1343)
at sbt.Tracked$$anonfun$lastOutput$1.apply(Tracked.scala:35)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:1348)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:1342)
at sbt.Tracked$$anonfun$inputChanged$1.apply(Tracked.scala:45)
at sbt.Classpaths$.cachedUpdate(Defaults.scala:1360)
at sbt.Classpaths$$anonfun$updateTask$1.apply(Defaults.scala:1300)
at sbt.Classpaths$$anonfun$updateTask$1.apply(Defaults.scala:1275)
at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
at sbt.std.Transform$$anon$4.work(System.scala:63)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
at sbt.Execute.work(Execute.scala:235)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
error sbt.ResolveException: unresolved dependency: org.typelevel#sbt-typelevel;0.3.1: not found
[error] unresolved dependency: com.typesafe.sbt#sbt-git;0.6.4: not found
[error] unresolved dependency: com.eed3si9n#sbt-assembly;0.11.3: not found
[error] unresolved dependency: com.scalapenos#sbt-prompt;0.2.1: not found
[error] Total time: 5 s, completed Oct 21, 2015 2:49:06 AM

Anyone Can help with this issue ?? Thanks in advance

DSE Spark failed to start

./dse spark
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:64)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:1996)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:1996)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:1996)
    at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:207)
    at org.apache.spark.repl.SparkIMain.<init>(SparkIMain.scala:118)
    at org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.<init>(SparkILoop.scala:187)
    at org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:216)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:948)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
    ... 32 more
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V
    at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method)
    at org.apache.hadoop.security.JniBasedUnixGroupsMapping.<clinit>(JniBasedUnixGroupsMapping.java:49)
    at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:38)
    ... 37 more

Missing dependency or reference to hadoop library or binary (Windows)

Running sbt on windows environment results in the following:

$ sbt app/run

[ERROR] [2015-05-11 14:25:17,117] [org.apache.hadoop.util.Shell]: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

Log output:

[INFO] [2015-05-11 14:25:14,524] [Remoting]: Starting remoting
[INFO] [2015-05-11 14:25:14,535] [Remoting]: Remoting started; listening on addr
esses :[akka.tcp://[email protected]:55917]
[INFO] [2015-05-11 14:25:15,642] [com.datastax.killrweather.NodeGuardian]: Start
ing at akka.tcp://[email protected]:2550
[INFO] [2015-05-11 14:25:15,653] [Cluster(akka://KillrWeather)]: Cluster Node [a
kka.tcp://[email protected]:2550] - Node [akka.tcp://[email protected]
:2550] is JOINING, roles []
[INFO] [2015-05-11 14:25:16,431] [Cluster(akka://KillrWeather)]: Cluster Node [a
kka.tcp://[email protected]:2550] - Leader is moving node [akka.tcp://Killr
[email protected]:2550] to [Up]
[INFO] [2015-05-11 14:25:16,433] [com.datastax.killrweather.NodeGuardian]: Membe
r akka.tcp://[email protected]:2550 joined cluster.
[INFO] [2015-05-11 14:25:16,498] [com.datastax.killrweather.NodeGuardian]: Node

is transitioning from 'uninitialized' to 'initialized'

Time: 1431368717000 ms

[ERROR] [2015-05-11 14:25:17,117] [org.apache.hadoop.util.Shell]: Failed to loca
te the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Ha
doop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278) [had
oop-common-2.2.0.jar:na]
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300) [hadoop-
common-2.2.0.jar:na]
at org.apache.hadoop.util.Shell.(Shell.java:293) [hadoop-common-
2.2.0.jar:na]
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:639) [hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.
java:468) [hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:456) [hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:424) [hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) [hadoop-c
ommon-2.2.0.jar:na]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) [hadoop-c
ommon-2.2.0.jar:na]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) [hadoop-c
ommon-2.2.0.jar:na]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772) [hadoop-c
ommon-2.2.0.jar:na]
at org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.ru
n(Checkpoint.scala:135) [spark-streaming_2.10-1.2.1.jar:1.2.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615) [na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
error java.lang.NullPointerException
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
589)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:639)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.
java:468)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772)
at org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.ru
n(Checkpoint.scala:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)
[trace] Stack trace suppressed: run last app/*:run for the full output.
error java.lang.NullPointerException
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
589)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:639)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.
java:468)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav
a:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772)
at org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.ru
n(Checkpoint.scala:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)

[trace] Stack trace suppressed: run last app/*:run for the full output.

Time: 1431368717500 ms

issue about connect a spark cluster and cassandra

well ,it is weird to connect a spark cluster and cassandra.
I install cassandra in my computer.

*First,I run this code in IntelliJ IDEA : *

` val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "127.0.0.1")

val sc = new SparkContext("local[*]", "test", conf)
val table: CassandraRDD[CassandraRow] = sc.cassandraTable("system_traces", "events")

val rowCount = table.count()
table.toLocalIterator foreach println

println(s"Total Rows in Table: $rowCount")
sc.stop()`

Everything is OK.

then I connect the spark cluster and cassandra:
` val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "192.168.211.128")

val sc = new SparkContext("spark://192.168.210.47:7077", "test", conf)
val table: CassandraRDD[CassandraRow] = sc.cassandraTable("system_traces", "events")

val rowCount = table.count()
table.toLocalIterator foreach println

println(s"Total Rows in Table: $rowCount")
sc.stop()`

IDE returns:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/04 02:07:07 INFO SparkContext: Running Spark version 1.3.1
16/03/04 02:07:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/04 02:07:07 INFO SecurityManager: Changing view acls to: root
16/03/04 02:07:07 INFO SecurityManager: Changing modify acls to: root
16/03/04 02:07:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/04 02:07:08 INFO Slf4jLogger: Slf4jLogger started
16/03/04 02:07:08 INFO Remoting: Starting remoting
16/03/04 02:07:08 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Master:37901]
16/03/04 02:07:08 INFO Utils: Successfully started service 'sparkDriver' on port 37901.
16/03/04 02:07:08 INFO SparkEnv: Registering MapOutputTracker
16/03/04 02:07:08 INFO SparkEnv: Registering BlockManagerMaster
16/03/04 02:07:08 INFO DiskBlockManager: Created local directory at /tmp/spark-1f1e100b-21a7-4485-99cf-19f89303419b/blockmgr-1556269c-ecae-486c-8366-ed10d1c61a8a
16/03/04 02:07:08 INFO MemoryStore: MemoryStore started with capacity 958.2 MB
16/03/04 02:07:08 INFO HttpFileServer: HTTP File server directory is /tmp/spark-837f5fcb-0047-4b87-b307-a58f379177cc/httpd-d44d74b1-2d37-4b5b-bd10-5ebe138ec972
16/03/04 02:07:08 INFO HttpServer: Starting HTTP Server
16/03/04 02:07:08 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:07:08 INFO AbstractConnector: Started [email protected]:35873
16/03/04 02:07:08 INFO Utils: Successfully started service 'HTTP file server' on port 35873.
16/03/04 02:07:08 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/04 02:07:13 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:07:13 INFO AbstractConnector: Started [email protected]:4040
16/03/04 02:07:13 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/04 02:07:13 INFO SparkUI: Started SparkUI at http://Master:4040
16/03/04 02:07:13 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:07:33 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:07:53 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:08:13 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
16/03/04 02:08:13 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
16/03/04 02:08:13 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.

then I change cassandra address:
val conf = new SparkConf(true) .set("spark.cassandra.connection.host", "192.168.211.128") val sc = new SparkContext("local[*]", "test", conf)

IDE returns:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/04 02:11:53 INFO SparkContext: Running Spark version 1.3.1
16/03/04 02:11:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/04 02:11:53 INFO SecurityManager: Changing view acls to: root
16/03/04 02:11:53 INFO SecurityManager: Changing modify acls to: root
16/03/04 02:11:53 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/04 02:11:54 INFO Slf4jLogger: Slf4jLogger started
16/03/04 02:11:54 INFO Remoting: Starting remoting
16/03/04 02:11:54 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Master:60079]
16/03/04 02:11:54 INFO Utils: Successfully started service 'sparkDriver' on port 60079.
16/03/04 02:11:54 INFO SparkEnv: Registering MapOutputTracker
16/03/04 02:11:54 INFO SparkEnv: Registering BlockManagerMaster
16/03/04 02:11:54 INFO DiskBlockManager: Created local directory at /tmp/spark-2a5256d6-03a1-4f6f-92b4-a35eae7937a9/blockmgr-e116d3d6-7499-4e0e-b22f-e33c4d9b29dd
16/03/04 02:11:54 INFO MemoryStore: MemoryStore started with capacity 958.2 MB
16/03/04 02:11:54 INFO HttpFileServer: HTTP File server directory is /tmp/spark-411693ab-5fc3-4a6a-a186-24354b381129/httpd-32279b97-4e69-4491-b6f7-2926cb2edc4d
16/03/04 02:11:54 INFO HttpServer: Starting HTTP Server
16/03/04 02:11:54 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:11:54 INFO AbstractConnector: Started [email protected]:55059
16/03/04 02:11:54 INFO Utils: Successfully started service 'HTTP file server' on port 55059.
16/03/04 02:11:54 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/04 02:11:59 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:11:59 INFO AbstractConnector: Started [email protected]:4040
16/03/04 02:11:59 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/04 02:11:59 INFO SparkUI: Started SparkUI at http://Master:4040
16/03/04 02:11:59 INFO Executor: Starting executor ID on host localhost
16/03/04 02:11:59 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@Master:60079/user/HeartbeatReceiver
16/03/04 02:11:59 INFO NettyBlockTransferService: Server created on 55233
16/03/04 02:11:59 INFO BlockManagerMaster: Trying to register BlockManager
16/03/04 02:11:59 INFO BlockManagerMasterActor: Registering block manager localhost:55233 with 958.2 MB RAM, BlockManagerId(, localhost, 55233)
16/03/04 02:11:59 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {192.168.211.128}:9042
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:181)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:76)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:104)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:115)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:243)
at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.tableDef(CassandraTableRowReaderProvider.scala:49)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tableDef$lzycompute(CassandraTableScanRDD.scala:59)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tableDef(CassandraTableScanRDD.scala:59)
at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.verify(CassandraTableRowReaderProvider.scala:148)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:118)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:900)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.count(CassandraTableScanRDD.scala:246)
at cassandraTest$delayedInit$body.apply(cassandraTest.scala:16)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at cassandraTest$.main(cassandraTest.scala:9)
at cassandraTest.main(cassandraTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.211.128:9042 (com.datastax.driver.core.TransportException: [/192.168.211.128:9042] Cannot connect))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:223)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1230)
at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:333)
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:174)
... 36 more

with local cassandra address and spark cluster:
` val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "127.0.0.1")

val sc = new SparkContext("spark://192.168.210.47:7077", "test", conf)`

IDE returns:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/04 02:19:06 INFO SparkContext: Running Spark version 1.3.1
16/03/04 02:19:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/04 02:19:06 INFO SecurityManager: Changing view acls to: root
16/03/04 02:19:06 INFO SecurityManager: Changing modify acls to: root
16/03/04 02:19:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/04 02:19:07 INFO Slf4jLogger: Slf4jLogger started
16/03/04 02:19:07 INFO Remoting: Starting remoting
16/03/04 02:19:07 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Master:42883]
16/03/04 02:19:07 INFO Utils: Successfully started service 'sparkDriver' on port 42883.
16/03/04 02:19:07 INFO SparkEnv: Registering MapOutputTracker
16/03/04 02:19:07 INFO SparkEnv: Registering BlockManagerMaster
16/03/04 02:19:07 INFO DiskBlockManager: Created local directory at /tmp/spark-4a2d101b-f7aa-4d7e-b295-8da39377fcd4/blockmgr-c8d8ae90-10a4-4015-87fa-b627bb46427c
16/03/04 02:19:07 INFO MemoryStore: MemoryStore started with capacity 958.2 MB
16/03/04 02:19:07 INFO HttpFileServer: HTTP File server directory is /tmp/spark-e01aabb3-1cdd-4b1a-8ef6-db376dc55143/httpd-d535d35f-96ac-48a4-ad31-de4180f57085
16/03/04 02:19:07 INFO HttpServer: Starting HTTP Server
16/03/04 02:19:07 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:19:07 INFO AbstractConnector: Started [email protected]:56722
16/03/04 02:19:07 INFO Utils: Successfully started service 'HTTP file server' on port 56722.
16/03/04 02:19:07 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/04 02:19:12 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/04 02:19:12 INFO AbstractConnector: Started [email protected]:4040
16/03/04 02:19:12 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/04 02:19:12 INFO SparkUI: Started SparkUI at http://Master:4040
16/03/04 02:19:12 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:19:32 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:19:52 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/04 02:20:12 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
16/03/04 02:20:12 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
16/03/04 02:20:12 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.

**Generally speaking,if I connect spark or cassandra which is not in my computer, IDE returns Error!!

Somebody please helps me!!**

Granulated timestamp partition key rather than one solid timestamp

I've seen the data model for each of the table, why did you guys made granulated timestamp (year, month, day) as the partition key and not just made the entire timestamp type to be the partition key ? Isn't that create reading data expensive by reading through 3 partition at once ?

Getting com.datastax.driver.core.exceptions.NoHostAvailableException

I am ussing DSE Cassandra "[cqlsh 5.0.1 | Cassandra 3.0.8.1293 | DSE 5.0.2" but getting "com.datastax.driver.core.exceptions.NoHostAvailableException: " error when akka is trying to connect to cassandra database. I have updated the rpc_address and the broadcast_rpc_address parameters in the yaml file as below. Any thoughts on what could be wrong or how to determine the cause?

$ sbt app/run -Dcassandra.connection.host=127.0.0.1
[info] Loading project definition from /root/killrweather/project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to KillrWeather (in build file:/root/killrweather/)
[info] Running com.datastax.killrweather.KillrWeatherApp
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
[INFO] [2016-09-23 14:27:48,486] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2016-09-23 14:27:48,531] [Remoting]: Starting remoting
[INFO] [2016-09-23 14:27:48,801] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2550]
[INFO] [2016-09-23 14:27:48,810] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Starting up...
[INFO] [2016-09-23 14:27:48,856] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2016-09-23 14:27:48,856] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Started up successfully
[INFO] [2016-09-23 14:27:48,864] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - No seed-nodes configured, manual cluster join required
[INFO] [2016-09-23 14:27:48,898] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Metrics collection has started successfully
ZooKeeperServer isRunning: true
log4j:WARN No appenders could be found for logger (kafka.utils.VerifiableProperties).
log4j:WARN Please initialize the log4j system properly.
Starting the Kafka server at 198.159.220.11:2181
[INFO] [2016-09-23 14:27:54,293] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2016-09-23 14:27:54,296] [Remoting]: Starting remoting
[INFO] [2016-09-23 14:27:54,302] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:43119]
[INFO] [2016-09-23 14:27:54,392] [org.spark-project.jetty.server.Server]: jetty-8.y.z-SNAPSHOT
[INFO] [2016-09-23 14:27:54,401] [org.spark-project.jetty.server.AbstractConnector]: Started [email protected]:59834
[INFO] [2016-09-23 14:27:54,498] [org.spark-project.jetty.server.Server]: jetty-8.y.z-SNAPSHOT
[INFO] [2016-09-23 14:27:54,506] [org.spark-project.jetty.server.AbstractConnector]: Started [email protected]:4040
[INFO] [2016-09-23 14:27:54,758] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Node [akka.tcp://[email protected]:2550] is JOINING, roles [analytics]
[INFO] [2016-09-23 14:27:54,764] [com.datastax.killrweather.NodeGuardian]: Starting at akka.tcp://[email protected]:2550
[INFO] [2016-09-23 14:27:54,764] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Trying to join seed nodes [akka.tcp://[email protected]:2550] when already part of a cluster, ignoring
[INFO] [2016-09-23 14:27:54,873] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Leader is moving node [akka.tcp://[email protected]:2550] to [Up]
[DEBUG] [2016-09-23 14:27:54,874] [com.datastax.killrweather.NodeGuardian]: Member [akka.tcp://[email protected]:2550] joined cluster.
[ERROR] [2016-09-23 14:27:55,246] [akka.actor.OneForOneStrategy]: Failed to open native connection to Cassandra at {127.0.0.1}:9042
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:166) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.ActorCell.create(ActorCell.scala:596) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.dispatch.Mailbox.run(Mailbox.scala:219) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) [akka-actor_2.10-2.3.11.jar:na]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.10.5.jar:na]
Caused by: java.io.IOException: Failed to open native connection to Cassandra at {127.0.0.1}:9042
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:181) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:167) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:76) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:104) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:115) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:243) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:182) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.spark.connector.streaming.DStreamFunctions.saveToCassandra(DStreamFunctions.scala:32) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
at com.datastax.killrweather.KafkaStreamingActor.(KafkaStreamingActor.scala:45) ~[classes/:na]
at com.datastax.killrweather.NodeGuardian$$anonfun$1.apply(NodeGuardian.scala:42) ~[classes/:na]
at com.datastax.killrweather.NodeGuardian$$anonfun$1.apply(NodeGuardian.scala:42) ~[classes/:na]
at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.Props.newActor(Props.scala:252) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.ActorCell.newActor(ActorCell.scala:552) ~[akka-actor_2.10-2.3.11.jar:na]
at akka.actor.ActorCell.create(ActorCell.scala:578) ~[akka-actor_2.10-2.3.11.jar:na]
... 9 common frames omitted
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured table schema_keyspaces))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:223) ~[cassandra-driver-core-2.1.5.jar:na]
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78) ~[cassandra-driver-core-2.1.5.jar:na]
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1230) ~[cassandra-driver-core-2.1.5.jar:na]
at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:333) ~[cassandra-driver-core-2.1.5.jar:na]
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:174) ~[spark-cassandra-connector_2.10-1.3.0-M1.jar:1.3.0-M1]
... 26 common frames omitted
[DEBUG] [2016-09-23 14:27:59,420] [com.datastax.killrweather.NodeGuardian]: NodeMetrics[heap-memory-committed:1008205824,heap-memory-max:1008205824,system-load-average:0.55,heap-memory-used:226654768,cpu-combined:0.0]
[DEBUG] [2016-09-23 14:28:08,914] [com.datastax.killrweather.NodeGuardian]: NodeMetrics[heap-memory-committed:1008205824,cpu-combined:0.009101701622477245,heap-memory-max:1008205824,system-load-average:0.62,heap-memory-used:230459632]
[DEBUG] [2016-09-23 14:28:18,913] [com.datastax.killrweather.NodeGuardian]: NodeMetrics[heap-memory-committed:1008205824,heap-memory-max:1008205824,heap-memory-used:232953840,cpu-combined:0.0070140280561122245,system-load-average:0.53]
[DEBUG] [2016-09-23 14:28:28,913] [com.datastax.killrweather.NodeGuardian]: NodeMetrics[system-load-average:0.45,heap-memory-committed:1008205824,heap-memory-used:234236752,heap-memory-max:1008205824,cpu-combined:0.0055151667084482325]

$ vi $CASS_HOME/resources/cassandra/conf/cassandra.yaml
rpc_address: 0.0.0.0
broadcast_rpc_address: 1.2.3.4

$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
....

"sbt compile" breaks because spark-cassandra-connector no longer includes custom Logging

Since March the 3rd, 2015, spark-cassandra-connector no longer includes the custom Logging class because of commit SPARKC-54.

So the v1.2.0-alpha3/spark-cassandra-connector dependency causes "sbt compile" to break in line 6 of killrweather/killrweather-app/src/it/scala/com/datastax/killrweather/Initializer.scala with an "object Logging is not a member of package com.datastax.spark.connector.util" exception.

I suggest to start using org.apache.spark.Logging or similar.

Raw weather data and daily aggregate not populating

Running in unix environment. I'm successfully able to get app/run and clients/ingestion_app and clients/client_app running. However, when querying the cassandra db, I don't see the two tables being populated. I don't see any errors or incorrect states of the kafka/akka/cassandra servers.

app/run

ESC[0m[ESC[0minfoESC[0m] ESC[0mLoading project definition from /home/eugene/killrweather/projectESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mThere may be incompatibilities among your library dependencies.ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mHere are some of the libraries that were evicted:ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0m * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mRun 'evicted' to see detailed eviction warningsESC[0m
ESC[0m[ESC[0minfoESC[0m] ESC[0mSet current project to KillrWeather (in build file:/home/eugene/killrweather/)ESC[0m
ESC[0m[ESC[0minfoESC[0m] ESC[0mRunning com.datastax.killrweather.KillrWeatherApp ESC[0m
ZooKeeperServer isRunning: true
Starting the Kafka server at 127.0.1.1:2181
[INFO] [2015-05-11 16:15:59,620] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-05-11 16:15:59,717] [Remoting]: Starting remoting
[INFO] [2015-05-11 16:16:00,088] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2550]
[INFO] [2015-05-11 16:16:00,104] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Starting up...
[INFO] [2015-05-11 16:16:00,147] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] -
Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-05-11 16:16:00,148] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Started up successfully
[INFO] [2015-05-11 16:16:00,165] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - No seed-nodes configured, manual cluster join required
OpenJDK Client VM warning: You have loaded library /home/eugene/killrweather/sigar/libsigar-x86-linux.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
[INFO] [2015-05-11 16:16:00,252] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Metrics collection has started successfully
[WARN] [2015-05-11 16:16:00,285] [org.apache.spark.util.Utils]: Your hostname, eugene-VirtualBox resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth2)
[WARN] [2015-05-11 16:16:00,286] [org.apache.spark.util.Utils]: Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [2015-05-11 16:16:00,540] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-05-11 16:16:00,544] [Remoting]: Starting remoting
[INFO] [2015-05-11 16:16:00,557] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:39888]
[INFO] [2015-05-11 16:16:07,207] [com.datastax.killrweather.NodeGuardian]: Starting at akka.tcp://[email protected]:2550
[INFO] [2015-05-11 16:16:07,216] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Node [akka.tcp://[email protected]:2550] is JOINING, roles []
[INFO] [2015-05-11 16:16:07,891] [com.datastax.killrweather.NodeGuardian]: Node is transitioning from 'uninitialized' to 'initialized'
[INFO] [2015-05-11 16:16:08,352] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Leader is moving node [akka.tcp://[email protected]:2550] to [Up]

[WARN] [2015-05-11 16:16:08,623] [org.apache.spark.util.SizeEstimator]: Failed to check whether UseCompressedOops is set; assuming yes

Time: 1431375368500 ms

...

[INFO] [2015-05-11 16:17:19,650] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Node [akka.tcp://KillrWeather@127.

0.0.1:2551] is JOINING, roles []

Time: 1431375440000 ms

[INFO] [2015-05-11 16:17:20,160] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2550] - Leader is moving node [akka.tcp://

[email protected]:2551] to [Up]

clients/ingestion_app

ESC[0m[ESC[0minfoESC[0m] ESC[0mLoading project definition from /home/eugene/killrweather/projectESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mThere may be incompatibilities among your library dependencies.ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mHere are some of the libraries that were evicted:ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0m * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mRun 'evicted' to see detailed eviction warningsESC[0m
ESC[0m[ESC[0minfoESC[0m] ESC[0mSet current project to KillrWeather (in build file:/home/eugene/killrweather/)ESC[0m
ESC[0m[ESC[33mwarnESC[0m] ESC[0mMultiple main classes detected. Run 'show discoveredMainClasses' to see the listESC[0m

Multiple main classes detected, select one to run:

[1] com.datastax.killrweather.KafkaDataIngestionApp
[2] com.datastax.killrweather.KillrWeatherClientApp

Enter number: 1

ESC[0m[ESC[0minfoESC[0m] ESC[0mRunning com.datastax.killrweather.KafkaDataIngestionApp ESC[0m
[INFO] [2015-05-11 16:17:18,345] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-05-11 16:17:18,462] [Remoting]: Starting remoting
[INFO] [2015-05-11 16:17:18,697] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2551]
[INFO] [2015-05-11 16:17:18,714] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2551] - Starting up...
[INFO] [2015-05-11 16:17:19,013] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2551] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-05-11 16:17:19,015] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2551] - Started up successfully
OpenJDK Client VM warning: You have loaded library /home/eugene/killrweather/sigar/libsigar-x86-linux.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
[INFO] [2015-05-11 16:17:19,182] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2551] - Metrics collection has started successfully
[INFO] [2015-05-11 16:17:19,894] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2551] - Welcome from [akka.tcp://[email protected]:2550]
[INFO] [2015-05-11 16:17:20,028] [com.datastax.killrweather.HttpNodeGuardian]: Starting at akka.tcp://[email protected]:2551
[INFO] [2015-05-11 16:17:20,233] [com.datastax.killrweather.HttpNodeGuardian]: Member akka.tcp://[email protected]:2551 joined cluster.
[INFO] [2015-05-11 16:17:20,252] [com.datastax.killrweather.HttpNodeGuardian]: Starting data ingestion on akka.tcp://[email protected]:2551.
[INFO] [2015-05-11 16:17:20,257] [com.datastax.killrweather.HttpNodeGuardian]: Sending 724940:23234,2008,01,01,00,11.7,-0.6,1023.8,50,7.2,2,0.0,0.0 to Kafka
[INFO] [2015-05-11 16:17:20,305] [com.datastax.killrweather.HttpNodeGuardian]: Sending 724940:23234,2008,01,01,01,10.6,3.3,1023.5,100,4.1,4,0.0,0.0 to Kafka
[INFO] [2015-05-11 16:17:20,311] [com.datastax.killrweather.HttpNodeGuardian]: Sending 724940:23234,2008,01,01,02,10.6,1.7,1023.6,110,4.6,4,0.0,0.0 to Kafka
[INFO] [2015-05-11 16:17:20,312] [com.datastax.killrweather.HttpNodeGuardian]: Sending 724940:23234,2008,01,01,03,10.0,1.1,1024.0,100,4.1,7,0.0,0.0 to Kafka

cassandra db

(0 rows)
cqlsh> select * from isd_weather_data.daily_aggregate_precip;

wsid | year | month | day | precipitation
------+------+-------+-----+---------------

(0 rows)
cqlsh> select * from isd_weather_data.raw_weather_data;

wsid | year | month | day | hour | dewpoint | one_hour_precip | pressure | six_hour_precip | sky_condition | sky_condition_text | temperature | wind_direction | wind_speed
------+------+-------+-----+------+----------+-----------------+----------+-----------------+---------------+--------------------+-------------+----------------+------------

(0 rows)

Exception thrown when sample no longer has any un-queried data

[ERROR] [2015-01-30 21:04:58,616] [akka.actor.OneForOneStrategy]: next on empty iterator
java.util.NoSuchElementException: next on empty iterator
        at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) ~[scala-library.jar:0.13.5]
        at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) ~[scala-library.jar:0.13.5]
        at scala.collection.IterableLike$class.head(IterableLike.scala:91) ~[scala-library.jar:0.13.5]
        at scala.collection.AbstractIterable.head(Iterable.scala:54) ~[scala-library.jar:0.13.5]
        at com.datastax.killrweather.WeatherApiQueries.queries(KillrWeatherClientApp.scala:77) ~[classes/:na]
        at com.datastax.killrweather.WeatherApiQueries$$anonfun$receive$1.applyOrElse(KillrWeatherClientApp.scala:61) ~[classes/:na]
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.10-2.3.8.jar:na]
        at com.datastax.killrweather.WeatherApiQueries.aroundReceive(KillrWeatherClientApp.scala:37) ~[classes/:na]
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [akka-actor_2.10-2.3.8.jar:na]
        at akka.actor.ActorCell.invoke(ActorCell.scala:487) [akka-actor_2.10-2.3.8.jar:na]
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) [akka-actor_2.10-2.3.8.jar:na]
        at akka.dispatch.Mailbox.run(Mailbox.scala:221) [akka-actor_2.10-2.3.8.jar:na]
        at akka.dispatch.Mailbox.exec(Mailbox.scala:231) [akka-actor_2.10-2.3.8.jar:na]
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library.jar:na]
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library.jar:na]
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library.jar:na]
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library.jar:na]

Which is generated from

val sample: Day = fileFeed().flatMap { case f =>
      val source = FileSource(f).source
      val s = source.getLines().map(Day(_)).filterNot(previous).toSeq.headOption
      Try(source.close())
      s
    }.head

The final head will fail if all the data has been previously queried.

ImportError: No module named cqlsh while running source 'load-timeseries.cql'

cqlsh> source 'load-timeseries.cql';
Using 7 child processes

Starting copy of isd_weather_data.weather_station with columns ['id', 'name', 'country_code', 'state_code', 'call_sign', 'lat', 'long', 'elevation'].
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
Traceback (most recent call last):
File "", line 1, in
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
File "", line 1, in
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
prepare(preparation_data)
file, path_name, etc = imp.find_module(main_name, dirs)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
ImportError: No module named cqlsh
Traceback (most recent call last):
File "", line 1, in
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 503, in prepare
file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh
load-timeseries.cql:7:8 child process(es) died unexpectedly, aborting
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 0 files in 0.173 seconds (0 skipped).

No Outputs in the respective cassandra tables!!!

I am new to this thing and am trying to setup this app successfully on my linux environment, But I am not able to see the Output in the database.

Please find the Logs for the respective APPs :

  1. app/run :
    /killrweather$ sbt
    [info] Loading project definition from /home/mkanchwala/killrweather/project
    [warn] There may be incompatibilities among your library dependencies.
    [warn] Here are some of the libraries that were evicted:
    [warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
    [warn] Run 'evicted' to see detailed eviction warnings
    [info] Set current project to KillrWeather.com (in build file:/home/mkanchwala/killrweather/)
    [SBT] mkanchwala@mkanchwala::root> app/run
    [info] Running com.datastax.killrweather.KillrWeatherApp
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/home/mkanchwala/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/home/mkanchwala/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    02/16/15 06:47:19 INFO [run-main-0] com.datastax.killrweather.WeatherSettings: Starting up with spark master 'local[*]' cassandra hosts '127.0.0.1'
    ZooKeeperServer isRunning: true
    ZooKeeper Client connected.
    Starting the Kafka server at 127.0.1.1:2181
    02/16/15 06:47:27 WARN [KillrWeather-akka.actor.default-dispatcher-5] org.apache.spark.util.Utils: Your hostname, mkanchwala resolves to a loopback address: 127.0.1.1; using 192.168.117.131 instead (on interface eth0)
    02/16/15 06:47:27 WARN [KillrWeather-akka.actor.default-dispatcher-5] org.apache.spark.util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    02/16/15 06:47:33 INFO [KillrWeather-akka.actor.default-dispatcher-2] com.datastax.killrweather.NodeGuardian: Starting at akka.tcp://[email protected]:2550
    02/16/15 06:47:34 INFO [KillrWeather-akka.actor.default-dispatcher-18] com.datastax.killrweather.NodeGuardian: Node is transitioning from 'uninitialized' to 'initialized'

  2. clients/run : DataFeedAPP

sbt
[info] Loading project definition from /home/mkanchwala/killrweather/project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] * com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to KillrWeather.com (in build file:/home/mkanchwala/killrweather/)
[SBT] mkanchwala@mkanchwala::root> clients/run
[warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list

Multiple main classes detected, select one to run:

[1] com.datastax.killrweather.DataFeedApp
[2] com.datastax.killrweather.KillrWeatherClientApp

Enter number: 1

[info] Running com.datastax.killrweather.DataFeedApp
[INFO] [2015-02-16 06:48:51,659] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-02-16 06:48:52,097] [Remoting]: Starting remoting
[INFO] [2015-02-16 06:48:52,438] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:47862]
[INFO] [2015-02-16 06:48:52,480] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Starting up...
[INFO] [2015-02-16 06:48:52,728] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-02-16 06:48:52,729] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Started up successfully
[INFO] [2015-02-16 06:48:52,761] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Metrics will be retreived from MBeans, and may be incorrect on some platforms. To increase metric accuracy add the 'sigar.jar' to the classpath and the appropriate platform-specific native libary to 'java.library.path'. Reason: java.lang.ClassNotFoundException: org.hyperic.sigar.Sigar
[INFO] [2015-02-16 06:48:52,771] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - No seed-nodes configured, manual cluster join required
[INFO] [2015-02-16 06:48:52,795] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Metrics collection has started successfully
[INFO] [2015-02-16 06:48:52,851] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Node [akka.tcp://[email protected]:47862] is JOINING, roles []
[INFO] [2015-02-16 06:48:52,895] [com.datastax.killrweather.AutomaticDataFeedActor]: Starting data file ingestion on akka.tcp://[email protected]:47862.
[INFO] [2015-02-16 06:48:53,512] [akka.http.HttpManager]: Bound to localhost/127.0.0.1:8080
[INFO] [2015-02-16 06:48:53,716] [com.datastax.killrweather.DynamicDataFeedActor]: Connected to [/127.0.0.1:8080] with [akka.stream.impl.ActorProcessor@52f16502]
[INFO] [2015-02-16 06:48:53,788] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:47862] - Leader is moving node [akka.tcp://[email protected]:47862] to [Up]
[INFO] [2015-02-16 06:48:57,942] [com.datastax.killrweather.FileFeedActor]: Ingesting /home/mkanchwala/killrweather/data/load/ny-sf-2008.csv.gz
[DEBUG] [2015-02-16 06:48:59,990] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,00,11.7,-0.6,1023.8,50,7.2,2,0.0,0.0'
[DEBUG] [2015-02-16 06:48:59,991] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,01,10.6,3.3,1023.5,100,4.1,4,0.0,0.0'
[DEBUG] [2015-02-16 06:48:59,994] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,02,10.6,1.7,1023.6,110,4.6,4,0.0,0.0'
.................

  1. KillrWeatherClientAPP:
    clients/run
    [warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list

Multiple main classes detected, select one to run:

[1] com.datastax.killrweather.DataFeedApp
[2] com.datastax.killrweather.KillrWeatherClientApp

Enter number: 2

[info] Running com.datastax.killrweather.KillrWeatherClientApp
[INFO] [2015-02-16 06:51:00,246] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2015-02-16 06:51:01,108] [Remoting]: Starting remoting
[INFO] [2015-02-16 06:51:01,694] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:39026]
[INFO] [2015-02-16 06:51:01,769] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Starting up...
[INFO] [2015-02-16 06:51:02,097] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2015-02-16 06:51:02,099] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Started up successfully
[INFO] [2015-02-16 06:51:02,157] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Metrics will be retreived from MBeans, and may be incorrect on some platforms. To increase metric accuracy add the 'sigar.jar' to the classpath and the appropriate platform-specific native libary to 'java.library.path'. Reason: java.lang.ClassNotFoundException: org.hyperic.sigar.Sigar
[INFO] [2015-02-16 06:51:02,185] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Metrics collection has started successfully
[INFO] [2015-02-16 06:51:02,205] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - No seed-nodes configured, manual cluster join required
[INFO] [2015-02-16 06:51:02,339] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Node [akka.tcp://[email protected]:39026] is JOINING, roles []
[INFO] [2015-02-16 06:51:02,383] [com.datastax.killrweather.WeatherApiQueries]: Starting.
[INFO] [2015-02-16 06:51:03,146] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:39026] - Leader is moving node [akka.tcp://[email protected]:39026] to [Up]
[INFO] [2015-02-16 06:51:06,355] [com.datastax.killrweather.WeatherApiQueries]: Requesting the current weather for weather station 724940:23234
[INFO] [2015-02-16 06:51:06,673] [com.datastax.killrweather.WeatherApiQueries]: Requesting annual precipitation for weather station 724940:23234 in year 2008
[INFO] [2015-02-16 06:51:06,688] [com.datastax.killrweather.WeatherApiQueries]: Requesting top-k Precipitation for weather station 724940:23234
[INFO] [2015-02-16 06:51:06,709] [com.datastax.killrweather.WeatherApiQueries]: Requesting the daily temperature aggregate for weather station 724940:23234
[INFO] [2015-02-16 06:51:06,724] [com.datastax.killrweather.WeatherApiQueries]: Requesting the high-low temperature aggregate for weather station 724940:23234
[INFO] [2015-02-16 06:51:06,739] [com.datastax.killrweather.WeatherApiQueries]: Requesting weather station 724940:23234
[INFO] [2015-02-16 06:51:08,610] [com.datastax.killrweather.WeatherApiQueries]: Requesting the current weather for weather station 722020:12839
[INFO] [2015-02-16 06:51:08,628] [com.datastax.killrweather.WeatherApiQueries]: Requesting annual precipitation for weather station 722020:12839 in year 2008
[INFO] [2015-02-16 06:51:08,631] [com.datastax.killrweather.WeatherApiQueries]: Requesting top-k Precipitation for weather station 722020:12839
[INFO] [2015-02-16 06:51:08,637] [com.datastax.killrweather.WeatherApiQueries]: Requesting the daily temperature aggregate for weather station 722020:12839
[INFO] [2015-02-16 06:51:08,642] [com.datastax.killrweather.WeatherApiQueries]: Requesting the high-low temperature aggregate for weather station 722020:12839
[INFO] [2015-02-16 06:51:08,651] [com.datastax.killrweather.WeatherApiQueries]: Requesting weather station 722020:12839
[INFO] [2015-02-16 06:51:10,444] [com.datastax.killrweather.WeatherApiQueries]: Requesting the current weather for weather station 722950:23174
[INFO] [2015-02-16 06:51:10,448] [com.datastax.killrweather.WeatherApiQueries]: Requesting annual precipitation for weather station 722950:23174 in year 2008
[INFO] [2015-02-16 06:51:10,453] [com.datastax.killrweather.WeatherApiQueries]: Requesting top-k Precipitation for weather station 722950:23174
[INFO] [2015-02-16 06:51:10,457] [com.datastax.killrweather.WeatherApiQueries]: Requesting the daily temperature aggregate for weather station 722950:23174
[INFO] [2015-02-16 06:51:10,462] [com.datastax.killrweather.WeatherApiQueries]: Requesting the high-low temperature aggregate for weather station 722950:23174
[INFO] [2015-02-16 06:51:10,466] [com.datastax.killrweather.WeatherApiQueries]: Requesting weather station 722950:23174
[INFO] [2015-02-16 06:51:12,266] [com.datastax.killrweather.WeatherApiQueries]: Requesting the current weather for weather station 725300:94846
[INFO] [2015-02-16 06:51:12,273] [com.datastax.killrweather.WeatherApiQueries]: Requesting annual precipitation for weather station 725300:94846 in year 2008
[INFO] [2015-02-16 06:51:12,279] [com.datastax.killrweather.WeatherApiQueries]: Requesting top-k Precipitation for weather station 725300:94846
[INFO] [2015-02-16 06:51:12,283] [com.datastax.killrweather.WeatherApiQueries]: Requesting the daily temperature aggregate for weather station 725300:94846
[INFO] [2015-02-16 06:51:12,284] [com.datastax.killrweather.WeatherApiQueries]: Requesting the high-low temperature aggregate for weather station 725300:94846
[INFO] [2015-02-16 06:51:12,289] [com.datastax.killrweather.WeatherApiQueries]: Requesting weather station 725300:94846
[ERROR] [2015-02-16 06:51:16,059] [akka.actor.OneForOneStrategy]: next on empty iterator
java.util.NoSuchElementException: next on empty iterator
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) ~[scala-library.jar:0.13.7]
at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) ~[scala-library.jar:0.13.7]
at scala.collection.IterableLike$class.head(IterableLike.scala:91) ~[scala-library.jar:0.13.7]
at scala.collection.AbstractIterable.head(Iterable.scala:54) ~[scala-library.jar:0.13.7]
at com.datastax.killrweather.WeatherApiQueries.queries(KillrWeatherClientApp.scala:76) ~[classes/:na]
at com.datastax.killrweather.WeatherApiQueries$$anonfun$receive$1.applyOrElse(KillrWeatherClientApp.scala:61) ~[classes/:na]
at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.10-2.3.8.jar:na]
at com.datastax.killrweather.WeatherApiQueries.aroundReceive(KillrWeatherClientApp.scala:37) ~[classes/:na]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [akka-actor_2.10-2.3.8.jar:na]
at akka.actor.ActorCell.invoke(ActorCell.scala:487) [akka-actor_2.10-2.3.8.jar:na]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) [akka-actor_2.10-2.3.8.jar:na]
at akka.dispatch.Mailbox.run(Mailbox.scala:221) [akka-actor_2.10-2.3.8.jar:na]
at akka.dispatch.Mailbox.exec(Mailbox.scala:231) [akka-actor_2.10-2.3.8.jar:na]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library.jar:na]
[INFO] [2015-02-16 06:51:16,091] [com.datastax.killrweather.WeatherApiQueries]: Starting.
[INFO] [2015-02-16 06:51:16,145] [com.datastax.killrweather.WeatherApiQueries]: Requesting the current weather for weather station 724940:23234
[INFO] [2015-02-16 06:51:16,153] [com.datastax.killrweather.WeatherApiQueries]: Requesting annual precipitation for weather station 724940:23234 in year 2008
[INFO] [2015-02-16 06:51:16,156] [com.datastax.killrweather.WeatherApiQueries]: Requesting top-k Precipitation for weather station 724940:23234
[INFO] [2015-02-16 06:51:16,161] [com.datastax.killrweather.WeatherApiQueries]: Requesting the daily temperature aggregate for weather station 724940:23234
[INFO] [2015-02-16 06:51:16,167] [com.datastax.killrweather.WeatherApiQueries]: Requesting the high-low temperature aggregate for weather station 724940:23234
[INFO] [2015-02-16 06:51:16,173] [com.datastax.killrweather.WeatherApiQueries]: Requesting weather station 724940:23234

@helena @searler : Can you guys suggest me what am I doing wrong here?

Thanks

cassandra connection error

[akka.actor.OneForOneStrategy]: Failed to open native connection to Cassandra at {127.0.1.1}:9042
akka.actor.ActorInitializationException: exception during creation

i have tried changing rpc_address:127.0.1.1 but all in vain , can you please suggest me what can i do?
Thanks

Client app doesn't do anything

The client application com.datastax.killrweather.KillrWeatherClientApp doesn't do anything at the moment. When started, it displays the following logs and then hangs:

sbt clients/run
[info] Loading project definition from /Users/i027947/Projects/lmlambda/killrweather/project
[warn] There may be incompatibilities among your library dependencies.
[warn] Here are some of the libraries that were evicted:
[warn] 	* com.typesafe.sbt:sbt-git:0.6.2 -> 0.6.4
[warn] Run 'evicted' to see detailed eviction warnings
[info] Set current project to KillrWeather (in build file:/Users/i027947/Projects/lmlambda/killrweather/)
[warn] Multiple main classes detected.  Run 'show discoveredMainClasses' to see the list

Multiple main classes detected, select one to run:

 [1] com.datastax.killrweather.KafkaDataIngestionApp
 [2] com.datastax.killrweather.KillrWeatherClientApp

Enter number: 2

[info] Running com.datastax.killrweather.KillrWeatherClientApp 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/i027947/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/i027947/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
[INFO] [2017-01-05 11:37:27,374] [akka.event.slf4j.Slf4jLogger]: Slf4jLogger started
[INFO] [2017-01-05 11:37:27,447] [Remoting]: Starting remoting
[INFO] [2017-01-05 11:37:27,654] [Remoting]: Remoting started; listening on addresses :[akka.tcp://[email protected]:2552]
[INFO] [2017-01-05 11:37:27,668] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Starting up...
[INFO] [2017-01-05 11:37:27,754] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [2017-01-05 11:37:27,754] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Started up successfully
[INFO] [2017-01-05 11:37:27,774] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - No seed-nodes configured, manual cluster join required
[INFO] [2017-01-05 11:37:27,815] [com.datastax.killrweather.ApiNodeGuardian]: Starting at akka.tcp://[email protected]:2552
[INFO] [2017-01-05 11:37:27,817] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Metrics collection has started successfully
[INFO] [2017-01-05 11:37:27,823] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Node [akka.tcp://[email protected]:2552] is JOINING, roles [client]
[INFO] [2017-01-05 11:37:27,825] [Cluster(akka://KillrWeather)]: Cluster Node [akka.tcp://[email protected]:2552] - Trying to join seed nodes [akka.tcp://[email protected]:2552] when already part of a cluster, ignoring
[INFO] [2017-01-05 11:37:27,838] [com.datastax.killrweather.AutomatedApiActor]: Starting.

This can be fixed by uncommenting the method preStart in ApiNodeGuardian. I intend to do a PR for the fix.

2 second delay in FileFeedActor serves no purpose

The following code probably intended to spread out the load so one line is loaded every 2 seconds.
It rather delays each line by 2 seconds, with the result that they are all processed together in a rather
expensive manner.

Source(source.getLines).foreach { case data =>
        context.system.scheduler.scheduleOnce(2.second) {
        log.debug(s"Sending '{}'", data)
        guardian ! KafkaMessageEnvelope[String, String](DefaultTopic, DefaultGroup, data)
      }
    }

Illustrated by the following log entries. Note the 2 second delay between Ingesting and Sending, with no delay between each Sending line

INFO] [2015-02-01 15:21:03,367] [com.datastax.killrweather.AutomaticDataFeedActor]: Starting data file ingestion on akka.tcp://[email protected]:51020.
[INFO] [2015-02-01 15:21:08,389] [com.datastax.killrweather.FileFeedActor]: Ingesting /home/rsearle/work/killrweather/./data/load/sf-2008.csv.gz
[DEBUG] [2015-02-01 15:21:10,407] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,00,11.7,-0.6,1023.8,50,7.2,2,0.0,0.0'
[DEBUG] [2015-02-01 15:21:10,408] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,01,10.6,3.3,1023.5,100,4.1,4,0.0,0.0'
[DEBUG] [2015-02-01 15:21:10,408] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,03,10.0,1.1,1024.0,100,4.1,7,0.0,0.0'
[DEBUG] [2015-02-01 15:21:10,412] [com.datastax.killrweather.FileFeedActor]: Sending '724940:23234,2008,01,01,02,10.6,1.7,1023.6,110,4.6,4,0.0,0.0'

ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up

I try to run [SimpleSparkJob].
It works well if I do not change anything.
I want to run the Demo to a Cluster,and change the code like this:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1") .setMaster("spark://192.168.210.47:7077").setAppName("spark.cassandra.connection")
The IDE return this error:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/03 15:29:46 INFO SparkContext: Running Spark version 1.3.1
16/03/03 15:29:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/03 15:29:47 INFO SecurityManager: Changing view acls to: root
16/03/03 15:29:47 INFO SecurityManager: Changing modify acls to: root
16/03/03 15:29:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/03 15:29:47 INFO Slf4jLogger: Slf4jLogger started
16/03/03 15:29:47 INFO Remoting: Starting remoting
16/03/03 15:29:47 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Master:44565]
16/03/03 15:29:47 INFO Utils: Successfully started service 'sparkDriver' on port 44565.
16/03/03 15:29:47 INFO SparkEnv: Registering MapOutputTracker
16/03/03 15:29:47 INFO SparkEnv: Registering BlockManagerMaster
16/03/03 15:29:47 INFO DiskBlockManager: Created local directory at /tmp/spark-d7c6da22-eed7-469f-a120-f3aa0fb6d650/blockmgr-6bed18a9-6b12-4963-baae-ab99c60efadc
16/03/03 15:29:47 INFO MemoryStore: MemoryStore started with capacity 958.2 MB
16/03/03 15:29:47 INFO HttpFileServer: HTTP File server directory is /tmp/spark-81b29c62-e8a7-48a3-99f1-15789561cc16/httpd-b4767476-467f-41be-90d4-051a7a5d8171
16/03/03 15:29:47 INFO HttpServer: Starting HTTP Server
16/03/03 15:29:47 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/03 15:29:47 INFO AbstractConnector: Started [email protected]:58063
16/03/03 15:29:47 INFO Utils: Successfully started service 'HTTP file server' on port 58063.
16/03/03 15:29:48 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/03 15:29:53 INFO Server: jetty-8.y.z-SNAPSHOT
16/03/03 15:29:53 INFO AbstractConnector: Started [email protected]:4040
16/03/03 15:29:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/03 15:29:53 INFO SparkUI: Started SparkUI at http://Master:4040
16/03/03 15:29:53 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/03 15:30:13 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/03 15:30:33 INFO AppClient$ClientActor: Connecting to master akka.tcp://[email protected]:7077/user/Master...
16/03/03 15:30:53 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
16/03/03 15:30:53 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.

16/03/03 15:30:53 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.

It seems that something wrong with akka.tcp://[email protected]:7077/user/Master,
so I run this code:
object RemoteActorApp extends App { val system = ActorSystem("spike-spark-issue") val actor = system.actorSelection("akka.tcp://[email protected]:7077/user/Master") if (actor == null) println("null actor") else println("correct") }
that returns correct.
I got the log in the folder of spark:

INFO Master: 192.168.23.101:36188 got disassociated, removing it.

well ,192.168.23.101 is my computer's IP and thiere is a cluster too.
Any good advice?
Many thx.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.