i run sparkling 1.6.3 on spark 1.6, and i got the following error. E

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

error with H2OContext().start() about sparkling-water HOT 12 CLOSED

h2oai commented on May 16, 2024

error with H2OContext().start()

from sparkling-water.

Comments (12)

jakubhava commented on May 16, 2024 1

@Jason0812 To what is set your MASTER variable please ? Could you please send me H2O logs and in case you're running the application on YARN also YARN logs ?

@copoo Thanks for reporting this problem! On current sparkling water you can set spark configuration property spark.ext.h2o.repl.enabled to false. In command line by adding extra argument --conf spark.ext.h2o.repl.enabled=false or using sparkConf instance in application. It's just temporary solution which disables REPL - a feature which allows to write in Scala code in Flow UI ( our GUI on top of sparkling-water). If you aren't using this feature, than this solution should work at this moment perfectly.

It is on our plan to create proper fix and enable REPL support also for CDH 5.7 together with upcoming release for Spark 2.0.

Kuba

from sparkling-water.

jakubhava commented on May 16, 2024

Hi Jason,
thanks for reporting this! I have just a few questions since this is the first time we've seen this error.

Are you sure you are using Spark for Scala 2.10 ? So far we do not support Scala 2.11
Are you using official Spark distribution or some modified one ? Are you running only pure Spark with Sparkling Water ?
Can you post here the complete code you're trying to run and also the shell script which starts it ?
Have you downloaded the sparkling-water for Spark 1.6 from our page or have you built it on your own?

Just to understand what's happening there - In order to get our REPL ( interactive scala interpreter) working, we have to alter a little bit Spark environment using reflections and that's why we need to get field classServer from class SparkIMain. In official spark distribution ( Spark 1.6 ) SparkIMain class contains the field classServer.

Thanks, Kuba

from sparkling-water.

Jason0812 commented on May 16, 2024

Hi Kuba,
i use the spark 1.6 for scala2.0, and spark distribution is CDH5.7-spark1.6.
the script is run sparkling water ./bin/run-example.sh.

Thanks
Jason.

from sparkling-water.

jakubhava commented on May 16, 2024

Thanks for the info.

So I found what the problem is. Actually CHD5.7-spark1.6 contains some commits which has been put to master in official spark 2.0 release.

It can be seen in the Cloudera change log.

commit e0d03eb30e03f589407c3cf37317a64f18db8257
Author: Marcelo Vanzin <[email protected]>
Date:   Thu Dec 10 13:26:30 2015 -0800

[SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes.

This avoids bringing up yet another HTTP server on the driver, and
instead reuses the file server already managed by the driver's
RpcEnv. As a bonus, the repl now inherits the security features of
the network library.

There's also a small change to create the directory for storing classes
under the root temp dir for the application (instead of directly
under java.io.tmpdir).

Author: Marcelo Vanzin <[email protected]>

Closes #9923 from vanzin/SPARK-11563.

(cherry picked from commit 4a46b8859d3314b5b45a67cdc5c81fecb6e9e78c)

Since we so far don't support spark 2.0 ( and this change officially belongs to spark 2.0), corresponding changes are not implemented so far. I'll file up a JIRA and let you know about the progress.

from sparkling-water.

Jason0812 commented on May 16, 2024

thanks.
i will move the office spark 1.6 to run sparkling.

from sparkling-water.

Jason0812 commented on May 16, 2024

move to office spark 1.6 version.
got the following error:

any suggestion for this error?

Sparkling Water Context:

H2O name: sparkling-water-root_1007693659
number of executors: 1
list of used executors:

(executorId, host, port)

(1,tracing030,54321)

Open H2O Flow in browser: http://172.168.0.19:54321 (CMD + click in Mac OSX)

Exception in thread "main" DistributedException from tracing030/172.168.0.30:54321, caused by java.lang.NullPointerException
at water.MRTask.getResult(MRTask.java:472)
at water.MRTask.doAll(MRTask.java:401)
at water.parser.ParseSetup.guessSetup(ParseSetup.java:222)
at water.parser.ParseSetup.guessSetup(ParseSetup.java:205)
at water.parser.ParseDataset.parse(ParseDataset.java:36)
at water.parser.ParseDataset.parse(ParseDataset.java:30)
at water.util.FrameUtils.parseFrame(FrameUtils.java:58)
at water.util.FrameUtils.parseFrame(FrameUtils.java:47)
at water.fvec.H2OFrame.(H2OFrame.scala:65)
at water.fvec.H2OFrame.(H2OFrame.scala:75)
at org.apache.spark.examples.h2o.DeepLearningDemo$.main(DeepLearningDemo.scala:47)
at org.apache.spark.examples.h2o.DeepLearningDemo.main(DeepLearningDemo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NullPointerException
at water.parser.ParseSetup$GuessSetupTsk.map(ParseSetup.java:284)
at water.MRTask.compute2(MRTask.java:587)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1184)
at water.parser.ParseSetup$GuessSetupTsk$Icer.compute1(ParseSetup$GuessSetupTsk$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1180)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

from sparkling-water.

copoo commented on May 16, 2024

Hi Kuba,
We encounter the same issue(CDH 5.7.0 and spark version 1.6.0),is there any way to work around, except using office spark 1.6.1?

from sparkling-water.

copoo commented on May 16, 2024

Hi Kuba,
Thanks for your reply.

After adding the conf option as you say, The Exception java.lang.NoSuchFieldException: classServer has pass.However another exception occur , relative exception info below:

16/06/06 14:38:59 INFO spark.SparkContext: Created broadcast 9 from textFile at AirlinesWithWeatherDemo2.scala:53
Exception in thread "main" DistributedException from nd10.localdomain/192.168.0.10:54321, caused by java.lang.NullPointerException
        at water.MRTask.getResult(MRTask.java:472)
        at water.MRTask.doAll(MRTask.java:401)
        at water.parser.ParseSetup.guessSetup(ParseSetup.java:222)
        at water.parser.ParseSetup.guessSetup(ParseSetup.java:205)
        at water.parser.ParseDataset.parse(ParseDataset.java:36)
        at water.parser.ParseDataset.parse(ParseDataset.java:30)
        at water.util.FrameUtils.parseFrame(FrameUtils.java:58)
        at water.util.FrameUtils.parseFrame(FrameUtils.java:47)
        at water.fvec.H2OFrame.<init>(H2OFrame.scala:65)
        at water.fvec.H2OFrame.<init>(H2OFrame.scala:75)
        at org.apache.spark.examples.h2o.AirlinesWithWeatherDemo2$.main(AirlinesWithWeatherDemo2.scala:59)
        at org.apache.spark.examples.h2o.AirlinesWithWeatherDemo2.main(AirlinesWithWeatherDemo2.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NullPointerException
        at water.parser.ParseSetup$GuessSetupTsk.map(ParseSetup.java:284)
        at water.MRTask.compute2(MRTask.java:587)
        at water.H2O$H2OCountedCompleter.compute1(H2O.java:1184)
        at water.parser.ParseSetup$GuessSetupTsk$Icer.compute1(ParseSetup$GuessSetupTsk$Icer.java)
        at water.H2O$H2OCountedCompleter.compute(H2O.java:1180)
        at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
        at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
        at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
        at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
        at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

The spark Cluster run as spark on yarn and Job submited as yarn-client mode.

192.168.0.10 is one of the cluster datanode/spark gateway.

from sparkling-water.

jakubhava commented on May 16, 2024

Hi @Jason0812 @copoo,
sorry for late response. I discussed that with my colleagues and the problems in both your cases are that you are probably trying to use Spark in distributed environment such as YARN. In that case you need to ensure that data from which you want to create H2OFrames are on some distributed filesystem such as HDFS, because from there all nodes can see the file.

AirlinesWeatherDemo2 is demo which is supposed to be run in local environment.

So before you start your spark application, be sure that you put all the necessary files on HDFS. They you can create H2OFrame based on URI pointing to the location of the file on HDFS like

new H2OFrame(new java.net.URI(hdfs://path/to/the/file))

from sparkling-water.

jakubhava commented on May 16, 2024

Hi @Jason0812 @copoo ,
since I haven't heard from you for some time, I'm closing this issue. Feel free to reopen this issue or open a new one if you have other problems/questions.

Thanks, Kuba

from sparkling-water.

Avkash commented on May 16, 2024

I had same issue with Cloudera CDH 5.8 with Spark 1.6.1 and sparkling water 1.6.8.
Solution was to add the following as explained by Kuba:
--conf spark.ext.h2o.repl.enabled=false

from sparkling-water.

DINESHKUMARMURUGAN commented on May 16, 2024

I'm having issues while the H2O is trying to convert the data back to spark RDD

Error: assertion failed: Should never be here, type is 3
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx$$anonfun$4.apply(InternalReadConverterCtx.scala:74)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx$$anonfun$4.apply(InternalReadConverterCtx.scala:73)
	at scala.collection.Map$WithDefault.default(Map.scala:53)
	at scala.collection.MapLike$class.apply(MapLike.scala:141)
	at scala.collection.AbstractMap.apply(Map.scala:59)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx.string(InternalReadConverterCtx.scala:63)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx.string(InternalReadConverterCtx.scala:29)
	at org.apache.spark.h2o.converters.ReadConverterCtx$$anonfun$ExtractorsTable$8.apply(ReadConverterCtx.scala:111)
	at org.apache.spark.h2o.converters.ReadConverterCtx$$anonfun$ExtractorsTable$8.apply(ReadConverterCtx.scala:111)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx$$anonfun$returnOption$2.apply(InternalReadConverterCtx.scala:47)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx$$anonfun$returnOption$2.apply(InternalReadConverterCtx.scala:46)
	at scala.Option$WithFilter.flatMap(Option.scala:208)
	at org.apache.spark.h2o.backends.internal.InternalReadConverterCtx.returnOption(InternalReadConverterCtx.scala:46)
	at org.apache.spark.h2o.converters.ReadConverterCtx$$anonfun$org$apache$spark$h2o$converters$ReadConverterCtx$$OptionReadersMap$1$$anonfun$apply$1.apply(ReadConverterCtx.scala:119)
	at org.apache.spark.h2o.converters.ReadConverterCtx$$anonfun$org$apache$spark$h2o$converters$ReadConverterCtx$$OptionReadersMap$1$$anonfun$apply$1.apply(ReadConverterCtx.scala:119)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator$$anonfun$5$$anonfun$apply$1.apply(H2ORDD.scala:128)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator$$anonfun$6.apply(H2ORDD.scala:132)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator$$anonfun$6.apply(H2ORDD.scala:132)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator.extractRow(H2ORDD.scala:132)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator.readOneRow(H2ORDD.scala:188)
	at org.apache.spark.h2o.converters.H2ORDD$H2ORDDIterator.hasNext(H2ORDD.scala:156)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply$mcV$sp(PairRDDFunctions.scala:1210)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1341)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1218)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1197)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

from sparkling-water.

error with H2OContext().start() about sparkling-water HOT 12 CLOSED

Comments (12)

any suggestion for this error?

(executorId, host, port)

(1,tracing030,54321)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent