Giter VIP home page Giter VIP logo

cdhproject's People

Contributors

fayson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdhproject's Issues

SparkStreaming读取Kafka数据时抛出jaas.conf文件不存在的错误

您好

我用spark-submit以yarn client的方式提交SparkStreaming应用读取启用了Kerberos认证的kafka数据时,如果Kafka Topic有数据产生,SparkStreaming应用就会抛出
java.io.IOException: /home/aspire/kerberos/jaas.conf (没有这样的文件或目录)
而我在CDH集群的每个节点的这个路径下都放置了这个文件,并且在读取Kafka之前用这个路径下的jaas.conf连接ZooKeeper是正常的。

还请您帮我解答疑问

我的提交命令是
spark2-submit --master yarn --deploy-mode client \
--class com.zy.KrbKafkaStreaming --num-executors 2 \
--executor-memory 4G --executor-cores 2 \
--conf spark.core.connection.ack.wait.timeout=300 \
--conf spark.executor.memoryOverhead=1024 \
--conf spark.memory.storageFraction=0.4 \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/home/aspire/kerberos/jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/aspire/kerberos/jaas.conf" \
/home/aspire/zhangyan/streamsimu/sparktrain-ch12-1.0.jar

错误是这样的
19/11/06 17:44:20 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 2.0 (TID 6, node-62, executor 1): org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:789)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:608)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:589)
at org.apache.spark.streaming.kafka010.CachedKafkaConsumer.(CachedKafkaConsumer.scala:45)
at org.apache.spark.streaming.kafka010.CachedKafkaConsumer$.get(CachedKafkaConsumer.scala:194)
at org.apache.spark.streaming.kafka010.KafkaRDDIterator.(KafkaRDD.scala:252)
at org.apache.spark.streaming.kafka010.KafkaRDD.compute(KafkaRDD.scala:212)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:381)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.SecurityException: java.io.IOException: /home/aspire/kerberos/jaas.conf (没有这样的文件或目录)
at sun.security.provider.ConfigFile$Spi.(ConfigFile.java:137)
at sun.security.provider.ConfigFile.(ConfigFile.java:102)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at javax.security.auth.login.Configuration$2.run(Configuration.java:255)
at javax.security.auth.login.Configuration$2.run(Configuration.java:247)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.Configuration.getConfiguration(Configuration.java:246)
at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:112)
at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:78)
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:103)
at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:61)
at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:86)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:710)
... 17 more
Caused by: java.io.IOException: /home/aspire/kerberos/jaas.conf (没有这样的文件或目录)
at sun.security.provider.ConfigFile$Spi.ioException(ConfigFile.java:666)
at sun.security.provider.ConfigFile$Spi.init(ConfigFile.java:262)
at sun.security.provider.ConfigFile$Spi.(ConfigFile.java:135)
... 34 more

谢谢

oozie调度spark需要将hive-site.xml打包到jar里面?

使用的是cdh5.12.1在用oozie调度时,将spark-submit命令写在sh脚本里面,发现如果不将hive-site.xml文件打包至jar,会出现报错。请问如何将hive-site.xml指定给oozie,或者说以后不需要将其打包至jar

yarn资源队列问题

cdh集群只开sentry,不开kerberos,通过hive/beeline提交的mapreduce任务都是hive用户,无法通过设置队列提交的用户组进行资源队列的隔离,请问这种情况如何解决?

oozie spark2 action on kerberized cluster

Hi,

最近在CDH集群中测试通过oozie提交spark action, 总是碰到问题。我希望通过oozie提交spark2的action,集群中启用了Kerberos,虽然按照提示配了principal和keytab,但一直出现认证失败的问题,不知道如何解决,期待能有一篇文章介绍这这个主题,谢谢!

Streamset UI zone error

用streamset同步binlog数据,发现timestamp类型会差八个小时,而且streamset页面选择UTC或者CST都会有时间出入,请问需要修改哪里

请问一下用spark跨集群的访问开启了kerberos的hbase

请问一下。如果想把集群B的hbase的数据读取到集群A中该怎么操作啊

我想到的是

sparkSession.sparkContext.addFile("hdfs://nameservice1/krb5B.conf")
sparkSession.sparkContext.addFile("hdfs://nameservice/clusterB.keytab")

val krb5Path = SparkFiles.get("krb5B.conf")
val principal = config.getJSONObject("auth").getString("principal")
val keytab = SparkFiles.get("clusterB.keytab")

System.setProperty("java.security.krb5.conf", krb5Path);
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab(principal, keytabPath)

在读取集群B的之前先loginUserFromKeytab一下。这里使用集群B的配制
在读取成一个dataframe之后。在用集群A的配制loginUserFromKeytab一下。

不知道这样是否可行

hpl/sql create procedure

在hpl/sql的存储过程中,是否支持按分区插入数据呢,一直报错提示Function : not found partition

cdh从5.15.0升级到cdh6.3.1后中文显示乱码

求帮助
环境:
测试环境:centos7.5
cdh版本:cdh6.3.1
问题:
升级6.3.1后,只要是yarn任务中带中文的,全部显示为乱码,如下显示“Hive 查询字符串:
insert overwrite table temp.tmp_test_20191225 select '������' union all select 'iss' union all select '���������' union all select 'bigdata'

微信公众号问题

能把历史消息里的标题 更新到必点里吗?
我看必点里好久没更新了
公众号不知到怎么联系你 就只能在这里写问题了 不好意思

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.