Giter VIP home page Giter VIP logo

blinkdb's People

Contributors

alig avatar alupher avatar amirshim avatar anandpiyer avatar andyk avatar apanda avatar apivovarov avatar cengle avatar chenghao-intel avatar colorant avatar conviva-zz avatar dennybritz avatar haoyuan avatar harsha2010 avatar harveyfeng avatar jasongiedymin avatar joshrosen avatar markhamstra avatar marmbrus avatar mateiz avatar mbautin avatar pwendell avatar rxin avatar sameeragarwal avatar sarahgerweck avatar sundeepn avatar tdas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blinkdb's Issues

hive_blinkdb build failed!!!!!

Hi,
Any help would be appreciated!

I have followed the instruction https://github.com/sameeragarwal/blinkdb/wiki/Running-BlinkDB-on-a-Cluster to build blinkDB, however when I firstly using ant building the submodule hive_blinkdb to the line
"$ ant package"

the version is 0.2.0

$ cd blinkdb
$ git submodule init
$ git submodule update
$ cd hive_blinkdb
$ ant package

errors exit as:(I changed nothing of the blinkDB files and settings)

..................
ivy-retrieve:
[echo] Project: shims

compile:
[echo] Project: shims
[echo] Building shims 0.20

build_shims:
[echo] Project: shims
[echo] Compiling /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/common/java;/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20/java against hadoop 0.20.2 (/home/ljz/blinkDB/blinkdb/hive_blinkdb/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
[echo] Project: shims

ivy-resolve-hadoop-shim:
[echo] Project: shims
[ivy:resolve] :: loading settings :: file = /home/ljz/blinkDB/blinkdb/hive_blinkdb/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
[echo] Project: shims
[echo] Building shims 0.20S

build_shims:
[echo] Project: shims
[echo] Compiling /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/common/java;/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/common-secure/java;/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java against hadoop 1.0.0 (/home/ljz/blinkDB/blinkdb/hive_blinkdb/build/hadoopcore/hadoop-1.0.0)

ivy-init-settings:
[echo] Project: shims

ivy-resolve-hadoop-shim:
[echo] Project: shims
[ivy:resolve] :: loading settings :: file = /home/ljz/blinkDB/blinkdb/hive_blinkdb/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
[echo] Project: shims
[javac] Compiling 13 source files to /home/ljz/blinkDB/blinkdb/hive_blinkdb/build/shims/classes
[javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:21: error: package org.mortbay.jetty.bio does not exist
[javac] import org.mortbay.jetty.bio.SocketConnector;
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:22: error: package org.mortbay.jetty.handler does not exist
[javac] import org.mortbay.jetty.handler.RequestLogHandler;
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:23: error: package org.mortbay.jetty.webapp does not exist
[javac] import org.mortbay.jetty.webapp.WebAppContext;
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:34: error: package org.mortbay.jetty does not exist
[javac] private static class Server extends org.mortbay.jetty.Server implements JettyShims.Server {
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:27: error: Jetty20SShims is not abstract and does not override abstract method startServer(String,int) in JettyShims
[javac] public class Jetty20SShims implements JettyShims {
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:28: error: startServer(String,int) in Jetty20SShims cannot implement startServer(String,int) in JettyShims
[javac] public Server startServer(String listen, int port) throws IOException {
[javac] ^
[javac] return type org.apache.hadoop.hive.shims.Jetty20SShims.Server is not compatible with org.apache.hadoop.hive.shims.JettyShims.Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:34: error: org.apache.hadoop.hive.shims.Jetty20SShims.Server is not abstract and does not override abstract method stop() in org.apache.hadoop.hive.shims.JettyShims.Server
[javac] private static class Server extends org.mortbay.jetty.Server implements JettyShims.Server {
[javac] ^
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:36: error: cannot find symbol
[javac] WebAppContext wac = new WebAppContext();
[javac] ^
[javac] symbol: class WebAppContext
[javac] location: class Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:36: error: cannot find symbol
[javac] WebAppContext wac = new WebAppContext();
[javac] ^
[javac] symbol: class WebAppContext
[javac] location: class Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:39: error: cannot find symbol
[javac] RequestLogHandler rlh = new RequestLogHandler();
[javac] ^
[javac] symbol: class RequestLogHandler
[javac] location: class Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:39: error: cannot find symbol
[javac] RequestLogHandler rlh = new RequestLogHandler();
[javac] ^
[javac] symbol: class RequestLogHandler
[javac] location: class Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:47: error: cannot find symbol
[javac] SocketConnector connector = new SocketConnector();
[javac] ^
[javac] symbol: class SocketConnector
[javac] location: class Server
[javac] /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java:47: error: cannot find symbol
[javac] SocketConnector connector = new SocketConnector();
[javac] ^
[javac] symbol: class SocketConnector
[javac] location: class Server
[javac] Note: /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: /home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 13 errors
[javac] 1 warning

BUILD FAILED
/home/ljz/blinkDB/blinkdb/hive_blinkdb/build.xml:319: The following error occurred while executing this line:
/home/ljz/blinkDB/blinkdb/hive_blinkdb/build.xml:169: The following error occurred while executing this line:
/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/build.xml:92: The following error occurred while executing this line:
/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/build.xml:95: The following error occurred while executing this line:
/home/ljz/blinkDB/blinkdb/hive_blinkdb/shims/build.xml:84: Compile failed; see the compiler error output for details.

integration of BlinkDB directly with Spark

Hi,

We are in the process of evaluating BlinkDB for supporting interactive count(*) queries on a table with various filters.
As per https://github.com/sameeragarwal/blinkdb/wiki/Running-BlinkDB-on-a-Cluster , we need to "Copy the Spark and BlinkDB directories to slaves"
But from few talks i saw for BlinkDB, looks like only spark client needs to be modified, so that it modifies query plan to use sample tables instead of original table. That means it is necessary for us to build & install BlinkDB on only machine.
Please correct me if i am wrong about this.

Since our organization is huge, it would be difficult to ask infrastructure team to apply patch on spark for supporting BlinkDB. We are using Spark 2.1.0 version currently. It will be easier for us to ask our infrastructure team to upgrade Spark Version.
So, wondering by when native support of BlinkDB will be available in Spark.

Also, would be grateful if one can point us to documentation for creating proof of concept around BlinkDB.

running example using "within xx seconds"

When I run simple SQL queries like

Select approx_count(_) from table1 where A = 'xx' _within 0.1 seconds*

I run into an error: Parse Error: mismatched input 'within' expecting EOF near ''table1''

Does blinkdb support clauses like "within xx seconds" as shown in the paper as running examples?

blinkdb convert all command line to lowercase

Blinkdb will fail if using SERDE, for example

ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'

JsonSerDe class will not be found, because blinkdb will convert the command to

ROW FORMAT SERDE 'org.openx.data.jsonserde.jsonserde'

Suggest to modify SharkDriver.scala line 175.

Issue while running bin/blinkdb on EMR Hadoop2.4

15/12/14 18:06:03 ERROR hive.log: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:238)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:104)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:136)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:151)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getDefaultDatabasePath(HiveMetaStore.java:475)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:353)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:371)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:278)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:248)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:114)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1009)
at shark.memstore2.TableRecovery$.reloadRdds(TableRecovery.scala:49)
at shark.SharkCliDriver.(SharkCliDriver.scala:273)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)

Exception in thread "main" org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteException Server IPC version 9 cannot communicate with client version 4)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1011)
at shark.memstore2.TableRecovery$.reloadRdds(TableRecovery.scala:49)
at shark.SharkCliDriver.(SharkCliDriver.scala:273)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteException Server IPC version 9 cannot communicate with client version 4)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:785)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:106)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:136)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:151)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getDefaultDatabasePath(HiveMetaStore.java:475)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:353)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:371)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:278)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:248)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:114)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1009)
... 4 more

Problem with using approx functions

I was trying to run the approx functions locally. However, they behaved differently from what's one the tutorial. The approx sum needs 3 arguments and I am not sure what to put in order to approximate. Can someone help me use them? Also, I cannot use error, samplewith...
shark> describe rand5000;
OK
numbers int
Time taken: 0.143 seconds
shark> select approx_sum(numbers) from rand5000;
62.169: [GC (Metadata GC Threshold) 179982K->30429K(1005056K), 0.0271976 secs]
62.196: [Full GC (Metadata GC Threshold) 30429K->19981K(1005056K), 0.0866642 secs]
63.093: [GC (System.gc()) 45909K->21763K(1005056K), 0.0042580 secs]
63.097: [Full GC (System.gc()) 21763K->10910K(1005056K), 0.1983384 secs]
63.299: [GC (System.gc()) 15782K->11078K(1005056K), 0.0017398 secs]
63.301: [Full GC (System.gc()) 11078K->9947K(1005056K), 0.0761869 secs]
FAILED: Error in semantic analysis: Exactly one argument is expected.
shark> select approx_sum(numbers, numbers, numbers) from rand5000;
130.911: [GC (Allocation Failure) 264388K->30538K(1005056K), 0.0253560 secs]
132.398: [GC (Allocation Failure) 292682K->51941K(1005056K), 0.0284526 secs]
132.821: [GC (Allocation Failure) 309431K->30135K(1005056K), 0.0112377 secs]
OK
2471570.0 +/- NaN (99% Confidence)
Time taken: 4.154 seconds

Spar 1.0.x

Are there plans to have Blink working with Spark 1.0.0/1.0.1?

CDH5 and Spark

Hi!
Have you got any experience deploying on CDH5 cluster with the Spark available in there?
Is the standalone Spark strictly necessary?
Thanks

Is this project dead?

I thought BlinkDB was a very promising project. I do not see any activity on this fork or any other fork more recent than June 2014.

What's the current status? Is this project dead?

BlinkDB building issue

Hey,

I am trying to build blinkdb using command,

sbt/sbt package

but it keeps throwing following error,

[info] Resolving org.scala-tools.testing#test-interface;0.5 ...
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: org.apache.spark#spark-core_2.9.3;0.8.0-SNAPSHOT: not found
[warn] :: org.apache.spark#spark-repl_2.9.3;0.8.0-SNAPSHOT: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
sbt.ResolveException: unresolved dependency: org.apache.spark#spark-core_2.9.3;0.8.0-SNAPSHOT: not found
unresolved dependency: org.apache.spark#spark-repl_2.9.3;0.8.0-SNAPSHOT: not found

I tried building my spark 0.8.0 again using following command,

SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 sbt/sbt publish-local

but all in vain.

Any help.?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.