Giter VIP home page Giter VIP logo

hive-testbench's Introduction

RETIRED

This repository is moved to the Hortonworks GitHub.

Make pull requests against that repository.

hive-testbench's People

Contributors

cartershanklin avatar t3rmin4t0r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hive-testbench's Issues

Shouldn't data generation and loading be separated?

According to TPC-DS specification (section 7.4.3) data generation time should not be measured. Right now in hive-testbench it is impossible to separate the generation from loading into the database, so it's difficult to measure only the data load (the generation and the load is done by running the single tpcds-setup.sh script).

Correct me if I haven't understood the specification or if I don't know how to use hive-testbench.

EDIT: After a moment of consideration I have this doubt: does hive-testbench do "in-line load" (as described in section 7.4.3.2 of TPC-DS specification) and the generation actually should contribute to the load time?

tpcds-build.sh fails due to change in data structure

The directory structure of the tpcds data has changed from DS Tools to TPC-DS v1.3.0.

I'll send a pull request once fixed.

hive-testbench]$ ./tpcds-build.sh
Maven not found, automatically installing it.
Building TPC-DS Data Generator
curl --output tpcds_kit.zip http://www.tpc.org/tpcds/dsgen/dsgen-download-files.asp?download_key=NaN
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1805k  100 1805k    0     0   664k      0  0:00:02  0:00:02 --:--:--  715k
mkdir -p target/
cp tpcds_kit.zip target/tpcds_kit.zip
test -d target/tools/ || (cd target; unzip tpcds_kit.zip)
Archive:  tpcds_kit.zip
   creating: TPC-DS v1.3.0/
[... snipped ...]
  inflating: TPC-DS v1.3.0/tools/y.tab.h
test -d target/tools/ || (cd target; mv "DS Tools/tools" tools)
mv: cannot stat `DS Tools/tools': No such file or directory
make: *** [target/tools/dsdgen] Error 1
TPC-DS Data Generator built, you can now use tpcds-setup.sh to generate data.

values.h should be limits.h

Trying to build on Mac OSX and got the error below:
Some googling showed that values.h is deprecated in ANSI C and should be limits.h instead. If it is changed in file porting.h, then build proceeds.

In file included from mkheader.c:37:
./porting.h:46:10: fatal error: 'values.h' file not found

include <values.h>

     ^

1 error generated.
make[1]: *** [mkheader.o] Error 1

Failed to run job: user <user> to unknown queue: default

Hello!

Please help me to run the hive-testbench.
We have a yarn-scheduler config which split the queue into 2 (dev1 70% and dev2 30%).
Once i tried to run $ ./tpcds-setup.sh 5.
It's giving me error
Exception in thread "main" java.io.IOException: Failed to run job : Application application_1444364220514_0016 submitted by user hive to unknown queue: default

I've tried to add the user hive to queue on yarn scheduler, but failed.
Where do I need to set the queue on hive-testbench settings?

Any help is highly appreciated! Thanks!

clarity on branches and Hive/Tez versions

Can you provide some clarity about what branch should be used for what version of Hive/Tez?

The hive13 branch README.md references testbench.settings but the file is init.settings in that branch. That settings file also sets hive.execution.engine=tez.

The master branch uses testbench.settings and has a second file, not in the README - testbench-withATS.settings, but those settings use hive.execution.engine=mr.

hive run query24.sql error

Hi,Please help me ๏ผš
run hive query24.sql error info:
FAILED: ParseException line 46:23 cannot recognize input near 'select' '0.05' '*' in expression specification

                                             <-----hive version:hive1.1.0--->

misspelled defines

Running tpcds_build.sh gives:

In file included from w_store_returns.c:40:
./w_store_sales.h:36:9: warning: 'W_STORE_SALES_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]

ifndef W_STORE_SALES_H

    ^~~~~~~~~~~~~~~

./w_store_sales.h:37:9: note: 'W_STORE_SLAES_H' is defined here; did you mean 'W_STORE_SALES_H'?

define W_STORE_SLAES_H

    ^~~~~~~~~~~~~~~
    W_STORE_SALES_H

1 warning generated.
gcc -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DYYDEBUG -DLINUX -g -Wall -c -o w_store_sales.o w_store_sales.c
In file included from w_store_sales.c:39:
./w_store_sales.h:36:9: warning: 'W_STORE_SALES_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]

ifndef W_STORE_SALES_H

    ^~~~~~~~~~~~~~~

./w_store_sales.h:37:9: note: 'W_STORE_SLAES_H' is defined here; did you mean 'W_STORE_SALES_H'?

define W_STORE_SLAES_H

    ^~~~~~~~~~~~~~~
    W_STORE_SALES_H

ANSI SQL-92 version of hive13 branch queries

In your blog post Benchmarking Apache Hive 13 for Enterprise Hadoop you site this repo as the souce, but Hive 0.10 requires ANSI SQL-92 join syntax and the hive13 branch contains only ANSI SQL-89 versions. In keeping with complete openness, can you add the ANSI SQL-92 versions of the queries that were used for the Hive 0.10 benchmark? I'll also point out that there are ANSI SQL-92 versions of TPC-DS queries in the 1.1 branch, but the filters do not match the version in the hive13 or master branch so they can not be used to compare without modifications.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.