Giter VIP home page Giter VIP logo

mysharedocuments's Introduction

  1. change alluxio configuration alluxio.worker.tieredstore.level0.dirs.quota=40GB alluxio.user.block.size.bytes.default=128MB

  2. change hive configuraion

cp /opt/hive/conf/hive-env.sh.template hive-env.sh
/opt/hive/conf/hive-env.sh

comment out the following lines

 if [ "$SERVICE" = "cli" ]; then
   if [ -z "$DEBUG" ]; then
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
   else
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
   fi
fi
export HADOOP_HEAPSIZE=1024

and add

export HIVE_OPTS="-hiveconf mapreduce.map.memory.mb=4096 -hiveconf mapreduce.reduce.memory.mb=5120"
  1. change presto configuration change /opt/presto/etc/config.properties
query.max-memory=20GB
query.max-memory-per-node=20GB
query.max-total-memory-per-node=20GB

change /opt/presto/etc/jvm.config

-Xmx35G
  1. restart all the things and mount the S3 data set
alluxio fs mount --readonly --option aws.accessKeyId=**** --option aws.secretKey=*** /s3 s3a://autobots-tpcds-test-data/parquet/scale100
  1. put a small table promotion in the Alluxio with path alluxio://localhost:19998/promotion the table size is 54KB.
wget https://autobots-tpcds-test-data.s3.amazonaws.com/parquet/scale100/promotion/part-00000-799bb353-4be8-4189-b02e-8ccb71463cbf-c000.snappy.parquet

alluxio fs mkdir /promotion

alluxio fs copyFromLocal part-00000-799bb353-4be8-4189-b02e-8ccb71463cbf-c000.snappy.parquet /promotion/
  1. create the external Alluxio table in hive create the store_sales, store_returns, web_sales, web_returns with data in Alluxio on S3 directory using hive -f createTestTables.sql

create the promotion with data in Alluxio promotion data using hive -f createPromotionTable.sql

  1. Run presto query for twice and see the time difference

way 1 (suggested): trigger presto sql presto --server localhost:8080 --catalog hive and copy the queries in prestoQuery.sql

This way presto will show the detailed progress and time for each query

way 2: directly run presto --server localhost:8080 --catalog hive -f prestoQuery.sql User may feel the query stuck as it will take several minutes

mysharedocuments's People

Contributors

luqqiu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.