Giter VIP home page Giter VIP logo

sqoop's Introduction

Sqoop

For Complete Document, see Apache Sqoop Java Client API

What is it

This project provides two functionality for sqoop users, including:

  • scanner
  • worker

Scanner will transfer a raw-type configuration file to a ready-to-use one by scanning a required source file. Util now, we only support source file in csv format: TABLE_NAME,PRIME_KEY_NAME. Click here to have a look at our sample csv file.

Worker will parse a ready-to-use configuration file into link configs and job configs, and then communication with sqoop server to execute appropriate commands, such as creating or updating links, jobs, etc.

Usage

The built executable jar file includes the following options:

Usage: <main class> [options] 
  Options:
  * --action, -a
       the action to take:
		worker - use the specified configuration file to
       communicate with sqoop server.
		scanner - use the specified configuration file to
       generate ready-to-use configuration file for worker action.
  * --configuration, -c
       the path of the configuration file, used by worker action or scanner
       action
    --help, -h
       display help messages
       Default: false
    --input, -i
       the path of the input file, only when scanner action is taken, this
       parameter is *required*
    --output, -o
       the path of the output file, omit this parameter will output will be
       directed to console

You can use --help or -h option to show the message above:

# java -jar BUILT_JAR_FILE_PATH --help
java -jar sqoop-worker-1.0-SNAPSHOT.jar --help

Using full functionality of Scanner:

# java -jar BUILT_JAR_FILE_PATH --action scanner --configuration RAW_CONF_FILE_PATH 
#                               --input INPUT_FILE_PATH --output OUTPUT_FILE_PATH 
java -jar sqoop-worker-1.0-SNAPSHOT.jar -a scanner -c raw.conf -i test.csv -o ready.conf

Using full functionality of Worker:

# java -jar BUILT_JAR_FILE_PATH --action worker --configuration CONF_FILE_PATH 
java -jar sqoop-worker-1.0-SNAPSHOT.jar -a worker -c ready.conf

Demo raw.conf File

# the packing process will proceed sequentially.
# Please arrange your definitions of links and jobs into groups

# link #1
linkConfig.cid=2
linkConfig.name=Vampire
linkConfig.creationUser=LiHe
linkConfig.connectionString=jdbc:mysql://localhost/my
linkConfig.jdbcDriver=com.mysql.jdbc.Driver
linkConfig.username=root
linkConfig.password=root

# link #2
linkConfig.cid=1
linkConfig.name=JOKE
linkConfig.uri=hdfs://nameservice1:8020/

# job #1
jobConfig.fromLinkName=Vampire
jobConfig.toLinkName=JOKE
jobConfig.name=oracle-{fromJobConfig.tableName}
jobConfig.creationUser=LiHe
fromJobConfig.schemaName=sqoop
fromJobConfig.tableName={fromJobConfig.tableName}
fromJobConfig.partitionColumn={fromJobConfig.partitionColumn}
fromJobConfig.sql=select * from {fromJobConfig.tableName}
toJobConfig.outputDirectory=/usr/tmp/{fromJobConfig.tableName}/[yyyy-MM-dd]
throttlingConfig.numExtractors=3
throttlingConfig.numLoaders=3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.