Giter VIP home page Giter VIP logo

hama-godb-refactored's Introduction

GoFFish on Hama

Follow the steps to install GoFFish

  1. Install Hadoop and Hama using this link.

  2. Build

    git clone https://github.com/dream-lab/goffish_v3
    cd goffish_v3/goffish-api
    mvn clean install						# Build api project
    cd ../sample
    mvn clean install						# Build sample project
    cd ../hama/v3.1/
    mvn clean install assembly:single		# Build goffish-hama
    
  3. Graphs

    3.1 Generate random graph using hama-examples:

    $HAMA_HOME/bin/hama jar hama-examples-x.x.x.jar gen fastgen -v 100 -e 10 -o randomgraph -t 2
    

    3.2 Supported graph formats

    3.2.1 Adjacency List (LongTextAdjacencyListReader.java)

    srcId sinkVertexId sinkVertexId

    job.setInputFormat(NonSplitTextInputFormat.class);
    job.setInputReaderClass(LongTextAdjacencyListReader.class);
    

    If no reader is specified in the job, this is used as default reader.

    3.2.2 Partitioned Adjacency List ( PartitionsLongTextAdjacencyReader.java)

    srcId partitionID sinkVertexId sinkVertexId

    job.setInputFormat(TextInputFormat.class);
    job.setInputReaderClass(PartitionsLongTextAdjacencyReader.class);
    

    3.2.3 JSON Reader (LongTextJSONReader.java)

    [srcid,partitionid,srcvalue,[[sinkid1,edgeid1,edgevalue1],[sinkid2,edgeid2,edgevalue2]... ]]

    job.setInputFormat(TextInputFormat.class);
    job.setInputReaderClass(LongTextJSONReader.class);
    

    Note: Partition ID starts from 0. And if partitionId is specified, the number of input files to the job should be atleast as many as there are partitions.

  4. Running sample

    Place the jars generated in step 2 in Hama classpath:

    cd goffish_v3/
    cp -t $HAMA_HOME/lib/ goffish-api/target/goffish-api-3.1.jar sample/target/goffish-sample-3.1.jar hama/v3.1/target/goffish-hama-3.1-jar-with-dependencies.jar
    

    General format of running goffish-hama job:

    hama JobClass properties-file input-path output-path 
    

    where input-path and output-path is the path of the graph in HDFS and path of the output in HDFS, respectively and properties-file is local file used for loading properties of job; e.g:

    hama in.dream_lab.goffish.job.DefaultJob ConnectedComponents.properties facebook_graph fbout
    

    Output and logs can be found at $HAMA_HOME/logs/tasklogs/job_id/

  5. Writing custom application

    Your application has to extend AbstractSubgraphComputation.java and implement the compute function. The job configuration (equivalent of driver class in MapReduce) can be written in the form of properties file or as a Java class. See DefaultJob.java and ConnectedComponents.properties for more details.

    Add the following dependencies to your project in pom.xml:

    <dependency>
    	<groupId>in.dream_lab.goffish</groupId>
    	<artifactId>goffish-api</artifactId>
    	<version>3.1</version>
    </dependency>
    <dependency>
    	<groupId>in.dream_lab.goffish</groupId>
    	<artifactId>goffish-hama</artifactId>
    	<version>3.1</version>
    </dependency>
    

    Your application can be run in the similar way as described above for the sample job, you just have to put the jar file in Hama classpath:

    export HAMA_CLASSPATH=/home/user/my_application.jar
    

hama-godb-refactored's People

Contributors

abhilashsharma avatar

Watchers

James Cloos avatar Shriram Ramesh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.