Giter VIP home page Giter VIP logo

parser's Introduction

CodeOntology

RDF-ization of source code

CodeOntology is an extraction tool that parses Java source code to generate RDF triples. It supports both maven and gradle projects. For more details see codeontology.org.

Set up

First, check dependencies in the Dockerfile.

To set up codeontology, you have to clone the repository and build the tool:

$ git clone https://github.com/codeontology/parser
$ cd codeontology
$ mvn package -DskipTests

Now, you can run the tool on any java project:

$ ./codeontology -i <input_folder> -o <output_file>

For a complete list of all command line options, just type:

$ ./codeontology --help

Use cases

JDK

Let's use the tool to extract RDF triples from the OpenJDK 8 source code.

First, you need the OpenJDK 8 source code. It is available on github:

$ git clone https://github.com/codeontology/openjdk8.git

Now, you have to install OpenJDK 8:

$ sudo dpkg -iR openjdk8/amd64

The above command should install OpenJDK 8. If you get dependecy errors, just type:

$ sudo apt-get -f install

Set the newly installed version of Java as the default version:

$ sudo update-java-alternatives -s java-1.8.0-openjdk-amd64

If you get the following error, just ignore it:

update-java-alternatives: plugin alternative does not exist: /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/IcedTeaPlugin.so

To verify that everything has worked, check that your java version is correct:

$ java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-4-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)

Finally, you are ready to serialize the OpenJDK source code into RDF triples. Just type:

$ ./codeontology -i openjdk8/ -o openjdk8.nt

This command will run the tool on the openjdk8 directory and save the extracted RDF triples to the file openjdk8.nt. Be aware that this may take 2 hour and a half!

To annotate source code comments, see CommentLinker.

Maven Repository

Let's suppose you want to use the tool to extract RDF triples from a generic repository. Here the spoon maven repository is used as an example to show how it works.

First, you have to clone the repository:

$ git clone https://github.com/INRIA/spoon

The repository contains tests that cause some troubles when building the abstract syntax tree. The -f switch is added to solve this issue and get rid of the tests. Moreover, the --dependencies switch is here used to parse all of the dependencies of the repository. The -v switch tells CodeOntology to verbosely print out all files processed.

$ ./codeontology -i spoon -o spoon.nt -vf --dependencies

Another interesting repository that can be used as example is Apache Commons Math (it will take less than 2 minutes to build the triples).

Jar files

CodeOntology can also process jar files:

$ ./codeontology --jar <path_to_jar>

In the following example, a jar file is downloaded to show how it works.

$ wget -O weka.zip http://downloads.sourceforge.net/project/weka/weka-3-8/3.8.0/weka-3-8-0.zip?r=https%3A%2F%2Fsourceforge.net%2Fprojects%2Fweka%2F&ts=1463402758&use_mirror=kent
$ unzip -j weka.zip "weka-3-8-0/weka.jar" -d .
$ ./codeontology --jar weka.jar -v

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.