Giter VIP home page Giter VIP logo

Helge Holzmann's Projects

archivepig icon archivepig

An Apache Pig framework that facilitates access to Web Archives, enables easy data extraction as well as derivation.

archivespark icon archivespark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

archivespark-server icon archivespark-server

A server application that provides a Web service API for ArchiveSpark to be used by third-party applications to integrate temporal Web archive data with a flexible, easy-to-use interface.

aut icon aut

The Archives Unleashed Toolkit is an open-source platform for analyzing web archives.

engtagger icon engtagger

English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger

exspec icon exspec

Don't write specs anymore, just save 'em while testing your code interactively. Specs will become a byproduct.

hadoopconcatgz icon hadoopconcatgz

A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

hadoopwebgraph icon hadoopwebgraph

A Hadoop input format to use gaphs in WebGraph's BV format with Hadoop and Spark.

micrawler icon micrawler

Create and cite micro Web archives with semantics as temporal representations of objects / entities / concepts on the Web

ruby-jobs icon ruby-jobs

A simple way to run jobs, wether simple scripts or experiments for your research. You can log results and progress without a hassle and experiment with different configurations.

tempas2archivespark icon tempas2archivespark

ArchiveSpark DataSpec to analyze the Internet Archive's Web archive through temporal search results returned by Tempas (v2)

web2warc icon web2warc

An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.