Giter VIP home page Giter VIP logo

Kristian Rickert's Projects

bash_crawler icon bash_crawler

Simple bash script that crawls a website from a list of files and saves the output. Useful for creating unit tests. Not much more.

collector-http icon collector-http

Norconex Web Crawler (or spider) is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.

crawler-manager icon crawler-manager

web crawler that works over selenium and extracts the text from the plain html

fusion-client-tools icon fusion-client-tools

Client tools for working with Fusion, such as to support the hbase-indexer sending docs to a Fusion pipeline.

hugegherkin icon hugegherkin

Implementation of a Gherkin language to test against selenium.

ipaddresszipcodestatecountrylucenejavasearch icon ipaddresszipcodestatecountrylucenejavasearch

Automatically downloads a database of known IP addresses as well as other location data and creates a lucene index for spatial searching for IP addresses within a specific range (or other criteria). Stores lat/lon, zip, city, country, and IP addresses for fast lucene search. Index is 1GB upon completion.

java-wget icon java-wget

Dead simple java wget. Just one static class and a status enum. No dependencies.

javafast1x1transparentpixel icon javafast1x1transparentpixel

Sample code to return a 1x1 transparent GIF using Spring MVC as well as a simple servlet. This is useful if you want to implement tracking on your website. It's an in-memory servlet so it's fast as hell and won't open a file handle list most people do so dumbly.

kafka-committer icon kafka-committer

A committer allowing a norconex crawl to publish crawled data to a Kafka topic.

markdown-parser icon markdown-parser

Takes in markdown documents and outputs well structured test. Meant for a precursor for chunking in a text processing pipeline.

nlp-ner icon nlp-ner

NLP Named Entity Recognition Text Processor Microservice

pipeline-processor icon pipeline-processor

Takes in a PIpeDocument for a PipeService and runs it through the configured stages.

raspiwrite icon raspiwrite

A Python Script that prepairs and installs a Raspberry Pi compatiable distro to an SD Card

search-api icon search-api

Search API for the vector-based search engine ecosystem

solr icon solr

Apache Solr open-source search software

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.