Giter VIP home page Giter VIP logo

feiii2017's Introduction

What is this?

This is the repository for our contribution to the FEIII Challenge 2017 at the DSMM workshop at SIGMOD'17 in Chicago, IL, USA. You may find further information on the official webpages:

How do I replicate the results?

An almost minimal working example can be found in the jupyter-notebook file FEIII_synfeatures. Check out the init, train and result sections to see how the codebase can be used.

Create scorings

Start a jupyter-notebook server in the project root and open the FEIII_final_scoring.ipynb file ans start executing the cells. Within the pipeline() and score_func() functions, you can adjust which classifier and scoring function to use by setting the variable directly after the function header.

In case the train/eval split isn't working out, just shuffle the sampling by executing

data.shuffle_train_eval(n_docs_eval=3, max_tries=5)

Remember to adjust the classifier variable and scoring function accordingly before executing the cell that saves the frame(s).

Train and save embeddings

$ cd <project_root>/
$ python
>>> from code import feiii_transformers as ft
>>> e = ft._EmbeddingHolder('<path_to_full_repots>')
>>> e.train(num_files=30, num_epochs=25)
reading: ALLY_2016.html
reading: ALLY_2014.html
reading: CAPITAL-ONE_2013.html
...
extracting sentences...
words: 1630611
sentences: 56085
Training-Epoch: 0 | lr: 0.025
...
>>> e.save('<path_to_embedding>')

feiii2017's People

Contributors

timrepke avatar milost avatar

Watchers

 avatar James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.