Giter VIP home page Giter VIP logo

nlvr_tau_nlp_final_proj's Introduction

nlvr_tau_nlp_final_proj

This repository contains the files and directories, as well as the needed data, used for our final project for NLP and advanced machine learning courses, in Tel Aviv University, spring semester 2017.

Repository structure:

This is a short description of the files and directories. Note that not all are listed here. more elaborate description is to be found in the documentation and in the project assignment paper.

seq2seqModel:

Directory containing all files and data with the impementation of the model.

.py files:

  • seq2seq.py : this is the only runnable file in the directory. It contains the tf implementation of the model architecture, and the functions for training and evaluating the model.
  • beam_search.py : our implementation of epsilon greedy randomized beam search.
  • beam_boosting.py : functions used for boosting the baseline performance of the beam search
  • partial_program.py : contains the class PartialProgram that is used to wrap the programs in the beam.
  • hyper_params : constants and boolean properties of the model. can be changed between runs.

data directories:

  • learnedWeightsPreTrain : weights learned from running the pre-training using generated sentences and annotations of certain common patterns.
  • learnedWeightsWeaklySupervised : weights learned using the weakly supervised model (learning from denotations). The current weights in the dir are those achieving the beat results so far on the dev and test data sets.
  • running_logs : directory for saving logs with results of running training or testing of the model. Right now contains the results by sentence of running our best model on the dev and test sets.
  • word2vec : word embeddings used by the model and the code used for creating them.

data:

Contains most of the data needed for the project, including the original data set and other data used or generated by us.

  • nlvr-data : the original CNLVR data set
  • logical forms : data for using the logical forms in the model
  • parsed sentences : contains patterns of sentences with their annotations, as well as the dataset for pre-train that was generated based on them.
  • sentence processing : data needed for (or aquired through) pre-processing of the sentences.

.py files in root dir:

  • data_manager.py : loads the needed data, processes it and return it as an object that is convenient to work with.
  • sentence processing.py : used by the data manager to preprocess the sentences in the data, in order to reduce noise (e.g. generated by spelling errors) in the data.
  • logical forms.py: the code for the functions that are run when executing a logical form on a structured representation of an image.
  • structured_rep.py: classes representing the structured representation of an image in the data set.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.