Emma Strubell's Projects
Course materials for 11-767
Data and software for building the ACL Anthology.
Place to collect updated documents needed for ACL publications.
Annotated Gigaword Java API and Command Line Tools
Tokenizer for Arabic using jflex-scala
Bias detection in the news. Back and front end for areyoufakenews.com
The Berkeley Entity Resolution System jointly solves the problems of named entity recognition, coreference resolution, and entity linking with a feature-rich discriminative model.
Python code for pre-processing conll09 data
BERT for Coreference Resolution
A Bayesian network for heart disease data.
A chain CRF for OCR
Monte Carlo image denoising using grid-structured conditional random field models.
Restricted Boltzmann machine for the MNIST handwritten digits dataset.
Restricted Boltzmann machine for a classification problem.
Author attribution of Yelp reviews
Competence-based Curriculum Learning
My curriculum vitae in Latex
A one page , two asymmetric column resume template in XeTeX that caters to an undergraduate Computer Science student
Torch installation in a self-contained folder
Pre-trained ELMo Representations for Many Languages
Assignments for CS650: Applied Information Theory.
FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
For benchmarking various Factorie pipeline components
Bash scripts for training/testing Factorie NLP components.
Serialization and Deserialization of Factorie Documents into a lightweight protocol buffer format.
Patches the fbcunn library to work on AWS EC2 instances
The fast scanner generator for Java, modified to emit Scala code.
KenLM: Faster and Smaller Language Model Queries
Linguistically-Informed Self-Attention implemented in TensorFlow