Giter VIP home page Giter VIP logo

brica's Introduction

BRICA: Bias Research In Communicating Agents

This repo contains the code which we used for the experiments reported in our ACL 2019 paper, titled "Word-order biases in deep-agent emergent communication", arxiv.

Installation

  • Cloning the repo: git clone --recursive [email protected]:facebookresearch/brica.git && cd brica
  • Creating & activating a conda environment: conda create -n brica_env python=3.6 && conda activate brica_env
  • Installing the requirements: pip install -r requirements.txt
  • Now tests should pass: pytest

How the repo is organized

  • data/Iconic_LessCounting/ contains the training/validation/test data splits for the artificial languages that are used in the paper. Each artificial language reflects or violates various natural language trends, such as the tendency to avoid redundancy (e.g. iconicity_teacher vs iconicity_markers_teacher);
  • pytorch-seq2seq/ is a git submodule containing a 3rd party seq2seq framework based on top of pytorch which is used in the experiments;
  • t2s/ and train.py contain the actual logic implementation.

Training

Training of the agents is split in two distinct scenarios:

  • pre-training: training of a single agent in separation. This training procedure allows to study how fast agents learn a particular language, assuming that the faster it is, the easier its properties are for the agent;
  • iterated training: training of the agents in an iterated setup, where an agent is trained in combination with another parent agent. After convergence, an agent is fixed and used as a parent to train the next child. Using this procedure, we can study the diachronic persistence of different language properties.

An example of a command that would pre-train an agent: python train.py --pretrain_agent=1 --n_epochs=10 --no_dev_eval=0 --no_test_eval=0

This agent can be either loaded for the iterated learning or one can start iterated learning from training the first agent:

python train.py --pretrain_agent=1 --no_dev_eval=0 --no_test_eval=0 --n_epochs=10 --hidden_size=16 --generations=1 where generations specify the number of iterations in iterated learning.

Other useful parameters are listed below.

Model specification:

  • hidden_size: size of the hidden layer (default is 32),
  • max_len: maximal length of an utterance (default is 30),
  • n_layers: number of layers in a recurrent unit (default is 1),
  • use_attention: whether the decoder uses an attention (default is 1),
  • tied: whether the input embedding layer of encoder and the output embedding layer of the decoder are tied (default is 1).
  • polyglot: training an agent speaking the free-order language (a polyglot),

Training parameters:

  • num_epochs: number of training epochs (default is 10),
  • batch_size: batch size (default is 32),
  • data_path_prefix: the path for the language data (default is ./data/Iconic_LessCounting/iconicity_markers),
  • teacher_forcing_ratio: sets the teacher forcing ratio (default is 1.0),
  • generations: the number of iterative learning generations.

Other:

  • random_seed: sets the random seed,
  • explosion_eval/explosion_train: specify the number of samples from a teacher that are used test/training time, respectively (defaults are 120 and 20),
  • init_A1 initializes the teacher from a saved checkpoint,
  • init_A1_from_A2 initializes the teacher from a student agent from a checkpoint,
  • save_model_path set where a trained model would be persisted,
  • no_test_eval/no_dev_eval if set to 1, disable validation of the model during training.

Citation

If you find this code or the ideas in the paper useful in your research, please consider citing the paper arxiv:

@inproceedings{Chaabouni2019,
    title={Word-order biases in deep-agent emergent communication},
    author={Chaabouni, Rahma and Kharitonov, Eugene and Lazaric, Alessandro and Dupoux, Emmanuel and Baroni, Marco},
    booktitle={ACL},
    year={2019}
}

License

BRICA is CC-BY-NC licensed, as found in the LICENSE file.

brica's People

Contributors

eugene-kharitonov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

slowika

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.