Giter VIP home page Giter VIP logo

dom-q-net's Introduction

DOM-Q-NET: Grounded RL on Structured Language

"DOM-Q-NET: Grounded RL on Structured Language" International Conference on Learning Representations (2019). Sheng Jia, Jamie Kiros, Jimmy Ba. [arxiv] [openreview]
Architecture

Can consider adding in changes such as MuZero improvements..

Demo

Trained multitask agent: https://www.youtube.com/watch?v=eGzTDIvX4IY
Facebook login: https://www.youtube.com/watch?v=IQytRUKmWhs&t=2s

Requirement

Need to download selenium & install chrome driver for selenium..

Installation(FOLLOW INSTRUCTIONS GIVEN IN NOTES AND INSTALL THE RIGHT DRIVERS(SELENIUM)

  1. Clone this repo
  2. Download MiniWoB++ environment from the original repo https://github.com/stanfordnlp/miniwob-plusplus
    and copy miniwob-plusplus/html folder to miniwob/html in this repo
  3. In fact, this html folder could be stored anywhere, but remember to perform one of the following actions:
  • Set environment variable "WOB_PATH" to
    file://"your-path-to-miniwob-plusplus"/html/miniwob
    E.g. "your-path-to-miniwob-plusplus" is "/h/sheng/DOM-Q-NET/miniwob
  • Directly modify the base_url on line 33 of instance.py to
    "your-path-to-miniwob-plusplus"/html/miniwob
    In my case, base_url='file:///h/sheng/DOM-Q-NET/miniwob/html/miniwob/'

Run experiment

Experiment launch files are stored under runs For example,

cd runs/hard2medium9tasks/
sh run1.sh

will launch a 11 multi-task (social-media search-engine login-user enter-password click-checkboxes click-option enter-dynamic-text enter-text email-inbox-delete click-tab-2 navigation-tree) experiment.

Multitask Assumptions

State & Action restrictions

Item Maximum number of items
DOM tree leaves (action space) 160
DOM tree 200
Instruction tokens 16

Attribute embeddings & vocabulary

Attribute max vocabulary Embedding dimension
Tag 100 16
Text (shared with instructions) 600 48
Class 100 16
  • UNKnown tokens
    These are assigned to a random vector such that the cosine distance with the text attribute can yield 1.0 for the direct alignment.

Acknowledgement

Credit to Dopamine for the implementation of prioritized replay used in dstructs/dopamine_segtree.py

dom-q-net's People

Contributors

kazel04 avatar sheng-j avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.