Giter VIP home page Giter VIP logo

uhh-lt / wsd Goto Github PK

View Code? Open in Web Editor NEW
19.0 16.0 5.0 14.28 MB

A system for unsupervised knowledge-free interpretable word sense disambiguation based on distributional semantics

Home Page: http://jobimtext.org/wsd

License: GNU General Public License v3.0

Scala 36.38% HTML 0.34% Shell 6.22% Python 0.10% JavaScript 56.37% CSS 0.02% R 0.49% Makefile 0.09%
sense-disambiguation sense wsd word-sense-disambiguation distributional-semantics jobimtext distributional-analysis

wsd's Introduction

Unsupervised Knowledge Free Word Sense Disambiguation

A software to construct and visualize Word Sense Disambiguation models based on JoBimText models. This project implements the method described in the following paper, please cite it if you use the paper in a research project:

@inproceedings{Panchenko:17:emnlp,
  author    = {Panchenko, Alexander and Marten, Fide and Ruppert, Eugen and Faralli, Stefano  and Ustalov, Dmitry and Ponzetto, Simone Paolo and Biemann, Chris},
  title     = {{Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation}},
  booktitle = {In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP 2017)},
  year      = {2017},
  address   = {Copenhagen, Denmark},
  publisher = {Association for Computational Linguistics},
  language  = {english}
}

Prerequisites

Serving the WSD model

Online demo

Download precalculated DB and pictures

We provide a ready for use database and a dump of pictures for all senses in the database. To download and prepare the project with those two artifacts, you can use the following command:

To download and untar it, you will need 300 GB of free disk space!

./wsd model:download

Note: For instructions on how to rebuild the DB with the model, please see below: Build your own DB

Start the web application

To start the application:

./wsd web-app:start

The web application runs with Docker Compose. To customize your installation adjust docker-compose.override.yml. See the official documentation for general information on this file.

To get further information on the running containers you can use all Docker Compose commands, such as docker-compose ps and docker-compose logs.

Build your own DB

First set the $SPARK_HOME environment variable or provide spark-submit on your path.

By modifying the script scripts/spark_submit_jar.sh you can adjust the amount of memory used by Spark (consider changing --conf 'spark.driver.memory=4g' and --conf 'spark.executor.memory=1g').

We recommend to first use a toy training data set to build a toy model within a few minutes.

Build small toy model

./wsd model:build-toy

This model only provides senses for the word "Python" but is fully functional.

Build full model

Building the full model will take nearly 11 hours on an eight core machine with 30 GB of memory and needs around 300 GB of free disk space. It will also download 4 GB of training data.

./wsd model:build-full

See also

./wsd --help

wsd's People

Contributors

alexanderpanchenko avatar fmarten avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wsd's Issues

API not working

I downloaded the model and started the api and the web UI.
I'm getting 404 responses when I try to use the api and 'Network communication failed' when trying to use the web ui.

When I try to do ./wsd web-app:test, it just get stuck and there is no output.

Please help.

Thanks

License?

Hi!

What is the license for this project?

no database found for wsd

Hello,

I am trying to install the WSD project(https://github.com/uhh-lt/wsd)
When I execute the command './wsd model:donwload' and command './wsd web-app:start', everything seems fine, and the three containers started, but when I enter into the shell of the postgresql, I find there are three databases(postgres, template0,template1),but i find no tables in each of tables. and when I execute sudo ./wsd web-app:test, then there is following bug:

[ERROR] API responded with status code '500'.

For more details run:
curl 'http://localhost:9000/predictSense' -H 'content-type: application/json' -d '{"word":"python","context":"Pyhthon is a programming language.","model":"cos_traditional_self"}'

and my docker-compose.yml is like following:
version: '3'
services:
db:
image: postgres:9.5.5
volumes:
# Assumes you downloaded postgres data to the folder ./pgdata
- ./pgdata/data:/var/lib/postgresql/data
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=p0stgres
api:
build:
context: api/target/docker/stage
environment:
- APPLICATION_SECRET=change_me
- WSP_DB_SERVER_URL=jdbc:postgresql://db
- WSP_FLICKR_API_KEY=change_me
- WSP_BING_API_KEY=change_me
- WSP_IMAGE_SEARCH_ENGINE=bing
- WSP_API_PUBLIC_URL=set_me
- WSP_API_BING_IMAGE_FOLDER=/imgdata/bing
volumes:
- ./imgdata/bing:/imgdata/bing
depends_on:
- db
restart: always
web:
build:
context: web
args:
public_url: "" # Run the web application on the root namespace
api_host: "set_me"
external_api_endpoint: "http://ltmaggie.informatik.uni-hamburg.de/wsd-server/predictWordSense"
image_api_name: "bing"
depends_on:
- api

urgent for your help

how to add inventory into mysql

Hello,

I am trying to install the WSD-server (https://github.com/eugenso/WSD-Server)
in one step is 'Add the inventories to a MySQL database', but when I download the 3 inventories, how can I add the three inventories into mysql? It seems that the content of inventories is not regular.

urgent for your help

any way to reduce size of model by removing Image database

I need to just use the API endpoints and not the GUI where images are also loaded. For Eg-
Endpoint: /predictSense
Example request
curl -H "Content-Type: application/json"
-X POST
-d '{"context":"Java is an island.","word":"Java", "model": "simwords"}'
$YOUR_API_SERVER/predictWordSense

I saw that the imgdata/data stores data . Is there a way to remove this 135GB data and still have the API working ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.