Giter VIP home page Giter VIP logo

lexuse's Introduction

LexUse

Scripts used to find and add usage examples to Wikidata Lexemes. All scripts here are licensed under GPL-v3 or later unless they are snippets from someone else (source is then noted above the code)

LexUse can be used as a library if you want. It contains the following modules:

  • config: setting up variables that affect all scripts
  • riksdagen: code related to the Riksdagen API
  • util: code reused among the language specific scripts

Requirements

  • Python >= 3.7 (datetime fromisoformat needed)
  • httpx
  • wikibaseintegrator

Install using pip: $ sudo pip install wikibaseintegrator httpx

If pip fails with errors related to python 2.7 you need to upgrade your OS. E.g. if you are using an old version of Ubuntu like 18.04.

Getting started

To get started install the following libraries with your package manager or python PIP:

  • httpx
  • wikibaseintegrator

Please create a bot password for running the script for safety reasons here: https://www.wikidata.org/wiki/Special:BotPasswords

Add the following variables to your ~/.bashrc (recommended): export LEXUSE_USERNAME="username" export LEXUSE_PASSWORD="password"

Alternatively edit the file named config.py yourself and adjust the following content:

username = "username" password= "password"

And delete the 2 lines related to environment labels.

Language specific scripts

Please help add support for more languages by making pull requests or issues with suggestions for new CC0 or out of copyright sources.

Maybe a wikisource module would be nice?

swedish.py

Script used to semi-automatically import usage examples from the Riksdagen Open Data API (400.000 documents) and possibly later from RAÄ K-samsök (10 mio. items with CC0 metadata) and https://www.wikidata.org/wiki/Q5412081.

For developers

It might be worthwile to add a REPL to the script and let the user choose what language to work on. For now they have to start the script named after the language to work on.

Pseudo code describing the internal operation of the script

fetch a list of lexeme forms and words loop through the list search for the word in choosen api extract sentence clean sentence present sentence for approval if approved if number of sense=1 present sense for approval else present senses for approval and ask the user to choose 1 add "demonstrates form" add "demonstrates sense" add a reference upload to WD

lexuse's People

Contributors

ainali avatar dpriskorn avatar lemyst avatar salgo60 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.