Giter VIP home page Giter VIP logo

bulgarian-nlp's Introduction

bulgarian-nlp's People

Contributors

amontgomerie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

kiivanova

bulgarian-nlp's Issues

Issues when using a paragraph as input

I have been recently using the model. However, when trying to use a paragraph or some long string as input the annotator shows an error due to

---> 22         assert len(tokens) == len(pos_tags)
     23         assert len(tokens) == len(ner_tags)
     24         annotation = {}

AssertionError: 

It seems like the number of tokens differs from the number of tags. This doesn't happen with a shorter string. I tried using POSTagger.generate_tags() and I get a list of tags that is around a third of the number of words in the paragraph. Is there some size restriction in the text that can be used as input? How can I work around this issue?

This is the text I was using as an example

През 1878 г., след почти век на културно и икономическо възраждане, неуспешни въстания и дипломатически борбиБългария възстановява държавността си под формата на монархия и се освобождава от петвековното османско владичество с помощтана Руската империя в Руско-турската Освободителна война. Малко след това България започва да води редица войни със своите съседии се съюзява с Германия по време на двете световни войни. На 15 септември 1946 г. монархията е заменена с народна република, от съветски тип и държавата се преименува на Народна република България, ръководена от Българската комунистическа партия. Социалистическият строй съществува до 1990 г., след което България поема по пътя на либералната демокрация и пазарната икономика. На 29 март 2004 г. страната се присъединява към НАТО, а на 1 януари 2007 г. – към Европейския съюз.

Examples do not work, project must be updated

Error on generating tags:

image

Also, imports must be updated.

I just followed the steps in the jupyter notebooks that you provided. The only thing I changed is the import from:
from models.postaggers import POSTagger
to:
from models.taggers import POSTagger

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.