Giter VIP home page Giter VIP logo

tweeboparser's People

Contributors

ikekonglp avatar nkini avatar swabhs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tweeboparser's Issues

Try to run file run.sh

I made all the previous steps.
When I run the file ‘run.sh’ I get error
about ./turboparser that not find.
Where did I wrong ?

"(explanation in parentheses): some more text" parsed wrong

If you feed the parser something like this (a random tweet I grabbed)

9th grade to 10th grade to 11th grade to 12th grade (at my school at least): scum to student to bully to god. Isn't High School the best?

It outputs this

1	9th	_	A	A	_	2	_
2	grade	_	N	N	_	0	_
3	to	_	P	P	_	2	_
4	10th	_	$	$	_	5	_
5	grade	_	N	N	_	3	_
6	to	_	P	P	_	5	_
7	11th	_	$	$	_	8	_
8	grade	_	N	N	_	6	_
9	to	_	P	P	_	8	_
10	12th	_	$	$	_	11	_
11	grade	_	N	N	_	9	_
12	(	_	,	,	_	-1	_
13	at	_	P	P	_	0	_
14	my	_	D	D	_	15	_
15	school	_	N	N	_	13	_
16	at	_	P	P	_	15	_
17	least	_	A	A	_	16	MWE
18	):	_	,	,	_	-1	_
19	scum	_	N	N	_	0	_
20	to	_	P	P	_	19	_
21	student	_	N	N	_	20	_
22	to	_	P	P	_	21	_
23	bully	_	V	V	_	22	_
24	to	_	P	P	_	23	_
25	god	_	^	^	_	24	_
26	.	_	,	,	_	-1	_
27	Isn't	_	V	V	_	0	_
28	High	_	A	A	_	29	_
29	School	_	N	N	_	27	_
30	the	_	D	D	_	31	_
31	best	_	A	A	_	27	_
32	?	_	,	,	_	-1	_

Note line 18, it's being parsed as a sad face ): when it should probably be treated as 2 separate tokens, ) and : so as to not confuse programs that are trying to do e.g. sentiment analysis.

configure error and glog error

When I do the ./install.sh I receive the error:
Installing glog...
Cloning into 'glog'...
remote: Counting objects: 1720, done.
remote: Total 1720 (delta 0), reused 0 (delta 0), pack-reused 1720
Receiving objects: 100% (1720/1720), 1.45 MiB | 503.00 KiB/s, done.
Resolving deltas: 100% (1230/1230), done.
Checking connectivity... done.
expr: syntax error
./install_deps.sh: line 50: ./configure: No such file or directory
Done.

I also receive this Error:
g++ -DHAVE_CONFIG_H -I. -I../.. -I../util -I../classifier -I../parser -I../../deps/local/include -I/ad3 -I../../deps/local/include -I/ad3 -g -O2 -std=c++11 -MT SequenceFeatures.o -MD -MP -MF .deps/SequenceFeatures.Tpo -c -o SequenceFeatures.o SequenceFeatures.cpp
In file included from SequenceFeatures.cpp:19:
In file included from ./SequencePipe.h:22:
In file included from ../classifier/Pipe.h:23:
In file included from ../classifier/Features.h:22:
../classifier/Part.h:23:10: fatal error: 'glog/logging.h' file not found
#include <glog/logging.h>
^
1 error generated.
make[2]: *** [SequenceFeatures.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Does anyone know how to fix this?

The output format

The link to output format does not work.
Could you provide the format of the output, please? Thanks.
Since it said it used the CoNLL format, but in the CoNLL website. I found 10 columns, however, in your output only has 8, I do not know which 8 are.

model downloads silently fail if curl is not installed

Hello,

In a fresh environment (e.g. Docker), the install.sh script fails silently if curl is not installed. This will cause the models to fail to download with no warning and can be confusing when attempting to execute run.sh after a seemingly successful installation. I'd like to request that a warning or error be raised if curl fails. Thanks so much.

Dryden

missing gflags/gflags.h file

When I try to run install.sh the installation breaks due to the following error:

../classifier/Options.h:24:10: fatal error: 'gflags/gflags.h' file not found

Is there a missing gflags/gflags.h file?

Unclear format

Could you please detail on the output format(conll x or conll u) as it is unclear.

Script cannot deal with directories that have a space

I'm getting an error code (java.io.FileNotFoundException) when I run the run.sh script on a file in a directory with a space in the path.

(To clarify, the space is in the argument to the run.sh script)

This isn't a big deal as I simply changed the directory name but I figured it might be useful to point out. I may have some time in a couple weeks to try to fix this.

Change the way the pretrained_models.tar.gz is handled

Hi, thanks for providing TweeboParser. I'd like to offer two suggestions for the install.sh and the handling of the pretrained_models.tar.gz because the download time from your server can take up to 10 minutes, even with a 700 MBit/s internet connection.

  1. Please add an optional parameter that skips "rm pretrained_models.tar.gz" if the parameter is set. In our case, we could use a cached copy of the models but still want the latest TweeboParser from GitHub.
  2. Allow another optional parameter for an alternative mirror for the curl call.

Include a script to convert CoNLL output to standoff format for use with brat visualizer.

It would be convenient if the output produced by the TweeboParser could be visualized using the brat annotation tool. To do so, one needs to convert the CoNLL-ish output produced by Tweebo to the standoff format.

Unfortunately, the Tweebo output is slightly non-standard CoNLL and hence breaks brat's own conversion tools.

The attached file contains a python script that performs the conversion as well as configuration files to use with Brat.

tweebo-brat.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.