ikekonglp / tweeboparser Goto Github PK
View Code? Open in Web Editor NEWA Dependency Parser for Tweets
License: GNU General Public License v3.0
A Dependency Parser for Tweets
License: GNU General Public License v3.0
how to obtain Noun Phrase or stem of the word or the head of the sentence
I made all the previous steps.
When I run the file ‘run.sh’ I get error
about ./turboparser that not find.
Where did I wrong ?
If you feed the parser something like this (a random tweet I grabbed)
9th grade to 10th grade to 11th grade to 12th grade (at my school at least): scum to student to bully to god. Isn't High School the best?
It outputs this
1 9th _ A A _ 2 _
2 grade _ N N _ 0 _
3 to _ P P _ 2 _
4 10th _ $ $ _ 5 _
5 grade _ N N _ 3 _
6 to _ P P _ 5 _
7 11th _ $ $ _ 8 _
8 grade _ N N _ 6 _
9 to _ P P _ 8 _
10 12th _ $ $ _ 11 _
11 grade _ N N _ 9 _
12 ( _ , , _ -1 _
13 at _ P P _ 0 _
14 my _ D D _ 15 _
15 school _ N N _ 13 _
16 at _ P P _ 15 _
17 least _ A A _ 16 MWE
18 ): _ , , _ -1 _
19 scum _ N N _ 0 _
20 to _ P P _ 19 _
21 student _ N N _ 20 _
22 to _ P P _ 21 _
23 bully _ V V _ 22 _
24 to _ P P _ 23 _
25 god _ ^ ^ _ 24 _
26 . _ , , _ -1 _
27 Isn't _ V V _ 0 _
28 High _ A A _ 29 _
29 School _ N N _ 27 _
30 the _ D D _ 31 _
31 best _ A A _ 27 _
32 ? _ , , _ -1 _
Note line 18, it's being parsed as a sad face ):
when it should probably be treated as 2 separate tokens, )
and :
so as to not confuse programs that are trying to do e.g. sentiment analysis.
When I do the ./install.sh I receive the error:
Installing glog...
Cloning into 'glog'...
remote: Counting objects: 1720, done.
remote: Total 1720 (delta 0), reused 0 (delta 0), pack-reused 1720
Receiving objects: 100% (1720/1720), 1.45 MiB | 503.00 KiB/s, done.
Resolving deltas: 100% (1230/1230), done.
Checking connectivity... done.
expr: syntax error
./install_deps.sh: line 50: ./configure: No such file or directory
Done.
I also receive this Error:
g++ -DHAVE_CONFIG_H -I. -I../.. -I../util -I../classifier -I../parser -I../../deps/local/include -I/ad3 -I../../deps/local/include -I/ad3 -g -O2 -std=c++11 -MT SequenceFeatures.o -MD -MP -MF .deps/SequenceFeatures.Tpo -c -o SequenceFeatures.o SequenceFeatures.cpp
In file included from SequenceFeatures.cpp:19:
In file included from ./SequencePipe.h:22:
In file included from ../classifier/Pipe.h:23:
In file included from ../classifier/Features.h:22:
../classifier/Part.h:23:10: fatal error: 'glog/logging.h' file not found
#include <glog/logging.h>
^
1 error generated.
make[2]: *** [SequenceFeatures.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Does anyone know how to fix this?
The link to output format does not work.
Could you provide the format of the output, please? Thanks.
Since it said it used the CoNLL format, but in the CoNLL website. I found 10 columns, however, in your output only has 8, I do not know which 8 are.
Can I visualize the output of the tweebo parser using library as tree view ?
Hello,
In a fresh environment (e.g. Docker), the install.sh
script fails silently if curl
is not installed. This will cause the models to fail to download with no warning and can be confusing when attempting to execute run.sh
after a seemingly successful installation. I'd like to request that a warning or error be raised if curl
fails. Thanks so much.
Dryden
When I try to run install.sh
the installation breaks due to the following error:
../classifier/Options.h:24:10: fatal error: 'gflags/gflags.h' file not found
Is there a missing gflags/gflags.h
file?
Could you please detail on the output format(conll x or conll u) as it is unclear.
I'm getting an error code (java.io.FileNotFoundException) when I run the run.sh script on a file in a directory with a space in the path.
(To clarify, the space is in the argument to the run.sh script)
This isn't a big deal as I simply changed the directory name but I figured it might be useful to point out. I may have some time in a couple weeks to try to fix this.
Hi, thanks for providing TweeboParser. I'd like to offer two suggestions for the install.sh and the handling of the pretrained_models.tar.gz because the download time from your server can take up to 10 minutes, even with a 700 MBit/s internet connection.
It would be convenient if the output produced by the TweeboParser could be visualized using the brat annotation tool. To do so, one needs to convert the CoNLL-ish output produced by Tweebo to the standoff format.
Unfortunately, the Tweebo output is slightly non-standard CoNLL and hence breaks brat's own conversion tools.
The attached file contains a python script that performs the conversion as well as configuration files to use with Brat.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.