Comments (2)
$ python semstr/convert.py test_files/20001001.sdp -f conll
Converting: 100%|████████████| 1/1 [00:00<00:00, 87.65file/s, file=20001001.sdp]
$ cat 20001001.conll
# format = sdp
# sent_id = 20001001
1 Pierre Pierre NNP NNP _ 0 ROOT _ _
2 Vinken _generic_proper_ne_ NNP NNP _ 1 compound_ _
2 Vinken _generic_proper_ne_ NNP NNP _ 6 ARG1 __
2 Vinken _generic_proper_ne_ NNP NNP _ 9 ARG1 __
3 , _ , , _ 0 ROOT _ _
4 61 _generic_card_ne_ CD CD _ 0 ROOT __
5 years year NNS NNS _ 4 ARG1 _ _
6 old old JJ JJ _ 5 measure _ _
7 , _ , , _ 0 ROOT _ _
8 will will MD MD _ 0 ROOT _ _
9 join join VB VB _ 12 ARG1 _ _
9 join join VB VB _ 17 loc _ _
10 the the DT DT _ 0 ROOT _ _
11 board board NN NN _ 9 ARG2 _ _
11 board board NN NN _ 10 BV _ _
12 as as IN IN _ 0 ROOT _ _
13 a a DT DT _ 0 ROOT _ _
14 nonexecutive _generic_jj_ JJ JJ _ 0 ROOT __
15 director director NN NN _ 12 ARG2 __
15 director director NN NN _ 13 BV __
15 director director NN NN _ 14 ARG1 __
16 Nov. Nov. NNP NNP _ 0 ROOT _ _
17 29 _generic_dom_card_ne_ CD CD _ 16 of __
18 . _ . . _ 0 ROOT _ _
from semstr.
Note that the CoNLL format does not allow multiple heads, so these are discarded in the example above.
If, however, you want CoNLL-U format, just say so:
$ python semstr/convert.py test_files/20001001.sdp -f conllu
Converting: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 76.49file/s, file=20001001.sdp]
$ cat 20001001.conllu
# format = sdp
# sent_id = 20001001
# text = Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .
1 Pierre Pierre NNP NNP _ 0 root 0:root _
2 Vinken _generic_proper_ne_ NNP NNP _ 1 compound 1:compound|6:ARG1|9:ARG1 _
3 , _ , , _ 1 orphan 1:orphan _
4 61 _generic_card_ne_ CD CD _ 1 orphan 1:orphan _
5 years year NNS NNS _ 4 ARG1 4:ARG1 _
6 old old JJ JJ _ 5 measure 5:measure _
7 , _ , , _ 1 orphan 1:orphan _
8 will will MD MD _ 1 orphan 1:orphan _
9 join join VB VB _ 1 orphan 1:orphan _
10 the the DT DT _ 1 orphan 1:orphan _
11 board board NN NN _ 9 ARG2 9:ARG2|10:BV _
12 as as IN IN _ 1 orphan 1:orphan _
13 a a DT DT _ 1 orphan 1:orphan _
14 nonexecutive _generic_jj_ JJ JJ _ 1 orphan 1:orphan _
15 director director NN NN _ 12 ARG2 12:ARG2|13:BV|14:ARG1 _
16 Nov. Nov. NNP NNP _ 1 orphan 1:orphan _
17 29 _generic_dom_card_ne_ CD CD _ 16 of 16:of _
18 . _ . . _ 1 orphan 1:orphan _
This way you get all heads in the last column (the deps
column).
from semstr.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from semstr.