An exhaustive unlexicalised PCFG parser. It does no Markovisation of rules and no annotation of internal symbols.
- Read treebank trees (texpr)
- Remove function tags (“-LOC” etc) and “-NONE” elements
- Binarize trees
- Read grammar off trees
- Output 2NF 1 rules with probability
- Output inverse unary relations
- Read grammar
- Read input sentence (tokens with POS tags)
- Run CKY algorithm
- Output parse tree from CKY table (texpr)