Giter VIP home page Giter VIP logo

matlab-stanford-postagger's Introduction

A small function to show how to use the stanford-pos-tagger in MATLAB.

Requirements

It requires the following files:

  1. english-left3words-distsim.tagger in the current path while running it. It can be found in $STANFORD_POS_TAGGER_PATH/models/
  2. stanford-postagger.jar should be added to the classpath. Matlab command to do it: javaaddpath('$STANFORD_POS_TAGGER_PATH/stanford-postagger.jar')

Usage

To run it simply drop it in the current working directory and run:

PosTaggerM(sample_sentence)

Sample input:

This is a very small sample sentence for test purpose - Chomsky.

Sample output:

[This/DT, is/VBZ, a/DT, very/RB, small/JJ, sample/NN, sentence/NN, for/IN, test/NN, purpose/NN, -/:, Chomsky/NNP, ./.]

The result is an ArrayList of TaggedWords.

Note on performance:: See discussion on this issue.

File path for english-left3words-distsim.tagger in Windows:: See discussion on and resolution of this issue.

Compatibility

Verified to work on:

  • 3.3.1 and 3.4.1 of the tagger
  • Matlab version 2010a, 8.3.0.532 (R2014a), R2016a and R2017a.
  • JRE version: 1.7 (JRE 7) and 1.8 (JRE 8).

Also, see this issue for more details.

Acknowledgements

This was initially hosted on my homepage. Douglas found the code and improved it to work with the latest version of the tagger.

@johnnykast helped debug some compatibility issues.

@Sardar-Usama did a detailed analysis of compatibility.

matlab-stanford-postagger's People

Contributors

musically-ut avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

matlab-stanford-postagger's Issues

java.io.IOException

I'm trying to use your files. But I'm receiving these errors:

Error using PosTaggerM (line 40)
Java exception occurred:
edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model

	at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)

	at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)

	at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)

Caused by: java.io.IOException: Unable to resolve "./english-left3words-distsim.tagger" as either class path,
filename or URL

	at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)

	at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)

	... 2 more

What I did: I added the folder of the file english-left3words-distsim.tagger in the path and added stanford-postagger.jar to the classpath as you suggested here. Then I used this code for testing:

sample_sentence = 'This is a very small sample sentence for test purpose - Chomsky.';
PosTaggerM(sample_sentence)

I have tried on both MATLAB R2014a and R2016a 64bit Win 10
I have tried with both 3.3.1 and 3.4.1 of the tagger and also tried some other versions.
Tried with latest java, and also with Version 8 Update 73 (build 1.8.0_73-b02) and also with Java 1.7.0_60-b19. .

How to solve this issue?

Can't make it with matlab-stanford-postagger

OK, here's my problem.
I want to POS tagg a text file with product reviews, in order to proceed with sentiment analysis on that opinions.
I downloaded the stanford parser 31/10/2016 and copied the file named english-left3words-distsim.tagger in Matlab's current working path.
I have installed Java in my computer.
I also added stanford-postagger.jar to the classpath using Matlab's command:
javaaddpath(./stanford-postagger-2016-10-31/stanford-postagger.jar')
Last, i call the PosTaggerM(str) function to POS tag the string contained in str variable.
An error about MaxentTagger( ) occurs on line 40 of the PosTaggerM() function which is:
tagger = MaxentTagger('./english-left3words-distsim.tagger');
Having in mind that i don't know Java, so i cannot debug the MaxentTagger method, can i be advised of what to do?
PS: I saw that this POStagging is compatible with Matlab 2014b and i'm working with Matlab 2010b, but i understand that it has to do with the java part, not the Matlab part of the program. Any help or modifications would be very much appreciated.

Compatibility

I'm using your code. After spending some time to find the latest tagger that works with your code, I have the following conclusion to make:

The latest versions of MATLAB, Java and tagger as by now (20th July, 2017) that I have tried and found your code working are: MATLAB R2017a, Java 8 Update 141, Stanford-Postagger 3.4.1.

I am attaching a list of my details for future reference.

Stanford-Postagger

I have found your code working with:

  • 3.3.0
  • 3.3.1
  • 3.4.1

The following versions don't work:

  • 3.5.0
  • 3.5.1
  • 3.5.2
  • 3.7.0
  • 3.8.0

All above non-working versions give this error:

Arguments to IMPORT must either end with ".*" or else specify a fully qualified
class name: "edu.stanford.nlp.tagger.maxent.MaxentTagger" fails this test.

I didn't dig into this but there may be a solution to above problem as well. What I have observed is that the version 3.5.0 and later versions which use Java 8 are not compatible with the code (as it is). The same issue was pointed out in this thread with the version 3.7.0.

Java version

It have tried it on the latest version as by now (20th July, 2017) i.e. Java 8 Update 141 (1.8.0_141-b15) (Release date July 18, 2017) and found it working.

MATLAB version

It have tried it on the following versions:
R2016a (Windows)
R2017a (Windows)
Both work.

Import edu.stanford.nlp.tagger.maxent.MaxentTagger fail

Hi,

I am trying to use your function for importing Stanford NLP Speech Tager. The error I get is on the import of the particular file:

Arguments to IMPORT must either end with ".*" or else specify a fully qualified class name:
"edu.stanford.nlp.tagger.maxent.MaxentTagger" fails this test.

What am I using?:

  • MATLAB 2016b (Running on a mac)
  • java version "1.8.0_51"
  • stanford-postagger-full-2016-10-31

What did I do? :

  • Added the stanford-postagger.jar using javaaddpath& verified if it was added using javaclasspath. It's added in a dynamic java path
  • Place PosTaggerM.m in a directory and nglish-left3words-distsim.tagger in the same directory

Is there something I am missing in this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.