Giter VIP home page Giter VIP logo

cognitive-linguisticanalysis-windows's Introduction

Microsoft Linguistic Analysis API: Windows Client Library & Sample

This repo contains the Windows client library & sample for the Microsoft Linguistic Analysis API, an offering within Microsoft Cognitive Services, formerly known as Project Oxford.

Getting Started

To obtain a subscription key for Linguistic Analysis, visit our website and sign up for free using a Microsoft account. Then open, modify, build, and run the accompanying Visual Studio solution. This sample is a C# Windows console application demonstrating the use of the Microsoft Linguistic Analysis API.

Build the Sample

  1. Start Microsoft Visual Studio 2015 and select File > Open > Project/Solution.
  2. Starting in the folder where you clone the repository, go to the LinguisticAnalysis > Windows > Sample folder.
  3. Double-click the Visual Studio 2015 Solution file .sln.
  4. Paste your Microsoft Cognitive Services Linguistic Analysis subscription key into the LinguisticsClient constructor parameter value in Program.cs.
  5. Press Ctrl+Shift+B, or select Build > Build Solution.

Run the Sample

After the build is complete, press F5 to run the sample.

A number of lines will be printed on the console, showing the results received from the Linguistics service. These include: listing the set of available analyzers, and printing the results of those analyzers on some simple text.

Press any key to exit the app.

Contributing

We welcome contributions. Feel free to file issues and pull requests on the repo and we'll address them as we can. Learn more about how you can help on our Contribution Rules & Guidelines.

You can reach out to us anytime with questions and suggestions using our communities below:

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

All Microsoft Cognitive Services SDKs and samples are licensed with the MIT License. For more details, see LICENSE.

Sample images are licensed separately, please refer to LICENSE-IMAGE

Developer Code of Conduct

Developers using Cognitive Services, including this client library & sample, are expected to follow the “Developer Code of Conduct for Microsoft Cognitive Services”, found at http://go.microsoft.com/fwlink/?LinkId=698895.

cognitive-linguisticanalysis-windows's People

Contributors

chrisquirk avatar cthrash avatar delfu avatar lightfrenzy avatar msftgits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognitive-linguisticanalysis-windows's Issues

Tagging problems leading to entity extraction problems

I've found what appears to be poor tagging of words which should be tagged nouns or proper nouns in some instances, as verbs or adjectives in others. I believe the utterances I'm using to be fairly straightforward English utterances that the API shouldn't have too much trouble with. It's causing problems in extracting entities from utterances. Take these utterances for example.

For my purposes, each utterance starts with an uppercase letter, and is otherwise forced into lower-case, so the API can't rely on uppercase letters to interpret if a word is a proper noun. We have the API's output, and a further bit of code on our end to assess if the word is an entity. (That relies on the root tag and certain patterns of tags, so we could modify that on our end for a temporary workaround.)

"Call edward smith and vincenzo de campo":

Phrase Root Tag Tags Is Entity
Call VP VP-VP-VB False
edward VP VP-VP-S-NP-NN True
smith VP VP-VP-S-ADJP-JJ True
and VP VP-CC False
vincenzo VP VP-VP-VB False
de VP VP-VP-PP-IN True
campo VP VP-VP-PP-NN True

Here, "edward" is interpreted as a noun. Our system therefore picks it up as an entity along with "smith" as they're found in the right context.

"call edith walker and edward smith":

Phrase Root Tag Tags Is Entity
Call VP VP-VP-VB False
edith VP VP-VP-S-ADJP-RB True
walker VP VP-VP-S-ADJP-JJ True
and VP VP-CC False
edward VP VP-VP-VB False
smith VP VP-VP-JJ True

In this instance, the API doesn't recognise that "and" is extending the initial verb, so it seems to be interpreting "edward" as the verb rather than the noun, and only picking up "smith" as the noun. This is understandable as something the API isn't yet prepared to understand. It would be a lot more helpful if the API could make the connection that "call edith walker and edward smith" truly reads as "call edith walker and call edward smith".

"Book a meeting with edward walker and edith smith":

Phrase Root Tag Tags Is Entity
Book VP VP-NN False
a VP VP-NP-NP-NP-DT False
meeting VP VP-NP-NP-NP-NN False
with VP VP-NP-NP-PP-IN False
edward VP VP-NP-NP-PP-NP-JJ True
walker VP VP-NP-NP-PP-NP-NN True
and VP VP-NP-CC False
edith VP VP-NP-NP-JJ False
smith VP VP-NP-NP-NN False

Firstly, the API doesn't understand that "book" in this context is a verb, like "call" in the previous example. Importantly for the entity extraction, it seems the API is interpreting "edward" as an adjective that should be modifying the following "walker" noun. However, English rules should only apply if it also included the indefinite article, as in, "Book a meeting with [a] blue Walker", so this seems to be an error.

"Register edwin smith as a new employee":

Phrase Root Tag Tags Is Entity
Register ADJP ADJP-ADJP-RB False
edwin ADJP ADJP-ADJP-RB False
smith ADJP ADJP-ADJP-JJ True
as ADJP ADJP-PP-IN False
a ADJP ADJP-PP-NP-DT False
new ADJP ADJP-PP-NP-JJ True
employee ADJP ADJP-PP-NP-NN True

Firstly, the API wrongly reads the verb "register" as an adverb, which would affect how it interprets the following word. Subsequently, we have the name, "edwin", which is also intepreted as an adverb, rather than a noun or proper noun. It's only "smith" which is correctly identified as a noun.

Full Penn Tree Database list: http://web.mit.edu/6.863/www/PennTreebankTags.html

Failed to gather list of analyzers

Hi All,

I receive following error =>

Failed to list supported analyzers: System.Exception: Failed to gather list of analyzers
   at Microsoft.ProjectOxford.Linguistics.Sample.Program.ListAnalyzers() in C:\Users\berat.kaan.celen\Desktop\Social Responsiblity App\Cognitive-LinguisticAnalysis-Windows-master\Sample\Program.cs:line 101
   at Microsoft.ProjectOxford.Linguistics.Sample.Program.Main(String[] args) in C:\Users\berat.kaan.celen\Desktop\Social Responsiblity App\Cognitive-LinguisticAnalysis-Windows-master\Sample\Program.cs:line 56

And when I make an API call with postman I receive

GET /linguistics/v1.0/analyzers HTTP/1.1
Host: uksouth.api.cognitive.microsoft.com
Content-Type: application/json
ocp-apim-subscription-key: {my token}
Cache-Control: no-cache
Postman-Token: 17630898-bc37-e5e3-b334-5422cc311bd6
{
    "error": {
        "code": "404",
        "message": "Resource not found"
    }
}

Noun incorrectly tagged as an adverb

Utterance: "Register edwin smith as a new employee"
Keyword: “Edwin”
Expected result: ADJP-ADJP-NN
Current result: ADJP-ADJP-RB

For more information see: #7

LinguisticsClient json parse error

From @mehmetilker on April 8, 2016 16:2

When I run LinguisticsClient sample I got exception as follows:
Unexpected character encountered while parsing value: [. Path '[0].result', line 1, position 64.

It returns proper json but when it try deserialize it throw exception. Json looks valid.

return JsonConvert.DeserializeObject<TResponse>(responseContent, settings);
I tried to change settings but did not worked as well.

[{ "analyzerId" : "4fa79af1-f22c-408d-98bb-b7d7aeef7f04", "result" : [["NNP", "TO", "NNP", "NNP", "NNP", "."]] }, { "analyzerId" : "22a6b758-420f-4745-8a3c-46835a67c0d2", "result" : ["(TOP (INTJ (UH Welcome) (PP (TO to) (NP (NNP Microsoft) (NNP Linguistic) (NNP Analysis))) (. !)))"] }, { "analyzerId" : "08ea174b-bfdb-4e64-987e-602f85da7f72", "result" : [{ "Len" : 41, "Offset" : 0, "Tokens" : [{ "Len" : 7, "NormalizedToken" : "Welcome", "Offset" : 0, "RawToken" : "Welcome" }, { "Len" : 2, "NormalizedToken" : "to", "Offset" : 8, "RawToken" : "to" }, { "Len" : 9, "NormalizedToken" : "Microsoft", "Offset" : 11, "RawToken" : "Microsoft" }, { "Len" : 10, "NormalizedToken" : "Linguistic", "Offset" : 21, "RawToken" : "Linguistic" }, { "Len" : 8, "NormalizedToken" : "Analysis", "Offset" : 32, "RawToken" : "Analysis" }, { "Len" : 1, "NormalizedToken" : "!", "Offset" : 40, "RawToken" : "!" } ] } ] } ]

Copied from original issue: microsoft/ProjectOxford-ClientSDK#47

Request just hangs

Request to get analyzer list just hangs. Specifically > return Client.ListAnalyzersAsync().Result
This ultimately calls an http client request.

My code is directly from the sample app.
Here is the default url in the LinquisticClient library
private const string DefaultServiceHost = "https://api.projectoxford.ai/linguistics/v1.0";

Here is the class that I created

` public static class LinguisticAnalyzer
{
private static readonly LinguisticsClient Client = new LinguisticsClient("Removed_id");

    public static string Parse(string line)
    {
        // List analyzers
        Analyzer[] supportedAnalyzers = null;
        try
        {
            supportedAnalyzers = ListAnalyzers();
            var analyzersAsJson = JsonConvert.SerializeObject(supportedAnalyzers, Formatting.Indented, jsonSerializerSettings);
            Console.WriteLine("Supported analyzers: " + analyzersAsJson);
        }
        catch (Exception e)
        {
            Console.Error.WriteLine("Failed to list supported analyzers: " + e.ToString());
            Environment.Exit(1);
        }

        // Analyze text with all available analyzers
        var analyzeTextRequest = new AnalyzeTextRequest()
        {
            Language = "en",
            AnalyzerIds = supportedAnalyzers.Select(analyzer => analyzer.Id).ToArray(),
            Text = line //"Welcome to Microsoft Linguistic Analysis!"
        };

        try
        {
            var analyzeTextResults = AnalyzeText(analyzeTextRequest);
            var resultsAsJson = JsonConvert.SerializeObject(analyzeTextResults, Formatting.Indented, jsonSerializerSettings);
            Console.WriteLine("Analyze text results: " + resultsAsJson);

            return resultsAsJson;
        }
        catch (Exception e)
        {
            Console.Error.WriteLine("Failed to list supported analyzers: " + e.ToString());
            Environment.Exit(1);
        }

        return "";
    }

    /// <summary>
    /// Default jsonserializer settings
    /// </summary>
    private static JsonSerializerSettings jsonSerializerSettings = new JsonSerializerSettings()
    {
        DateFormatHandling = DateFormatHandling.IsoDateFormat,
        NullValueHandling = NullValueHandling.Ignore,
        ContractResolver = new CamelCasePropertyNamesContractResolver()
    };

    /// <summary>
    /// List analyzers synchronously.
    /// </summary>
    /// <returns>An array of supported analyzers.</returns>
    private static Analyzer[] ListAnalyzers()
    {
        try
        {
            return Client.ListAnalyzersAsync().Result;
        }
        catch (Exception exception)
        {
            throw new Exception("Failed to gather list of analyzers", exception as ClientException);
        }
    }

    /// <summary>
    /// Analyze text synchronously.
    /// </summary>
    /// <param name="request">Analyze text request.</param>
    /// <returns>An array of analyze text result.</returns>
    private static AnalyzeTextResult[] AnalyzeText(AnalyzeTextRequest request)
    {
        try
        {
            return Client.AnalyzeTextAsync(request).Result;
        }
        catch (Exception exception)
        {
            throw new Exception("Failed to analyze text", exception as ClientException);
        }
    }

}`

Noun incorrectly tagged as an adjective

Utterance: "Book a meeting with edward walker and edith smith"
Keyword: “Edward”
Expected result: VP-NP-NP-PP-NP-NN
Current result: VP-NP-NP-PP-NP-JJ

For more information see: #7

Some errors when running the Linguistic.Sample.sln

From @qindaying on April 12, 2016 9:15

Hi,
I found a difficulty when I tried to run the sample in LinguisticAnalysis by following 'Getting started'. I pasted my subscription key in the Program.cs (replace the "Your subscription key") and built the solution successfully, but I cannot ran the example successfully. Some errors appeared are as follows:

  1. The error poped when I opened the solution
    TF400324: Team Foundation services are not available from server http://vstfsng01:8080/tfs/cig.
    Technical information (for administrator):
    The remote name could not be resolved: 'vstfsng01'
    The active solution has been temporarily disconnected from source control because the server is unavailable. To attempt to reconnect to source control, close and then re-open the solution when the server is available. If you want to connect this solution to another server, use the Change Source Control dialog.
    The mappings for the solution could not be found.
    The mappings for the solution could not be found.
    The active solution has been temporarily disconnected from source control because the server is unavailable. To attempt to reconnect to source control, close and then re-open the solution when the server is available. If you want to connect this solution to another server, use the Change Source Control dialog.
  2. The errors after I pressed F5
    'Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe' (CLR v4.0.30319: Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe): Loaded 'C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Runtime.Serialization\v4.0_4.0.0.0__b77a5c561934e089\System.Runtime.Serialization.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
    'Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe' (CLR v4.0.30319: Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe): Loaded 'C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Xml.Linq\v4.0_4.0.0.0__b77a5c561934e089\System.Xml.Linq.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
    'Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe' (CLR v4.0.30319: Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe): Loaded 'C:\Windows\Microsoft.Net\assembly\GAC_32\System.Data\v4.0_4.0.0.0__b77a5c561934e089\System.Data.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
    'Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe' (CLR v4.0.30319: Microsoft.ProjectOxford.Linguistics.Sample.vshost.exe): Loaded 'C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Diagnostics.Debug\v4.0_4.0.0.0__b03f5f7f11d50a3a\System.Diagnostics.Debug.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
    Exception thrown: 'Newtonsoft.Json.JsonReaderException' in Newtonsoft.Json.dll
    Exception thrown: 'Newtonsoft.Json.JsonReaderException' in mscorlib.dll
    Exception thrown: 'System.AggregateException' in mscorlib.dll
    Exception thrown: 'System.Exception' in Microsoft.ProjectOxford.Linguistics.Sample.exe

Thank you very much! :)

Copied from original issue: microsoft/ProjectOxford-ClientSDK#49

Prebuilt Domains and Entities need clarity

LUIS now has a number of prebuilt entities and domains. All LUIS says about them is:

Prebuilt domain entities are parts of prebuilt domains. We recommend using entities from the same domain when possible since they work better together.

I want an entity for a person's name that I can use across intents for multiple uses. LUIS now gives Communication.ContactName, and also gives Entertainment.Person. They both seem to have very similar functions, but they're in different domains, and it's not clear how their functionality truly differs.
The same goes for Communication.Location and Events.Address.

  1. Why are the entities split into different domains?
  2. Does it truly matter if I just pick one of these entities and use it across domains for multiple uses in different intents?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.