Giter VIP home page Giter VIP logo

Comments (18)

rodrigopivi avatar rodrigopivi commented on May 31, 2024

you can use the npm package with --formatOptions=somejson.json file with the template, there you can define the alises. let me know if this works for you and i can close the ticket

from chatito.

alexmojaki avatar alexmojaki commented on May 31, 2024

I think there may be some misunderstanding.

Using this example:

%[findByCityAndCategory]
    ~[nearby] @[city]

~[nearby]
    close to
    in the area of
    within
    located at
    nearby

~[newYork]
    new york ~[city?]
    ny ~[city?]

~[sanFrancisco]
    san francisco
    san francisco city

~[atlanta]
    atlanta
    atlanta city

~[city]
    city

@[city]
    ~[newYork]
    ~[sanFrancisco]
    ~[atlanta]

The output from [email protected] starts as follows:

{
    "rasa_nlu_data": {
        "regex_features": [],
        "entity_synonyms": [
            {
                "value": "newYork",
                "synonyms": [
                    "new york city",
                    "new york",
                    "ny city",
                    "ny"
                ]
            },
            {
                "value": "sanFrancisco",
                "synonyms": [
                    "san francisco",
                    "san francisco city"
                ]
            },
            {
                "value": "atlanta",
                "synonyms": [
                    "atlanta city"
                ]
            }
        ],
        "common_examples": [
            {
                "text": "located at ny city",
                "intent": "findByCityAndCategory",
                "entities": [
                    {
                        "start": 11,
                        "end": 18,
                        "value": "newYork",
                        "entity": "city"
                    }
                ]
            },
            {
                "text": "located at ny",
                "intent": "findByCityAndCategory",
                "entities": [
                    {
                        "start": 11,
                        "end": 13,
                        "value": "newYork",
                        "entity": "city"
                    }
                ]
            },

Whereas now (2.0.0) it starts like this:

{
 "rasa_nlu_data": {
  "regex_features": [],
  "entity_synonyms": [],
  "common_examples": [
   {
    "text": "nearby new york city",
    "intent": "findByCityAndCategory",
    "entities": [
     {
      "end": 20,
      "entity": "city",
      "start": 7,
      "value": "new york city"
     }
    ]
   },
   {
    "text": "close to ny city",
    "intent": "findByCityAndCategory",
    "entities": [
     {
      "end": 16,
      "entity": "city",
      "start": 9,
      "value": "ny city"
     }
    ]
   },

In particular it has "value": "new york city" and "value": "ny city" separately and no synonyms connecting them.

My understanding is that in #7 this problem was solved by making the values consistent, and later on synonyms were used instead, and somehow the behaviour has gone back to pre-#7 behaviour.

Personally I'm going to use version 1.2.2 until this is fixed.

from chatito.

alexmojaki avatar alexmojaki commented on May 31, 2024

Also you mention --formatOptions=somejson.json, but I see no information anywhere about what is supposed to go in somejson.json. Is it documented?

from chatito.

rodrigopivi avatar rodrigopivi commented on May 31, 2024

hi @alexmojaki

v2 is a rewrite of the library because it had issues with larger dataset generation that now are fixed.

you can pass a template of the output as a json file, if you are building a rasa dataset, you can pass something like this for your example:

format.json

{
    "rasa_nlu_data": {
        "regex_features": [],
        "entity_synonyms": [
            {
                "value": "newYork",
                "synonyms": [
                    "new york city",
                    "new york",
                    "ny city",
                    "ny"
                ]
            },
            {
                "value": "sanFrancisco",
                "synonyms": [
                    "san francisco",
                    "san francisco city"
                ]
            },
            {
                "value": "atlanta",
                "synonyms": [
                    "atlanta city"
                ]
            }
        ]
    }
}

then pass --formatOptions=format.json using the npm tool and it will contain those defaults.

That auto synonyms generation was removed because v2 is a rewrite and more generic, and that functionality is rasa specific, but will add i back soon, for now you can just use the solution provided.

from chatito.

YuukanOO avatar YuukanOO commented on May 31, 2024

This feature is clearly a must have!

from chatito.

anaportela avatar anaportela commented on May 31, 2024

Hi! How can I use the older version of chatito?

from chatito.

alexmojaki avatar alexmojaki commented on May 31, 2024

@anaportela uninstall your current version and then npm install '[email protected]'

from chatito.

anaportela avatar anaportela commented on May 31, 2024

Thank you. Also, in the new version is it possible to generate testing data?

from chatito.

UtpalDas6 avatar UtpalDas6 commented on May 31, 2024

npx chatito trainClimateBot.chatito --format=rasa --formatOptions=format.json
SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:71:38)
at Generator.next ()
at /home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:8:71
at new Promise ()
at __awaiter (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:4:12)
at __dirname (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:42:8)
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:95:4)
at Module._compile (module.js:643:30)
at Object.Module._extensions..js (module.js:654:10)
FULL ERROR REPORT:
SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:71:38)
at Generator.next ()
at /home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:8:71
at new Promise ()
at __awaiter (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:4:12)
at __dirname (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:42:8)
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:95:4)
at Module._compile (module.js:643:30)
at Object.Module._extensions..js (module.js:654:10)

Any help here is appreciated! How do I generate a rasa format json? Please help!! :(

from chatito.

rodrigopivi avatar rodrigopivi commented on May 31, 2024

just released version 2.1.0 with support for synonyms for rasa and snips, and also support for generating training and testing data on demand as part of the DSL spec @anaportela @YuukanOO @alexmojaki @manaschaturvedi

@UtpalDas6 the format is the default template to use for the rasa/snips/etc adapter, as you can see now in the online IDE it now allows you to edit that, if you still have any problems please feel free to open a new ticket and paste your example.

Thanks to all for your feedback, closing the ticket now

from chatito.

anaportela avatar anaportela commented on May 31, 2024

@rodrigopivi I tried to use the training data generation with ('training: '2', 'testing': '1') like the spec says and I have this error:

==== CHATITO SYNTAX ERROR ====
Expected integer but "'" found.
Line: 34, Column: 14

from chatito.

anaportela avatar anaportela commented on May 31, 2024

@rodrigopivi and the rasa synonyms are also not getting generated on the new version

from chatito.

rodrigopivi avatar rodrigopivi commented on May 31, 2024

hi @anaportela just tested from the web IDE and the npm cli package, both working fine.. can you please post your failing example to test?

Tested this code witht the rasa adapter:

%[findByCityAndCategory]('training': '2', 'testing': '1')
    ~[nearby] @[city]

~[nearby]
    close to
    in the area of
    within
    located at
    nearby

~[newYork]
    new york ~[city?]
    ny ~[city?]

~[sanFrancisco]
    san francisco
    san francisco city

~[atlanta]
    atlanta
    atlanta city

~[city]
    city

@[city]
    ~[newYork]
    ~[sanFrancisco]
    ~[atlanta]

And it generated this training dataset correctly for me:

{
 "rasa_nlu_data": {
  "regex_features": [],
  "entity_synonyms": [
   {
    "synonyms": [
     "ny city"
    ],
    "value": "newYork"
   },
   {
    "synonyms": [
     "san francisco city"
    ],
    "value": "sanFrancisco"
   }
  ],
  "common_examples": [
   {
    "text": "nearby ny city",
    "intent": "findByCityAndCategory",
    "entities": [
     {
      "end": 14,
      "entity": "city",
      "start": 7,
      "value": "ny city"
     }
    ]
   },
   {
    "text": "within san francisco city",
    "intent": "findByCityAndCategory",
    "entities": [
     {
      "end": 25,
      "entity": "city",
      "start": 7,
      "value": "san francisco city"
     }
    ]
   }
  ]
 }
}

Not sure what is your issue if you don't paste your example code

from chatito.

anaportela avatar anaportela commented on May 31, 2024

The error is in: %[ask_query]('training': '35000', 'testing': '5000').
And on the example you just posted, the synonyms for rasa are not generated. the values are still all the same as the text and not the name of the entity that it's the synonym.

from chatito.

rodrigopivi avatar rodrigopivi commented on May 31, 2024

@anaportela AFAIK the rasa spec works the same if i just pass the entity_synonyms value. So i think we are good

from chatito.

anaportela avatar anaportela commented on May 31, 2024

Rasa Docs state:

Alternatively, you can add an “entity_synonyms” array to define several synonyms to one entity value. Here is an example of that:
{
"rasa_nlu_data": {
"entity_synonyms": [
{
"value": "New York City",
"synonyms": ["NYC", "nyc", "the big apple"]
}
]
}
}
Note: Please note that adding synonyms using the above format does not improve the model’s classification of those entities. Entities must be properly classified before they can be replaced with the synonym value.

So I wouldn't say it's the same.

But still, I can't use the new code format: ('training': '35000', 'testing': '5000'). It says that it should be a integer and not a " ' ", like the old version.

from chatito.

anaportela avatar anaportela commented on May 31, 2024

Ok, now it's working. It was some updating issue.
But the synonyms, I still wouldn't say it's the same.

from chatito.

rodrigopivi avatar rodrigopivi commented on May 31, 2024

@anaportela i think from the practical sense it is the same, have you tried training a rasa nlu pipeline with that dataset? i think it should just work fine, or did you found issues?. AFAIK the named entity recognition model, just tags a word from the input, and the synonyms are just a post process step that replaces the words by the original synonym. Please let me know how it goes after training the nlu model

from chatito.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.