Comments (18)
you can use the npm package with --formatOptions=somejson.json
file with the template, there you can define the alises. let me know if this works for you and i can close the ticket
from chatito.
I think there may be some misunderstanding.
Using this example:
%[findByCityAndCategory]
~[nearby] @[city]
~[nearby]
close to
in the area of
within
located at
nearby
~[newYork]
new york ~[city?]
ny ~[city?]
~[sanFrancisco]
san francisco
san francisco city
~[atlanta]
atlanta
atlanta city
~[city]
city
@[city]
~[newYork]
~[sanFrancisco]
~[atlanta]
The output from [email protected] starts as follows:
{
"rasa_nlu_data": {
"regex_features": [],
"entity_synonyms": [
{
"value": "newYork",
"synonyms": [
"new york city",
"new york",
"ny city",
"ny"
]
},
{
"value": "sanFrancisco",
"synonyms": [
"san francisco",
"san francisco city"
]
},
{
"value": "atlanta",
"synonyms": [
"atlanta city"
]
}
],
"common_examples": [
{
"text": "located at ny city",
"intent": "findByCityAndCategory",
"entities": [
{
"start": 11,
"end": 18,
"value": "newYork",
"entity": "city"
}
]
},
{
"text": "located at ny",
"intent": "findByCityAndCategory",
"entities": [
{
"start": 11,
"end": 13,
"value": "newYork",
"entity": "city"
}
]
},
Whereas now (2.0.0) it starts like this:
{
"rasa_nlu_data": {
"regex_features": [],
"entity_synonyms": [],
"common_examples": [
{
"text": "nearby new york city",
"intent": "findByCityAndCategory",
"entities": [
{
"end": 20,
"entity": "city",
"start": 7,
"value": "new york city"
}
]
},
{
"text": "close to ny city",
"intent": "findByCityAndCategory",
"entities": [
{
"end": 16,
"entity": "city",
"start": 9,
"value": "ny city"
}
]
},
In particular it has "value": "new york city"
and "value": "ny city"
separately and no synonyms connecting them.
My understanding is that in #7 this problem was solved by making the values consistent, and later on synonyms were used instead, and somehow the behaviour has gone back to pre-#7 behaviour.
Personally I'm going to use version 1.2.2 until this is fixed.
from chatito.
Also you mention --formatOptions=somejson.json
, but I see no information anywhere about what is supposed to go in somejson.json
. Is it documented?
from chatito.
hi @alexmojaki
v2 is a rewrite of the library because it had issues with larger dataset generation that now are fixed.
you can pass a template of the output as a json file, if you are building a rasa dataset, you can pass something like this for your example:
format.json
{
"rasa_nlu_data": {
"regex_features": [],
"entity_synonyms": [
{
"value": "newYork",
"synonyms": [
"new york city",
"new york",
"ny city",
"ny"
]
},
{
"value": "sanFrancisco",
"synonyms": [
"san francisco",
"san francisco city"
]
},
{
"value": "atlanta",
"synonyms": [
"atlanta city"
]
}
]
}
}
then pass --formatOptions=format.json
using the npm tool and it will contain those defaults.
That auto synonyms generation was removed because v2 is a rewrite and more generic, and that functionality is rasa specific, but will add i back soon, for now you can just use the solution provided.
from chatito.
This feature is clearly a must have!
from chatito.
Hi! How can I use the older version of chatito?
from chatito.
@anaportela uninstall your current version and then npm install '[email protected]'
from chatito.
Thank you. Also, in the new version is it possible to generate testing data?
from chatito.
npx chatito trainClimateBot.chatito --format=rasa --formatOptions=format.json
SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:71:38)
at Generator.next ()
at /home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:8:71
at new Promise ()
at __awaiter (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:4:12)
at __dirname (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:42:8)
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:95:4)
at Module._compile (module.js:643:30)
at Object.Module._extensions..js (module.js:654:10)
FULL ERROR REPORT:
SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:71:38)
at Generator.next ()
at /home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:8:71
at new Promise ()
at __awaiter (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:4:12)
at __dirname (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:42:8)
at Object. (/home/unix_roxx/ubot/node_modules/chatito/dist/bin.js:95:4)
at Module._compile (module.js:643:30)
at Object.Module._extensions..js (module.js:654:10)
Any help here is appreciated! How do I generate a rasa format json? Please help!! :(
from chatito.
just released version 2.1.0
with support for synonyms for rasa and snips, and also support for generating training and testing data on demand as part of the DSL spec @anaportela @YuukanOO @alexmojaki @manaschaturvedi
@UtpalDas6 the format is the default template to use for the rasa/snips/etc adapter, as you can see now in the online IDE it now allows you to edit that, if you still have any problems please feel free to open a new ticket and paste your example.
Thanks to all for your feedback, closing the ticket now
from chatito.
@rodrigopivi I tried to use the training data generation with ('training: '2', 'testing': '1') like the spec says and I have this error:
==== CHATITO SYNTAX ERROR ====
Expected integer but "'" found.
Line: 34, Column: 14
from chatito.
@rodrigopivi and the rasa synonyms are also not getting generated on the new version
from chatito.
hi @anaportela just tested from the web IDE and the npm cli package, both working fine.. can you please post your failing example to test?
Tested this code witht the rasa adapter:
%[findByCityAndCategory]('training': '2', 'testing': '1')
~[nearby] @[city]
~[nearby]
close to
in the area of
within
located at
nearby
~[newYork]
new york ~[city?]
ny ~[city?]
~[sanFrancisco]
san francisco
san francisco city
~[atlanta]
atlanta
atlanta city
~[city]
city
@[city]
~[newYork]
~[sanFrancisco]
~[atlanta]
And it generated this training dataset correctly for me:
{
"rasa_nlu_data": {
"regex_features": [],
"entity_synonyms": [
{
"synonyms": [
"ny city"
],
"value": "newYork"
},
{
"synonyms": [
"san francisco city"
],
"value": "sanFrancisco"
}
],
"common_examples": [
{
"text": "nearby ny city",
"intent": "findByCityAndCategory",
"entities": [
{
"end": 14,
"entity": "city",
"start": 7,
"value": "ny city"
}
]
},
{
"text": "within san francisco city",
"intent": "findByCityAndCategory",
"entities": [
{
"end": 25,
"entity": "city",
"start": 7,
"value": "san francisco city"
}
]
}
]
}
}
Not sure what is your issue if you don't paste your example code
from chatito.
The error is in: %[ask_query]('training': '35000', 'testing': '5000').
And on the example you just posted, the synonyms for rasa are not generated. the values are still all the same as the text and not the name of the entity that it's the synonym.
from chatito.
@anaportela AFAIK the rasa spec works the same if i just pass the entity_synonyms
value. So i think we are good
from chatito.
Rasa Docs state:
Alternatively, you can add an “entity_synonyms” array to define several synonyms to one entity value. Here is an example of that:
{
"rasa_nlu_data": {
"entity_synonyms": [
{
"value": "New York City",
"synonyms": ["NYC", "nyc", "the big apple"]
}
]
}
}
Note: Please note that adding synonyms using the above format does not improve the model’s classification of those entities. Entities must be properly classified before they can be replaced with the synonym value.
So I wouldn't say it's the same.
But still, I can't use the new code format: ('training': '35000', 'testing': '5000'). It says that it should be a integer and not a " ' ", like the old version.
from chatito.
Ok, now it's working. It was some updating issue.
But the synonyms, I still wouldn't say it's the same.
from chatito.
@anaportela i think from the practical sense it is the same, have you tried training a rasa nlu pipeline with that dataset? i think it should just work fine, or did you found issues?. AFAIK the named entity recognition model, just tags a word from the input, and the synonyms are just a post process step that replaces the words by the original synonym. Please let me know how it goes after training the nlu model
from chatito.
Related Issues (20)
- relex
- Unhandled crash when generating testing data HOT 3
- Online ide HOT 2
- Optional slots HOT 1
- [BUG] Slot regression between v2.1.5 and v.2.2.1 HOT 5
- Import failing HOT 2
- Weighted probability HOT 10
- Snips NLU output format error HOT 1
- 数据量太大,然后速度太慢了 HOT 2
- How can I add previous generated json file with new examples? HOT 1
- How can I add Number? HOT 1
- "Can't generate X examples" warning doesn't say which intent it is referring to HOT 2
- How to use Chatito in angularjs HOT 1
- Training/Testing Number Via Cli? HOT 2
- how to use regex_features? HOT 1
- Downloading dsl files? HOT 1
- How to start Chatito on local host HOT 1
- I got JavaScript heap out of memory when training HOT 1
- How to determine whether happened over-fit?
- Save entities for test HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatito.