gunthercox / chatterbot-corpus Goto Github PK

View Code? Open in Web Editor NEW

1.3K 69.0 1.1K 549 KB

A multilingual dialog corpus

Home Page: http://chatterbot-corpus.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

chatterbot corpus dialog language yaml

chatterbot-corpus's People

Contributors

Stargazers

Watchers

Forkers

dmichelin kennyl chagge lixiang0 vangogh0318 devaiyaaz galeej deepak-k-zefr orinify kmmao rahulmirdha slbinilkumar sudophils flpvsk wrapperband madhu009 sporting rollioforce michab23 corelmax vkosuri eusouamaro ashrovy iamhssingh vignesh10 zhangyunfang amusingthrone xavi-reloaded jlin9327 tsingly petecummings ljp215 jeanpaiva42 ajoeajoe 4575759ww stinvl afcentry akatsuki06 anshulmalviya hamzah3058 owen864720655 pastorenue nilopc-python cleangsb yumingvvv111 tonyklose1984 changyuhang suresrp nhrkr chiransh jintan2000 danfouer vunb fanfanba melody-xiaomi virtadpt rjr thinhngo224 lzbgt hailiang-wang esda-lab mylvcs shahnaansari milesqli stevenlol little1tow benjamesbabala micdes sisinduku deepakr6242 kumsantosh property404 3dluis pathriclee angelsilvan uchile-robotics-forks anjiang2016 nsnietol lvsj dimitrismav tyge318 v-zion limpapud neufii krypton3 shiyongde indiclinguist ealiums lizheng29 mdasifbinkhaled iamdinithi edwarddunn maggie0830 fabturing shdeng zhang9song mght xubbo2009 coresoft2 innerface

chatterbot-corpus's Issues

How to implement the API's response in yaml?

I have created the same chatbot using SUSI, however the best part is that I can implement my built-in api's like this one using indentation with given queries entered by users:

https://github.com/kumar-mind/kumar-bot/blob/master/conf/kumar/en_train_finding_skill.txt

Can we do the same thing in this project? I would be more happy to know about this feature implementation. I have found that yaml is the best approach for providing skill sets instead of creating .txt

For your reference, I am adding this example:
https://kumar.rajendraarora.com

Tranning data taking ages, alternatives?

My code:

`from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
import os

bot = ChatBot('Bot')
bot.set_trainer(ListTrainer)

for files in os.listdir('PATH TO DATA'):
data = open('PATH TO DATA' + files, 'r', encoding="utf8").readlines()
bot.train(data)

while True:
message = input('You:')
if message.strip() != 'Bye':

    reply = bot.get_response(message)
    print('Chatbot :', reply)
if message.strip() == 'Bye':
        print('ChatBot : Bye')
        break`

Heya! I have this tranning data but when I start the program It takes ages on "List Trainer".... Can't I train the data one time and put it in a file or something and never train it again?

chatter bot training

i am using django for chatterbot i want to train the chatbot using seperate train .py file when i import the settings from chatterbot.ext.django_chatterbot i am getting an error

Traceback (most recent call last):
  File "train.py", line 3, in <module>
    from chatterbot.ext.django_chatterbot import settings
  File "C:\Users\nuthalapativ\AppData\Local\Continuum\anaconda3\lib\site-packages\chatterbot\ext\django_chatterbot\settings.py", line 8, in <module>
    CHATTERBOT_SETTINGS = getattr(settings, 'CHATTERBOT', {})
  File "C:\Users\nuthalapativ\AppData\Local\Continuum\anaconda3\lib\site-packages\django\conf\__init__.py", line 79, in __getattr__
    self._setup(name)
  File "C:\Users\nuthalapativ\AppData\Local\Continuum\anaconda3\lib\site-packages\django\conf\__init__.py", line 64, in _setup
    % (desc, ENVIRONMENT_VARIABLE))
django.core.exceptions.ImproperlyConfigured: Requested setting CHATTERBOT, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

can you help me on this

the program that i have written is

from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer
from chatterbot.ext.django_chatterbot import settings
bot = ChatBot(**settings.CHATTERBOT)

trainer =ChatterBotCorpusTrainer(bot)

trainer.train('./training_data/')

Server program to run this on browser takes an input but does not answer anything, any help would be appreciated.

` from SimpleWebSocketServer import SimpleWebSocketServer, WebSocket
from chatbot import get_response

 class ChatServer(WebSocket):

 def handleMessage(self):
    # echo message back to client
    message = self.data
    response = get_response(message)
    self.sendMessage(response)

 def handleConnected(self):
    print(self.address, 'connected')

 def handleClose(self):
    print(self.address, 'closed')

server = SimpleWebSocketServer('', 8000, ChatServer)
server.serveforever()`

ISSUE

C:\Users\user>cd desktop

C:\Users\user\Desktop>python ourchatbot.py
ourchatbot.py:8: SyntaxWarning: invalid escape sequence \D
for files in os.listdir('C:/Users/user\Desktop\chatterbot-corpus-master\chatterbot_corpus\data\english/'):
ourchatbot.py:9: SyntaxWarning: invalid escape sequence \D
data = open ('C:/Users/user\Desktop\chatterbot-corpus-master\chatterbot_corpus\data\english/'+ files,'r').readlines()
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data] C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data] Package stopwords is already up-to-date!
Traceback (most recent call last):
File "ourchatbot.py", line 6, in
bot.set_trainer(ListTrainer)
AttributeError: 'ChatBot' object has no attribute 'set_trainer'

Chatterbot reading .yml files as text files

When I try to train the bot in a .yml format file that is available in 'chatterbot_corpus/data/english/botprofile.yml', the bot responds with sentences along with hyphens. How can I train the bot in such a way that it replies without hyphens?

It seems as if it did not recognize the file as a .yml and is just reading it as a text file.

Readme "project documentation" link is broken.

It currently links here: "https://github.com/gunthercox/ChatterBot/wiki/Training", which opens a page to create a new wiki entry in the ChatterBot project.

I imagine that it should direct here instead?

Missing hyphen in English training corpus file conversations.yml

When training my bot on the provided English training corpora, there seems to be a problem when loading the file conversations.yml:

The question statement "- What languages do you like to use?" (line 90) is missing one hyphen, which causes the sentence to be split into single characters and to be stored as such in the database (in my case SQLite). This might lead to answers composed of single characters sometimes.

Can you check this problem and fix the file?
Thank you very much!

Identify "Corpus" by IETF language tag rather than just language

My suggestion is to Identify the "Corpus" by IETF language tag rather than just language. What are the benefits? it would allow for different dialects to be made independent as i might want to just "teach" just one of them.

Why i'm bringging this idea up?

For example the corpus for the Portuguese language taking a look at it seems to have mostly Portuguese Brazilian (pt-BR) strings and some, here and there, in the Portuguese (pt or pt-PT). Using said corpus makes the bot a bit of biliangual freak, i'f im allowed to call it that :P

Same goes for the english corpus which, i'd say (not totally sure, but from some expressions), is English (en or en-GB) with United States English (en-US) in it.

Chinese is another language with ALOT of dialects.. (but this one is an unkown to me as i have zero knowledge in the language)
The story goes on..

~ @duramato

This ticket was originally opened in the ChatterBot repository when the corpus files were a part of the main package. I've moved it here to keep track of it as the projects move forward.

Previous ticket: gunthercox/ChatterBot#504

Switch from ruamel.yaml to PyYAML

This is in response to gunthercox/ChatterBot#913 (comment)

PyYAML seems like a better choice considering that a number of large projects depend on it. For example, the docker-compose project https://github.com/docker/compose/blob/f9aaa72c54957ee834141bdf7f7ec9b656f26697/requirements.txt#L17

change in training data is not reflecting

Hi all,

I changed data in computers.yml and the executed the below code

bot=ChatBot('Bot')
trainer = ChatterBotCorpusTrainer(bot)
corpus_path='chatterbot-corpus-master/chatterbot_corpus/data/english/'

for file in os.listdir(corpus_path):
data=open('chatterbot-corpus-master/chatterbot_corpus/data/english/'+file,'r')
trainer.train(corpus_path + file)

while True:
message=input('you: ')
if message.strip()!='Bye':
reply=bot.get_response(message)
print('chatbot ',reply)
if message.strip()=='Bye':
print(' chat:', Bye)
break

Pls suggest the way forward

Thanks
vijay

support for regex and dynamic responses

What do you think it would take to have the possibility of using regex, for example, in the case of zipcode being the zipcode (i.e. if the user input is zipcode, the response is always the same).

The other part of this is "dynamic" responses where depending on the user input, an otherwise standard response is transformed in some way (e.g. confirming the zipcode to be correct).

What do you think?

Untrain or create new chatterbot

So I'm very new to python and chatterbot. I have started training a bot and have made some minor grammatical errors such as

Me: What is your name?
bot: Your name is DEX

Keeps saying this even after I set a new conversation trainer to try and correct it. So I'm trying to wipe the bot I currently have so I can start fresh. I can't seem to find where the data is stored. I'm using the python interpreter so I don't understand where the files are being stored. No files are attached to the python interpreter on my computer that has to do with chatterbot. Any help with be greatly appreciated thanks!

FileNotFoundError: [Errno 2] No such file or directory:

i am getting this error. does anyone know how to fix it?

CODE_
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
import os

bot = ChatBot('Bot')
bot.set_trainer(ListTrainer)

for files in os.listdir('C:/Users/nisha\chatterbot-corpus-master\chatterbot_corpus\data\english'):
data = open('C:/Users/nisha\chatterbot-corpus-master\chatterbot_corpus\data\english' + files ,'r').readlines()
bot.train(data)

while True:
message = input('You:')
if message.strip()!= 'Bye':
reply = bot.get_response(message)
print('ChatBot :',reply)
if message.strip() == 'Bye':
print('ChatBot : Bye')
break

ERROR-
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/nisha\chatterbot-corpus-master\chatterbot_corpus\data\englishai.yml'

Enable Read Only Mode

From which file of and which statement in chatterbot can we enable read only mode of chatterbot.

Please help !!

How do I change file location of Chatterbot_corpus data?

Hi, I have developed my own UI for creating Q&A for chatterbot_corpus data. I can test chatterbot responses also. They are working fine.
Now when i create Q&A from my project, the YAML files will be stored in location "C:/Python27/Lib/site-packages/chatterbot_corpus/data/english/". For the simplicity of my project, I want to move it into my project folder so that I don't have to use absolute path, but using relative path instead. Can I do it?

Can I set several related questions together before their common answers

How to define custom synonyms for words or sentences in yml file, any help would be appreciated.

For example, if I want the same answer (output) for eNodeB, eNB and evolved node B all 3 inputs.

.yml file:

The above code is not working it's just for example!

Error on file voc.tar

When I run below command, I get this error. Where do I get this file

python3 main.py -tr ./data/botprofile.yml -la 1 -hi 512 -lr 0.0001 -it 50000 -b 64 -p 500 -s 1000

Error:
FileNotFoundError: [Errno 2] No such file or directory: './save/training_data/botprofile/voc.tar

Update Pyyaml to match chatterbot

I just started using chatterbot,

It does not install chatterbot-corpus as a dependency after running pipenv install chatterbot.

In [1]: from chatterbot import ChatBot                                                                                                                                                                             
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-22d2a9e85e25> in <module>
----> 1 from chatterbot import ChatBot

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/__init__.py in <module>
      2 ChatterBot is a machine learning, conversational dialog engine.
      3 """
----> 4 from .chatterbot import ChatBot
      5 
      6 __version__ = '1.0.5'

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/chatterbot.py in <module>
      1 import logging
----> 2 from chatterbot.storage import StorageAdapter
      3 from chatterbot.logic import LogicAdapter
      4 from chatterbot.search import IndexedTextSearch
      5 from chatterbot import utils

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/storage/__init__.py in <module>
----> 1 from chatterbot.storage.storage_adapter import StorageAdapter
      2 from chatterbot.storage.django_storage import DjangoStorageAdapter
      3 from chatterbot.storage.mongodb import MongoDatabaseAdapter
      4 from chatterbot.storage.sql_storage import SQLStorageAdapter
      5 

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/storage/storage_adapter.py in <module>
      1 import logging
      2 from chatterbot import languages
----> 3 from chatterbot.tagging import PosHypernymTagger
      4 
      5 

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/tagging.py in <module>
      2 from chatterbot import languages
      3 from chatterbot import utils
----> 4 from chatterbot.tokenizers import get_sentence_tokenizer
      5 from nltk import pos_tag
      6 from nltk.corpus import wordnet, stopwords

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/tokenizers.py in <module>
      2 from nltk.tokenize.punkt import PunktSentenceTokenizer, PunktTrainer
      3 from nltk.tokenize import _treebank_word_tokenizer
----> 4 from chatterbot.corpus import load_corpus, list_corpus_files
      5 from chatterbot import languages
      6 

~/.local/share/virtualenvs/chatterbot-test-H1-ATVUx/lib/python3.7/site-packages/chatterbot/corpus.py in <module>
      3 import glob
      4 import yaml
----> 5 from chatterbot_corpus.corpus import DATA_DIRECTORY
      6 
      7 

ModuleNotFoundError: No module named 'chatterbot_corpus'

(which should probably an issue in Chatterbot)

Then when I try to install chatterbot-corpus I get the following error:

Installing chatterbot_corpus…
Collecting chatterbot-corpus
  Using cached https://files.pythonhosted.org/packages/ed/19/f8b41daf36fe4b0f43e283a820362ffdb2c1128600ab4ee187e84262fa4d/chatterbot_corpus-1.2.0-py2.py3-none-any.whl
Collecting PyYAML<4.0,>=3.12 (from chatterbot-corpus)
  Using cached https://files.pythonhosted.org/packages/9e/a3/1d13970c3f36777c583f136c136f804d70f500168edc1edea6daa7200769/PyYAML-3.13.tar.gz
Building wheels for collected packages: PyYAML
  Building wheel for PyYAML (setup.py): started
  Building wheel for PyYAML (setup.py): finished with status 'done'
  Stored in directory: /home/jonas/.cache/pipenv/wheels/ad/da/0c/74eb680767247273e2cf2723482cb9c924fe70af57c334513f
Successfully built PyYAML
Installing collected packages: PyYAML, chatterbot-corpus
  Found existing installation: PyYAML 5.1
    Uninstalling PyYAML-5.1:
      Successfully uninstalled PyYAML-5.1
Successfully installed PyYAML-3.13 chatterbot-corpus-1.2.0

Adding chatterbot_corpus to Pipfile's [packages]…
Pipfile.lock (6f58d5) out of date, updating to (2ccbda)…
Locking [dev-packages] dependencies…
Locking [packages] dependencies…

Warning: Your dependencies could not be resolved. You likely have a mismatch in your sub-dependencies.
  First try clearing your dependency cache with $ pipenv lock --clear, then try the original command again.
 Alternatively, you can use $ pipenv install --skip-lock to bypass this mechanism, then run $ pipenv graph to inspect the situation.
  Hint: try $ pipenv lock --pre if it is a pre-release dependency.
Could not find a version that matches pyyaml<4.0,<5.2,>=3.12,>=5.1
Tried: 3.10, 3.10, 3.11, 3.11, 3.12, 3.12, 3.12, 3.12, 3.12, 3.12, 3.12, 3.12, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 3.13, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1
Skipped pre-versions: 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13b1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 3.13rc1, 4.2b1, 4.2b2, 4.2b4, 4.2b4, 4.2b4, 4.2b4, 4.2b4, 5.1b1, 5.1b3, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b5, 5.1b7
There are incompatible versions in the resolved dependencies.

Notice how installing uninstalled the pyyaml version for chatterbot.
Running pipenv graph show there's a conflict in dependencies:

ChatterBot==1.0.5
  ...
  - pyyaml [required: >=5.1,<5.2, installed: 3.13]
  ...
chatterbot-corpus==1.2.0
  - PyYAML [required: >=3.12,<4.0, installed: 3.13]

pymongo.errors.OperationFailure for Ubuntu corpus

Hi,

Ive trained the chatterbot (for a while, maybe 15 hours) over the Ubuntu corpus. The training is not finished but it got aborted (my computer went to sleep) so I thought Id try it out to see what effect it had. However, I just get below error. Any idea what to do about this?

pymongo.errors.OperationFailure: distinct too big, 16mb cap

Thanks.

Dynamic repsonds

Is it possible to add a dynamic responds?
What I'm looking for is. If I ask the chatbot what time or date it is, I want to get the responds on the current time and date that the localhost has.

Thanks

Format for adding new corpus?

Hey,

I see that one is able to add new corpus:es. But how would one go about doing that for adding e.g. the Cornell Movie DB (https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html)? Would one need to parse the entire DB and create a huge json in the similar format as current corpus:es as per below?

[
"Good morning, how are you?",
"I am doing well, how about you?",
"I'm also good.",
"That's good to hear.",
"Yes it is."
],
...

Thanks.

Getting error: AttributeError: 'str' object has no attribute 'storage'

I am running the below code:

from chatterbot import ChatBot #import the chatbot
from chatterbot.trainers import ListTrainer # Method to train chatterbot
from chatterbot.trainers import ChatterBotCorpusTrainer
import os

bot= ChatBot('Bot')
#bot.set_trainer(ListTrainer)
trainer = ListTrainer('Bot')

for files in os.listdir ('C:/Users/XXX/Desktop/chatterbot-corpus-master/chatterbot_corpus/data/english/'):
data = open ('C:/Users/XXX/Desktop/chatterbot-corpus-master/chatterbot_corpus/data/english/' +files, 'r').readlines()
trainer.train('Bot')

while True:
message =input('You:')
if message.strip() != 'Bye':
reply =bot.get_response(message)
print('ChatBot:',reply)
if message.strip() == 'Bye':
print('ChatBot: Bye')
break

I am getting the below error while running the code Test1.py:
C:\Users\XXX>C:/Users/XXX\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/XXX\Desktop\ABC\Test1.py
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\XXX\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\XXX\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data] C:\Users\XXX\AppData\Roaming\nltk_data...
[nltk_data] Package stopwords is already up-to-date!
List Trainer: [####### ] 33%Traceback (most recent call last):
File "C:/Users/XXX\Desktop\ABC\Test1.py", line 12, in
trainer.train('Bot')
File "C:\Users\XXX\AppData\Local\Programs\Python\Python37-32\lib\site-packages\chatterbot\trainers.py", line 103, in train
statement_search_text = self.chatbot.storage.tagger.get_bigram_pair_string(text)
AttributeError: 'str' object has no attribute 'storage'

chatterbot keeps download nltk's package that already exist

whenever i run this, i always got the following error

from chatterbot import ChatBot
chatbot = ChatBot ('Joe')

Traceback (most recent call last):
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\utils.py", line 108, in nltk_download_corpus
find(resource_path)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\data.py", line 699, in find
raise LookupError(resource_not_found)
LookupError:

Resource �[93m�[0m not found.
Please use the NLTK Downloader to obtain the resource:

�[31m>>> import nltk

nltk.download('')
�[0m
Attempted to load �[93mstopwords/�[0m

Searched in:
- 'C:\Users\ASUS/nltk_data'
- 'C:\Users\ASUS\AppData\Local\Programs\Python\Python36\nltk_data'
- 'C:\Users\ASUS\AppData\Local\Programs\Python\Python36\share\nltk_data'
- 'C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\nltk_data'
- 'C:\Users\ASUS\AppData\Roaming\nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\chatterbot.py", line 58, in init
self.initialize()
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\chatterbot.py", line 80, in initialize
function()
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\utils.py", line 190, in download_nltk_stopwords
nltk_download_corpus('stopwords')
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\utils.py", line 110, in nltk_download_corpus
download(corpus_name)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\downloader.py", line 787, in download
for msg in self.incr_download(info_or_id, download_dir, force):
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\downloader.py", line 636, in incr_download
info = self._info_or_id(info_or_id)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\downloader.py", line 609, in _info_or_id
return self.info(info_or_id)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\downloader.py", line 1019, in info
self._update_index()
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\downloader.py", line 962, in _update_index
ElementTree.parse(urlopen(self._url)).getroot()
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\xml\etree\ElementTree.py", line 1196, in parse
tree.parse(source, parser)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python36\lib\xml\etree\ElementTree.py", line 597, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0

its seems like chatterbot always downloading nltk's package that already exist. is there any way to solve this? i tried to fix the ntlk download issue, but its no go. please help

Make corpus data available to other programming languages

Because the data files in the chatterbot corpus are not Python specific, it should be possible to add packaging for other programming languages as well. I would be interested in experimenting to see how successfully this could be accomplished with a language such as Node JS.

I am geeting this error.. tried all the versions.

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\santosh.rawat\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\santosh.rawat\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data] C:\Users\santosh.rawat\AppData\Roaming\nltk_data...
[nltk_data] Package stopwords is already up-to-date!
Traceback (most recent call last):
File "C:\Users\santosh.rawat\AppData\Local\Programs\Python\Python37\chatnew.py", line 13, in
chatbot.set_trainer(ListTrainer)
AttributeError: 'ChatBot' object has no attribute 'set_trainer'

python2 error

  "chatterbot.corpus.english.conversations"
  File "/usr/local/lib/python2.7/dist-packages/chatterbot/trainers.py", line 143, in train
    corpora = self.corpus.load_corpus(corpus_path)
  File "/usr/local/lib/python2.7/dist-packages/chatterbot_corpus/corpus.py", line 87, in load_corpus
    data_file_paths = self.list_corpus_files(dotted_path)
  File "/usr/local/lib/python2.7/dist-packages/chatterbot_corpus/corpus.py", line 76, in list_corpus_files
    paths = glob.glob(corpus_path + '/**/*.' + CORPUS_EXTENSION, recursive=True)
TypeError: glob() got an unexpected keyword argument 'recursive'

Adding my corpus.json failed

where is chatterbot.search file or IndexedTextSearch or logic for primary_search_algorithm or search_algorithm ? Any response would be appreciated.

Untraining Chatterbot

How to untrain Chatterbot once it is wrongly trained ?

Train from file

Hi,
I wonder if it possible to train it from a specific file with conversations.
For example specify the file name in example_app/settings.py:

CHATTERBOT = {
    'name': 'ChatterBot Example',
    'trainer': 'chatterbot.trainers.ChatterBotCorpusTrainer',
    'training_data': [
#        'chatterbot.corpus.english.greetings'
        './data/custom.corpus/custom_conversations.txt'
    ],
    'django_app_name': 'django_chatterbot'
}

And in data/custom.corpus/custom_conversations.txt to have statements and responses like:

How are you?
I am doing great, thanks.
Do you enjoy the weather today?
Not really.
We should play some baseball.
I would love to play baseball.

So when I run the python manage.py train command, to get all these answers in my SQLite database.
How is this done? Thank you

Supporting multi-language conversations

Right now, the format of the data organizes the dialog by language. It would be useful if the format of the corpus supported the ability to annotate individual statements to specify a language.

Add a default_locale attribute to corpus files: "Primary language for a conversation".
Individual statements should be able to support languages of different parts. (Including start and end index). A validator will be needed for testing.

Support multiple responses to the same statement

User story: As a developer, I want to be able to specify multiple responses to the same statement in a conversation so that I don't have to create multiple copies of the same conversation with each variation.

For example,

1. Hello how are you?
  a. I am well.
  b. I am not well.

Corpus to include entities?

Look at this question and hopefully let me know if this can be done.

I use the chatterbot library for creating a bot about food. I need it to learn about entities so that a question about meatballs can have the same response action like the question about pasta. Is it possible to make chatterbot learn entities or could we add a corpus like dictionary with all entities our bot should know about?

Where can I find more corpora for ChatterBot?

Hi,

Using the "built in english" corpus gives some decent answers but its really far to few to build a nice conversation bot.

Using the "built in Ubuntu corpus" gives an error (16 mbit pymongo error, already reported and hopefully soon worked on).

So, what choices do I have? Is there some "conversational corpus", which is not as big as the Unbuntu corpus to cause the pymongo errors, but still bigger than the built in english corpus? Im "only" looking to train the bot to hold a decent conversation, no need to debate meaning of life but should be reliable in "normal conversation".

Thanks.

Several responses to a question

Hello,
I have seen that in the "english", directory, in the "computers" file, several responses are written like that to the same questions :

- Question Q1
- A
- B
- ...
  whereas in most of others file, the question is repeated each time "ai.yml" for example)
- Question Q1
- A
- Question Q1
- B
  Both solutions work or not ? If yes, is there any difference, and is the first one recommended or not ?

Not clear for me :)

What function does Extra Data mean?

As you probably already understand I'm really start playing around with your cool project. (Thanks for your great work).

But what does the field Extra Data mean when I create my own Statement/Respond?

Thanks

Conditional response to questions

In chatterbot is it possible to have a response based on a condition? For example if the user replies yes to ". Does that help?", repl with "great!", if the user replies No, suggest another solution and ask the same "does that help" question again.

Basic question

Hi,
Looking though your corpus files.

What is the different between

- Bla bla bla
  and
- Haba haba haba

The 1st one, does that mean a new conversation? and the 2nd one is a continue on that conversation above?

Thanks
Christian

Inaccurate conversations in Hindi and Marathi due to literal translation.

This is one of the greetings in Hindi data

- क्या हो रहा है?
- आकाश के ऊपर है, लेकिन मैं। आप के बारे में क्या ठीक धन्यवाद कर रहा हूँ?
  This seems to be a literal translation of its English counterpart.
- What's up?
- The sky's up but I'm fine thanks. What about you?
  But the Hindi one doesn't make much sense because "क्या हो रहा है?" --> "What's going on?" and "आकाश के ऊपर है, लेकिन मैं। आप के बारे में क्या ठीक धन्यवाद कर रहा हूँ?" --> " The sky's up but I'm fine thanks. What am I doing"
  Same is the case with Marathi.
  I presume the Hindi and Marathi data were generated by translating English ones on Google Translate. There are some more minor inaccuracies in Hindi data.
  I could work on fixing this.
  I could also add data for Tamil.

Does Chatterbot use Neural Networks? What other machine learning algorithms does it use other than search-based algorithms?

Create a branch

@gunthercox can you please add me to this so I can create my branch and create a swedish corpus file and also document how to create your own training data.

Django statement and response are reversed

Hi,
I have create my on Corpus and successfully got the chatterbot successfully imported that data to Django so I can see that data under "admin -> Responds"

But when I ask a question that is exact the same writing as the Corpus files I got a total different answer.
And if I remove that line from the database and ask the same question again I got a different answer but not the correct one.

Is there anything I do wrong here?

Thanks

Is the corpus from chatterbot-corpus the full training data?

If not, where can we find the full training data? Thanks~

Adding corpuses in the Django app

How can I add existing corpuses and new corpus json files to the Django example app?

ModuleNotFoundError: No module named '_sqlite3'

Hi all,

I download the code and ran the command python setup.py install

we are run the code
import os
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer

bot=ChatterBot('Bot')
bot.set_trainer(ListTrainer)

for files in os.listdir('/chatterbot-corpus-master/chatterbot_corpus/data'):
data=open('chatterbot_corpus/data/english/'+files,'r')
bot.train(data)

while True:
message=input('you ')
if message.strip()!='Bye':
reply=bot.get_response(message)
print('chatbot ',reply)
if message.strip()=='Bye':
print(' chat :', Bye)
break

The error we encountered is
File "/usr/local/lib/python3.6/sqlite3/dbapi2.py", line 27, in
from _sqlite3 import *
ModuleNotFoundError: No module named '_sqlite3'

FYI
we are using python 3.6.3

Pls suggest

gunthercox / chatterbot-corpus Goto Github PK

chatterbot-corpus's People

Contributors

Stargazers

Watchers

Forkers

chatterbot-corpus's Issues

Why i'm bringging this idea up?

Recommend Projects

Recommend Topics

Recommend Org