csvance / armchair-expert Goto Github PK

View Code? Open in Web Editor NEW

55.0 9.0 18.0 3.32 MB

Machine Learning Chatbot

License: MIT License

Python 100.00%

bot discord python machine-learning ai nlp twitter markov

armchair-expert's People

Contributors

Stargazers

Watchers

Forkers

iamvazu cocothehacker cloudxtreme god-father-xuez senvr jiwant lucmski naoufu cammand-slump yenatofu adn-a veanome zuse12345 tomatt131 watch-later roxyboxxy koo1140

armchair-expert's Issues

Twitter backend double reply

Seems like sometimes the twitter backend does a double reply for a single input. Investigate and fix.

Restore realtime markov learning

'ImportTrainingDataManager' is not defined

Traceback (most recent call last): File "import_text_file.py", line 43, in <module> main() File "import_text_file.py", line 28, in main data_manager = ImportTrainingDataManager() NameError: name 'ImportTrainingDataManager' is not defined

bot not auto-creating markov.json.zlib and structure-model.h5 in weights directory

When I start the bot I get it to the RUNNING state, however, I don't believe it is storing anything from my discord channel.

I noticed that when I check the weights directory it is empty. Whenever I place the test trump data that you provided in the weights folder, I get responses from the bot in discord. Should the python program be creating the markov.json.zlib and structure-model.h5 in the weights directory on start-up?

Images for reference:

Thanks!

Bot doesn't come online in discord

I've gone through so many steps to FINALLY get this bot working, but it's not coming online on discord. I've set the right token, username, and invited it with an OAuth link that has bot and administrator privileges. What am I doing wrong?

Is this what the console is supposed to look like?

Python namespace collision with common module

Hi !

I dropping your code into a docker container and used pigar to build a requirements.txt file and for some reason it pulled down common and installed it via pip3.

Took me a while to figure out why common.nlp couldn't be found.

Could you at a later date change your namespace to be a bit more exotic so other less capable python folk don't trip over this.

I'll update this with the docker container when it is up on dockerhub

Support for other chat / social backends

Backends for Twitter, Slack, and IRC should be easy to integrate and run in parallel with the rest of the codebase.

Replace dense capitalization model with RNN model

Instead of just having one feature that says a word is at the start of a sentence, an RNN potentially allows for more accuracy in the reproduction of capitalization style for each PoS.

Setup process needs feedback

The process of setting up the bot and training it initially doesn't provide much useful information about the learned knowledge of the bot. Features should be added to provide statistics on what the bot knows to the user.

Update sentence generation process using new embedding neural nets

Generation Process

Decide on subject word (NOUN or VERB or HASHTAG)
Fill in all NOUN, VERB, HASHTAG based on subject
Fill in all URL based on proximity with subject words. If no other subject PoS exists in the sentence structure, use the chosen initial subject word.
Fill in all ADJ, ADV based on proximity with NOUN and VERB
Fill in all EMOJI based on proximity with NOUN, VERB, ADJ, ADV, HASHTAG
Fill in all other PoS using a uniform distribution of collected words in relational database

ValueError: pvals < 0, pvals > 1 or pvals contains NaNs

When setting up this bot with my discord server, I get this exception randomly whenever someone sends the bot a message:

divide
  p_values = distance_magnitudes / sums
/Users/hwashere/Desktop/PROJECTS/python-copebot/common/ml.py:10: RuntimeWarning: divide by zero encountered in log
  preds = np.log(preds) / temperature
Traceback (most recent call last):
  File "copebot_python_edition.py", line 289, in <module>
    cpe.start(retrain_structure=args.retrain_structure, retrain_markov=args.retrain_markov)
  File "copebot_python_edition.py", line 95, in start
    self._main()
  File "copebot_python_edition.py", line 247, in _main
    reply = connector.generate(message, doc=doc)
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/connectors/connector_common.py", line 164, in generate
    return self._reply_generator.generate(message, doc)
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/connectors/bot_instance.py", line 17, in generate
    reply = ConnectorReplyGenerator.generate(self, message, doc, ignore_topics=[BOT_USERNAME.split('#')[0]])
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/connectors/connector_common.py", line 59, in generate
    sentences = generator.generate(db=self._markov_model)
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/markov_engine.py", line 364, in generate
    if not self._generate_words(db):
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/markov_engine.py", line 504, in _generate_words
    handle_projections()
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/markov_engine.py", line 473, in handle_projections
    word_choice_idx = temp(p_values, temperature=MARKOV_MODEL_TEMPERATURE)
  File "/Users/hwashere/Desktop/PROJECTS/python-copebot/common/ml.py", line 13, in temp
    probas = np.random.multinomial(1, preds, 1)
  File "mtrand.pyx", line 3863, in numpy.random.mtrand.RandomState.multinomial
  File "common.pyx", line 323, in numpy.random.common.check_array_constraint
ValueError: pvals < 0, pvals > 1 or pvals contains NaNs```

Any idea what could be causing this crash?

Seems like my bot is runaway: Just talking to itself

INFO:discord.http:A rate limit bucket has been exhausted

Is there a lever for lowering the responses or is this perhaps bad training data that is causing my bot to run rampant ?

Replace RDBMS depedent Markov chain engine with custom engine

The current model's performance is very poor for generation. Create a new markov engine using a trie database. Instead of a neighbor system, we should embed the frequency of the distance the word occurs from others so we can calculate the probability it will be used in a certain position.

AttributeError: 'bool' object has no attribute 'set'

Hi, firstly apologies because I hate to create issues, but I cant seem to fix this error by myself. I'm a bit of a noob.

It seems to finish loading and display the graphical info on training stats in the console, but after a few seconds another message appears that seems to be an error that stops the bot from working correctly.

Using Python 3.7, all modules installed and updated, on discord.

Console:

D:\Mirai\armchair-expert-master>c:\Users\Jake\AppData\Local\Programs\Python\Python37\python.exe D:\Mirai\armchair-expert-master\armchair_expert.py
INFO:ArmchairExpert:Status: STARTING_UP
INFO:ArmchairExpert:Loaded Discord Connector.
INFO:ArmchairExpert:Loading spaCy model
INFO:ArmchairExpert:Training begin
INFO:ArmchairExpert:Training_Preprocessing_Markov(Import)
INFO:ArmchairExpert:Training_Preprocessing_Markov(Discord)
INFO:ArmchairExpert:Training(Markov)
INFO:ArmchairExpert:Training_Preprocessing_Structure(Import)
INFO:ArmchairExpert:Training_Preprocessing_Structure(Discord)
INFO:ArmchairExpert:Training(Structure)
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:From c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
INFO:ArmchairExpert:Training end
INFO:ArmchairExpert:Status: RUNNING
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding_1 (Embedding)      (None, 16, 120)           14400
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               127488
_________________________________________________________________
dense_1 (Dense)              (None, 120)               15480
=================================================================
Total params: 157,368
Trainable params: 157,368
Non-trainable params: 0
_________________________________________________________________
Task exception was never retrieved
future: <Task finished coro=<ConnectionState._delay_ready() done, defined at c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\discord\state.py:286> exception=AttributeError("'bool' object has no attribute 'set'")>
Traceback (most recent call last):
  File "c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\discord\state.py", line 322, in _delay_ready
    self.call_handlers('ready')
  File "c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\discord\state.py", line 139, in call_handlers
    func(*args, **kwargs)
  File "c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\discord\client.py", line 207, in _handle_ready
    self._ready.set()
AttributeError: 'bool' object has no attribute 'set'

After this the bot still appears to be online, and direct messaging it yields further console text every time a message is sent, either direct or in a dm:

Ignoring exception in on_message
Traceback (most recent call last):
  File "c:\Users\Jake\AppData\Local\Programs\Python\Python37\lib\site-packages\discord\client.py", line 255, in _run_event
    await coro(*args, **kwargs)
  File "D:\Mirai\armchair-expert-master\connectors\discord.py", line 54, in on_message
    if message.server is None and DISCORD_LEARN_FROM_DIRECT_MESSAGE:
AttributeError: 'Message' object has no attribute 'server'

The bot never sends any messages out to discord. :(

Thanks,
-I.V.

Implement word embedding neural nets

Multiple embedding networks relating from subject words (NOUN, VERB, HASHTAG) to various PoS

NOUN

NOUN -> ADJ
NOUN -> VERB
NOUN -> NUM
NOUN -> SYM
NOUN -> HASHTAG
NOUN -> EMOJI
NOUN -> URL

VERB

VERB -> ADV
VERB -> NOUN
VERB -> NUM
VERB -> SYM
VERB -> HASHTAG
VERB -> EMOJI
VERB -> URL

HASHTAG

HASHTAG -> NOUN
HASHTAG -> VERB
HASHTAG -> ADV
HASHTAG -> ADJ
HASHTAG -> SYM
HASHTAG -> EMOJI
HASHTAG -> URL

Stuck on starting up page

Hello, when I tried running this and using it, it ever worked. It was just stuck on this page.
C:\Users\Mohit\PycharmProjects\SuperAIPR\venv\Scripts\python.exe C:/Users/Mohit/PycharmProjects/SuperAIPR/armchair-expert/armchair_expert.py
INFO:ArmchairExpert:Status: STARTING_UP
INFO:ArmchairExpert:Loaded Discord Connector.
INFO:ArmchairExpert:Loading spaCy model
2020-11-10 13:41:58.346187: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
INFO:ArmchairExpert:Training begin
INFO:ArmchairExpert:Training_Preprocessing_Markov(Import)
INFO:ArmchairExpert:Training_Preprocessing_Markov(Discord)
INFO:ArmchairExpert:Training(Markov)
INFO:ArmchairExpert:Training_Preprocessing_Structure(Import)
INFO:ArmchairExpert:Training_Preprocessing_Structure(Discord)
INFO:ArmchairExpert:Training(Structure)
2020-11-10 13:41:58.930290: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
INFO:ArmchairExpert:Training end
INFO:ArmchairExpert:Status: RUNNING
2020-11-10 13:42:08.089287: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-11-10 13:42:08.173283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 680 computeCapability: 3.0
coreClock: 1.0585GHz coreCount: 8 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 179.05GiB/s
2020-11-10 13:42:08.174598: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:20:00.0 name: Quadro P400 computeCapability: 6.1
coreClock: 1.2525GHz coreCount: 2 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 29.88GiB/s
2020-11-10 13:42:08.174914: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-11-10 13:42:08.234582: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-11-10 13:42:08.263904: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-11-10 13:42:08.272198: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-11-10 13:42:08.350489: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-11-10 13:42:08.376081: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-11-10 13:42:08.379728: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-11-10 13:42:08.380804: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1812] Ignoring visible gpu device (device: 0, name: GeForce GTX 680, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
2020-11-10 13:42:08.381653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1843] Ignoring visible gpu device (device: 1, name: Quadro P400, pci bus id: 0000:20:00.0, compute capability: 6.1) with core count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
2020-11-10 13:42:08.432344: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1715f876120 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-10 13:42:08.432594: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-10 13:42:08.434576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-10 13:42:08.434815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
Model: "sequential"

Layer (type) Output Shape Param #

embedding (Embedding) (None, 16, 120) 14400

lstm (LSTM) (None, 128) 127488

dense (Dense) (None, 120) 15480

Total params: 157,368
Trainable params: 157,368
Non-trainable params: 0

2020-11-10 13:42:09.270123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 680 computeCapability: 3.0
coreClock: 1.0585GHz coreCount: 8 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 179.05GiB/s
2020-11-10 13:42:09.270433: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:20:00.0 name: Quadro P400 computeCapability: 6.1
coreClock: 1.2525GHz coreCount: 2 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 29.88GiB/s
2020-11-10 13:42:09.272560: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-11-10 13:42:09.273504: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-11-10 13:42:09.275615: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-11-10 13:42:09.275742: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-11-10 13:42:09.275866: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-11-10 13:42:09.275991: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-11-10 13:42:09.276120: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-11-10 13:42:09.277873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1812] Ignoring visible gpu device (device: 0, name: GeForce GTX 680, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
2020-11-10 13:42:09.278200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1843] Ignoring visible gpu device (device: 1, name: Quadro P400, pci bus id: 0000:20:00.0, compute capability: 6.1) with core count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
2020-11-10 13:42:09.279155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-10 13:42:09.279284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 1
2020-11-10 13:42:09.279366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N N
2020-11-10 13:42:09.279511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 1: N N
2020-11-10 13:42:09.281033: I tensorflow/compiler/xla/service/platform_util.cc:139] StreamExecutor cuda device (0) is of insufficient compute capability: 3.5 required, device is 3.0
2020-11-10 13:42:09.282005: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17167ef3790 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-10 13:42:09.282233: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Quadro P400, Compute Capability 6.1
Its been like this for 20 minutes and I don't know what to do

Discord learning

Log data through discord to a database to use for scheduled training sessions
Train markov module on new data while running
Train neural nets on new data on startup

Twitter connector stops working over time

Detect when the userstream needs to be restarted and restart it.

After last recent update, bot only replies with Huh ?

I abandoned running armchair-expert in a docker container for now but thought I'd play with using youtube captions as a source for data. I've managed to load the data ok, but no matter what I say to the bot, it either replies back with the a single noun of what I wrote or a 'huh ?' .
I'd love to hear ideas on how to debug this. Thanks !

Embed sentence terminators in PoS tree

The punctuation marks . (sometimes), !, and ? indicate the end of a sentence. Store these in the PoS tree for use in generation later so the generated sentence matches the original classification of sentence in both punctuation and PoS. Check if spaCy can differentiate between period used in abbreviations and as a sentence terminator

Improve sentence structure model training time

It looks like the dual LSTM architecture and hidden size may be complete overkill, and hence causing very slow / ineffective training. Going to test moving changing the dim to 64 and removing one of the LSTM's entirely.

Also the number of epochs should be scaled by the amount of training data available.

Implement automated training

All unsupervised training operations should be able to be called in an automated fashion:
-PoS Tree
-Capitalization
-Word Embeddings
If the words in the embedding networks never change, the bot can become stale
Moving from a RDBMS model to a neural network model, updating weights based on reactions may not be completely intuitive, and we can't just add new words on the fly
The entire neural net should be regenerated for subject words every night
Feed lines that generated good reactions multiple times
Keep track of line reactions in the lines table, instead of word, wordassoc, and wordrelation
Configurable scheduling

MarkovFilter.filter_input destroys some usernames

Usernames with underscores don't seem to survive intact. Swap twitter handles out before doing filtering, then swap them back in.

Can't get to work

Okay, I have downloaded all the dependencies for this, but I'm still running into errors.
I have spent hours trying to get this to work, and I have no idea what I did wrong.
If your wondering, here are the errors
First the program runs into a syntax error:

Then at the end, it gets this error:

Create "madlib" system

Have the bot fill in a mad lib template based on a subject.

Replace RDBMS with custom Markov chain db

We have run into the performance ceiling with RDBMS for loading data and generating replies.

Going to look into creating an optimized Markov chain database. Some ideas:

Trie Structure
Should be able to completely run in RAM even with millions of words
Implemented in Cython for speed if necessary
Each word ends with a terminator flag as well as a usage count and rating
Each word also contains a list of neighbors and their occurrence rates / ratings

Can't pickle _thread.RLock objects

Hey. I've been trying to set this bot up for a Discord channel. I have very little experience with Python, so this was probably a bit too complex to dive into for me, but I thought I was making good progress. I got all the requirements and dependencies set up and managed to feed the training module a chunk of data that it processed just fine. But when it comes to actually connecting it throws up this:

Traceback (most recent call last):
File "C:\DiscordBot\Armchair\armchair_expert.py", line 334, in
ae.start(retrain_structure=args.retrain_structure, retrain_markov=args.retrain_markov)
File "C:\DiscordBot\Armchair\armchair_expert.py", line 106, in start
connector.start()
File "C:\DiscordBot\Armchair\connectors\connector_common.py", line 122, in start
self._scheduler.start()
File "C:\DiscordBot\Armchair\connectors\connector_common.py", line 101, in start
self._worker.start()
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\Barrin\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Twitter learning

Add a thread to the twitter frontend which pulls and stores new data for learning.

error

Traceback (most recent call last):
File "C:\Users\Admin\CBL.tech\MARKPVNBOT 2\FTBot-master\ftbot_discord.py", line 1, in
from ftbot import *
File "C:\Users\Admin\CBL.tech\MARKPVNBOT 2\FTBot-master\ftbot.py", line 2, in
from search import *
File "C:\Users\Admin\CBL.tech\MARKPVNBOT 2\FTBot-master\search.py", line 1, in
from googleapiclient.discovery import build
ImportError: No module named 'googleapiclient'

Cant start up the chatbot

I dont know what is the problem, i did everything as said in readme, i imported packages and i did the discord.py thing

Log important events

Implement some sort of logging framework, especially for exceptions and debug information.

markov_engine.py:141: RuntimeWarning: invalid value encountered in true_divide p_values = distance_magnitudes / sums

This appears while the bot is using Discord and responds to anyone..
After that it answers just sometimes..

Add discord history crawler

Implement a dump of discord channel history if possible with API

Persistant per channel configuration

Allow replyrate, shutup, wakeup to be done on a per channel basis for chat programs like Discord, Slack, IRC, etc. Source level configuration for private message / dm. Create DB tables for source (Discord, Slack, Twitter, etc), server (where applicable), and channel. Source and server can both store private message settings. First server will be checked, and if it does not exist or has NULL entires for the options, it is loaded from the source instead.

Add temperature system to sentence structure generation and markov word generation

Mix random and greediness for better output with a system like this:

def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

Can't merge non-disjoint spans. '🇦' is already part of tokens to merge.

I downloaded and updated all the modules that armchair requires. I imported my chat log using your text import tool. It works fine. But I get this after I start the bot.

INFO:ArmchairExpert:Training_Preprocessing_Markov(Import): 25.458330%
Traceback (most recent call last):
File "armchair_expert.py", line 344, in
ae.start(retrain_structure=args.retrain_structure, retrain_markov=args.retrain_markov)
File "armchair_expert.py", line 98, in start
self.train(retrain_structure=True, retrain_markov=retrain_markov)
File "armchair_expert.py", line 273, in train
self._train_markov(retrain_markov)
File "armchair_expert.py", line 223, in _train_markov
spacy_preprocessor = self._preprocess_markov_data(all_training_data=retrain)
File "armchair_expert.py", line 180, in _preprocess_markov_data
doc = self._nlp(MarkovFilters.filter_input(message[0].decode()))
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/spacy/language.py", line 402, in call
doc = proc(doc, **component_cfg.get(name, {}))
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/spacymoji/init.py", line 87, in call
retokenizer.merge(span)
File "_retokenize.pyx", line 56, in spacy.tokens._retokenize.Retokenizer.merge
ValueError: [E102] Can't merge non-disjoint spans. '🇦' is already part of tokens to merge.

Add Discord support

Need to add Discord support back into the bot though its own frontend

Improve and reintegrate sentiment analysis reinforcement learning

No invitation link with discord connector

I built tensorflow (r2.0) from source to add AVX2 FMA cpu support (and because basic pip installation wasn't working either).

Here what i got when i start python armchair_expert.py --retrain-structure (running on docker Ubuntu 18.04) :

INFO:ArmchairExpert:Status: STARTING_UP
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Using TensorFlow backend.
WARNING: Logging before flag parsing goes to stderr.
W0805 14:12:44.053263 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0805 14:12:44.061141 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0805 14:12:44.062849 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0805 14:12:44.131823 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

W0805 14:12:44.137647 139799969265472 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use rate instead of keep_prob. Rate should be set to rate = 1 - keep_prob.
W0805 14:12:44.326126 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0805 14:12:44.340184 139799969265472 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3341: The name tf.log is deprecated. Please use tf.math.log instead.

2019-08-05 14:12:44.352372: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3795880000 Hz
2019-08-05 14:12:44.352505: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1cbc580 executing computations on platform Host. Devices:
2019-08-05 14:12:44.352545: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,

Improve data storage performance

The training databases should only be indexed on whether the data is learned yet or not. Having a ton of (mostly unused) indexes makes inserts slower for no good reason.

Add new capitalization mode for pattern "aBcdefg"

CHAOS should match "aBcD" instead of "aB" like it currently does.

Refactor training script names

We should have a group of scripts for importing data into the DB lines table, and a seperate group of scripts for training the neural networks. Some scripts which import data into the DB lines table could use arguments instead of hardcoded paths.

armchair-expert does not shutdown completely when signaled

Each background thread should be torn down when the main threat / process receives a signal to shut down, ie. CTRL+C, SIGTERM, etc.

Move advanced configuration options to separate file

Most of the markov, reaction model, pos tree, and capitalization model config options should be in a separate file so they don't clutter up the main config.

Refactor database code to store PoS not handled in word embeddings

Since we are implementing word embedding neural nets in place of relational database for most learning and generation, have the database take care of things we don't need to have running in a neural net. The engine can use SQLite again, and separate tables can be created for each PoS we need to store to keep performance good.

The new tables should just contain id, text, and count fields. The sum(count) for calculating p-values can be calculated on startup and managed in memory instead of querying the database for it every time.

Bot doesn't input/output @Username correctly

When someone @'s the bot it seems to intake its ID string as "<@!7456453436456356>" instead of "@chatbot", and when it sends a message in discord it will say "<@!7456453436456356>" or someone else's username ID instead of "@chatbot" or "@taggeduser", any idea on how to fix this?

Web API Connector

Hey i know this project is old but would it be possible to add a webserver you can post messages to and get a response and learn from all the messages

how do i get my bot to learn on discord

i done everything but do i ping her so she learns or what do i do to make her speak please help