llu13701 / sms_analysis Goto Github PK
View Code? Open in Web Editor NEWA stats analysis on how much you and your partner text to each other
A stats analysis on how much you and your partner text to each other
The readme says to download from here : python -m spacy download en_vectors_web_lg
However I think Spacy has changed their naming convention according to this github post.
I think the download should be changed to : python -m spacy download en_core_web_lg
/usr/local/bin/python3.9 /Users/xxx/code/persoonlijk/sms_analysis/simple_stats.py
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForNextSentencePrediction: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Enter your whatapp chat filename (ending in txt): WhatsApp-chat met xxx.txt
Please enter your whatsapp name: Xxxx
try whatsapp processing
file date_format is %d/%m/%y, %I:%M:%S %p
try messenger processing
date_format for the file is %d/%m/%y, %I:%M:%S %p
something is wrong with the file
cleaning messanger data
finish processing
Traceback (most recent call last):
File "/Users/xxx/code/persoonlijk/sms_analysis/simple_stats.py", line 508, in <module>
stats_collections()
File "/Users/xxx/code/persoonlijk/sms_analysis/simple_stats.py", line 485, in stats_collections
generate_master_summary(pd_text)
File "/Users/xxx/code/persoonlijk/sms_analysis/simple_stats.py", line 283, in generate_master_summary
custom_stopwords=identify_custom_stopwords(list_of_entire_text)
File "/Users/xxx/code/persoonlijk/sms_analysis/incoming_outgoing_msg.py", line 52, in identify_custom_stopwords
one_stdev=statistics.mean(diff)-0.7*statistics.stdev(diff)
File "/usr/local/Cellar/[email protected]/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/statistics.py", line 797, in stdev
var = variance(data, xbar)
File "/usr/local/Cellar/[email protected]/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/statistics.py", line 739, in variance
raise StatisticsError('variance requires at least two data points')
statistics.StatisticsError: variance requires at least two data points
Process finished with exit code 1
Hello! The second github link in the read me seems to be broken.
Thanks for making this. I'm trying to wrap my head around it now.
If the user's conversation includes certain characters, the following error appears
UnicodeDecodeError: 'charmap' codec can't decode byte X in position X: character maps to < undefined >
The issue stems from the input_into_list(file_name)
function in preprocessing_script.py
.
Changing a_file = open(file_name, "r")
to a_file = open(file_name, "r", encoding="utf8")
fixes the issue.
For simpler installation, the repo should have a requirements.txt file that lists each non standard pip module needed, so that one could simply run pip install -r requirements.txt
the file I have in mind would look like
pandas
matplotlib
emoji
spacy
nltk
transformers
torch
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz#egg=en_core_web_sm
https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.3.1/en_vectors_web_lg-2.3.1.tar.gz#egg=en_vectors_web_lg
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.