10acad / git-get-started Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 64.0 2.73 MB

License: MIT License

Jupyter Notebook 100.00%

git-get-started's People

Contributors

Stargazers

Watchers

git-get-started's Issues

clean_tweets defined but not used

The clean_tweets function should be used to clean the tweets.

Change and add as below:

#calculate sentiment
filtered_tweet =self.clean_tweets(status['text'])
blob = TextBlob(filtered_tweet)
Sentiment = blob.sentiment
polarity = Sentiment.polarity
subjectivity = Sentiment.subjectivity

tweet preprocessing was incorrectly used in clean_tweet method

the preprocessing module was imported as py
but was used as p.clean( ) when cleaning the twitter data

Name error

107 
108         #page attribute in tweepy.cursor and iteration
109         for page in tweepy.Cursor(api.search, q=keyword,count=200, include_rts=False):

name api is not defined.

we can fix it by editing to self.api.search

api is not defined

To resolve this issue will have to add this line api = tweepy.API(auth) before using the api.search

The filtered_tweet variable (in Line 134) in the get_tweet function in the tweet_search class has been called but was defined in the clean_tweets function i.e it has not been defined in the get_tweet function.

Edit

Failed to import csv and re modules

'Cursor' object is not iterable

Error 'cursor' object is not iterable occurs on the following line:
for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False):
so we add .pages() to go through all the pages or .pages() to go through required number of pages.

for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False).pages():

Class not working

Class not working due to the following reasons:

the "self" key word was not defined in init method

It does not have "get_data()" method but "get_tweets()"

the "data/ethiopia_covid19_23june2020.json" does not exist in the directory but "covid19_23june2020.json"

A cvs file was parsed instead of json file

import & references Issues identified

Inconsistent use of variable names

The variable name given to the preprocessor library imported is consistent. ppr was used from the start and p was used to call method clean

import preprocessor as ppr
#use preprocessor
>> tweet = p.clean(twitter_text)

use of library regex without initial import

Regex was used to compile regex patterns but was never imported

We have to do an import re

Calling an unavailable class method on a class object

The method get_data in ts.get_data is not available in the class tweetsearch

df = ts.get_data(covid_keywords, csvfile=tweets_file)
It appears the intended method to be called is get_tweets.
hence we ought to have:

df = ts.get_tweets(covid_keywords, csvfile=tweets_file)

Missing modules.

Some modules have either not been imported or their aliases do not match what is in the code.

Examples

preprocessor: aliased as ppr during import but referenced as p in the tweetSearch() class
string import missing.

Wrong Instance

get_tweets should be called in place of get_data which was used in the code as nothing like get_data exists in the class

tweetfile name variable was changed

filename in the tweetfile variable as the name was not consistent
with the filename sent to us on the slack channel

tweetfile variable used ethiopia/........ but the filename didnt contain the word ethiopia

TweepError: Failed to send request: Only unicode objects are escapable. Got None of type <class 'NoneType'>.

please help me fix this.
Thanks

TweetSearch method called does not exist - get_tweets should be called instead of get_data

fixed the call to TweetSearch.get_data() to TweetSearch.get_tweets

Class Error

No self parameter passed into tweet search in def_init function for the tweets_search class.

init constructor is missing the parameter 'self'

def init(cols=None,auth=None):
if not cols is None:
self.cols = cols
In line 7, the self parameter is missing

Stream data and save it to file

In class tweetsearch(), the api can not be read by the function get_tweets().
Saved numbers of tweets from availed data was 1000.

json.decoder.JSONDecodeError: Extra data: line 2 column 1

Resolved this issue by
tweets = []
for line in open('covid19_23june2020.json', 'r'):
tweets.append(json.loads(line))

Error in xyz

line 6 xyz should be zyx because

The "init" constructor did not have the "self" parameter

There was no self parameter for the init constructor
There was also no concluding else statement for one of the if clauses in the init constructor

Name Error

Sqlite_connection not defined

Installing Tweepy

The snippet above gives errors; No module called ''tweepy'
The code should specify the essence of the commented pip installations

TypeError: 'Cursor' object is not iterable

This was fixed by adding .page() to this line of code
#page attribute in tweepy.cursor and iteration
for page in tweepy.Cursor(self.api.search,q=keyword,count=200, include_rts=False).pages():

Error in defined function def get_tweets()

The def get_tweets(self,keyword,csvfile=None) utilizes the api.search.
which culminates to the line below in the twitter_mining code:
#page attribute in tweepy.cursor and iteration for page in tweepy.Cursor(api.search, q=keyword,count=200, include_rts=False):

The correct reference is stated below:
#page attribute in tweepy.cursor and iteration for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False):

Reading the json file

When using the pandas read_json to read in the json file:

Incorrect syntax:

if not csvfile is None:
#If the file exists, then read the existing data from the CSV file.
if os.path.exists(csvfile):
df = pd.read_json('covid19_23june2020.json')

the code above will give the trailing data error, you can simply add lines = True as shown below

Correct syntax:

if not csvfile is None:
#If the file exists, then read the existing data from the CSV file.
if os.path.exists(csvfile):
df = pd.read_json('covid19_23june2020.json', lines=True)

Data Folder not FOund

Cursor Object

The cursor object is used to loop through data.
Therefore we append .pages() to the end of the line since we are looping through pages.

Class not working

Nice description of the problem and error you found

Installing other library

Not found:
!pip install textblob
!pip install preprocessor

aside: !pip install tweepy which was already given

Missing files

Hi,
When I try to run the Stream part of the Code, I find an error which is" No such file or directory". And I am having issues on adding the json file path.

cursor object and iteration and clean_tweets function

In the cursor object the argument is not correctly named and the pages are not called for iteration and the twitter function was not called.

Error resolve

After debugging the code, some crucial part that can be solve, I corrected them immediately but by running the code I keep on getting ValueError: Only unicode objects are escapable. Got None of type <class 'NoneType'>.
By: [email protected]

Twitter Mining errors

Constructor function for class tweetsearch does not instantiate with self
tweetsearch has no method get_data replace with get_tweets
replace for page in tweepy.Cursor(api.search, q=keyword,count=200, include_rts=False): with for page in tweepy.Cursor(self. api.search, q=keyword,count=200, include_rts=False).pages():
change tweet = p.clean(twitter_text) to tweet = ppr.clean(twitter_text)
change clean_tweets to @staticmethod since it does not require the instance of the class
fixed filtered words by adding clean_text = status['text'] filtered_tweet= self.clean_tweets(clean_text)
Import missing libraries import re #regular expression
nltk.download('punkt')
from nltk.tokenize import word_tokenize
Add twitter credentials to StdOutListener

ModuleNotFoundError: No module named 'tweetpy'

No module named 'tweetpy'

Failing to load 'twitter_mining.ipynb' in Jupyter

It says 'file is not in Json format'

ValueError:when reading in the json file

Some important libraries where not imported

Some libraries were not imported for cleaning the tweet data and also the clean data was not referenced, it was supposed to be referenced as self but it is not

'cursor' object is not iterable

107 
108         #page attribute in tweepy.cursor and iteration
109         for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False):

line 109 raised the error and it can be fixed with
for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False).pages():

The "init" constructor did not have the parameter "self"

Initialization of tweetsearch object doesn't have 'self' parameter

Parse Error

Code errors

In class tweetsearch, function init(), there's no 'self' keyword.
In class tweetsearch, function clean_tweets(), a typo occured when using import 'ppr'.
In class tweetsearch, package for 're' is used but has not been imported, line 85
In class tweetsearch, function clean_tweets(), package 'word_tokenize' is used but has no import, line 98
In function clean_tweets(), method string has been used but has not been imported, line 119
Function get_tweets(), there is unresolved reference to 'api.search', line 132
Function get_tweets(), there is unresolved reference to variable 'filtered_tweet',

Line of code in get_tweets() method missing self

This particular line of code in the get_tweets(self,keyword,csvfile=None) references the use of api.search.
It says:

#page attribute in tweepy.cursor and iteration for page in tweepy.Cursor(api.search, q=keyword,count=200, include_rts=False):
But it should be
#page attribute in tweepy.cursor and iteration for page in tweepy.Cursor(self.api.search, q=keyword,count=200, include_rts=False):

Error 404

TweepError: Twitter error response: status code = 400

I initially got a TypeError: 'Cursor' object is not iterable and so I added .pages() attribute to the object then I got the TweepError described above and I don't know what the issue is exactly...