mottl / getoldtweets3 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jefferson-henrique/getoldtweets-python

365.0 365.0 127.0 124 KB

A Python 3 library and a corresponding command line utility for accessing old tweets

License: MIT License

Python 100.00%

python3 twitter twitter-api

getoldtweets3's People

Contributors

Stargazers

Watchers

Forkers

kickasssss nithu-singh humasak oalbishri guillaumepressiat sergeigks giulionf spedr ahmet-kaplan idea-nthu-taiwan leolizm grillguth linkedfiles choronz hiperjp dawars jackfhnp chicalin ndw inactivist solin1998 gabrieldaos jtaylor351 ujahbridget mawic lok-wong m-salti anurox2 mats16 dlbt96 miscek yunjusong bakut daesik82 jannikseuss aogier simonlindgren kaszklar kevboh parthnagarkar875 aubreyyu2929 gptix yasserelsedawy apple33333 hans-ekbrand therealkeyboardwarrior evenegas3 marmistrz emilbakke josemreis ffmpegd xhiroga ksmaheshkumar patronlargibi wendy-chang ultrabh 5l1v3r1 mccannical shreyaskorad cskyrocket atharva-lipare jasonraimondi vdpb20 lw6ege vishwesh10 skadoodle20 petraavramovic theoneandonlymike tzuling panji31 norahyyu ahmedyes2000 alameddinc atavacron johnjdailey amauricio lluissalord junzhi-wen garry3k hoferdy federovner xaiki nerdynucleon disenodc umbertho chinmuxmaximus ds-sekongo eatyofood gozuacik imathatguy vdamaster shantanuhadap ashwch daniaalkhadraa palmbook manalytics msbeni yiyunfanfan lenhhoxung86 rithima-reddy

getoldtweets3's Issues

how can i get city name and cordinates of each tweet?

Document is empty...sometimes

Describe the bug
Most of the times, seems for all but dates in 2015, I get "Document is empty". See below for examples. Any clue whats going on? Many thanks in advance.

WORKS
`import GetOldTweets3 as got

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('trump')
.setSince("2015-09-14")
.setUntil("2015-09-19")
.setMaxTweets(1)
tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0]
print(tweet.text)`

DOES NOT WORK
`import GetOldTweets3 as got

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('trump')
.setSince("2019-05-14")
.setUntil("2019-05-18")
.setMaxTweets(1)
tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0]
print(tweet.text)`

Yields:
tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0]
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/GetOldTweets3/manager/TweetManager.py", line 70, in getTweets
scrapedTweets = PyQuery(json['items_html'])
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyquery/pyquery.py", line 255, in init
elements = fromstring(context, self.parser)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyquery/pyquery.py", line 99, in fromstring
result = getattr(lxml.html, meth)(context)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/lxml/html/init.py", line 875, in fromstring
doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/lxml/html/init.py", line 764, in document_fromstring
"Document is empty")
lxml.etree.ParserError: Document is empty

How can i get Tweets that contains at least one of the keywords provided in QuerySearch()

I wonder if there is a way I can retrieve tweets with at least one of the keywords provided by the query search method.
I've tried this:

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('europe refugees')\
                                           .setSince("2015-05-01")\
                                           .setUntil("2015-09-30")\
                                           .setMaxTweets(10)
tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0]
print(tweet.text)

But I get in return the tweets that contain the two keywords europe and refugees. I want the first 10 tweets that contain either europe or refugees. Is this possible ?

Issue with --near

Hi, when i try to collect tweets from specific location it doesn't take into account the country. For example i tried --near "Vienna, Austria" and it returned tweets from California.

Retweets and favorites

hi,
thank you for great tool, it was very helpful
I am wondering, if there is any future plane to add new features such as collect all historical retweets and favorites?

results show just for 7 past days

this code worked fine and give back results by --querysearch from 2009 to 2019, but now it only gives me tweets for 7 days ago.
I also do changes in Jefferson-Henrique#230 but it doesn't work.

Help in running command line

I have never done this kind of thing before and usually never had to refer to an external library and hope I can get help here. So, I am trying to make sure I can run the command line to extract tweets to csv directly. My error is as follows:

GetOldTweets3 -h
Traceback (most recent call last):
File "", line 1, in
NameError: name 'h' is not defined

The pip will install the GetOldTweets3 folder into the site-packages. If I could get a tutorial on how and where to run command lines, that would be really helpful.

Just Can Get Tweets for This Month

I try to get using command
GetOldTweets3 --username "barackobama" --since 2015-09-10 --until 2015-09-12 --maxtweets 10
and I got nothing
Then I try to delete the time range like this
GetOldTweets3 --username "barackobama"
I got the tweet, but just for this month
How can I fix this problem?
I need tweets for last 10 years

cannot run pip install GetOldTweets3

Hi,
I tried to execute "pip install GetOldTweets3" in linux server. But it is not successfull and i got this message:

Could not find a version that satisfies the requirement GetOldTweets3 (from versions: )
No matching distribution found for GetOldTweets3

need help to solve this. thank you.

Multithreaded Date Range Based Download

Hello!

I've been modifying the source code to build a multi threaded crawler that downloads tweets in a given date range. I'm using these architecture to download a huge amount of tweets in a cluster. Since it seems to work pretty good I thought about sharing my code. Would you guys find useful this?

PD: I'm using these to download 60M of tweet, that are like 20GB of data. Using a non multithreaded scheme my program would have spent almost a month to download all the data. With the multithreaded scheme I can download it faster.

Cheers,
Victor

tweet.to property is wrong if the Tweet is a response to multiple people

Update to the latest version of GetOldTweets3 before committing the issue!

Describe the bug
tweet.to property is wrong if the Tweet is a response to multiple people

Solution
TweetManager.py needs to be changed from:
tweet.to = usernames[1] if len(usernames) == 2 else None
to:
tweet.to = usernames[1] if len(usernames) >= 2 else None

error in extracting all tweets

hello!
I was trying to extract tweets by running
python3 /Users/Jham/src/getoldtweets3/bin/GetOldTweets3 --usernames-from-file userlist.txt --since 2017-01-01 --until 2017-12-31

but I keep getting the following error:
Found 94 usernames in userlist.txt
Downloading tweets...
Saved 600Traceback (most recent call last):
File "/Users/Jham/src/getoldtweets3/bin/GetOldTweets3", line 206, in main
got.manager.TweetManager.getTweets(tweetCriteria, receiveBuffer, debug=debug)
File "/Users/Jham/src/getoldtweets3/GetOldTweets3/manager/TweetManager.py", line 88, in getTweets
rawtext = TweetManager.textify(tweetPQ("p.js-tweet-text").html(), tweetCriteria.emoji)
File "/Users/Jham/src/getoldtweets3/GetOldTweets3/manager/TweetManager.py", line 190, in textify
if "u-hidden" in attr["class"]:
KeyError: 'class'

'class'

Done. Output file generated "output_got.csv".

Thank you!!

How to export all tweets in csv file?

Update to the latest version of GetOldTweets3 before committing the issue!

Describe the bug
A clear and concise description of what the bug is.

debug.log
Run GetOldTweets with the --debug option:

GetOldTweets ... --debug > debug.log

Upload debug.log to somewhere like http://gist.github.com or https://pastebin.com and provide with the link to your debug.log.

For general issues with running GetOldTweets3
If you have a general question please provide with OS, Python version and the method you have used to install GetOldTweets3

[New feature] Add "replies count" field for Tweet

Is there a way to add the number of replies that a Tweet received? I think it would be similar to the favorites and retweets fields.

Wouldn't it be adding a line like this one?

tweet.replies = int(tweetPQ("span.ProfileTweet-action--reply span.ProfileTweet-actionCount").attr("data-tweet-stat-count").replace(",", ""))

After https://github.com/Mottl/GetOldTweets3/blob/master/GetOldTweets3/manager/TweetManager.py#L87

Can I get tweets' "via" information?

Hi, there!
This program is super cool!

I am using this program for analysis. To enhance reliability of that, I want to get tweets of only via "blah-blah". Is it possible? or, Is there any method in this program?

Possible to include re-tweets?

Hi there, thanks for this version - this is a very helpful tool.

I am doing a search by username bound by start and end dates and it works well. But I noticed that re-tweets are not returned, only the specific user's tweets..

Is it possible to make it so that tweets AND retweets are returned? Don't need a full release for this but if you could point me out where in the code I'd make the mods, would be super appreciated. I spent 12+ hours going through it but I didn't find a way to do it. Thanks a MILLION!

how can i get city name and cordinates of each tweet?

Can i get the location of tweet?

Update to the latest version of GetOldTweets3 before committing the issue!

Describe the bug
A clear and concise description of what the bug is.

debug.log
Run GetOldTweets with the --debug option:

GetOldTweets ... --debug > debug.log

Upload debug.log to somewhere like http://gist.github.com or https://pastebin.com and provide with the link to your debug.log.

For general issues with running GetOldTweets3
If you have a general question please provide with OS, Python version and the method you have used to install GetOldTweets3

Too Many Requests

I wonder if there is a way to break down the download into pieces and pause between two pieces to avoid the "Too Many Requests" error? I am getting tweets for one highly used word, and I want to break it into batches of 10,000 tweets and pause in the between batches.

Not all tweets collected

Hello,
Thanks to the contributors for this work !

I am having issues collecting all tweets related to a hashtag using the --querysearch argument.
Interestingly, the low number of tweets i am retrieving seem to be sampled from the whole database (i am obtaining tweets from as soon as 2011 even though my topic of interest has been buzzing in september 2018), and the number of tweets and the tweets themselves retrieved using a given query are still the same.

Going through Twitter search page by hand gives me the expected results, though, i.e. a lot of recent tweets.

I tried to use another ISP but the results remain the same.

Do you have any idea of where this could come from ?
I am using Python 3.7.2.

Thanks in advance,
Best,
J.

Problem getting started (SyntaxError)

Hi, I wanted to use GetOldTweets3 to download the tweets of a userlist and went through the installation using sudo pip install -e git+https://github.com/Mottl/GetOldTweets3#egg=GetOldTweets3 but whatever I do, even if I just want to call the -h, it just gives me an error:

Traceback (most recent call last):
File "/usr/local/bin/GetOldTweets3", line 6, in <module>
exec(compile(open(__file__).read(), __file__, 'exec'))
File "/GetOldTweets3-master/src/getoldtweets3/bin/GetOldTweets3", line 144
nonlocal cnt
^
SyntaxError: invalid syntax

The old GetOldTweets by Jefferson-Henrique works without a problem and I'm at a bit of a loss since I don't know anything about the nonlocal statement in this context. Could somebody maybe help me out here?

Always prints only one tweet. The output csv is not created.

Hi. I ran the code exactly as in the instructions.
However, it always print only one tweet no matter what, which doesn't make sense given the user and the period.
It doesn't matter whether I set the maximum number of tweets or not using .setMaxTweets(1)

Also, the output csv file is not created anywhere. How can I create the csv file containing the results?

Here are my code:

import GetOldTweets3 as got tweetCriteria = got.manager.TweetCriteria().setUsername("washingtonpost") \ .setSince("2019-01-01") \ .setUntil("2019-01-05") tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0] print(tweet.text)

Here are the output:

C:/............../GetOldTweets.py
How to be likable, by @petridisheshttps://wapo.st/2F9uL1J

Process finished with exit code 0

Can we set since parameter with format YYYY-MM-DD HH:MM:SS or YYYY-MM-DD HH:MM ?

I tested this option and it does not properly work.

Here is the log link

The log did not properly worked either, so here is the command line query:

python Exporter.py --querysearch "europe refugees" --maxtweets 1000 --since 2018-02-21 18:00:00 --debug > debug.log

Problem with VPN

Hi,

First, let me say thank you for your work! you have made my life so much easier in the last couple month.
Secondly, the program has been working flawlessly until today. Let me explain the issue. I use windows 10, pycharm, and run getoldtweet bash on ubuntu. The program still works when I am not connected to a VPN. However, in the past (until yesterday) I was able to run GetOldTweets3 without any issues when I was connected to my VPN.
I have attached the debug file and any help would be wonderful. I have not changed anything so I am not sure what is happening...

> GetOldTweets3 0.0.10
> Downloading tweets...
> https://twitter.com/i/search/timeline?f=tweets&vertical=news&q=Possession%20since%3A2012-05-30%20until%3A2012-09-04&src=typd&&include_available_features=1&include_entities=1&max_position=&reset_error_state=false
> Host: twitter.com
> User-Agent: Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko
> Accept: application/json, text/javascript, */*; q=0.01
> Accept-Language: en-US,en;q=0.5
> X-Requested-With: XMLHttpRequest
> Referer: https://twitter.com/i/search/timeline?f=tweets&vertical=news&q=Possession%20since%3A2012-05-30%20until%3A2012-09-04&src=typd&&include_available_features=1&include_entities=1&max_position=&reset_error_stat
> e=false
> Connection: keep-alive
> An error occured during an HTTP request: <urlopen error [Errno -3] Temporary failure in name resolution>
> Try to open in browser: https://twitter.com/search?q=Possession%20since%3A2012-05-30%20until%3A2012-09-04&src=typd
> 
> Done. Output file generated "output_got.csv".

https://gist.github.com/Appotrooper/adc89b1976e0dc96a4caa726103e9742

Thanks in advance!

Similar alternative for getting Followers / Followings list ?

Great library fo immense utility. Kudos to the developers.
I was wondering if there's a similar utility tool to grab the followers of a user too? It'd be extremely important application wise since the rate limit on downloading Followers/ Followings are even more painful than that of the tweets.
if there's already an existing solution or other alternatives, would you let me know?

How can I specify the time?

Update to the latest version of GetOldTweets3 before committing the issue!

Describe the bug
A clear and concise description of what the bug is.

debug.log
Run GetOldTweets with the --debug option:

GetOldTweets ... --debug > debug.log

Upload debug.log to somewhere like http://gist.github.com or https://pastebin.com and provide with the link to your debug.log.

For general issues with running GetOldTweets3
If you have a general question please provide with OS, Python version and the method you have used to install GetOldTweets3

Emoji[😄😭💢] included, Tweets

Can this system get a tweet containing emoji[😄😭💢]?

very thankful for this system.
Thanks

Any of the words in a text file and the compound words

Hi, thanks for your contribution.
I wonder how should I get tweets for a compound word like "summer festival"?
Also, how can I provide a list of words in a text file for the query?

setUntil or --until not returning tweets

Update to the latest version of GetOldTweets3 before committing the issue!

Describe the bug
Using "until" does not return any tweets
debug.log

https://gist.github.com/Fxa180017/282d4001795e99e41cd9cb23467328e1

Download stops after a lot of tweets

I tried to download tweets with guery-search 'bitcoin' since 2018-02-18 until 2018-02-19. The issue is that the script stoped before the end of the until parameter

The log was too big to put it all, so I deleted the log of the first 31000 tweets.

You can find the log here

Can this be because twitter detects a bot downloading a lot of tweets?

I get 0 tweets

Hello,

When I request tweets I don't receive any one. Yesterday it worked perfectly well

Do you know what can be the problem?

empty cells not divided by commas

In a CSV file generated, it appears that at least some of the cells that are empty are not marked properly - as a result the column for "favorites" contains what should be in "text", and so on - the database is unworkable as a result.

I'm wondering if there is a way to correct this on my side. It appears that it is broken in instances when a long www address is part of the text (with /?mbid=social_facebook&amp&utm_brand=p4k&amp... etc.) syntax

How to use this?

Hi. I have no background on porgramming, but I have a project on getting old tweets, but I can't seem to make it work. Can anyone kindly show /tell me how to do it step by step? Thank you.

Remove Warnings from code

Rename all variables and fix code structure to meet Python pep8 guidelines

How to run scripts using Pycharm IDE?

My question is how to run the code using Pycharm? which class should i execute and where i set the search parameters?

Thanks for advance.

Words from queries that contain the $ symbol are ignored

I need to search for tweets that contain a company name (e.g. Apple) AND the stock ticker symbol (e.g. $APPL). However, if I use the following command:

GetOldTweets3 --querysearch "Apple $APPL lang:en" --maxtweets 10

then the word $APPL is not present in any of the tweets in the output doc.

The same happens with any word that starts with $ symbol. The tool somehow ignores such words. What can I do to solve this issue?

Thank you in advance and thanks for the very useful tool.

Issue when using until for bound date search

My scraper uses a bound search to extract tweets for individual dates using the tweet criteria "QuerySearch", "Since" and "Until". Yesterday, it stopped returning a list if tweets and instead returned only an empty list. After some trying around, it looks like setting the "until" property leads to this faulty behaviour:

GetOldTweets3 --querysearch "europe refugees" --since 2015-09-10 --until 2016-09-11 --maxtweets 10 --output "out.csv" --debug
/home/lks/anaconda3/bin/GetOldTweets3 --querysearch europe refugees --since 2015-09-10 --until 2016-09-11 --maxtweets 10 --output out.csv --debug
GetOldTweets3 0.0.10
Downloading tweets...
https://twitter.com/i/search/timeline?f=tweets&vertical=news&q=europe%20refugees%20since%3A2015-09-10%20until%3A2016-09-11&src=typd&&include_available_features=1&include_entities=1&max_position=&reset_error_state=false
Host: twitter.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:63.0) Gecko/20100101 Firefox/63.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
X-Requested-With: XMLHttpRequest
Referer: https://twitter.com/i/search/timeline?f=tweets&vertical=news&q=europe%20refugees%20since%3A2015-09-10%20until%3A2016-09-11&src=typd&&include_available_features=1&include_entities=1&max_position=&reset_error_state=false
Connection: keep-alive
{"min_position":"thGAVUV0VFVBYBFgESNQAVACUAVQAVAAA=","has_more_items":false,"items_html":"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n \n","new_latent_count":0,"focused_refresh_interval":30000}
---


Done. Output file generated "out.csv".

When the --until option is not provided (or .setUntil() is not used) the scraper works as expected.

Keep having "Twitter weird response..."

When I ran a command,
"GetOldTweets3 --username "barackobama" --maxtweets 1",

I encountered following outcome,
" Downloading tweets...
Twitter weird response. Try to open in browser: https://twitter.com/search?q=%20from%3Abarackobama&src=typd
Unexpected error: <class 'urllib.error.URLError'>

Done. Output file generated "output_got.csv".

Does this problem happens only for me? Any idea how I can fix?

Exact number of tweet from different location

I try to run my query like this

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('#Barcelona')
.setSince("2017-01-01")
.setUntil("2018-12-31")
.setNear("'Medan, Indonesia'")
.setMaxTweets(100000000)

I got 1559 tweets for this code. Than I try to change the location using this query

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('#Barcelona')
.setSince("2017-01-01")
.setUntil("2018-12-31")
.setNear("'Jambi, Indonesia'")
.setMaxTweets(100000000)

I got 1559 tweets too. Logically, this is not true. Am I doing something wrong?

How to use GetOldTweets3

Hello.
Excuse me if this question is so stupid, buy how I use the GetOldTweets3 command??
I'm inside the folder GetOldTweets3-master and I run the example:
GetOldTweets3 --username "barackobama" --maxtweets 1
or
GetOldTweets3 -h

But I have: "GetOldTweets3 it is not recognized as an internal or external command,
program or batch file executable"

Exactly what I have to write where it said "GetOldTweets3" to run the code??

Thank you.

Tweets for big data infrastructure

Is it possible to run it in parallel conditions?

Issue to run the script

Hi,

I have an error when run GetOldTweets because the script not found and say: module 'GetOldTweets3' has no attribute 'manager'

Geocode search

Hi there,
I am trying to scrape tweets from a geographic coordinates point (not by place name!). Twitter search allows it with a query: "geocode:latitude,longitude,radius around the point". As there is no argument in the parser to specify geocode, I tried to just put my geocode into the query and search. Looks like Twitter generally gets it. But sometimes it gives inconsistent results:

query = 'geocode:30.3255568815976,-81.7671865745302,0.578km'
tweetCriteria = got.manager.TweetCriteria().setQuerySearch(query)
storage = got.manager.TweetManager.getTweets(tweetCriteria)

GetOldTweets3 --querysearch "geocode:30.3255568815976,-81.7671865745302,0.578km"

gives 5 tweets, but there are about 90 if I search the same query in Twitter manually.

Could this be happening because I use set.QuerySearch to search by coordinates instead of, say, set.Geocode (if this existed, which would be awesome, by the way)? If not, do you have any ideas why this is happening? Thanks!

Log file here

Could the program run multiple queries in parallel?

Hello!!

The program is awesome, congrats!

I am using the program to run very heavy queries that take a long time to be completed, and I was wondering if the program could be used to run queries on parallel. and if so, direction on what needs to be changed.

Geo column is empty

Why Geo column is coming empty on the output?

GOT THIS

An error occured during an HTTP request: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1051)>
Try to open in browser: https://twitter.com/search?q=%20from%3Abarackobama&src=typd

Some errors in output when searching for accented words

Hi Mottl, thanks to come alive again this scripts.

I am here for a possible bug (or misunderstanding) when searching for accented words. Please take a look to my log.

I, [2018-12-14T11:46:09.782525 #4968]  INFO -- :  - Processing Term: Caja de compensación 18
I, [2018-12-14T11:46:09.782637 #4968]  INFO -- :    CMD: /bin/bash -l -c 'cd ~/venv/get-old-tweets3_mottl && source bin/activate && GetOldTweets3 --output "/home/ubuntu/artool-utils/releases/20181207142540/public/dl/Twitter_20181214-084609_750-#2.csv" --querysearch "Caja de compensación 18" --since "2018-12-11" --until "2018-12-15"'
I, [2018-12-14T11:46:10.055077 #4968]  INFO -- pid 5243 exit 0: Downloading tweets...
Traceback (most recent call last):
  File "/home/ubuntu/venv/get-old-tweets3_mottl/bin/GetOldTweets3", line 171, in main
    got.manager.TweetManager.getTweets(tweetCriteria, receiveBuffer, debug=debug)
  File "/home/ubuntu/venv/get-old-tweets3_mottl/lib/python3.5/site-packages/GetOldTweets3/manager/TweetManager.py", line 65, in getTweets
    json = TweetManager.getJsonReponse(tweetCriteria, refreshCursor, cookieJar, proxy, user_agent, debug=debug)
  File "/home/ubuntu/venv/get-old-tweets3_mottl/lib/python3.5/site-packages/GetOldTweets3/manager/TweetManager.py", line 180, in getJsonReponse
    url = url % (urllib.parse.quote(urlGetData.strip()), urlLang, urllib.parse.quote(refreshCursor))
  File "/usr/lib/python3.5/urllib/parse.py", line 706, in quote
    string = string.encode(encoding, errors)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 18: surrogates not allowed

'utf-8' codec can't encode character '\udcc3' in position 18: surrogates not allowed

Done. Output file generated "/home/ubuntu/artool-utils/releases/20181207142540/public/dl/Twitter_20181214-084609_750-#2.csv".

I, [2018-12-14T11:46:10.055220 #4968]  INFO -- :  - Processing Term: Caja de compensacion 18
I, [2018-12-14T11:46:10.055336 #4968]  INFO -- :    CMD: /bin/bash -l -c 'cd ~/venv/get-old-tweets3_mottl && source bin/activate && GetOldTweets3 --output "/home/ubuntu/artool-utils/releases/20181207142540/public/dl/Twitter_20181214-084609_750-#3.csv" --querysearch "Caja de compensacion 18" --since "2018-12-11" --until "2018-12-15"'
I, [2018-12-14T11:46:10.491267 #4968]  INFO -- pid 5337 exit 0: Downloading tweets...

Done. Output file generated "/home/ubuntu/artool-utils/releases/20181207142540/public/dl/Twitter_20181214-084609_750-#3.csv".

As you can see the term "Caja de compensación 18" has an accented "ó" wich is giving the mentioned error backtrace. while the other term is fine.

Anyway, the tweets seem to be downloaded just fine too.

get age of birth date

This is a wonderful tool!
Is it also possible to retrieve the age or birth date of the users somehow?

limit by the origin of accounts

Hi, is it possible to get queries from the US accounts only?

Download Stop - list index out of range for timestamps not in UTC

Hi All,

I tried this program and it works when I tried to get data from a tweet in UK.
The problem I want to get a tweet from Asia countries (timestamps of tweets in the CSV are not in UTC) and after download some tweets, it stops immediately.

The example command:

GetOldTweets3 --near "Jakarta Pusat, DKI Jakarta" --within 100mi --since 2018-11-29 --until 2018-11-30

and the result says: "Saved 92600 list index out of range"

Why does this error happen? Is there anything that I need to modify if I download data from not not in UTC country? thank you

mottl / getoldtweets3 Goto Github PK

getoldtweets3's People

Contributors

Stargazers

Watchers

Forkers

getoldtweets3's Issues

Recommend Projects

Recommend Topics

Recommend Org