vladkens / twscrape Goto Github PK

2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.

Home Page: https://pypi.org/project/twscrape/

License: MIT License

Makefile 2.38% Python 97.62%

python twitter scraper twitter-api async automation httpx twitter-bot twitter-scraper snscrape

twscrape's Introduction

twscrape

Twitter GraphQL API implementation with SNScrape data models.

Install

pip install twscrape

Or development version:

pip install git+https://github.com/vladkens/twscrape.git

Features

Support both Search & GraphQL Twitter API
Async/Await functions (can run multiple scrapers in parallel at the same time)
Login flow (with receiving verification code from email)
Saving/restoring account sessions
Raw Twitter API responses & SNScrape models
Automatic account switching to smooth Twitter API rate limits

Usage

Since this project works through an authorized API, accounts need to be added. You can register and add an account yourself. You can also google sites that provide these things.

The email password is needed to get the code to log in to the account automatically (via imap protocol).

Data models:

User
Tweet

import asyncio
from twscrape import API, gather
from twscrape.logger import set_log_level

async def main():
    api = API()  # or API("path-to.db") - default is `accounts.db`

    # ADD ACCOUNTS (for CLI usage see BELOW)
    await api.pool.add_account("user1", "pass1", "[email protected]", "mail_pass1")
    await api.pool.add_account("user2", "pass2", "[email protected]", "mail_pass2")
    await api.pool.login_all()

    # or add account with COOKIES (with cookies login not required)
    cookies = "abc=12; ct0=xyz"  # or '{"abc": "12", "ct0": "xyz"}'
    await api.pool.add_account("user3", "pass3", "[email protected]", "mail_pass3", cookies=cookies)

    # API USAGE

    # search (latest tab)
    await gather(api.search("elon musk", limit=20))  # list[Tweet]
    # change search tab (product), can be: Top, Latest (default), Media
    await gather(api.search("elon musk", limit=20, kv={"product": "Top"}))

    # tweet info
    tweet_id = 20
    await api.tweet_details(tweet_id)  # Tweet
    await gather(api.retweeters(tweet_id, limit=20))  # list[User]

    # Note: this method have small pagination from X side, like 5 tweets per query
    await gather(api.tweet_replies(tweet_id, limit=20))  # list[Tweet]

    # get user by login
    user_login = "xdevelopers"
    await api.user_by_login(user_login)  # User

    # user info
    user_id = 2244994945
    await api.user_by_id(user_id)  # User
    await gather(api.following(user_id, limit=20))  # list[User]
    await gather(api.followers(user_id, limit=20))  # list[User]
    await gather(api.verified_followers(user_id, limit=20))  # list[User]
    await gather(api.subscriptions(user_id, limit=20))  # list[User]
    await gather(api.user_tweets(user_id, limit=20))  # list[Tweet]
    await gather(api.user_tweets_and_replies(user_id, limit=20))  # list[Tweet]

    # list info
    list_id = 123456789
    await gather(api.list_timeline(list_id))

    # NOTE 1: gather is a helper function to receive all data as list, FOR can be used as well:
    async for tweet in api.search("elon musk"):
        print(tweet.id, tweet.user.username, tweet.rawContent)  # tweet is `Tweet` object

    # NOTE 2: all methods have `raw` version (returns `httpx.Response` object):
    async for rep in api.search_raw("elon musk"):
        print(rep.status_code, rep.json())  # rep is `httpx.Response` object

    # change log level, default info
    set_log_level("DEBUG")

    # Tweet & User model can be converted to regular dict or json, e.g.:
    doc = await api.user_by_id(user_id)  # User
    doc.dict()  # -> python dict
    doc.json()  # -> json string

if __name__ == "__main__":
    asyncio.run(main())

Depraceted API methods (no more available in X)

favoriters (ref)
liked_tweets (ref)

Stoping iteration with break

In order to correctly release an account in case of break in loop, a special syntax must be used. Otherwise, Python's events loop will release lock on the account sometime in the future. See explanation here.

from contextlib import aclosing

async with aclosing(api.search("elon musk")) as gen:
    async for tweet in gen:
        if tweet.id < 200:
            break

CLI

Get help on CLI commands

# show all commands
twscrape

# help on specific comand
twscrape search --help

Add accounts

To add accounts use add_accounts command. Command syntax is:

twscrape add_accounts <file_path> <line_format>

Where: <line_format> is format of line if accounts file splited by delimeter. Possible tokens:

username – required
password – required
email – required
email_password – to receive email code (you can use --manual mode to get code)
cookies – can be any parsable format (string, json, base64 string, etc)
_ – skip column from parse

Tokens should be splited by delimeter, usually ":" used.

Example:

I have account files named order-12345.txt with format:

username:password:email:email password:user_agent:cookies

Command to add accounts will be (user_agent column skiped with _):

twscrape add_accounts ./order-12345.txt username:password:email:email_password:_:cookies

Login accounts

Note: If you added accounts with cookies, login not required.

Run:

twscrape login_accounts

twscrape will start login flow for each new account. If X will ask to verify email and you provided email_password in add_account, then twscrape will try to receive verification code by IMAP protocol. After success login account cookies will be saved to db file for future use.

Manual email verification

In case your email provider not support IMAP protocol (ProtonMail, Tutanota, etc) or IMAP is disabled in settings, you can enter email verification code manually. To do this run login command with --manual flag.

Example:

twscrape login_accounts --manual
twscrape relogin user1 user2 --manual
twscrape relogin_failed --manual

Get list of accounts and their statuses

twscrape accounts

# Output:
# username  logged_in  active  last_used            total_req  error_msg
# user1     True       True    2023-05-20 03:20:40  100        None
# user2     True       True    2023-05-20 03:25:45  120        None
# user3     False      False   None                 120        Login error

Re-login accounts

It is possible to re-login specific accounts:

twscrape relogin user1 user2

Or retry login for all failed logins:

twscrape relogin_failed

Use different accounts file

Useful if using a different set of accounts for different actions

twscrape --db test-accounts.db <command>

Search commands

twscrape search "QUERY" --limit=20
twscrape tweet_details TWEET_ID
twscrape tweet_replies TWEET_ID --limit=20
twscrape retweeters TWEET_ID --limit=20
twscrape user_by_id USER_ID
twscrape user_by_login USERNAME
twscrape following USER_ID --limit=20
twscrape followers USER_ID --limit=20
twscrape verified_followers USER_ID --limit=20
twscrape subscriptions USER_ID --limit=20
twscrape user_tweets USER_ID --limit=20
twscrape user_tweets_and_replies USER_ID --limit=20

The default output is in the console (stdout), one document per line. So it can be redirected to the file.

twscrape search "elon mask lang:es" --limit=20 > data.txt

By default, parsed data is returned. The original tweet responses can be retrieved with --raw flag.

twscrape search "elon mask lang:es" --limit=20 --raw

Proxy

There are few options to use proxies.

You can add proxy per account

proxy = "http://login:[email protected]:8080"
await api.pool.add_account("user4", "pass4", "[email protected]", "mail_pass4", proxy=proxy)

You can use global proxy for all accounts

proxy = "http://login:[email protected]:8080"
api = API(proxy=proxy)
doc = await api.user_by_login("elonmusk")

Use can set proxy with environemt variable TWS_RPOXY:

TWS_PROXY=socks5://user:[email protected]:1080 twscrape user_by_login elonmusk

You can change proxy any time like:

api.proxy = "socks5://user:[email protected]:1080"
doc = await api.user_by_login("elonmusk")  # new proxy will be used
api.proxy = None
doc = await api.user_by_login("elonmusk")  # no proxy used

Proxy priorities

api.proxy have top priority
env.proxy will be used if api.proxy is None
acc.proxy have lowest priotity

So if you want to use proxy PER ACCOUNT, do NOT override proxy with env variable or by passing proxy param to API.

Note: If proxy not working, exception will be raised from API class.

Environment variables

TWS_WAIT_EMAIL_CODE – timeout for email verification code during login (default: 30, in seconds)
TWS_RAISE_WHEN_NO_ACCOUNT – raise NoAccountError exception when no available accounts right now, instead of waiting for availability (default: false, possible value: false / 0 / true / 1)

Limitations

After 1 July 2023 Twitter introduced new limits and still continue to update it periodically.

The basic behaviour is as follows:

the request limit is updated every 15 minutes for each endpoint individually
e.g. each account have 50 search requests / 15 min, 50 profile requests / 15 min, etc.

API data limits:

user_tweets & user_tweets_and_replies – can return ~3200 tweets maximum

Articles

twscrape's People

Contributors

Stargazers

Watchers

Forkers

moral-drip variousred imwillhayes videah kohaku1907 quyen88 sabeehsaeed codilau piratekg sd016808 allibell mysticaltech melroy89 chauvietnam entasadar shahdpinky zicas2000 tchxi jveko zulfanahmad aatakansalar roozbehsam stygmate kernelzeroday litanid antiisaint erngky nielsoerbaek drmikexo2 bombfriedrice kzoink doveppp geneblue desufnocat phamson02 yaacine xcsync 0xtract0r zhugeliange oscar-king yemregundogmus plexvola scvsh dhiazulfa wong-andrew suakow nnmykn undercover-developer nilufanier soilspoon j-94 minecon724 industrious1 medunea adaniy alerossiaa devilscrypto lasfito cristhiamdaniel-cj traumaxp parelite bfagit pigglebear catdevnull bayugtt justinjay282 sadernalwis ritikkumarsahu matrix- yuki-hosokawa0829 lucasleray abreto andylolz provswin kaijunhan dhl1402 alfredkondoro mogsub kerwinchina tun24193temple bjsi chief-alchemist haoxin1998 dvscripts algonacci shunnmatsumura hideyuda sulyaniggui fujohnwang cawottexit xunjunyin sarbagyakafle kaif-00z evictbot chxlqqx tinghao-ai badr-elmazaz mhmdwldn emigrek krj3427

twscrape's Issues

count limit in api.py/search_raw(self, q: str, limit=-1, kv=None) is not used

count is always 20 "count": 20

Locking accounts on httpxRead Timeout

Hello, I added this exception in the req function in queue_client.py. I think it makes sense not to lock the accounts in this case.

except (httpx.ReadTimeout, httpx.ProxyError):
    retry_count += 1
    if retry_count >= 3:
        logger.warning(f"Httpx error {type(e)}: {e}")

Do you think thats a good ideea @vladkens ?

Twitter autoblocks accounts

I have tried twscrape with a new and a fairly active 1+ year twitter account. Both failed to login "Error logging in to : 'ct0'"

When I then try to login directly to Twitter, I get a "Suspicious login prevented" dialog and cannot login.

Any reason this is happening?

Request for Multi-Processing Example

Thank you for this amazing work! I've been using this project for a few days now, and it has been incredibly helpful in my work.

I'm currently working on a project where I need to search a large amount of keywords in parallel. However, I'm new to multi-processing and await function. I would like to see a simple example that I can build upon.

Could you please provide some examples that demonstrates how to use multi-processing to search a list of keywords in parallel?Thank you for your help!

tweet.quotedTweet and tweet.retweetedTweet return None for quoted or retweeted Tweet

The scraper user profile with replies Tweet model return None for tweet.quotedTweet and tweet.retweetedTweet. If the scraper is with raw function the value is inside json result.

Manual entry of email verification code

Make email_password optional and manually enter the verification code via the CLI command.

search tweets dont work when built docker container

My dockerfile:

FROM python:3.10-bullseye
WORKDIR /app
COPY stopwords.txt /app/stopwords.txt
RUN mkdir /app/uploads
COPY requirements.txt .
RUN pip install git+https://github.com/JustAnotherArchivist/snscrape.git
RUN pip install git+https://github.com/vladkens/twscrape.git
RUN pip install flask[async]
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install logstash
EXPOSE 5000
CMD ["python", "app.py"]

On my laptop, everything is working file but when I dockerize on vps ubuntu 20.04, it not working
Get accounts is working fine but search throw exeption:

Traceback (most recent call last):
  File "/app/twscrape_test.py", line 179, in <module>
    asyncio.run(main())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/app/twscrape_test.py", line 14, in main
    followings = await gather(api.following(user_id))
  File "/usr/local/lib/python3.10/site-packages/twscrape/utils.py", line 16, in gather
    async for x in gen:
  File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 188, in following
    async for rep in self.following_raw(uid, limit=limit, kv=kv):
  File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 184, in following_raw
    async for x in self._gql_items(op, kv, limit=limit):
  File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 63, in _gql_items
    obj = rep.json()
  File "/usr/local/lib/python3.10/site-packages/httpx/_models.py", line 756, in json
    return jsonlib.loads(self.text, **kwargs)
  File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Infinite Client error '400 Bad Request' when log in

Hello! Sometimes I got error:

2023-07-12 09:58:38.518 | ERROR    | twscrape.login:next_login_task:180 - Error in LoginEnterUserIdentifierSSO: Client error '400 Bad Request' for url 'https://api.twitter.com/1.1/onboarding/task.json'
For more information check: https://httpstatuses.com/400

Or 429:

2023-07-12 10:00:39.694 | ERROR    | twscrape.login:next_login_task:180 - Error in LoginEnterUserIdentifierSSO: Client error '429 Too Many Requests' for url 'https://api.twitter.com/1.1/onboarding/task.json'
For more information check: https://httpstatuses.com/429

Which repeats infinitely until I shut down the app, can you make error handler for this?

retweetedTweet key frequently equal None

maybe, again, a api change...?

Translation from the Translate button

Since translating with a normal translating library makes the final result REALLY bad, are you able to implement the translation which comes from the Translate button? Thanks in advance :)

Ability to get promoted hashtags start date and end date

Hello, are you able to implement the start/end date of promoted hashtags on Twitter?

Thanks in advance!

Edit: not needed anymore, but would be a nice feature.

Feature Request: Adding ability to get alt text of photos from Tweets

I would like to request the ability to get alt text of photos from Tweets

Tweets with limited replies are not scraped

in gql responses, these ones are contained in the key tweet of objects of type TweetWithVisibilityResults

here is a patch for to_old_rep function:

def to_old_rep(obj: dict) -> dict[str, dict]:
    tmp = get_typed_object(obj, defaultdict(list))

    tweets_with_visibility_results =  [
        x["tweet"] 
        for x in tmp.get("TweetWithVisibilityResults", []) 
        if "legacy" in x["tweet"]
    ]
    tweets_with_visibility_results = {str(x["rest_id"]): to_old_obj(x) for x in tweets_with_visibility_results}

    tweets = [x for x in tmp.get("Tweet", []) if "legacy" in x]
    tweets = {str(x["rest_id"]): to_old_obj(x) for x in tweets}

    tweets_with_visibility_results.update(tweets)
    tweets = tweets_with_visibility_results

    users = [x for x in tmp.get("User", []) if "legacy" in x and "id" in x]
    users = {str(x["rest_id"]): to_old_obj(x) for x in users}

    return {"tweets": tweets, "users": users}

Fetching Ads

Please implement a way to fetch ads from a username with twscrape.

Here is an example of ad from FortniteGame on twitter: https://twitter.com/FortniteGame/status/1696926119700709551

Also note that this ad does not come up when using api.user_tweets()

Thanks in advance @vladkens !

Unable to login accounts.

i'm using 0.6
i'm unable to login accounts.
the only way working is copy pasting cookies, which is pretty annoying and slow to do when adding accounts.

Login with cookie data

Hello,
As others mentioned, most of the time the login flow detects automated logins and blocks twscrape. I tried various workarounds and the only way to manage logging in it other libraries is by using the ct0 and auth_token cookies.
I tried adding a dict in tha accounts.db but still fails (403 Forbidden).
Is there any official way to use these cookies in twscrape?

for example:
await pool.add_account(cookies={"ct0" : "...". "auth_token" : "..."})

Any help is well apreciated, thanks!

get_user call hangs when accounts.db already exists

Hello, I would like to thank for this wonderful library,

while testing it today, I've came to this case: when running code in the
first run get_user_by_id call works correctly and return user information, and as I see accounts.db
sqlite db is created.

However, when I try to run the code 2nd time I see no login is being made, so I assume account session from
accounts.db is being reused, however call to get_user_by_id is never returned.

When enabled DEBUG mode, I see this looping:

No accounts available for queue 'UserByRestId' (sleeping for 5 sec)

Login with only User & PW

If 2FA is off on a Twitter account, can't we log in with just the username and password? If so can we have that method added?

what should I do if login failed with login_confirm_email error?

when I use api to login, sometimes it fails with login_confirm_email error, how can we deal with it?

ability to increase get_user_following count to more than 20

Sorry for opening second isssue,

In source code I see this snippet:

async def following_raw(self, uid: int, limit=-1): op = "IWP6Zt14sARO29lJT35bBw/Following" kv = {"userId": str(uid), "count": 20, "includePromotedContent": False} async for x in self._ql_items(op, kv, limit=limit): yield x

Would it be possible to pass custom count number to that call or Twitter by itself hard limits calls to 20 per request?

Wrong and missing t.co links

twscrape 0.7.0 (like 0.5.0) returns missing and wrong t.co links. Take this tweet as an example:

tweet = await api.tweet_details(1682072224013099008)
print(tweet)

tweet.rawContent returns the tweet as a text, and it has 4 t.co links and their respective unshortened version:
https://t.co/5ih4hl5Y78 = https://fn.gg/support (which redirects to this link)
https://t.co/odE9qJiOg3 = https://fn.gg/accounts (which redirects to this link)
https://t.co/CUfyetPlK8 = https://trello.com/b/Bs7hgkma/fortnite-community-issues
https://t.co/X55I5IuFkk = https://status.epicgames.com/

when going to tweet.links, only 1 link is given:
"links":[
{
"url":"http://fn.gg/support",
"text":"fn.gg/support",
"tcourl":"https://t.co/I1uF2iqjzX"
}
]

Not only that, but the tcourl is also not matching the one from the raw.Content

How is this so messy? @vladkens Thanks in advance brother

Keep requesting even the account has rate-limited

QueueClient will switch context whether current account is rate limited or not, while the context switching itself will unlock that account, cause the script to constantly switch between all loaded accounts no matter limited or not.

How about just remain the lock when switch context? To prevent using account that is already limited.

class RemainLocked(AccountsPool):
    async def unlock(self, username: str, queue: str, req_count=0):
        qs = f"""
        UPDATE accounts SET
            stats = json_set(stats, '$.{queue}', COALESCE(json_extract(stats, '$.{queue}'), 0) + {req_count}),
            last_used = datetime({utc_ts()}, 'unixepoch')
        WHERE username = :username
        """
        await execute(self._db_file, qs, {"username": username})

    async def get_for_queue(self, queue: str):
        qs = f"""
        SELECT * FROM accounts
        WHERE active = true AND (
            locks IS NULL
            OR json_extract(locks, '$.{queue}') IS NULL
            OR json_extract(locks, '$.{queue}') < datetime('now')
        )
        ORDER BY RANDOM()
        LIMIT 1
        """
        rs = await fetchone(self._db_file, qs)
        return Account.from_rs(rs) if rs else None

Add get_id_from_username() method

Please, add a method that allows getting the ID from a username. Otherwise, we would manually search and convert those usernames to IDs. Thanks.

support for proxies?

is proxies supported?

How do I Get recent followers

How do I get the recent of followers of any specified account ? It is giving only top one.

[CRUCIAL] Add exception when rate limit has been reached

Please add a way to know wether an account has been rate limited or not, probably an exception would work even though there may be more efficient ways to implement it

Thanks!

[twitter/question] EDIT: wrong repo

EDIT: sorry posted question to the wrong repo

No account available for queue "SearchTimeline"

Hello, thanks for your great work!
I'm getting a message "2023-07-31 13:12:01.671 | INFO | twscrape.accounts_pool:get_for_queue_or_wait:260 - No account available for queue "SearchTimeline". Next available at 23:40:06"

I'm using this code:

import asyncio
from twscrape import API, gather
from twscrape.logger import set_log_level

async def main():
api = API() # or API("path-to.db") - default is accounts.db
# ADD ACCOUNTS (for CLI usage see BELOW)
cookies_acc1 = xxx <<I use your method to obtain cookies from Issue #45 >>
cookies_acc2 = xxx
cookies_acc3 = xxx
cookies_acc4 = xxx

await api.pool.add_account("user1", "pass1", "[email protected]", "mail_pass1", cookies = cookies_acc1) <<of course I use my login data for the accounts>>
await api.pool.add_account("user2", "pass2", "[email protected]", "mail_pass2", cookies = cookies_acc2)
await api.pool.add_account("user3", "pass3", "[email protected]", "mail_pass3", cookies = cookies_acc3)
await api.pool.add_account("user4", "pass4", "[email protected]", "mail_pass4", cookies = cookies_acc4)
await api.pool.login_all()

# API USAGE

# search (latest tab)
await gather(api.search("elon musk", limit=20))  # list[Tweet]

# tweet info
tweet_id = 20
await api.tweet_details(tweet_id)  # Tweet
await gather(api.retweeters(tweet_id, limit=20))  # list[User]
await gather(api.favoriters(tweet_id, limit=20))  # list[User]

# get user by login
user_login = "twitterdev"
await api.user_by_login(user_login)  # User

# user info
user_id = 2244994945
await api.user_by_id(user_id)  # User
await gather(api.followers(user_id, limit=20))  # list[User]
await gather(api.following(user_id, limit=20))  # list[User]
await gather(api.user_tweets(user_id, limit=20))  # list[Tweet]
await gather(api.user_tweets_and_replies(user_id, limit=20))  # list[Tweet]

# list info
list_id = 123456789
await gather(api.list_timeline(list_id))

# NOTE 1: gather is a helper function to receive all data as list, FOR can be used as well:
async for tweet in api.search("elon musk"):
    print(tweet.id, tweet.user.username, tweet.rawContent)  # tweet is `Tweet` object

# # NOTE 2: all methods have `raw` version (returns `httpx.Response` object):
async for rep in api.search_raw("elon musk"):
    print(rep.status_code, rep.json())  # rep is `httpx.Response` object

# change log level, default info
set_log_level("DEBUG")

# Tweet & User model can be converted to regular dict or json, e.g.:
doc = await api.user_by_id(user_id)  # User
doc.dict()  # -> python dict
doc.json()  # -> json string

if name == "main":
asyncio.run(main())

My output is this:
2023-07-31 13:12:01.656 | WARNING | twscrape.accounts_pool:add_account:76 - Account user1 already exists
2023-07-31 13:12:01.659 | WARNING | twscrape.accounts_pool:add_account:76 - Account user2 already exists
2023-07-31 13:12:01.663 | WARNING | twscrape.accounts_pool:add_account:76 - Account user3 already exists
2023-07-31 13:12:01.665 | WARNING | twscrape.accounts_pool:add_account:76 - Account user4 already exists
2023-07-31 13:12:01.671 | INFO | twscrape.accounts_pool:get_for_queue_or_wait:260 - No account available for queue "SearchTimeline". Next available at 23:40:06

Is there something wrong with my accounts? Or I'm doing something wrong here? I have one account created on gmail, one on yahoo, one on hotmail and one on some Polish website (o2.pl).
Thanks for help in advance!

how tu use new!!

hello i want to get tweets by giving username of the account that i want to scrape plus start date since and end date until could u give me an example to do that please?

Add outfile param and compression ability in CLI

Add ability to specify output file and allow compress results for search parameters in the CLI.

Also, leave the ability to use these functions from the code, in the style:

with dump_to_file(outfile) as fp:
    async for doc in api.search():
        fp.write(doc.json())

Formats: .jsonl, .jsonl.gz

dump_to_file function should stream results into file
output format should be guessed from outfile extension until not explicitly indicated

usage like:

# will save into `q1.jsonl` file without compression
twscrape search "QUERY" --limit=20 --out q1.jsonl

# will save into `q2.jsonl.gz` file with compression
twscrape search "QUERY" --limit=20 --out q2.jsonl.gz

Get user links when scraping profile

Hi @vladkens, would it be possible to scrape twitter user profile links, if present?
Right now I can't see this data in returned User type. Or isn't it possible, since Twitter API
is not returning it?

Search endpoint changed

Hello,
Search now returns 404, watching other fellow scrapers, it seems that that endpoint got changed to the GraphQL version too.

Trying to deploy twscrape with no luck

Hi I tried to deploy a flask app that will return tweets when called using twscrape but I'm always getting this same error "sqlite version {version} is too old please user 3.41 or higher" I understand this error is coming from db.py under twscrape but it looks like every flask app host out their is using a version lower than 3.41 in sqlite and I can't seem to be able to update. How this might be fixed? can I pass the db at all login with the hard coded username and password ? thanks here's my code and here's my error

`
from flask import Flask, jsonify
from flask_cors import CORS
from twscrape import API, gather
import asyncio

app = Flask(name)
CORS(app)

async def main():
api = API() # or API("path-to.db") - default is accounts.db

# ADD ACCOUNTS (for CLI usage see BELOW)
await api.pool.add_account("myusername", "andmypassword", "[email protected]", "pass")
await api.pool.login_all()

tweets = await gather(api.user_tweets("1492111965644283910", limit=10))
tweets_array = []
for tweet in tweets:
    tweets_array = tweets_array + [{
        'id': tweet.id_str,
        'url':tweet.url,
        'date':str(tweet.date),
        'lang':tweet.lang,
        'rawContent':tweet.rawContent,
        'replyCount':tweet.replyCount,
        'retweetCount':tweet.retweetCount,
        'likeCount':tweet.likeCount,
        'hashtags':tweet.hashtags,
        'viewCount':tweet.viewCount,
        'place':tweet.place 
    }]
return tweets_array

@app.route('/')
def hello_world():
ls = asyncio.run(main())
return jsonify(ls)

if name == 'main':
app.run()`

my error I'm getting SQLite version [whatever version less than 3.41] is too old, please upgrade to 3.41

thanks

searching by date

Is searching by date supported?
I am trying to search by date with this:
search "elon since:2023-07-12_11:59:59_UTC until:2023-07-13_11:59:59_UTC
but getting results as far back as 2021.

Infinite loop sqlite when creating Docker

Hi, thanks for your job!

I read other issues related with sqlite3 and version 3.34 and 3.35 when building using a Debian OS.

But after switching to the suggested python:3.10-alpine version I got the following infinite loop, and it is not working basically because it is stuck on it. Although it worked locally (thats the strange thing...)

This is the loop (seems it is trying to reconnect):

DEBUG:aiosqlite:executing functools.partial(<built-in method fetchone of sqlite3.Cursor object at 0x7f4016cef9c0>)
DEBUG:aiosqlite:operation functools.partial(<built-in method fetchone of sqlite3.Cursor object at 0x7f4016cef9c0>) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method close of sqlite3.Cursor object at 0x7f4016cef9c0>)
DEBUG:aiosqlite:operation functools.partial(<built-in method close of sqlite3.Cursor object at 0x7f4016cef9c0>) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f40172caf40>)
DEBUG:aiosqlite:operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f40172caf40>) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method close of sqlite3.Connection object at 0x7f40172caf40>)
DEBUG:aiosqlite:operation functools.partial(<built-in method close of sqlite3.Connection object at 0x7f40172caf40>) completed
DEBUG:aiosqlite:executing <function connect.<locals>.connector at 0x7f4016cc5990>
DEBUG:aiosqlite:operation <function connect.<locals>.connector at 0x7f4016cc5990> completed
DEBUG:aiosqlite:executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f40172caf40>, 'SELECT SQLITE_VERSION()', [])
DEBUG:aiosqlite:operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f40172caf40>, 'SELECT SQLITE_VERSION()', []) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method fetchone of sqlite3.Cursor object at 0x7f4016cef9c0>)
DEBUG:aiosqlite:operation functools.partial(<built-in method fetchone of sqlite3.Cursor object at 0x7f4016cef9c0>) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method close of sqlite3.Cursor object at 0x7f4016cef9c0>)
DEBUG:aiosqlite:operation functools.partial(<built-in method close of sqlite3.Cursor object at 0x7f4016cef9c0>) completed
DEBUG:aiosqlite:executing functools.partial(<built-in method close of sqlite3.Connection object at 0x7f40172caf40>)
DEBUG:aiosqlite:operation functools.partial(<built-in method close of sqlite3.Connection object at 0x7f40172caf40>) completed
DEBUG:aiosqlite:executing <function connect.<locals>.connector at 0x7f4016cc6680>
DEBUG:aiosqlite:operation <function connect.<locals>.connector at 0x7f4016cc6680> completed
DEBUG:aiosqlite:executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f40172cb840>, "\n            UPDATE accounts SET locks = json_s

And this is my Dockerfile

FROM python:3.10-alpine
ARG SQLITE_Y=2021
ARG SQLITE_V=3350400

RUN pip install --upgrade pip
RUN python -c "import sqlite3;print(sqlite3.sqlite_version)"

# https://www.sqlite.org/chronology.html
RUN apk add build-base
RUN wget https://sqlite.org/${SQLITE_Y}/sqlite-autoconf-${SQLITE_V}.tar.gz -O sqlite.tar.gz \
    && tar xvfz sqlite.tar.gz \
    && cd sqlite-autoconf-${SQLITE_V} \
    && ./configure --prefix=/usr/local --build=aarch64-unknown-linux-gnu \
    && make \
    && make install \
    && cd .. \
    && rm -rf sqlite* \

RUN apk add --no-cache mariadb-connector-c-dev ;\
    apk add --no-cache --virtual .build-deps \
        build-base \
        mariadb-dev ;\
    pip install mysqlclient;\
    apk del .build-deps
RUN apk add --update --no-cache g++ gcc libxslt-dev
RUN python -m pip install --upgrade pip setuptools wheel
RUN python -m pip install Pillow

WORKDIR /app

COPY . .

RUN pip install --upgrade pip &&  \
    pip3 install --no-cache-dir -r requirements.txt && \
    pip3 install --no-cache-dir search-engine-parser

EXPOSE 8080

CMD ["python3", "-u", "./app.py"]
``

SQLITE Db error

Hi, thaks for your job.

I got this error when do this with my credentials

import asyncio
from twscrape import AccountsPool, API, gather
from twscrape.logger import set_log_level

async def main():
pool = AccountsPool() # or AccountsPool("path-to.db") - default is accounts.db
await pool.add_account("", "", "@gmail.com", "")

# log in to all new accounts
await pool.login_all()

api = API(pool)
async for tweet in api.search("elon musk"):
    print(tweet.id, tweet.user.username, tweet.rawContent)  # tweet is `Tweet` object

if name == "main":
asyncio.run(main())

Traceback (most recent call last):
File "/root/twscrape/test.py", line 17, in
asyncio.run(main())
File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/root/twscrape/test.py", line 10, in main
await pool.login_all()
File "/root/twscrape/twscrape/accounts_pool.py", line 86, in login_all
await self.login(x)
File "/root/twscrape/twscrape/accounts_pool.py", line 75, in login
await self.save(account)
File "/root/twscrape/twscrape/accounts_pool.py", line 67, in save
await execute(self._db_file, qs, data)
File "/root/twscrape/twscrape/db.py", line 16, in wrapper
raise e
File "/root/twscrape/twscrape/db.py", line 13, in wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/twscrape/twscrape/db.py", line 61, in execute
await db.execute(qs, params)
File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 184, in execute
cursor = await self._execute(self._conn.execute, sql, parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 129, in _execute
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 102, in run
result = function()
^^^^^^^^^^
sqlite3.OperationalError: near "UPDATE": syntax error

I test this with Debian 11 and Ubuntu 22.04

Sqlite error when running inside ubuntu docker container

Hi, I've tried running twscrape in dockerized FastAPI project, but getting this exception
when trying to get user by login:

File "/usr/local/lib/python3.11/site-packages/huntersdao/twitter_scraper/core.py", line 112, in get_user_by_login user = await self.api.user_by_login(user_login) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/api.py", line 118, in user_by_login rep = await self.user_by_login_raw(login, kv=kv) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/api.py", line 115, in user_by_login_raw return await self._gql_item(op, kv) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/api.py", line 62, in _gql_item async with QueueClient(self.pool, queue, self.debug) as client: File "/usr/local/lib/python3.11/site-packages/twscrape/queue_client.py", line 37, in __aenter__ await self._get_ctx() File "/usr/local/lib/python3.11/site-packages/twscrape/queue_client.py", line 56, in _get_ctx acc = await self.pool.get_for_queue_or_wait(self.queue) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/accounts_pool.py", line 162, in get_for_queue_or_wait account = await self.get_for_queue(queue) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/accounts_pool.py", line 157, in get_for_queue rs = await fetchone(self._db_file, q2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/db.py", line 20, in wrapper raise e File "/usr/local/lib/python3.11/site-packages/twscrape/db.py", line 17, in wrapper return await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/twscrape/db.py", line 126, in fetchone async with db.execute(qs, params) as cur: File "/usr/local/lib/python3.11/site-packages/aiosqlite/context.py", line 41, in __aenter__ self._obj = await self._coro ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 184, in execute cursor = await self._execute(self._conn.execute, sql, parameters) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 129, in _execute return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/aiosqlite/core.py", line 102, in run result = function() ^^^^^^^^^^ sqlite3.OperationalError: near "RETURNING": syntax error

Any ideas? I am using latest 0.2.1 version from PyPI.

Add parameter cursor in functions

I saw that there is a function to get the cursor:

    def _get_cursor(self, obj: dict):
        if cur := find_obj(obj, lambda x: x.get("cursorType") == "Bottom"):
            return cur.get("value")
        return None

but I also noticed that even if you get the cursor - pass it to the function through a parameter - it's impossible, only if you do it with the help of the _gql_items function, but I would like to see an implementation method in this library - simpler
Is it possible to add a cursor to the parameter and return in the function (eg followers) - the last received cursor or just return the cursor to each user?

user retweet content is not true full text

such as :

{
                                        "entryId": "tweet-1665951747842641921",
                                        "sortIndex": "1682387836784869348",
                                        "content": {
                                            "entryType": "TimelineTimelineItem",
                                            "__typename": "TimelineTimelineItem",
                                            "itemContent": {
                                                "itemType": "TimelineTweet",
                                                "__typename": "TimelineTweet",
                                                "tweet_results": {
                                                    "result": {
                                                        "__typename": "Tweet",
                                                        "rest_id": "1665951747842641921",
                                                        "core": {
                                                            "user_results": {
                                                                "result": {
                                                                    "__typename": "User",
                                                                    "id": "VXNlcjoxNTM5Nzc0OTEzOTY3NzE0MzA0",
                                                                    "rest_id": "1539774913967714304",
                                                                    "affiliates_highlighted_label": {},
                                                                    "has_graduated_access": true,
                                                                    "is_blue_verified": false,
                                                                    "profile_image_shape": "Square",
                                                                    "legacy": {
                                                                        "can_dm": false,
                                                                        "can_media_tag": true,
                                                                        "created_at": "Thu Jun 23 00:59:21 +0000 2022",
                                                                        "default_profile": true,
                                                                        "default_profile_image": false,
                                                                        "description": "AVIC International Holding Corporation (AVIC INTL) is a global share-holding enterprise affiliated to Aviation Industry Corporation of China (AVIC).",
                                                                        "entities": {
                                                                            "description": {
                                                                                "urls": []
                                                                            }
                                                                        },
                                                                        "fast_followers_count": 0,
                                                                        "favourites_count": 118,
                                                                        "followers_count": 5911,
                                                                        "friends_count": 110,
                                                                        "has_custom_timelines": true,
                                                                        "is_translator": false,
                                                                        "listed_count": 3,
                                                                        "location": "",
                                                                        "media_count": 172,
                                                                        "name": "AVIC INTL",
                                                                        "normal_followers_count": 5911,
                                                                        "pinned_tweet_ids_str": [],
                                                                        "possibly_sensitive": false,
                                                                        "profile_banner_url": "https://pbs.twimg.com/profile_banners/1539774913967714304/1656380265",
                                                                        "profile_image_url_https": "https://pbs.twimg.com/profile_images/1542841676573466625/Twwv8q0a_normal.jpg",
                                                                        "profile_interstitial_type": "",
                                                                        "screen_name": "_AVICINTL",
                                                                        "statuses_count": 244,
                                                                        "translator_type": "none",
                                                                        "verified": false,
                                                                        "verified_type": "Business",
                                                                        "want_retweets": false,
                                                                        "withheld_in_countries": []
                                                                    },
                                                                    "professional": {
                                                                        "rest_id": "1565386974147387392",
                                                                        "professional_type": "Business",
                                                                        "category": []
                                                                    }
                                                                }
                                                            }
                                                        },
                                                        "edit_control": {
                                                            "edit_tweet_ids": [
                                                                "1665951747842641921"
                                                            ],
                                                            "editable_until_msecs": "1686032423229",
                                                            "is_edit_eligible": false,
                                                            "edits_remaining": "5"
                                                        },
                                                        "edit_perspective": {
                                                            "favorited": false,
                                                            "retweeted": false
                                                        },
                                                        "is_translatable": false,
                                                        "views": {
                                                            "state": "Enabled"
                                                        },
                                                        "source": "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Web App</a>",
                                                        "legacy": {
                                                            "bookmark_count": 0,
                                                            "bookmarked": false,
                                                            "created_at": "Tue Jun 06 05:20:23 +0000 2023",
                                                            "conversation_id_str": "1665951747842641921",
                                                            "display_text_range": [
                                                                0,
                                                                140
                                                            ],
                                                            "entities": {
                                                                "user_mentions": [
                                                                    {
                                                                        "id_str": "87775422",
                                                                        "name": "China Daily",
                                                                        "screen_name": "ChinaDaily",
                                                                        "indices": [
                                                                            3,
                                                                            14
                                                                        ]
                                                                    }
                                                                ],
                                                                "urls": [],
                                                                "hashtags": [],
                                                                "symbols": []
                                                            },
                                                            "favorite_count": 0,
                                                            "favorited": false,
                                                            "full_text": "RT @ChinaDaily: Today marks the arrival of a traditional Chinese solar term called mangzhong, or Grain in Ear, signifying a busy farming pe…",
                                                            "is_quote_status": false,
                                                            "lang": "en",
                                                            "quote_count": 0,
                                                            "reply_count": 0,
                                                            "retweet_count": 10,
                                                            "retweeted": false,
                                                            "user_id_str": "1539774913967714304",
                                                            "id_str": "1665951747842641921",
                                                            "retweeted_status_result": {
                                                                "result": {
                                                                    "__typename": "Tweet",
                                                                    "rest_id": "1665863576060575744",
                                                                    "core": {
                                                                        "user_results": {
                                                                            "result": {
                                                                                "__typename": "User",
                                                                                "id": "VXNlcjo4Nzc3NTQyMg==",
                                                                                "rest_id": "87775422",
                                                                                "affiliates_highlighted_label": {},
                                                                                "has_graduated_access": true,
                                                                                "is_blue_verified": true,
                                                                                "profile_image_shape": "Circle",
                                                                                "legacy": {
                                                                                    "can_dm": true,
                                                                                    "can_media_tag": true,
                                                                                    "created_at": "Thu Nov 05 20:30:10 +0000 2009",
                                                                                    "default_profile": false,
                                                                                    "default_profile_image": false,
                                                                                    "description": "Start a conversation as we share news and analysis from #China and beyond. \nFB: China Daily \nTelegram: https://t.co/7nB28KWLCr…\nIns: chinadailynews",
                                                                                    "entities": {
                                                                                        "description": {
                                                                                            "urls": [
                                                                                                {
                                                                                                    "display_url": "t.me/chinadaily_off",
                                                                                                    "expanded_url": "http://t.me/chinadaily_off",
                                                                                                    "url": "https://t.co/7nB28KWLCr",
                                                                                                    "indices": [
                                                                                                        103,
                                                                                                        126
                                                                                                    ]
                                                                                                }
                                                                                            ]
                                                                                        }
                                                                                    },
                                                                                    "fast_followers_count": 0,
                                                                                    "favourites_count": 1155,
                                                                                    "followers_count": 4161285,
                                                                                    "friends_count": 568,
                                                                                    "has_custom_timelines": false,
                                                                                    "is_translator": false,
                                                                                    "listed_count": 6975,
                                                                                    "location": "Beijing, China",
                                                                                    "media_count": 110028,
                                                                                    "name": "China Daily",
                                                                                    "normal_followers_count": 4161285,
                                                                                    "pinned_tweet_ids_str": [
                                                                                        "1679729828105445377"
                                                                                    ],
                                                                                    "possibly_sensitive": false,
                                                                                    "profile_banner_url": "https://pbs.twimg.com/profile_banners/87775422/1683168821",
                                                                                    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1598185470353022976/-KlKi0WI_normal.jpg",
                                                                                    "profile_interstitial_type": "",
                                                                                    "screen_name": "ChinaDaily",
                                                                                    "statuses_count": 204098,
                                                                                    "translator_type": "none",
                                                                                    "verified": false,
                                                                                    "want_retweets": false,
                                                                                    "withheld_in_countries": []
                                                                                },
                                                                                "professional": {
                                                                                    "rest_id": "1473187902209536003",
                                                                                    "professional_type": "Creator",
                                                                                    "category": [
                                                                                        {
                                                                                            "id": 580,
                                                                                            "name": "Media & News Company",
                                                                                            "icon_name": "IconBriefcaseStroke"
                                                                                        }
                                                                                    ]
                                                                                }
                                                                            }
                                                                        }
                                                                    },
                                                                    "edit_control": {
                                                                        "edit_tweet_ids": [
                                                                            "1665863576060575744"
                                                                        ],
                                                                        "editable_until_msecs": "1686011401000",
                                                                        "is_edit_eligible": true,
                                                                        "edits_remaining": "5"
                                                                    },
                                                                    "edit_perspective": {
                                                                        "favorited": false,
                                                                        "retweeted": false
                                                                    },
                                                                    "is_translatable": false,
                                                                    "views": {
                                                                        "count": "5188",
                                                                        "state": "EnabledWithCount"
                                                                    },
                                                                    "source": "<a href=\"http://erased21340332_45IhsckRBy.com\" rel=\"nofollow\">erased21340332_45IhsckRBy</a>",
                                                                    "legacy": {
                                                                        "bookmark_count": 0,
                                                                        "bookmarked": false,
                                                                        "created_at": "Mon Jun 05 23:30:01 +0000 2023",
                                                                        "conversation_id_str": "1665863576060575744",
                                                                        "display_text_range": [
                                                                            0,
                                                                            128
                                                                        ],
                                                                        "entities": {
                                                                            "media": [
                                                                                {
                                                                                    "display_url": "pic.twitter.com/SQMrX99bWr",
                                                                                    "expanded_url": "https://twitter.com/ChinaDaily/status/1665863576060575744/video/1",
                                                                                    "id_str": "1665861064259686400",
                                                                                    "indices": [
                                                                                        129,
                                                                                        152
                                                                                    ],
                                                                                    "media_url_https": "https://pbs.twimg.com/ext_tw_video_thumb/1665861064259686400/pu/img/D1qkMrHVMTV_ifFk.jpg",
                                                                                    "type": "photo",
                                                                                    "url": "https://t.co/SQMrX99bWr",
                                                                                    "features": {},
                                                                                    "sizes": {
                                                                                        "large": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "medium": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "small": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "thumb": {
                                                                                            "h": 150,
                                                                                            "w": 150,
                                                                                            "resize": "crop"
                                                                                        }
                                                                                    },
                                                                                    "original_info": {
                                                                                        "height": 640,
                                                                                        "width": 296
                                                                                    }
                                                                                }
                                                                            ],
                                                                            "user_mentions": [],
                                                                            "urls": [],
                                                                            "hashtags": [],
                                                                            "symbols": []
                                                                        },
                                                                        "extended_entities": {
                                                                            "media": [
                                                                                {
                                                                                    "display_url": "pic.twitter.com/SQMrX99bWr",
                                                                                    "expanded_url": "https://twitter.com/ChinaDaily/status/1665863576060575744/video/1",
                                                                                    "id_str": "1665861064259686400",
                                                                                    "indices": [
                                                                                        129,
                                                                                        152
                                                                                    ],
                                                                                    "media_key": "7_1665861064259686400",
                                                                                    "media_url_https": "https://pbs.twimg.com/ext_tw_video_thumb/1665861064259686400/pu/img/D1qkMrHVMTV_ifFk.jpg",
                                                                                    "type": "video",
                                                                                    "url": "https://t.co/SQMrX99bWr",
                                                                                    "additional_media_info": {
                                                                                        "monetizable": false
                                                                                    },
                                                                                    "mediaStats": {
                                                                                        "viewCount": 961
                                                                                    },
                                                                                    "ext_media_availability": {
                                                                                        "status": "Available"
                                                                                    },
                                                                                    "features": {},
                                                                                    "sizes": {
                                                                                        "large": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "medium": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "small": {
                                                                                            "h": 640,
                                                                                            "w": 296,
                                                                                            "resize": "fit"
                                                                                        },
                                                                                        "thumb": {
                                                                                            "h": 150,
                                                                                            "w": 150,
                                                                                            "resize": "crop"
                                                                                        }
                                                                                    },
                                                                                    "original_info": {
                                                                                        "height": 640,
                                                                                        "width": 296
                                                                                    },
                                                                                    "video_info": {
                                                                                        "aspect_ratio": [
                                                                                            37,
                                                                                            80
                                                                                        ],
                                                                                        "duration_millis": 16000,
                                                                                        "variants": [
                                                                                            {
                                                                                                "content_type": "application/x-mpegURL",
                                                                                                "url": "https://video.twimg.com/ext_tw_video/1665861064259686400/pu/pl/pPnoCo_jLID-udSS.m3u8?tag=12&container=fmp4"
                                                                                            },
                                                                                            {
                                                                                                "bitrate": 632000,
                                                                                                "content_type": "video/mp4",
                                                                                                "url": "https://video.twimg.com/ext_tw_video/1665861064259686400/pu/vid/296x640/qQVrB3CenvchQsuP.mp4?tag=12"
                                                                                            }
                                                                                        ]
                                                                                    }
                                                                                }
                                                                            ]
                                                                        },
                                                                        "favorite_count": 27,
                                                                        "favorited": false,
                                                                        "full_text": "Today marks the arrival of a traditional Chinese solar term called mangzhong, or Grain in Ear, signifying a busy farming period. https://t.co/SQMrX99bWr",
                                                                        "is_quote_status": false,
                                                                        "lang": "en",
                                                                        "possibly_sensitive": false,
                                                                        "possibly_sensitive_editable": true,
                                                                        "quote_count": 5,
                                                                        "reply_count": 4,
                                                                        "retweet_count": 10,
                                                                        "retweeted": false,
                                                                        "user_id_str": "87775422",
                                                                        "id_str": "1665863576060575744"
                                                                    }
                                                                }
                                                            }
                                                        },
                                                        "quick_promote_eligibility": {
                                                            "eligibility": "IneligibleNotProfessional"
                                                        }
                                                    }
                                                },
                                                "tweetDisplayType": "Tweet"
                                            },
                                            "clientEventInfo": {
                                                "component": "tweet",
                                                "element": "tweet",
                                                "details": {
                                                    "timelinesDetails": {
                                                        "injectionType": "RankedOrganicTweet",
                                                        "controllerData": "DAACDAABDAABCgABAAAAAAAAAAAKAAkPUKvlsZRgAAAAAAA="
                                                    }
                                                }
                                            }
                                        }
                                    }

true full text is Today marks the arrival of a traditional Chinese solar term called mangzhong, or Grain in Ear, signifying a busy farming period. https://t.co/SQMrX99bWr but twscrape reture rawContent is RT @ChinaDaily: Today marks the arrival of a traditional Chinese solar term called mangzhong, or Grain in Ear, signifying a busy farming pe…

No accounts available for queue 'UserTweetsAndReplies'

Hello,

Thanks for this code.
When I use your template I get this debug message:

2023-05-27 14:19:12.137 | DEBUG | twscrape.accounts_pool:get_for_queue_or_wait:129 - No accounts available for queue 'UserTweetsAndReplies' (sleeping for 5 sec)
2023-05-27 14:19:17.142 | DEBUG | twscrape.accounts_pool:get_for_queue_or_wait:129 - No accounts available for queue 'UserTweetsAndReplies' (sleeping for 5 sec)


    await gather(api.search("elon musk", limit=1))  # list[Tweet]

    # graphql api
    tweet_id, user_id, user_login = 20, 2244994945, "twitterdev"

    await api.tweet_details(tweet_id)  # Tweet
    await gather(api.retweeters(tweet_id, limit=1))  # list[User]
    await gather(api.favoriters(tweet_id, limit=1))  # list[User]

    await api.user_by_id(user_id)  # User
    await api.user_by_login(user_login)  # User
    await gather(api.followers(user_id, limit=1))  # list[User]
    await gather(api.following(user_id, limit=1))  # list[User]
    await gather(api.user_tweets(user_id, limit=1))  # list[Tweet]
    await gather(api.user_tweets_and_replies(user_id, limit=1))  # list[Tweet]

What is wrong?

Raise exception when no free accounts pool

Hello, I would like to thank you for your project.

I encountered an unexpected issue where the methods hang in an infinite loop when there are no accounts available until a free account becomes available. This is a significant problem for me, and I need to handle such situations.

Could you please consider adding a setting that would change this behavior?

The API creation could look like this:
api = API(DB_PATH, raiseWhenNoAccount=True)

And the function responsible for getting an account would be like this:

async def get_for_queue_or_raise(self, queue: str) -> Account:
    account = await self.get_for_queue(queue)
    if not account:
        nat = await self.next_available_at(queue)
        raise NoAvalibleAccountException(f'No account available for queue "{queue}". Next available at {nat}')
    return account

get_sqlite_version() reports old version

Hello @vladkens ,
I have to comment out raising the SystemError on check_version() constantly because even though I have a compatible newer sqlite installation, this still triggers every time. Any ideea? On manually checking what version gets imported it's always the new one and also everything works as expected without any issues.

Is by posibility the function loading the system-wide install of sqlite3 instead of the one from the venv in which it is running?

Also, unrelated. Is a port to mysql (or option) on the to do list?

[CRUCIAL] 400 Bad Request on code from example

Running code from example

api = API()  # or API("path-to.db") - default is `accounts.db`

# ADD ACCOUNTS (for CLI usage see BELOW)
await api.pool.add_account("user1", "pass1", "[email protected]", "mail_pass1")
await api.pool.login_all()

Here is the logs:

2023-08-27 21:13:32.397 | WARNING  | twscrape.utils:raise_for_status:25 - login_confirm_email - 400 - {"errors":[{"code":399,"message":"Incorrect. Please try again."}]}
2023-08-27 21:13:32.397 | ERROR    | twscrape.login:next_login_task:181 - Error in LoginAcid: Client error '400 Bad Request' for url 'https://api.twitter.com/1.1/onboarding/task.json'
For more information check: https://httpstatuses.com/400
2023-08-27 21:13:32.555 | WARNING  | twscrape.utils:raise_for_status:25 - login_confirm_email - 400 - {"errors":[{"code":399,"message":"Incorrect. Please try again."}]}

Twitter Blue / Verified Accounts

I've heard that rate limits differ between regular and X Premium / Twitter Blue accounts. Is this true? If so, can we change the limits when signed in with a Premium account?

Unable to fetch tweets

It would be a good implementation to add an exception where you can catch whether or not you have been rate limited. I am not able to fetch tweets and I think the account I am using has been rate limited, but have no way to make sure it is the case.

Edit: actually I have not been rate limited but it is taking WAY longer to fetch tweets. why?

get_user_by_id call sometimes fails

Hi again, @vladkens. After running lib for some time in test environment, I've observed that sometimes
I receive this error, when calling api.get_user_by_id(id):

File "/usr/local/lib/python3.11/site-packages/huntersdao/twitter_scraper/core.py", line 117, in get_user_by_id
    user = await self.api.user_by_id(user_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twscrape/api.py", line 107, in user_by_id
    res = rep.json()
          ^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'json'

Could it be somehow related to the fact of accounts getting blocked or restricted
temporarily?

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/home/carlos/.local/lib/python3.11/site-packages/aiosqlite/core.py", line 109, in run
    get_loop(future).call_soon_threadsafe(set_result, future, result)
  File "/usr/lib/python3.11/asyncio/base_events.py", line 806, in call_soon_threadsafe
    self._check_closed()
  File "/usr/lib/python3.11/asyncio/base_events.py", line 519, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/home/carlos/.local/lib/python3.11/site-packages/aiosqlite/core.py", line 117, in run
    get_loop(future).call_soon_threadsafe(set_exception, future, e)
  File "/usr/lib/python3.11/asyncio/base_events.py", line 806, in call_soon_threadsafe
    self._check_closed()
  File "/usr/lib/python3.11/asyncio/base_events.py", line 519, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Exception ignored in: <coroutine object QueueClient.__aexit__ at 0x7fee85f73680>
Traceback (most recent call last):
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/queue_client.py", line 41, in __aexit__
    await self._close_ctx()
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/queue_client.py", line 51, in _close_ctx
    await self.pool.unlock(ctx.acc.username, self.queue, ctx.req_count)
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/accounts_pool.py", line 181, in unlock
    await execute(self._db_file, qs, {"username": username})
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/db.py", line 17, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/db.py", line 123, in execute
    async with DB(db_path) as db:
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/db.py", line 104, in __aenter__
    await check_version()
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/db.py", line 38, in check_version
    ver = await get_sqlite_version()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/carlos/.local/lib/python3.11/site-packages/twscrape/db.py", line 31, in get_sqlite_version
    async with aiosqlite.connect(":memory:") as db:
  File "/home/carlos/.local/lib/python3.11/site-packages/aiosqlite/core.py", line 153, in __aexit__
    await self.close()
  File "/home/carlos/.local/lib/python3.11/site-packages/aiosqlite/core.py", line 171, in close
    await self._execute(self._conn.close)
  File "/home/carlos/.local/lib/python3.11/site-packages/aiosqlite/core.py", line 125, in _execute
    future = asyncio.get_event_loop().create_future()
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/events.py", line 677, in get_event_loop
    raise RuntimeError('There is no current event loop in thread %r.'
RuntimeError: There is no current event loop in thread 'MainThread'.

Here is my code:

try:
    async with aclosing(api.search(query, limit=limit_tweets)) as gen:
        async for tweet in gen:
            if tweet.id < 123:
                break    
except Exception as err:
    print('ERROR downloading')

Thanks for all.

Can i get tweet replies ?

like snscrape TwitterTweetScraper