Giter VIP home page Giter VIP logo

duckduckgo_search's Introduction

Python >= 3.8 Downloads Downloads

Duckduckgo_search

Search for words, documents, images, videos, news, maps and text translation using the DuckDuckGo.com search engine. Downloading files and images to a local hard drive.

Table of Contents

Install

pip install -U duckduckgo_search

There is also a beta release that uses the httpx library:

pip install -U duckduckgo_search==6.2.8b1

Note

you can install lxml to use the text function with backend='html' or backend='lite' (size ≈ 12Mb)
pip install -U duckduckgo_search[lxml]

CLI version

ddgs --help

CLI examples:

# AI chat
ddgs chat
# text search
ddgs text -k "standard oil"
# find and download pdf files via proxy
ddgs text -k "pushkin filetype:pdf" -r wt-wt -m 50 -d -p https://1.2.3.4:1234
# using Tor Browser as a proxy (`tb` is an alias for `socks5://127.0.0.1:9150`)
ddgs text -k "'to kill a mockingbird' filetype:doc" -m 50 -d -p tb
# find and save to csv
ddgs text -k "'neuroscience exploring the brain' filetype:pdf" -m 70 -o csv
# find and download images
ddgs images -k "beware of false prophets" -r wt-wt -type photo -m 500 -d
# get news for the last day and save to json
ddgs news -k "sanctions" -m 100 -t d -o json

Go To TOP

Duckduckgo search operators

Keywords example Result
cats dogs Results about cats or dogs
"cats and dogs" Results for exact term "cats and dogs". If no results are found, related results are shown.
cats -dogs Fewer dogs in results
cats +dogs More dogs in results
cats filetype:pdf PDFs about cats. Supported file types: pdf, doc(x), xls(x), ppt(x), html
dogs site:example.com Pages about dogs from example.com
cats -site:example.com Pages about cats, excluding example.com
intitle:dogs Page title includes the word "dogs"
inurl:cats Page url includes the word "cats"

Go To TOP

Regions

expand
xa-ar for Arabia
xa-en for Arabia (en)
ar-es for Argentina
au-en for Australia
at-de for Austria
be-fr for Belgium (fr)
be-nl for Belgium (nl)
br-pt for Brazil
bg-bg for Bulgaria
ca-en for Canada
ca-fr for Canada (fr)
ct-ca for Catalan
cl-es for Chile
cn-zh for China
co-es for Colombia
hr-hr for Croatia
cz-cs for Czech Republic
dk-da for Denmark
ee-et for Estonia
fi-fi for Finland
fr-fr for France
de-de for Germany
gr-el for Greece
hk-tzh for Hong Kong
hu-hu for Hungary
in-en for India
id-id for Indonesia
id-en for Indonesia (en)
ie-en for Ireland
il-he for Israel
it-it for Italy
jp-jp for Japan
kr-kr for Korea
lv-lv for Latvia
lt-lt for Lithuania
xl-es for Latin America
my-ms for Malaysia
my-en for Malaysia (en)
mx-es for Mexico
nl-nl for Netherlands
nz-en for New Zealand
no-no for Norway
pe-es for Peru
ph-en for Philippines
ph-tl for Philippines (tl)
pl-pl for Poland
pt-pt for Portugal
ro-ro for Romania
ru-ru for Russia
sg-en for Singapore
sk-sk for Slovak Republic
sl-sl for Slovenia
za-en for South Africa
es-es for Spain
se-sv for Sweden
ch-de for Switzerland (de)
ch-fr for Switzerland (fr)
ch-it for Switzerland (it)
tw-tzh for Taiwan
th-th for Thailand
tr-tr for Turkey
ua-uk for Ukraine
uk-en for United Kingdom
us-en for United States
ue-es for United States (es)
ve-es for Venezuela
vn-vi for Vietnam
wt-wt for No region

Go To TOP

DDGS and AsyncDDGS classes

The DDGS and AsyncDDGS classes are used to retrieve search results from DuckDuckGo.com. To use the AsyncDDGS class, you can perform asynchronous operations using Python's asyncio library. To initialize an instance of the DDGS or AsyncDDGS classes, you can provide the following optional arguments:

class DDGS:
    """DuckDuckgo_search class to get search results from duckduckgo.com

    Args:
        headers (dict, optional): Dictionary of headers for the HTTP client. Defaults to None.
        proxy (str, optional): proxy for the HTTP client, supports http/https/socks5 protocols.
            example: "http://user:[email protected]:3128". Defaults to None.
        timeout (int, optional): Timeout value for the HTTP client. Defaults to 10.
    """

Here is an example of initializing the DDGS class.

from duckduckgo_search import DDGS

results = DDGS().text("python programming", max_results=5)
print(results)

Here is an example of initializing the AsyncDDGS class:

import asyncio

from duckduckgo_search import AsyncDDGS

async def aget_results(word):
    results = await AsyncDDGS(proxy=None).atext(word, max_results=100)
    return results

async def main():
    words = ["sun", "earth", "moon"]
    tasks = [aget_results(w) for w in words]
    results = await asyncio.gather(*tasks)
    print(results)

if __name__ == "__main__":
    asyncio.run(main())

Go To TOP

Proxy

Package supports http/https/socks proxies. Example: http://user:[email protected]:3128. Use a rotating proxy. Otherwise, use a new proxy with each DDGS or AsyncDDGS initialization.

1. The easiest way. Launch the Tor Browser

ddgs = DDGS(proxy="tb", timeout=20)  # "tb" is an alias for "socks5://127.0.0.1:9150"
results = ddgs.text("something you need", max_results=50)

2. Use any proxy server (example with iproyal rotating residential proxies)

ddgs = DDGS(proxy="socks5://user:[email protected]:32325", timeout=20)
results = ddgs.text("something you need", max_results=50)

Go To TOP

Exceptions

Exceptions:

  • DuckDuckGoSearchException: Base exception for duckduckgo_search errors.
  • RatelimitException: Inherits from DuckDuckGoSearchException, raised for exceeding API request rate limits.
  • TimeoutException: Inherits from DuckDuckGoSearchException, raised for API request timeouts.

Go To TOP

1. chat() - AI chat

def chat(self, keywords: str, model: str = "gpt-4o-mini", timeout: int = 30) -> str:
    """Initiates a chat session with DuckDuckGo AI.

    Args:
        keywords (str): The initial message or question to send to the AI.
        model (str): The model to use: "gpt-4o-mini", "claude-3-haiku", "llama-3.1-70b", "mixtral-8x7b".
            Defaults to "gpt-4o-mini".
        timeout (int): Timeout value for the HTTP client. Defaults to 30.

    Returns:
        str: The response from the AI.
    """

Example

results = DDGS().chat("summarize Daniel Defoe's The Consolidator", model='claude-3-haiku')

# async
results = await AsyncDDGS().achat('describe the characteristic habits and behaviors of humans as a species')

Go To TOP

2. text() - text search by duckduckgo.com

def text(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: str | None = None,
    backend: str = "api",
    max_results: int | None = None,
) -> list[dict[str, str]]:
    """DuckDuckGo text search generator. Query params: https://duckduckgo.com/params.

    Args:
        keywords: keywords for query.
        region: wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
        safesearch: on, moderate, off. Defaults to "moderate".
        timelimit: d, w, m, y. Defaults to None.
        backend: api, html, lite. Defaults to api.
            api - collect data from https://duckduckgo.com,
            html - collect data from https://html.duckduckgo.com,
            lite - collect data from https://lite.duckduckgo.com.
        max_results: max number of results. If None, returns results only from the first response. Defaults to None.

    Returns:
        List of dictionaries with search results.
    """

Example

results = DDGS().text('live free or die', region='wt-wt', safesearch='off', timelimit='y', max_results=10)
# Searching for pdf files
results = DDGS().text('russia filetype:pdf', region='wt-wt', safesearch='off', timelimit='y', max_results=10)

# async
results = await AsyncDDGS().atext('sun', region='wt-wt', safesearch='off', timelimit='y', max_results=10)
print(results)
[
    {
        "title": "News, sport, celebrities and gossip | The Sun",
        "href": "https://www.thesun.co.uk/",
        "body": "Get the latest news, exclusives, sport, celebrities, showbiz, politics, business and lifestyle from The Sun",
    }, ...
]

Go To TOP

3. answers() - instant answers by duckduckgo.com

def answers(keywords: str) -> list[dict[str, str]]:
    """DuckDuckGo instant answers. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query,
    
    Returns:
        List of dictionaries with instant answers results.
    """

Example

results = DDGS().answers("sun")

# async
results = await AsyncDDGS().aanswers("sun")
print(results)
[
    {
        "icon": None,
        "text": "The Sun is the star at the center of the Solar System. It is a massive, nearly perfect sphere of hot plasma, heated to incandescence by nuclear fusion reactions in its core, radiating the energy from its surface mainly as visible light and infrared radiation with 10% at ultraviolet energies. It is by far the most important source of energy for life on Earth. The Sun has been an object of veneration in many cultures. It has been a central subject for astronomical research since antiquity. The Sun orbits the Galactic Center at a distance of 24,000 to 28,000 light-years. From Earth, it is 1 AU or about 8 light-minutes away. Its diameter is about 1,391,400 km, 109 times that of Earth. Its mass is about 330,000 times that of Earth, making up about 99.86% of the total mass of the Solar System. Roughly three-quarters of the Sun's mass consists of hydrogen; the rest is mostly helium, with much smaller quantities of heavier elements, including oxygen, carbon, neon, and iron.",
        "topic": None,
        "url": "https://en.wikipedia.org/wiki/Sun",
    }, ...
]

Go To TOP

4. images() - image search by duckduckgo.com

def images(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: str | None = None,
    size: str | None = None,
    color: str | None = None,
    type_image: str | None = None,
    layout: str | None = None,
    license_image: str | None = None,
    max_results: int | None = None,
) -> list[dict[str, str]]:
    """DuckDuckGo images search. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query.
        region: wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
        safesearch: on, moderate, off. Defaults to "moderate".
        timelimit: Day, Week, Month, Year. Defaults to None.
        size: Small, Medium, Large, Wallpaper. Defaults to None.
        color: color, Monochrome, Red, Orange, Yellow, Green, Blue,
            Purple, Pink, Brown, Black, Gray, Teal, White. Defaults to None.
        type_image: photo, clipart, gif, transparent, line.
            Defaults to None.
        layout: Square, Tall, Wide. Defaults to None.
        license_image: any (All Creative Commons), Public (PublicDomain),
            Share (Free to Share and Use), ShareCommercially (Free to Share and Use Commercially),
            Modify (Free to Modify, Share, and Use), ModifyCommercially (Free to Modify, Share, and
            Use Commercially). Defaults to None.
        max_results: max number of results. If None, returns results only from the first response. Defaults to None.
    
    Returns:
        List of dictionaries with images search results.
    """

Example

results = DDGS().images(
    keywords="butterfly",
    region="wt-wt",
    safesearch="off",
    size=None,
    color="Monochrome",
    type_image=None,
    layout=None,
    license_image=None,
    max_results=100,
)

# async
results = await AsyncDDGS().aimages('sun', region='wt-wt', safesearch='off', max_results=20)
print(images)
[
    {
        "title": "File:The Sun by the Atmospheric Imaging Assembly of NASA's Solar ...",
        "image": "https://upload.wikimedia.org/wikipedia/commons/b/b4/The_Sun_by_the_Atmospheric_Imaging_Assembly_of_NASA's_Solar_Dynamics_Observatory_-_20100819.jpg",
        "thumbnail": "https://tse4.mm.bing.net/th?id=OIP.lNgpqGl16U0ft3rS8TdFcgEsEe&pid=Api",
        "url": "https://en.wikipedia.org/wiki/File:The_Sun_by_the_Atmospheric_Imaging_Assembly_of_NASA's_Solar_Dynamics_Observatory_-_20100819.jpg",
        "height": 3860,
        "width": 4044,
        "source": "Bing",
    }, ...
]

Go To TOP

5. videos() - video search by duckduckgo.com

def videos(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: str | None = None,
    resolution: str | None = None,
    duration: str | None = None,
    license_videos: str | None = None,
    max_results: int | None = None,
) -> list[dict[str, str]]:
    """DuckDuckGo videos search. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query.
        region: wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
        safesearch: on, moderate, off. Defaults to "moderate".
        timelimit: d, w, m. Defaults to None.
        resolution: high, standart. Defaults to None.
        duration: short, medium, long. Defaults to None.
        license_videos: creativeCommon, youtube. Defaults to None.
        max_results: max number of results. If None, returns results only from the first response. Defaults to None.
    
    Returns:
        List of dictionaries with videos search results.
    """

Example

results = DDGS().videos(
    keywords="cars",
    region="wt-wt",
    safesearch="off",
    timelimit="w",
    resolution="high",
    duration="medium",
    max_results=100,
)

# async
results = await AsyncDDGS().avideos('sun', region='wt-wt', safesearch='off', timelimit='y', max_results=10)
print(results)
[
    {
        "content": "https://www.youtube.com/watch?v=6901-C73P3g",
        "description": "Watch the Best Scenes of popular Tamil Serial #Meena that airs on Sun TV. Watch all Sun TV serials immediately after the TV telecast on Sun NXT app. *Free for Indian Users only Download here: Android - http://bit.ly/SunNxtAdroid iOS: India - http://bit.ly/sunNXT Watch on the web - https://www.sunnxt.com/ Two close friends, Chidambaram ...",
        "duration": "8:22",
        "embed_html": '<iframe width="1280" height="720" src="https://www.youtube.com/embed/6901-C73P3g?autoplay=1" frameborder="0" allowfullscreen></iframe>',
        "embed_url": "https://www.youtube.com/embed/6901-C73P3g?autoplay=1",
        "image_token": "6c070b5f0e24e5972e360d02ddeb69856202f97718ea6c5d5710e4e472310fa3",
        "images": {
            "large": "https://tse4.mm.bing.net/th?id=OVF.JWBFKm1u%2fHd%2bz2e1GitsQw&pid=Api",
            "medium": "https://tse4.mm.bing.net/th?id=OVF.JWBFKm1u%2fHd%2bz2e1GitsQw&pid=Api",
            "motion": "",
            "small": "https://tse4.mm.bing.net/th?id=OVF.JWBFKm1u%2fHd%2bz2e1GitsQw&pid=Api",
        },
        "provider": "Bing",
        "published": "2024-07-03T05:30:03.0000000",
        "publisher": "YouTube",
        "statistics": {"viewCount": 29059},
        "title": "Meena - Best Scenes | 02 July 2024 | Tamil Serial | Sun TV",
        "uploader": "Sun TV",
    }, ...
]

Go To TOP

6. news() - news search by duckduckgo.com

def news(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: str | None = None,
    max_results: int | None = None,
) -> list[dict[str, str]]:
    """DuckDuckGo news search. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query.
        region: wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
        safesearch: on, moderate, off. Defaults to "moderate".
        timelimit: d, w, m. Defaults to None.
        max_results: max number of results. If None, returns results only from the first response. Defaults to None.
    
    Returns:
        List of dictionaries with news search results.
    """

Example

results = DDGS().news(keywords="sun", region="wt-wt", safesearch="off", timelimit="m", max_results=20)

# async
results = await AsyncDDGS().anews('sun', region='wt-wt', safesearch='off', timelimit='d', max_results=10)
print(results)
[
    {
        "date": "2024-07-03T16:25:22+00:00",
        "title": "Murdoch's Sun Endorses Starmer's Labour Day Before UK Vote",
        "body": "Rupert Murdoch's Sun newspaper endorsed Keir Starmer and his opposition Labour Party to win the UK general election, a dramatic move in the British media landscape that illustrates the country's shifting political sands.",
        "url": "https://www.msn.com/en-us/money/other/murdoch-s-sun-endorses-starmer-s-labour-day-before-uk-vote/ar-BB1plQwl",
        "image": "https://img-s-msn-com.akamaized.net/tenant/amp/entityid/BB1plZil.img?w=2000&h=1333&m=4&q=79",
        "source": "Bloomberg on MSN.com",
    }, ...
]

Go To TOP

7. maps() - map search by duckduckgo.com

def maps(
    keywords,
    place: str | None = None,
    street: str | None = None,
    city: str | None = None,
    county: str | None = None,
    state: str | None = None,
    country: str | None = None,
    postalcode: str | None = None,
    latitude: str | None = None,
    longitude: str | None = None,
    radius: int = 0,
    max_results: int | None = None,
) -> list[dict[str, str]]:
    """DuckDuckGo maps search. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query
        place: if set, the other parameters are not used. Defaults to None.
        street: house number/street. Defaults to None.
        city: city of search. Defaults to None.
        county: county of search. Defaults to None.
        state: state of search. Defaults to None.
        country: country of search. Defaults to None.
        postalcode: postalcode of search. Defaults to None.
        latitude: geographic coordinate (north-south position). Defaults to None.
        longitude: geographic coordinate (east-west position); if latitude and
            longitude are set, the other parameters are not used. Defaults to None.
        radius: expand the search square by the distance in kilometers. Defaults to 0.
        max_results: max number of results. If None, returns results only from the first response. Defaults to None.
    
    Returns:
        List of dictionaries with maps search results.
    """

Example

results = DDGS().maps("school", place="Uganda", max_results=50)

# async
results = await AsyncDDGS().amaps('shop', place="Baltimor", max_results=10)
print(results)
[
    {
        "title": "The Bun Shop",
        "address": "239 W Read St, Baltimore, MD 21201-4845",
        "country_code": None,
        "url": "https://www.facebook.com/TheBunShop/",
        "phone": "+14109892033",
        "latitude": 39.3006042,
        "longitude": -76.6195788,
        "source": "https://www.tripadvisor.com/Restaurant_Review-g60811-d4819859-Reviews-The_Bun_Shop-Baltimore_Maryland.html?m=63959",
        "image": "",
        "desc": "",
        "hours": {
            "Fri": "07:00:00–03:00:00",
            "Mon": "07:00:00–03:00:00",
            "Sat": "07:00:00–03:00:00",
            "Sun": "07:00:00–03:00:00",
            "Thu": "07:00:00–03:00:00",
            "Tue": "07:00:00–03:00:00",
            "Wed": "07:00:00–03:00:00",
            "closes_soon": 0,
            "is_open": 1,
            "opens_soon": 0,
            "state_switch_time": "03:00",
        },
        "category": "Cafe",
        "facebook": "",
        "instagram": "",
        "twitter": "",
    }, ...
]

Go To TOP

8. translate() - translation by duckduckgo.com

def translate(
    self,
    keywords: str,
    from_: str | None = None,
    to: str = "en",
) -> list[dict[str, str]]:
    """DuckDuckGo translate.
    
    Args:
        keywords: string or list of strings to translate.
        from_: translate from (defaults automatically). Defaults to None.
        to: what language to translate. Defaults to "en".
    
    Returns:
        List od dictionaries with translated keywords.
    """

Example

keywords = 'school'
# also valid
keywords = ['school', 'cat']
results = DDGS().translate(keywords, to="de")

# async
results = await AsyncDDGS().atranslate('sun', to="de")
print(results)
[{"detected_language": "en", "translated": "Sonne", "original": "sun"}]

Go To TOP

9. suggestions() - suggestions by duckduckgo.com

def suggestions(
    keywords,
    region: str = "wt-wt",
) -> list[dict[str, str]]:
    """DuckDuckGo suggestions. Query params: https://duckduckgo.com/params.
    
    Args:
        keywords: keywords for query.
        region: wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
    
    Returns:
        List of dictionaries with suggestions results.
    """

Example

results = DDGS().suggestions("fly")

# async
results = await AsyncDDGS().asuggestions('sun')
print(results)
[
    {"phrase": "sunshine live"},
    {"phrase": "sunexpress"},
    {"phrase": "sunday natural"},
    {"phrase": "sunrise village spiel"},
    {"phrase": "sunny portal"},
    {"phrase": "sundair"},
    {"phrase": "sunny cars"},
    {"phrase": "sunexpress online check-in"},
]

Disclaimer

This library is not affiliated with DuckDuckGo and is for educational purposes only. It is not intended for commercial use or any purpose that violates DuckDuckGo's Terms of Service. By using this library, you acknowledge that you will not use it in a way that infringes on DuckDuckGo's terms. The official DuckDuckGo website can be found at https://duckduckgo.com.

Go To TOP

duckduckgo_search's People

Contributors

arabianq avatar c3tas avatar deedy5 avatar desaiankitb avatar mrgick avatar shashankdeshpande avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

duckduckgo_search's Issues

RateLimitException on text() search

When using text search I am always seeing an error. This has been working without issue until within the last hour or so.

Detail of error -

duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://links.duckduckgo.com/d.js RateLimitException: _get_url() https://links.duckduckgo.com/d.js RateLimitError: resp.status_code==202

example to reproduce error -

from duckduckgo_search import DDGS

with DDGS() as ddgs:
for r in ddgs.text('live free or die', region='wt-wt', safesearch='off', timelimit='y', max_results=10):
print(r)

error trace -

Traceback (most recent call last): File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 48, in _get_url raise RateLimitException(f"_get_url() {url}") duckduckgo_search.exceptions.RateLimitException: _get_url() https://links.duckduckgo.com/d.js RateLimitError: resp.status_code==202 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<input>", line 4, in <module> File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 103, in text File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 154, in _text_api cache = set() File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 57, in _get_url raise HTTPException(f"_get_url() {url} HttpError: {ex}") duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://links.duckduckgo.com/d.js RateLimitException: _get_url() https://links.duckduckgo.com/d.js RateLimitError: resp.status_code==202

running both local (macOS) and in docker container in Python and duckduckgo_search v 3.9.9

Exclude in keywords

Adding an exclusion to the keywords causes zero results to be returned.

Example:
ddg("stuff -site:pinterest.com", region='us-en', safesearch='Off', time=None, max_results=30)

Referer header requires

@deedy5 FYI DDG now requires a "www.duckduckgo.com" referer header. Without it, it will return an error on the second page of results (at least for image search, which is all I tested).

issue in ddg

Line 66 in ddg.py

if "506-00.js" in resp.url:

the local variable 'resp' is referenced before assignment, this prevented me from using the function after the new update on May 12.

While searching for words, he stays in bug and enters an endless wait

I am using this code. I am searching by scraping words from a txt file. Sometimes, as a result of the word search, it enters the bug and waits for hours. It goes into an endless sleep cycle. And goes to sleep for hours without finding any results

from duckduckgo_search import ddg

keywords = 'Bella Ciao'
results = ddg(keywords, region='wt-wt', safesearch='Moderate', time='y', max_results=28)
print(results)

URL

Before you open an issue:

  1. Make sure you have the latest version installed. Check: ddgs version. Update: pip install -U duckduckgo_search
  2. Try reinstalling the library: pip install -I duckduckgo_search
  3. Make sure the site https://duckduckgo.com is accessible in your browser
  4. Try using a proxy. The site may block ip for a while.

Describe the bug

Traceback (most recent call last):
  File "E:call_search_engine.py", line 235, in <module>
    evidences = call_search_engine(query, engine=engine)
  File "E:\call_search_engine.py", line 37, in call_search_engine
    for result in results:
  File "D:\Anaconda\envs\langchain\lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 106, in text
    for i, result in enumerate(results, start=1):
  File "D:\Anaconda\envs\langchain\lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 134, in _text_api
    vqd = self._get_vqd(keywords)
  File "D:\Anaconda\envs\langchain\lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 64, in _get_vqd
    resp = self._get_url("POST", "https://duckduckgo.com", data={"q": keywords})

hello, this issue shows that the URL cannot be accessed, but my browser is able to access this URL. Hope to receive your help,THANKS!

IndexError: list index out of range

While running this:

import os
iskaggle = os.environ.get('KAGGLE_KERNEL_RUN_TYPE', '')
from duckduckgo_search import ddg_images
from fastcore.all import *

def search_images(term, max_images=30):
    print(f"Searching for '{term}'")
    return L(ddg_images(term, max_results=max_images)).itemgot('image')
urls = search_images('birds', max_images=5)
urls[0]

Searching for 'birds'

I'm getting this:

IndexError Traceback (most recent call last)
Cell In[2], line 2
1 urls = search_images('birds', max_images=5)
----> 2 urls[0]

File F:\Program Files (x86)\Python_3_10_6\lib\site-packages\fastcore\foundation.py:112, in L.getitem(self, idx)
--> 112 def getitem(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)

File F:\Program Files (x86)\Python_3_10_6\lib\site-packages\fastcore\foundation.py:116, in L.get(self, i)
115 def get(self, i):
--> 116 if is_indexer(i) or isinstance(i,slice): return getattr(self.items,'iloc',self.items)[i]
117 i = mask2idxs(i)
118 return (self.items.iloc[list(i)] if hasattr(self.items,'iloc')
119 else self.items.array()[(i,)] if hasattr(self.items,'array')
120 else [self.items[i
] for i
in i])

IndexError: list index out of range

Yesterday everything worked fine but to day I encounter this issue, I've tried different versions of duckduckgo_search, it might work for a bit but then it breaks again.

Runtime error: VQDExtractionException: Could not extract vqd.

Describe the bug
Just installed latest version of duckduckgo_search and started testing in a jupyter notebook.
Tried the little code extracted from the documentation:

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    results = [r for r in ddgs.text("python programming", max_results=5)]
    print(results)

Debug log

DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='/Users/rober/GitRepo/solvegraph_corporate/.venv/lib/python3.11/site-packages/certifi/cacert.pem'
DEBUG:httpcore.connection:connect_tcp.started host='duckduckgo.com' port=443 local_address=None timeout=10 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x124a14750>
DEBUG:httpcore.connection:start_tls.started ssl_context=<ssl.SSLContext object at 0x124a484d0> server_hostname='duckduckgo.com' timeout=10
DEBUG:httpcore.connection:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x12485f050>
DEBUG:httpcore.http2:send_connection_init.started request=<Request [b'POST']>
DEBUG:httpcore.http2:send_connection_init.complete
DEBUG:httpcore.http2:send_request_headers.started request=<Request [b'POST']> stream_id=1
DEBUG:hpack.hpack:Adding (b':method', b'POST') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 3 with 7 bits
DEBUG:hpack.hpack:Adding (b':authority', b'duckduckgo.com') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 1 with 6 bits
DEBUG:hpack.hpack:Encoding 11 with 7 bits
DEBUG:hpack.hpack:Adding (b':scheme', b'https') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 7 with 7 bits
DEBUG:hpack.hpack:Adding (b':path', b'/') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 4 with 7 bits
DEBUG:hpack.hpack:Adding (b'accept', b'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 19 with 6 bits
DEBUG:hpack.hpack:Encoding 101 with 7 bits
DEBUG:hpack.hpack:Adding (b'accept-language', b'en-US,en;q=0.9') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 17 with 6 bits
DEBUG:hpack.hpack:Encoding 11 with 7 bits
DEBUG:hpack.hpack:Adding (b'accept-encoding', b'gzip, deflate, br') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 16 with 6 bits
DEBUG:hpack.hpack:Encoding 13 with 7 bits
DEBUG:hpack.hpack:Adding (b'referer', b'https://duckduckgo.com/') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 51 with 6 bits
DEBUG:hpack.hpack:Encoding 17 with 7 bits
DEBUG:hpack.hpack:Adding (b'user-agent', b'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 58 with 6 bits
DEBUG:hpack.hpack:Encoding 76 with 7 bits
DEBUG:hpack.hpack:Adding (b'content-length', b'20') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 28 with 6 bits
DEBUG:hpack.hpack:Encoding 2 with 7 bits
DEBUG:hpack.hpack:Adding (b'content-type', b'application/x-www-form-urlencoded') to the header table, sensitive:False, huffman:True
DEBUG:hpack.hpack:Encoding 31 with 6 bits
DEBUG:hpack.hpack:Encoding 24 with 7 bits
DEBUG:hpack.hpack:Encoded header block to b'\x83A\x8b\x92\xd2u\x92\xd2u\x98\xeb\x90\xf4\xff\x87\x84S\xe5I|\xa5\x89\xd3M\x1fC\xae\xba\x0cA\xa4\xc7\xa9\x8f3\xa6\x9a?\xdf\x9ah\xfa\x1du\xd0b\r&=Ly\xa6\x8f\xbe\xd0\x01w\xfe\x8dH\xe6+\x03\xeei~\x8dH\xe6+\x1e\x0b\x1d\x7fF\xa4s\x15\x81\xd7T\xdf_,|\xfd\xf6\x80\x0b\xbd\xf4:\xeb\xa0\xc4\x1aLz\x98A\xa6\xa8\xb2,_$\x9cuL_\xbe\xf0F\xcf\xdfh\x00\xbb\xbfQ\x8b-Kp\xdd\xf4Z\xbe\xfb@\x05\xdfP\x8d\x9b\xd9\xab\xfaRB\xcb@\xd2_\xa5#\xb3s\x91\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xaeC\xd2\xc7z\xcc\xd0\x7ff\xa2\x81\xb0\xda\xe0S\xfa\xe4j\xa4?\x84)\xa7z\x81\x02\xe0\xfe\xd4\x86\xba\xe8/"\xc7\x98\xc9a\xb6]]\x97\x14\xfe\xb3c\xdf\xa3?\xd2\x94\x1b\xa9T\xc4Ru?\xf6\xa5\xe9\xec=%`!}p.\x05\xc0\xa6\xe1\xca;\x0c\xc3l\xba\xbb.\x7f\\\x82\x10?_\x98\x1du\xd0b\r&=Ly[\xc7\x8f\x0bJ{)Z\xdb(-D<\x85\x93'
DEBUG:httpcore.http2:send_request_headers.complete
DEBUG:httpcore.http2:send_request_body.started request=<Request [b'POST']> stream_id=1
DEBUG:httpcore.http2:send_request_body.complete
DEBUG:httpcore.http2:receive_response_headers.started request=<Request [b'POST']> stream_id=1
DEBUG:httpcore.http2:receive_remote_settings.started
DEBUG:httpcore.http2:receive_remote_settings.complete return_value=<RemoteSettingsChanged changed_settings:{ChangedSetting(setting=3, original_value=None, new_value=64), ChangedSetting(setting=4, original_value=65535, new_value=65536), ChangedSetting(setting=5, original_value=16384, new_value=16777215)}>
DEBUG:hpack.hpack:Decoding b' \x88v\x84\xaacU\xe7a\x96\xdfi~\x94\x03j_)\x14\x10\x04\xca\x82\x15\xc6Z\xb8\xdb\xcab\xd1\xbf_\x92I|\xa5\x89\xd3M\x1fj\x12q\xd8\x82\xa6\x0e\x1b\xf0\xac\xf7{\x8b\x84\x84-i[\x05D<\x86\xaao\x00\x8aAl\xee[\x16I\xa95S\x7f\x99I\xd2:>\xe4\xb6\xc8\x1b\x0f\xdc\x85A \xfen\x8c\x9dKT\x8ao:GG\xf3\x00\x8f\xf2\xb4\x96\x93\xac\x96\x93\xac\xc7Z\xc2\xa2\xda\x12\x8f\x011\x00\x91Bl1\x12\xb2l\x1dH\xac\xf6%d\x14\x96\xd8d\xfa\x8c\xa4~V\x1c\xc5\x81\x90\xb6\xcb\x80\x00?\x00\x8d\xac\xb6Rd \xc7\xa9\x0bVz\x0cO_\x8e5I-\x85BV!\xe7=\x89\x83\xfa\xfe\xff\x00\x90!\xeaIjJ\xc8)-\xb0\xc9\xf4\xb5g\xa0\xc4\xf5\xff\x9e\t\x90\xb2\x8e\xda\x12\xb2,"\x9f\xea\xa3\xd4_\xf4\xa7\xda\x84=U\x14\x89Y\x16\x11E\'JkE\xc6\x18\x92\xd2u\x92\xd2u\x98\xeb\x90\xf4\xa9:SZ.0\xc7\xca\xf2ZN\xb2ZN\xb3\x1dr\x1e\x95\'JkE\xc6\x18\x92\xd2u\x92\xd2u\x98\xf3L\xd0\xbc\xf49\x1d\x17\x96Q\xd0h?\x83\x8e\xc9c\x98\x94\xf7\x94\xd4\x8eT\xa5\xc4\xf8\x1c\xc8\xf1\xec\x9e\xc7"\xe7\xa8\xc7\xa9\x85\'JkE\xc6\x18Ev\x14rWa\xbb\x8c\x9e\x97!\xe9S\xedJGQ\xa5*\x12\xb2,"\x8aN\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xd7!\xe9Rt\xa6\xb4\\a\x8f\x95\xe4\xb4\x9dd\xb4\x9df:\xe4=*N\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xe6\x99\xa1y\xe8r:/,\xa3\xa0\xd0\x7f\x07\x1d\x92\xc71)\xef)\xa9\x1c\xa9K\x89\xf09\x91\xe3\xd9=\x8eE\xcfQ\x8fS\nN\x94\xd6\x8b\x8c0\x8a\xec(\xe4\xae\xc3w\x19=.C\xd2\xa7\xda\x94\x96C\rdXE\x14\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xaeC\xd2\xa4\xe9Mh\xb8\xc3\x1f+\xc9i:\xc9i:\xccu\xc8zT\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xcd3B\xf3\xd0\xe4t^YGA\xa0\xfe\x0e;%\x8ebS\xdeSR9R\x97\x13\xe0s#\xc7\xb2{\x1c\x8b\x9e\xa3\x1e\xa6\x14\x9d)\xad\x17\x18a\x15\xd8Q\xc9]\x86\xee2z\\\x87\xa5O\xb5\x10K\rZVE\x84R:\x0f\x1d\xc5\x14\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xaeC\xd2\xa4\xe9Mh\xb8\xc3\x1f+\xc9i:\xc9i:\xccu\xc8zT\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xcd3B\xf3\xd0\xe4t^YGA\xa0\xfe\x0e;%\x8ebS\xdeSR9R\x97\x13\xe0s#\xc7\xb2{\x1c\x8b\x9e\xa3\x1e\xa6\x14\x9d)\xad\x17\x18a\x15\xd8Q\xc9]\x86\xee2z\\\x87\xa5O\xf5mH\x1c\xa5X\xd5Pj\x8b\xfe\x94\xffV\xd4\x81\xcaU\x8b\xdct\x7f\xa5>\xd4\x94\xf5%dXE$\x1aGqE\'JkE\xc6\x18\x92\xd2u\x92\xd2u\x98\xeb\x90\xf4\xa9:SZ.0\xc7\xca\xf2ZN\xb2ZN\xb3\x1dr\x1e\x95\'JkE\xc6\x18\x92\xd2u\x92\xd2u\x98\xf3L\xd0\xbc\xf49\x1d\x17\x96Q\xd0h?\x83\x8e\xc9c\x98\x94\xf7\x94\xd4\x8eT\xa5\xc4\xf8\x1c\xc8\xf1\xec\x9e\xc7"\xe7\xa8\xc7\xa9\x85\'JkE\xc6\x18Ev\x14rWa\xbb\x8c\x9e\x97!\xe9S\xedCS2\xc8\xb0\x8aH4\x8e\xe2\x8aN\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xd7!\xe9Rt\xa6\xb4\\a\x8f\x95\xe4\xb4\x9dd\xb4\x9df:\xe4=*N\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xe6\x99\xa1y\xe8r:/,\xa3\xa0\xd0\x7f\x07\x1d\x92\xc71)\xef)\xa9\x1c\xa9K\x89\xf09\x91\xe3\xd9=\x8eE\xcfQ\x8fS\nN\x94\xd6\x8b\x8c0\x8a\xec(\xe4\xae\xc3w\x19=.C\xd2\xa7\xda\x88O\xaa\n\xb2,"\x8aN\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xd7!\xe9Rt\xa6\xb4\\a\x8f\x95\xe4\xb4\x9dd\xb4\x9df:\xe4=*N\x94\xd6\x8b\x8c1%\xa4\xeb%\xa4\xeb1\xe6\x99\xa1y\xe8r:/,\xa3\xa0\xd0\x7f\x07\x1d\x92\xc71)\xef)\xa9\x1c\xa9K\x89\xf09\x91\xe3\xd9=\x8eE\xcfQ\x8fS\nN\x94\xd6\x8b\x8c0\x8a\xec(\xe4\xae\xc3w\x19=.C\xd2\xa7\xfa\xb6\xa4\x0eR\xacj\xa85E\xffJ}\xa8x\xfa\x14\x89Y\x16\x11O\xf5Q\xea/\xfaS\xedO\x07\xb3\xa9lY\x16\x11H\xe8<w\x14\xfbP\x93\x9a\x89\x16E\x84R:\x0f\x1d\xc5\x14\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xaeC\xd2\xa4\xe9Mh\xb8\xc3\x1f+\xc9i:\xc9i:\xccu\xc8zT\x9d)\xad\x17\x18bKI\xd6KI\xd6c\xcd3B\xf3\xd0\xe4t^YGA\xa0\xfe\x0e;%\x8ebS\xdeSR9R\x97\x13\xe0s#\xc7\xb2{\x1c\x8b\x9e\xa3\x1e\xa6\x14\x9d)\xad\x17\x18a\x15\xd8Q\xc9]\x86\xee2z\\\x87\xa5O\xb5%\xb0t\x95dXE#\xa0\xf1\xdcQI\xd2\x9a\xd1q\x86$\xb4\x9dd\xb4\x9df:\xe4=*N\x94\xd6\x8b\x8c1\xf2\xbc\x96\x93\xac\x96\x93\xac\xc7\\\x87\xa5I\xd2\x9a\xd1q\x86$\xb4\x9dd\xb4\x9df<\xd34/=\x0eGE\xe5\x94t\x1a\x0f\xe0\xe3\xb2X\xe6%=\xe55#\x95)q>\x072<{\'\xb1\xc8\xb9\xea1\xeaaI\xd2\x9a\xd1q\x86\x11]\x85\x1c\x95\xd8n\xe3\'\xa5\xc8zT\xfbRS\xd9J\xc3"Lz\x94Rt\xa6\xb4\\a\x89-\'Y-\'Y\x8e\xb9\x0fJ\x93\xa55\xa2\xe3\x0c|\xaf%\xa4\xeb%\xa4\xeb1\xd7!\xe9Rt\xa6\xb4\\a\x89-\'Y-\'Y\x8f4\xcd\x0b\xcfC\x91\xd1ye\x1d\x06\x83\xf88\xec\x969\x89OyMH\xe5J\\O\x81\xcc\x8f\x1e\xc9\xecr.z\x8cz\x98Rt\xa6\xb4\\a\x84WaG%v\x1b\xb8\xc9\xe9r\x1e\x95>\xd4\x96\xc1\xd2U\x87Q\n\x84\x9e\xc4)\xfe\x90Z%\xffJ}\xa9\x18\xd0U\xad\xb0\xca\x7f\xa4\x16\x89\x7f\xd2\x9fjGA\xc9\xd5a\xd1B\xd4\x9b\xc9dX\x87\xa9%\xa9*}\xff\x00\x8b\xf2\xb4\xb6\x0e\x92\xacz\xd2c\xd4\x8f\x89\xdd\x0e\x8c\x1a\xb6\xe4\xc5\x93O\x00\x8c\xf2\xb7\x94!j\xec:JD\x98\xf5\x7f\x89\x0f\xdd\'\x90\xb0GA\xc9\xd7\x00\x90\xf2\xb1\x0fRKRVO\xaa\xca\xb1\xebI\x8fR?\x85\xa8\xe8\xa8\xd2\xcb\x00\x8b\xb0\xb2\x96\xcb\x0bb\xd5\x9e\x83\x13\xd7\x85=\x86\x98\xd5\x7f\x00\x87/\x9a\xcaD\xacD\xff\x87\xa4~V\x1c\xc5\x80\x1f\x00\x85/\x9a\xcdaQ\x96\xdfi~\x94\x03j_)\x14\x10\x04\xca\x82\x15\xc6Z\xb8\xdb\xeab\xd1\xbf\x00\x89 \xc99V!\xeaM\x87\xa3\x87\xa4~V\x1c\xc5\x80?\x00\x8e\xf2\xb4\x96\x93\xac\x96\x93\xac\xc7Z\x83\x90t\x17\x84-Qp\xdd\x00\x8b!\xeaIjJ\xc5\xa8\x87\x90\xd5M\x02br'
DEBUG:hpack.hpack:Decoded 0, consumed 1 bytes
DEBUG:hpack.table:Resizing header table to 0 from 4096
DEBUG:hpack.hpack:Decoded 8, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b':status', b'200'), consumed 1
DEBUG:hpack.hpack:Decoded 54, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 4, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'server', b'nginx'), total consumed 6 bytes, indexed True
DEBUG:hpack.hpack:Decoded 33, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 22, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'date', b'Tue, 05 Dec 2023 22:34:58 GMT'), total consumed 24 bytes, indexed True
DEBUG:hpack.hpack:Decoded 31, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 18, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'content-type', b'text/html; charset=UTF-8'), total consumed 20 bytes, indexed True
DEBUG:hpack.hpack:Decoded 59, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 11, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'vary', b'Accept-Encoding'), total consumed 13 bytes, indexed True
DEBUG:hpack.hpack:Decoded 10, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 25, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'server-timing', b'total;dur=51;desc="Backend Total"'), total consumed 38 bytes, indexed False
DEBUG:hpack.hpack:Decoded 15, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 1, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'x-duckduckgo-results', <memory at 0x124980f40>), total consumed 19 bytes, indexed False
DEBUG:hpack.hpack:Decoded 17, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 12, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'strict-transport-security', b'max-age=31536000'), total consumed 32 bytes, indexed False
DEBUG:hpack.hpack:Decoded 13, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 14, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'permissions-policy', b'interest-cohort=()'), total consumed 30 bytes, indexed False
DEBUG:hpack.hpack:Decoded 16, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 1309, consumed 3 bytes
DEBUG:hpack.hpack:Decoded (b'content-security-policy', b"default-src 'none' ; connect-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; manifest-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; media-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; script-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ 'unsafe-inline' 'unsafe-eval' ; font-src data:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; img-src data:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; style-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ 'unsafe-inline' ; object-src 'none' ; worker-src blob: ; child-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; frame-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; form-action  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; frame-ancestors 'self' ; base-uri 'self' ; block-all-mixed-content ;"), total consumed 1330 bytes, indexed False
DEBUG:hpack.hpack:Decoded 11, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 9, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'x-frame-options', b'SAMEORIGIN'), total consumed 23 bytes, indexed False
DEBUG:hpack.hpack:Decoded 12, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 9, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'x-xss-protection', b'1;mode=block'), total consumed 24 bytes, indexed False
DEBUG:hpack.hpack:Decoded 16, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 5, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'x-content-type-options', b'nosniff'), total consumed 24 bytes, indexed False
DEBUG:hpack.hpack:Decoded 11, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 5, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'referrer-policy', b'origin'), total consumed 19 bytes, indexed False
DEBUG:hpack.hpack:Decoded 7, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 7, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'expect-ct', b'max-age=0'), total consumed 17 bytes, indexed False
DEBUG:hpack.hpack:Decoded 5, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 22, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'expires', b'Tue, 05 Dec 2023 22:34:59 GMT'), total consumed 30 bytes, indexed False
DEBUG:hpack.hpack:Decoded 9, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 7, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'cache-control', b'max-age=1'), total consumed 19 bytes, indexed False
DEBUG:hpack.hpack:Decoded 14, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 4, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'x-duckduckgo-locale', b'en_US'), total consumed 21 bytes, indexed False
DEBUG:hpack.hpack:Decoded 11, consumed 1 bytes
DEBUG:hpack.hpack:Decoded 2, consumed 1 bytes
DEBUG:hpack.hpack:Decoded (b'content-encoding', <memory at 0x124981b40>), total consumed 16 bytes, indexed False
DEBUG:httpcore.http2:receive_response_headers.complete return_value=(200, [(b'server', b'nginx'), (b'date', b'Tue, 05 Dec 2023 22:34:58 GMT'), (b'content-type', b'text/html; charset=UTF-8'), (b'vary', b'Accept-Encoding'), (b'server-timing', b'total;dur=51;desc="Backend Total"'), (b'x-duckduckgo-results', b'1'), (b'strict-transport-security', b'max-age=31536000'), (b'permissions-policy', b'interest-cohort=()'), (b'content-security-policy', b"default-src 'none' ; connect-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; manifest-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; media-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; script-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ 'unsafe-inline' 'unsafe-eval' ; font-src data:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; img-src data:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; style-src  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ 'unsafe-inline' ; object-src 'none' ; worker-src blob: ; child-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; frame-src blob:  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; form-action  https://duckduckgo.com/ [https://*.duckduckgo.com](https://%2A.duckduckgo.com/) https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/ https://spreadprivacy.com/ ; frame-ancestors 'self' ; base-uri 'self' ; block-all-mixed-content ;"), (b'x-frame-options', b'SAMEORIGIN'), (b'x-xss-protection', b'1;mode=block'), (b'x-content-type-options', b'nosniff'), (b'referrer-policy', b'origin'), (b'expect-ct', b'max-age=0'), (b'expires', b'Tue, 05 Dec 2023 22:34:59 GMT'), (b'cache-control', b'max-age=1'), (b'x-duckduckgo-locale', b'en_US'), (b'content-encoding', b'br')])
INFO:httpx:HTTP Request: POST https://duckduckgo.com/ "HTTP/2 200 OK"
DEBUG:httpcore.http2:receive_response_body.started request=<Request [b'POST']> stream_id=1
DEBUG:httpcore.http2:receive_response_body.complete
DEBUG:httpcore.http2:response_closed.started stream_id=1
DEBUG:httpcore.http2:response_closed.complete
DEBUG:httpcore.connection:close.started
DEBUG:httpcore.connection:close.complete
---------------------------------------------------------------------------
VQDExtractionException                    Traceback (most recent call last)
Cell In[88], line 6
      3 from duckduckgo_search import DDGS
      5 with DDGS() as ddgs:
----> 6     results = [r for r in ddgs.text("python programming", max_results=5)]
      7     print(results)

Cell In[88], line 6, in <listcomp>(.0)
      3 from duckduckgo_search import DDGS
      5 with DDGS() as ddgs:
----> 6     results = [r for r in ddgs.text("python programming", max_results=5)]
      7     print(results)

File ~/GitRepo/solvegraph_corporate/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:105, in DDGS.text(self, keywords, region, safesearch, timelimit, backend, max_results)
    102     results = self._text_lite(keywords, region, timelimit, max_results)
    104 if results:
--> 105     for i, result in enumerate(results, start=1):
    106         yield result
    107         if max_results and i >= max_results:

File ~/GitRepo/solvegraph_corporate/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:133, in DDGS._text_api(self, keywords, region, safesearch, timelimit, max_results)
    118 """DuckDuckGo text search generator. Query params: https://duckduckgo.com/params
    119 
    120 Args:
   (...)
    129 
    130 """
    131 assert keywords, "keywords is mandatory"
--> 133 vqd = self._get_vqd(keywords)
    135 payload = {
    136     "q": keywords,
    137     "kl": region,
   (...)
    144     "sp": "0",
    145 }
    146 safesearch = safesearch.lower()

File ~/GitRepo/solvegraph_corporate/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:65, in DDGS._get_vqd(self, keywords)
     63 resp = self._get_url("POST", "https://duckduckgo.com/", data={"q": keywords})
     64 if resp:
---> 65     return _extract_vqd(resp.content, keywords)

File ~/GitRepo/solvegraph_corporate/.venv/lib/python3.11/site-packages/duckduckgo_search/utils.py:38, in _extract_vqd(html_bytes, keywords)
     36     except ValueError:
     37         pass
---> 38 raise VQDExtractionException(f"Could not extract vqd. {keywords=}")

VQDExtractionException: Could not extract vqd. keywords='python programming'

Screenshots
If applicable, add screenshots to help explain your problem.

Specify this information

  • OS: macOS
  • environment: pipenv --python 3.11.6
  • duckduckgo_search version: 3.9.9

Problem with advertisements in search results

Describe the bug
In all searches, an advertisement link is appearing initially in the results. I noticed this outcome only in the latest version 3.9.8.

code:

from duckduckgo_search import DDGS
from itertools import islice

web_results = []

with DDGS(timeout=10) as ddgs:
    result_web_search = ddgs.text(
        keywords='dog',
        region='br-pt',
        safesearch='off',
        timelimit='',
        backend='html',
    )
    
    for r in islice(result_web_search, 3):
        web_results.append(r)
list({i['href']: i for i in web_results}.values())

Screenshots
image

image

Specify this information

  • Manjaro Linux
  • Python 3.11.5
  • duckduckgo_search version: 3.9.8

403 forbidden error when hosting script

Describe the bug

I love duckduckgo-search, but I’ve been having issues with fetching images when hosting my script on Cybrancee. My script uses Python 3.10.12.

Whilst using the duckduckgo-search library to fetch images from DuckDuckGo, I encounter a HTTPError 403 Client Error: Forbidden for url error. This issue does not occur when running the bot locally – only when hosted on Cybrancee, which uses a Pterodactyl panel. Scraping web pages or search engines works fine, and fetching search results with duckduckgo-search works fine, too. Fetching images is the only thing that does not work.

I also tried proxies, headers, and a user agent. However, I still have the same problem.

For some odd reason, I’m able to scrape DuckDuckGo search results with duckduckgo-search just fine on my host:

ddg_link = DDGS(headers=new_headers, proxies=proxies, timeout=15).text(q)

However, when scraping image results instead, it does not work. Code:

rand_ua = get_ua()

logging.debug(f'[ddg_img.py] User agent: {rand_ua}')

headers = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
        "Accept-Encoding": "gzip, deflate",
        "Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8",
        "Dnt": "1",
        "Upgrade-Insecure-Requests": "1",
        "User-Agent": rand_ua,
        }

ddgs = DDGS(headers=headers, proxies=proxies, timeout=15)

async def get_ddg_image(query, first=False):
    # Remove all punctuation marks except hyphens from query
    query = re.sub(r'[^\w\s-]', '', query)
    logging.debug(f'[get_ddg_image()] query: {query}')

    keywords = query
    ddgs_images_gen = ddgs.images(
        keywords,
        region='wt-wt',
        safesearch='On'
    )
    # Get random image
    images = list(itertools.islice(ddgs_images_gen, 10))

    if first:
        # Get the first image
        image = images[0] if images else None
    else:
        # Get random image
        image = random.choice(images) if images else None

May be related to #100; however, unlike that issue, it does not happen periodically for me. It happens with every attempt – but only when hosting, not when running the script locally.

I was using version 3.2.0 of duckduckgo-search, then updated to the latest version, 3.8.3. However, the issue still occurs in the same way it did before.

I have seen #84 and #98. However, you (@deedy5) said that updating might fix it, but it did not. You also said that it’s not a library problem and that a proxy or increasing the time between requests might fix the issue, but in my case, it occurs every time, even if I haven’t made any recent requests, and I have tried both with and without proxies.

The strange anomaly is that it functions perfectly locally but not when hosting on Cybrancee (I have not tried other hosts) – and that using the same library to scrape DDG search results works perfectly with the same headers, UA, and proxies, but when trying to get images, it does not work. I’m not sure what is causing this, but if you could offer some assistance in fixing this issue, it would be much appreciated, as I am quite lost!

Errors

WARNING:duckduckgo_search.duckduckgo_search:_get_url() https://duckduckgo.com/i.js HTTPError 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=Potato+picture&vqd=4-7287769708002951745556569444305599608&f=%2C%2C%2C%2C%2C&p=1
ERROR:__main__:Unhandled error in on_message
Traceback (most recent call last):
  File "/home/container/.local/lib/python3.10/site-packages/discord/client.py", line 441, in _run_event
    await coro(*args, **kwargs)
  File "/home/container/script.py", line 6134, in on_message
    image_query, image_url, image_title = await fetch_image(msg, ai_response, server_id, channel_id, should_fetch, fetch_image_type)
  File "/home/container/script.py", line 2799, in fetch_image
    image_url, image_title = await get_ddg_image(image_query)
  File "/home/container/ddg_img.py", line 73, in get_ddg_image
    images = list(itertools.islice(ddgs_images_gen, 10))
  File "/home/container/.local/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 230, in images
    resp = self._get_url("GET", "https://duckduckgo.com/i.js", params=payload)
  File "/home/container/.local/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 69, in _get_url
    raise ex
  File "/home/container/.local/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 64, in _get_url
    resp.raise_for_status()
  File "/home/container/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=Potato+picture&vqd=4-7287769708002951745556569444305599608&f=%2C%2C%2C%2C%2C&p=1

Information

  • Environment: Cybrancee (Pterodactyl panel)
  • duckduckgo-search version: 3.8.3 (latest)

418 Client Error

I'm trying to test the package with multiple IPs(proxies) and different user agents. I'm using multithreading ( although each thread rotates the IP with each new request ).
After some time i'm getting:
requests.exceptions.HTTPError: 418 Client Error: for url: https://duckduckgo.com/
and
WARNING keywords=testword _get_vqd() is None. Refresh session and retry...

Although i'm rotating IPs & UAs on each request after some time all the requests get blocked by that error. I suppose is related to the antibotting system for duckduckgo.
Are you aware of some limits related to IPs or UAs ?
Do you think they also have limits related to searhing specific sites ( i.e. mrbeast site:instagram.com ) etc ?

HTTPError 403 Client Error: Forbidden for url

Describe the bug

Can't seem to get image search feature to work.
Here is a colab to reproduce the issue: https://colab.research.google.com/drive/1yAVjMaZxe_eaVJ-J1cvqPbfBcBaQWYvU?usp=sharing

Debug log

WARNING:duckduckgo_search.duckduckgo_search:_get_url() https://duckduckgo.com/i.js HTTPError 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=butterfly&vqd=4-127388145370558534196193019016331789945&f=%2C%2Ccolor%3AMonochrome%2C%2C%2C&p=-1
WARNING:duckduckgo_search.duckduckgo_search:_get_url() https://duckduckgo.com/i.js HTTPError 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=butterfly&vqd=4-127388145370558534196193019016331789945&f=%2C%2Ccolor%3AMonochrome%2C%2C%2C&p=-1
WARNING:duckduckgo_search.duckduckgo_search:_get_url() https://duckduckgo.com/i.js HTTPError 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=butterfly&vqd=4-127388145370558534196193019016331789945&f=%2C%2Ccolor%3AMonochrome%2C%2C%2C&p=-1
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-2-fe9bf968464c> in <cell line: 16>()
     14     license_image=None,
     15 )
---> 16 for r in ddgs_images_gen:
     17     print(r)

3 frames
/usr/local/lib/python3.10/dist-packages/duckduckgo_search/duckduckgo_search.py in images(self, keywords, region, safesearch, timelimit, size, color, type_image, layout, license_image)
    395         cache = set()
    396         for _ in range(10):
--> 397             resp = self._get_url("GET", "https://duckduckgo.com/i.js", params=payload)
    398             if resp is None:
    399                 break

/usr/local/lib/python3.10/dist-packages/duckduckgo_search/duckduckgo_search.py in _get_url(self, method, url, **kwargs)
     69                 logger.warning(f"_get_url() {url} {type(ex).__name__} {ex}")
     70                 if i >= 2 or "418" in str(ex):
---> 71                     raise ex
     72             sleep(3)
     73         return None

/usr/local/lib/python3.10/dist-packages/duckduckgo_search/duckduckgo_search.py in _get_url(self, method, url, **kwargs)
     63                 if self._is_500_in_url(resp.url) or resp.status_code == 202:
     64                     raise requests.HTTPError
---> 65                 resp.raise_for_status()
     66                 if resp.status_code == 200:
     67                     return resp

/usr/local/lib/python3.10/dist-packages/requests/models.py in raise_for_status(self)
   1019 
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 
   1023     def close(self):

HTTPError: 403 Client Error: Forbidden for url: https://duckduckgo.com/i.js?l=wt-wt&o=json&s=0&q=butterfly&vqd=4-127388145370558534196193019016331789945&f=%2C%2Ccolor%3AMonochrome%2C%2C%2C&p=-1

Specify this information

  • OS (Windows and Linux)
  • environment
  • duckduckgo_search version 3.5.0

Image search sometimes stops working

Describe the bug
The program is encountering a httpx.RemoteProtocolError and disconnects from the server without sending a response. The program breaks when downloading a specific image and it stops downloading the rest. I encountered this while performing the image search via commands in colab. I used this command: !ddgs images -k "Audi_A3" -m 180 -s off -d

Debug log
Downloading [##########----------------------------------------] 37/180 20% 00:00:44
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 353, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/connection_pool.py", line 262, in handle_async_request
raise exc
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/connection_pool.py", line 245, in handle_async_request
response = await connection.handle_async_request(request)
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/connection.py", line 96, in handle_async_request
return await self._connection.handle_async_request(request)
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/http11.py", line 121, in handle_async_request
raise exc
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/http11.py", line 99, in handle_async_request
) = await self._receive_response_headers(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/http11.py", line 164, in _receive_response_headers
event = await self._receive_event(timeout=timeout)
File "/usr/local/lib/python3.10/dist-packages/httpcore/_async/http11.py", line 214, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/bin/ddgs", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1719, in invoke
rv.append(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/duckduckgo_search/cli.py", line 435, in images
download_results(keywords, data, images=True, proxy=proxy, threads=threads)
File "/usr/local/lib/python3.10/dist-packages/duckduckgo_search/cli.py", line 164, in download_results
asyncio.run(_download_results(keywords, results, images, proxy))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/dist-packages/duckduckgo_search/cli.py", line 157, in _download_results
await future
File "/usr/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one
return f.result() # May raise f.exception().
File "/usr/local/lib/python3.10/dist-packages/duckduckgo_search/cli.py", line 105, in download_file
async with client.stream("GET", url) as resp:
File "/usr/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1573, in stream
response = await self.send(
File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1617, in send
response = await self._send_handling_auth(
File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1645, in _send_handling_auth
response = await self._send_handling_redirects(
File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1682, in _send_handling_redirects
response = await self._send_single_request(request)
File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1719, in _send_single_request
response = await transport.handle_async_request(request)
File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 352, in handle_async_request
with map_httpcore_exceptions():
File "/usr/lib/python3.10/contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

Screenshots
If applicable, add screenshots to help explain your problem.

Specify this information

  • Mac
  • google colab
  • duckduckgo_search v3.8.3

Search with site: resulting in RateLimitException

Describe the bug
Very frequently, maybe all the time, queries for a specific site result in RateLimitException.
This problem happens with both API calls and the ddgs script. I am showing the behavior with ddgs script because it is faster to reproduce without pasting any code here:

This works fine and give me two results:

ddgs text -k 'ayrton senna' -m 2

This doest not work and give me the following error:

ddgs text -k 'ayrton senna site:wikipedia.org' -m 2


Traceback (most recent call last):
  File "/Users/joao/miniforge3/envs/test_ddg/bin/ddgs", line 8, in <module>
    sys.exit(cli())
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/click/core.py", line 1719, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/duckduckgo_search/cli.py", line 150, in text
    for r in DDGS(proxies=proxy).text(
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 105, in text
    for i, result in enumerate(results, start=1):
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 156, in _text_api
    resp = self._get_url("GET", "https://links.duckduckgo.com/d.js", params=payload)
  File "/Users/joao/miniforge3/envs/test_ddg/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py", line 48, in _get_url
    raise RateLimitException(f"_get_url() {url}")
duckduckgo_search.exceptions.RateLimitException: _get_url() https://links.duckduckgo.com/d.js

Specify this information

  • OS:
    Apple M1 Pro
    Macos 14.0

  • environment
    New conda env created with:

conda create --name test_ddg python=3.10
pip install duckduckgo_search

Output of pip freeze:

aiofiles==23.2.1
anyio==4.1.0
Brotli==1.1.0
certifi==2023.11.17
click==8.1.7
duckduckgo-search==3.9.9
exceptiongroup==1.2.0
h11==0.14.0
h2==4.1.0
hpack==4.0.0
httpcore==1.0.2
httpx==0.25.2
hyperframe==6.0.1
idna==3.6
lxml==4.9.3
sniffio==1.3.0
socksio==1.0.0
  • duckduckgo_search version
duckduckgo_search-3.9.9-py3-none-any.whl

Thanks in advance for having a look at it and thanks for the amazing package!
Joao

Got empty results when use example

I ran the code, but sometimes I just got None.

from duckduckgo_search import ddg

keywords = 'Bella Ciao'
results = ddg(keywords, region='wt-wt', safesearch='Off', time='y')
print(results)

Text search: TypeError related to _get_url()

UPDATE: This was an error on my end. Please feel free to delete!

Thanks for this awesome package. It is so useful! I encountered a new issue this morning: I updated to version 3.9.11, and when I do a text search, I get a TypeError that seems to be related to the _get_url function.

Here is the script I'm running:

results=[]
with DDGS() as ddgs:
    for r in ddgs.text("When was George Washington born?", safesearch='moderate', timelimit='y', max_results=3):
        results.append(r)
results

And here is the full error message:

TypeError                                 Traceback (most recent call last)
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py:45, in DDGS._get_url(self, method, url, **kwargs)
     44 try:
---> 45     resp = self._client.request(method, url, follow_redirects=True, **kwargs)
     46     if _is_500_in_url(str(resp.url)) or resp.status_code == 403:

TypeError: Client.request() got an unexpected keyword argument 'follow_redirects'

During handling of the above exception, another exception occurred:

DuckDuckGoSearchException                 Traceback (most recent call last)
Cell In[16], line 3
      1 results=[]
      2 with DDGS() as ddgs:
----> 3     for r in ddgs.text("When was George Washington born?", safesearch='moderate', timelimit='y', max_results=3):
      4         results.append(r)
      5 results

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py:106, in DDGS.text(self, keywords, region, safesearch, timelimit, backend, max_results)
    103     results = self._text_lite(keywords, region, timelimit, max_results)
    105 if results:
--> 106     for i, result in enumerate(results, start=1):
    107         yield result
    108         if max_results and i >= max_results:

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py:134, in DDGS._text_api(self, keywords, region, safesearch, timelimit, max_results)
    119 """DuckDuckGo text search generator. Query params: https://duckduckgo.com/params
    120 
    121 Args:
   (...)
    130 
    131 """
    132 assert keywords, "keywords is mandatory"
--> 134 vqd = self._get_vqd(keywords)
    136 payload = {
    137     "q": keywords,
    138     "kl": region,
   (...)
    145     "sp": "0",
    146 }
    147 safesearch = safesearch.lower()

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py:64, in DDGS._get_vqd(self, keywords)
     62 def _get_vqd(self, keywords: str) -> Optional[str]:
     63     """Get vqd value for a search query."""
---> 64     resp = self._get_url("POST", "https://duckduckgo.com/", data={"q": keywords})
     65     if resp:
     66         return _extract_vqd(resp.content, keywords)

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/duckduckgo_search/duckduckgo_search.py:60, in DDGS._get_url(self, method, url, **kwargs)
     58     raise HTTPException(f"_get_url() {url} HttpError: {ex}")
     59 except Exception as ex:
---> 60     raise DuckDuckGoSearchException(f"_get_url() {url} {type(ex).__name__}: {ex}")

DuckDuckGoSearchException: _get_url() https://duckduckgo.com/ TypeError: Client.request() got an unexpected keyword argument 'follow_redirects'

automatically get new UserAgents

Is your feature request related to a problem? Please describe.
The UserAgents will be old after some days. An automatically way to update them could be good

Describe the solution you'd like
Crawl the newest 4 needed useragents without manually work

Describe alternatives you've considered
Something like this could be used in "utils":


try:
    # JSON URL containing user agents
    json_url = "https://jnrbsn.github.io/user-agents/user-agents.json"

    # Fetch the JSON data from the URL
    response = requests.get(json_url)
    user_agents_json = response.json()

    # Extract the first 4 user agents
    first_4_user_agents = user_agents_json[:4]

    # Save the user agents into the USERAGENTS list
    USERAGENTS = first_4_user_agents

except Exception as e:
    USERAGENTS = [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36",
    ]

Folder name error when using site: example.example.com for image download

To reproduce the error

  1. Downloading the package through pip
    pip install -U

  2. Use the ddg_image module with the following query keeping parameter download =True

    keyword = 'bangladeshi 100 taka note site:en.numista.com'
    r = ddg_images(keywords=keyword,safesearch='off',layout='Wide',max_results=50,size='Large',download=True)

    print(r)

Stack Track of error :

  File "D:\Folder\duck_duck_go_scraper.py", line 15, in <module>
    main()
  File "D:\Folder\duck_duck_go_scraper.py", line 11, in main
    r = ddg_images(keywords=keyword,safesearch='off',layout='Wide',max_results=50,size='Large',download=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Folder\venv\Lib\site-packages\duckduckgo_search\ddg_images.py", line 143, in ddg_images
    os.makedirs(path, exist_ok=True)
  File "<frozen os>", line 225, in makedirs
NotADirectoryError: [WinError 267] The directory name is invalid: 'ddg_images_bangladeshi 100 taka note site:en.numista.com_20230415_120129'

Date range in searches

Thank you very much for your project and the time you have dedicated to it. I find it both interesting and highly useful.

I have a small suggestion regarding the search functionality. It would be great if you could consider adding the option to include a time interval for searches. For instance, when using duckduckgo.com, the date limitation is achieved by including the "df" parameter in the URL: &df=YYYYYY-MM-DD..YYYYY-MM-DD

For example: https://duckduckgo.com/?q=test&df=2023-05-15..2023-05-18

Would it be possible to incorporate this feature into your project?

Once again, thank you for your exceptional work.

DDGS how to set max_result like ddg()

I wonder know DDGS().text() how to set max_results like ddg()

If I use ddg(),I can do as follows.

results = ddg("my query", max_result=5)

But I found DDGS might lose that ability. It can not use max_result parameter. Will max_result be provided for DDGS in the future?

Error on line 89

The following search strings failed:

  • "Alcyone" AAV gene therapy
  • "Andelyn Bio" AAV gene therapy
  • "Cevec" AAV gene therapy
  • "Locana Bio" AAV gene therapy

And other similar search strings. Other string worked. There seems to be some sort of inconsistency. Not sure why. Maybe something to do with the cache?

Traceback (most recent call last):
  File "/home/DNA/anaconda3/lib/python3.8/site-packages/duckduckgo_search.py", line 89, in ddg
    s = r["n"].split("s=")[1].split('&')[0]
KeyError: 'n'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ducksearch.py", line 20, in <module>
    results = ddg(search_str, region='wt-wt', safesearch='Moderate', time=None, max_results=25)
  File "/home/DNA/anaconda3/lib/python3.8/site-packages/duckduckgo_search.py", line 98, in ddg
    'body': _normalize(r['a']),
  File "/home/DNA/anaconda3/lib/python3.8/site-packages/duckduckgo_search.py", line 39, in _normalize
    body = html.fromstring(text)
  File "/home/DNA/anaconda3/lib/python3.8/site-packages/lxml/html/__init__.py", line 875, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/home/DNA/anaconda3/lib/python3.8/site-packages/lxml/html/__init__.py", line 763, in document_fromstring
    raise etree.ParserError(
lxml.etree.ParserError: Document is empty

Answers() doesnt works at all!

Describe the bug
What the bug is.
The answers() doesn't works at all for any query😔!

Debug log

answer_query = "france capital"
answers_results = list(ddgs.answers(answer_query))
print(answers_results)

Output:

<Response [200 OK]>
[]

Specify this information

  • OS- Windows
  • environment- Anaconda python 3.10
  • duckduckgo_search version - latest

Downloading files asociated to a key search

Thanks so much, this is a fantastic project and I hope still alive for much time.

Just as a kindly suggestion, should be fantastic adding a new function for downloading files associated to a key term, like pdf's, pptx's, like ddgs document.

Again, thanks for such powerful package.
Regards.

File write errors out when slash is present in query keywords

File write errors out when slash is present in query keywords.

Stack trace of error when package is pip install'ed:

    results = ddg(
  File "/path/.venv/lib/python3.9/site-packages/duckduckgo_search/ddg.py", line 116, in ddg
    _do_output("ddg", keywords, output, results)
  File "/path/.venv/lib/python3.9/site-packages/duckduckgo_search/utils.py", line 111, in _do_output
    _save_json(
  File "/path/.venv/lib/python3.9/site-packages/duckduckgo_search/utils.py", line 70, in _save_json
    with open(jsonfile, "w") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'george washington site:"wikipedia.com/wiki/"_20230322_085645.json'

list index out of range

Sometimes I get the following error:

IndexError                                Traceback (most recent call last)
Input In [111], in <cell line: 53>()
     47     # personSearch = search("senate {} site:ballotpedia.org".format(fullName))
     48     # for result in personSearch:
     49     #     # get the first one
     50     #     print(result)
     51     #     return result
     53 for fullName in politicians:
---> 54     personUrl = getBallotPediaUrl(fullName)
     55     # Get the page content
     56     r = requests.get(personUrl, allow_redirects=True, headers={
     57         "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
     58     })

Input In [111], in getBallotPediaUrl(fullName)
     42 def getBallotPediaUrl(fullName):
     43     print('searching', fullName)
---> 44     personSearch = ddg("{} site:ballotpedia.org".format(fullName), max_results=1)
     45     print(personSearch)
     46     return personSearch[0]['href']

File /opt/conda/lib/python3.9/site-packages/duckduckgo_search.py:39, in ddg(keywords, region, safesearch, time, max_results, **kwargs)
     36     if counter >= max_results:
     37         return results
---> 39 next_page = tree.xpath('.//div[@class="nav-link"]')[-1] 
     40 names = next_page.xpath('.//input[@type="hidden"]/@name')
     41 values = next_page.xpath('.//input[@type="hidden"]/@value')

IndexError: list index out of range

I imagine this is due to some rate limit page (that I do not see in the browser), but I have added the suggest 0.75 second sleep. Is this what is expected when you exceed rate limits?

max_results parameter is misleading

Hello,

The max_results parameter does not represent the maximum, but rather the minimum number of results. I noticed, that there is a hardcoded increment value for individual functions (different for ddg_news, ddg_images, etc), by which the number of results is increased, in batches, until the total number of results is larger, that max_results.

The easiest fix, would be to rename max_results, to min_results.

Additionally, the increment value could be exposed via an additional parameter (fetch_increment).

In any case, this is not critical. I understand not wanting to introduce a parameter name change, that would break backwards compatibility for such a trivial thing, but the actual meaning of max_results should be documented.

Cheers,
Pawel

Colon plus an Asterix in search results causes an Error

Adding a colon followed by an asterix to the search term causes an error.

    data = resp.json()["results"]
  File "/usr/local/lib/python3.9/dist-packages/requests/models.py", line 899, in json
    return complexjson.loads(
  File "/usr/local/lib/python3.9/dist-packages/simplejson/__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/dist-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/usr/local/lib/python3.9/dist-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

JSONDecodeError can be thrown when using ddg_news on queries that have no results

Hello,

Thanks a lot for putting up this library! I've started using it and so far it's been a great help.

Issue

I recently tried using ddg_news and most of the time it works fine.
However, when trying for elements which have possibly 0 answers, this error is thrown:

'from duckduckgo_search import ddg_news
items = ddg_news(keywords="Jeremy Belpois", max_results=3, safesearch='Off', region='wt-wt') ---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
c:\Anaconda3\lib\site-packages\requests\models.py in json(self, **kwargs)
959 try:
--> 960 return complexjson.loads(self.content.decode(encoding), **kwargs)
961 except UnicodeDecodeError:

c:\Anaconda3\lib\json_init_.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:

c:\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()

c:\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:
...
--> 968 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
969
970 try:

JSONDecodeError: Expecting value: line 1 column 1 (char 0)`

Versions

I'm using python 3.9, requests==2.28.1, duckduckgo-search==1.8.

Thoughts

It is likely that this is because a query is passed empty to a parser.
However when this error happens, the execution time is significantly longer (about 9s), which means it is a bit hard to plan around without losing quite some time in the program execution.

I'll continue diving on this on my end, unless you have a quick fix in mind.

some dorks with '"' don't work

Describe the bug
Some dorks don't work such as q = 'site:linkedin.com "wikipedia"'
This is my piece of code:

q = 'site:linkedin.com "{}"'.format(name)
with DDGS() as ddg:
	ddg_res = ddg.text(q)

It returns: Wikipedia, the Free Encyclopedia | 13,303 followers on LinkedIn. Imagine a world in which every single human being can freely share in the sum of all knowledge. | Wikipedia is a multilingual ...
While, if I make the query via DDG gui I have the following result: About us. Wikipedia is a multilingual online encyclopedia, based on open collaboration through a wiki-based content editing system. Website. https://www.wikipedia.org/. Industries. Software

it seems like it doesn't work the query with '"'

Specify this information

  • OS: Ubuntu
  • environment: python 3.8.10
  • duckduckgo_search version: 3.8.3

issue with image search

error returned when trying to run image search

duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://duckduckgo.com/i.js APIException: _get_url() https://duckduckgo.com/i.js 500 in url

steps to reproduce -

from duckduckgo_search import DDGS
with DDGS() as ddgs:
keywords = 'butterfly'
ddgs_images_gen = ddgs.images(
keywords,
region="wt-wt",
safesearch="off",
size=None,
color="Monochrome",
type_image=None,
layout=None,
license_image=None,
max_results=100,
)
for r in ddgs_images_gen:
print(r)

results in following error trace -

_Traceback (most recent call last):
File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 46, in _get_url
if _is_500_in_url(str(resp.url)) or resp.status_code == 403:
duckduckgo_search.exceptions.APIException: _get_url() https://duckduckgo.com/i.js 500 in url
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 15, in
File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 381, in images
license_image = f"license:{license_image}" if license_image else ""
File "/Users/keithnisbet/Bubblr/ew/ew-nlp/venv/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 57, in _get_url
except httpx.HTTPError as ex:
duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://duckduckgo.com/i.js APIException: get_url() https://duckduckgo.com/i.js 500 in url

  • duckduckgo_search version. 3.9.10

Are DDG making changes or are there regular issues like this - I am thinking may have to use a different search engine

After a number of requests ddg returns an empty list

ref #35
Lets track that here:

import duckduckgo_search
a=True
i=0
while a:
    a=duckduckgo_search.ddg('"test test"')
    print(f"{i} {len(a)}")
    i+=1

returns eventually []
b.c. duckduckgo returns

window.execDeep=function(){return{is506:1,bn:{ivc:1,ibc:0}};};

safesearch off not working as expected

Searching anything adult (of course we are human and have weird thoughts sometimes) with safesearch "Off", results response same as moderate, doing the same search on browser with safe search off gives proper results.

Can you kindly check...

ddg_image() sometimes not creative file name

Screenshot_2022-11-08-00-05-01-58_846b44643ec609f507828878741e1f9a

Suggestions:
Check if filename provided by api, then content-disposition filename, as last resort check file-type for extension and use that with image.{ext}
And finally just image.jpg (image viewer will detect)

HTTPError using ddgs.text()

Describe the bug

I got this generic HTTPError using text() following exactly the code like in the example here.

Traceback (most recent call last):
  File "E:\Projects\test.py", line 5, in <module>
    for r in ddgs.text(1, region='wt-wt', safesearch='Off', timelimit='y'):
  File "E:\Projects\venv\Lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 150, in text
    yield from self._text_api(keywords, region, safesearch, timelimit)
  File "E:\Projects\venv\Lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 203, in _text_api
    resp = self._get_url(
           ^^^^^^^^^^^^^^
  File "E:\Projects\venv\Lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 89, in _get_url
    raise ex
  File "E:\Projects\venv\Lib\site-packages\duckduckgo_search\duckduckgo_search.py", line 82, in _get_url
    raise httpx._exceptions.HTTPError("")
httpx.HTTPError

This is the code:

from duckduckgo_search import DDGS

q = input('Search: ')
with DDGS() as ddgs:
    for r in ddgs.text(q, region='wt-wt', safesearch='Off', timelimit='y'):
        print(r)

Information

  • Windows 10
  • duckduckgo_search version 3.8.4

RateLimitException in 3.9.10 & 3.9.11

Hi there,

First of all. Awesome package. Love it.

Howver, ever since the 3.9.10 & 3.9.11 updates, I'm getting this error:

Search Term #1 : what is seo
Error occurred while searching for 'what is seo': _get_url() https://duckduckgo.com
URL 1: None
Search Term #2 : what is cro
Error occurred while searching for 'what is cro': _get_url() https://duckduckgo.com
URL 2: None
Generic URLs: None

It would seem like it's just not grabbing any urls at all from the results anymore. I've spent about 8 hours trying to fix this now and was hoping to find some help here, hope that is OK :)

This is the part of the code I'm using:

def scraper(main_keyword, secondary_keyword_1, secondary_keyword_2, title):
    search_term = secondary_keyword_1
    print(f"Search Term #1 : {search_term}")
    
    wikipedia_url = None

    try:
        with DDGS() as ddgs:
            wikipedia_results = [r for r in ddgs.text(search_term, max_results=1)]
            if wikipedia_results:
                wikipedia_url = wikipedia_results[0]['href']
    except Exception as e:
        print(f"Error occurred while searching for '{search_term}': {e}")

    print(f"URL 1: {wikipedia_url}")

    time.sleep(5)

    secondary_search_term = secondary_keyword_2
    print(f"Search Term #2 : {secondary_search_term}")
    generic_url = None

    try:
        with DDGS() as ddgs:
            generic_results = [r for r in ddgs.text(secondary_search_term, max_results=1)]
            if generic_results:
                generic_url = generic_results[0]['href']
    except Exception as e:
        print(f"Error occurred while searching for '{secondary_search_term}': {e}")

    print(f"URL 2: {generic_url}")
    print(f"Generic URLs: {generic_url}")

Please excuse the horrendous use of variable names, I'm new to this!

Timeout error when trying image search from CLI or Python

Describe the bug
"TimeoutError: timed out" coming from the attempt at an http socket connection. This happens whether calling from Python or using the DDG CLI. Running pip install -U duckduckgo_search does not fix this.

Debug log
CLI version, using the suggested search from the README:

$ ddgs images -k "yuri kuklachev cat theatre" -m 500 -s off -d
Traceback (most recent call last):
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 206, in connect_tcp
    sock = socket.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py", line 851, in create_connection
    raise exceptions[0]
  File "/opt/homebrew/Cellar/[email protected]/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py", line 836, in create_connection
    sock.connect(sa)
TimeoutError: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions
    yield
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 228, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request
    raise exc
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request
    response = connection.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 99, in handle_request
    raise exc
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 76, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp
    with map_exceptions(exc_map):
  File "/opt/homebrew/Cellar/[email protected]/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/myusername/envs/blog/bin/ddgs", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1719, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/cli.py", line 249, in images
    for r in DDGS(proxies=proxy).images(
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 351, in images
    vqd = self._get_vqd(keywords)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 59, in _get_vqd
    resp = self._get_url("POST", "https://duckduckgo.com", data={"q": keywords})
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 54, in _get_url
    raise ex
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 45, in _get_url
    resp = self._client.request(method, url, follow_redirects=True, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 814, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 901, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 929, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 966, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 1002, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 227, in handle_request
    with map_httpcore_exceptions():
  File "/opt/homebrew/Cellar/[email protected]/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout: timed out

Python version: search_results = ddgs.images(keywords="dog images"):

---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File ~/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map)
      9 try:
---> 10     yield
     11 except Exception as exc:  # noqa: PIE786

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    205 with map_exceptions(exc_map):
--> 206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )
    211     for option in socket_options:

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py:851, in create_connection(address, timeout, source_address, all_errors)
    850 if not all_errors:
--> 851     raise exceptions[0]
    852 raise ExceptionGroup("create_connection failed", exceptions)

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py:836, in create_connection(address, timeout, source_address, all_errors)
    835     sock.bind(source_address)
--> 836 sock.connect(sa)
    837 # Break explicitly a reference cycle

TimeoutError: timed out

The above exception was the direct cause of the following exception:

ConnectTimeout                            Traceback (most recent call last)
File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:66, in map_httpcore_exceptions()
     65 try:
---> 66     yield
     67 except Exception as exc:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:228, in HTTPTransport.handle_request(self, request)
    227 with map_httpcore_exceptions():
--> 228     resp = self._pool.handle_request(req)
    230 assert isinstance(resp.stream, typing.Iterable)

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request)
    267         self.response_closed(status)
--> 268     raise exc
    269 else:

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request)
    250 try:
--> 251     response = connection.handle_request(request)
    252 except ConnectionNotAvailable:
    253     # The ConnectionNotAvailable exception is a special case, that
    254     # indicates we need to retry the request on a new connection.
   (...)
    258     # might end up as an HTTP/2 connection, but which actually ends
    259     # up as HTTP/1.1.

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request)
     98         self._connect_failed = True
---> 99         raise exc
    100 elif not self._connection.is_available():

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request)
     75 try:
---> 76     stream = self._connect(request)
     78     ssl_object = stream.get_extra_info("ssl_object")

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request)
    123 with Trace("connect_tcp", logger, request, kwargs) as trace:
--> 124     stream = self._network_backend.connect_tcp(**kwargs)
    125     trace.return_value = stream

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    200 exc_map: ExceptionMapping = {
    201     socket.timeout: ConnectTimeout,
    202     OSError: ConnectError,
    203 }
--> 205 with map_exceptions(exc_map):
    206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py:155, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    154 try:
--> 155     self.gen.throw(typ, value, traceback)
    156 except StopIteration as exc:
    157     # Suppress StopIteration *unless* it's the same exception that
    158     # was passed to throw().  This prevents a StopIteration
    159     # raised inside the "with" statement from being suppressed.

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

ConnectTimeout                            Traceback (most recent call last)
Cell In[17], line 15
     12         return image_urls
     14 # example usage:
---> 15 urls = search_images("dog images", max_images=10)
     16 urls

Cell In[17], line 10, in search_images(term, max_images)
      6 with DDGS() as ddgs:
      7     # generator which yields dicts with:
      8     # {'title','image','thumbnail','url','height','width','source'}
      9     search_results = ddgs.images(keywords=term) # returns a generator
---> 10     image_urls = [next(search_results).get("image") for _ in range(max_images)]
     11     # convert to L (functionally extended list class from fastai)
     12     return image_urls

Cell In[17], line 10, in <listcomp>(.0)
      6 with DDGS() as ddgs:
      7     # generator which yields dicts with:
      8     # {'title','image','thumbnail','url','height','width','source'}
      9     search_results = ddgs.images(keywords=term) # returns a generator
---> 10     image_urls = [next(search_results).get("image") for _ in range(max_images)]
     11     # convert to L (functionally extended list class from fastai)
     12     return image_urls

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:351, in DDGS.images(self, keywords, region, safesearch, timelimit, size, color, type_image, layout, license_image, max_results)
    326 """DuckDuckGo images search. Query params: https://duckduckgo.com/params
    327 
    328 Args:
   (...)
    347 
    348 """
    349 assert keywords, "keywords is mandatory"
--> 351 vqd = self._get_vqd(keywords)
    352 assert vqd, "error in getting vqd"
    354 safesearch_base = {"on": 1, "moderate": 1, "off": -1}

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:59, in DDGS._get_vqd(self, keywords)
     57 def _get_vqd(self, keywords: str) -> Optional[str]:
     58     """Get vqd value for a search query."""
---> 59     resp = self._get_url("POST", "https://duckduckgo.com/", data={"q": keywords})
     60     if resp:
     61         return _extract_vqd(resp.content)

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:54, in DDGS._get_url(self, method, url, **kwargs)
     52     logger.warning(f"_get_url() {url} {type(ex).__name__} {ex}")
     53     if i >= 2 or "418" in str(ex):
---> 54         raise ex
     55 sleep(3)

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:45, in DDGS._get_url(self, method, url, **kwargs)
     43 for i in range(3):
     44     try:
---> 45         resp = self._client.request(method, url, follow_redirects=True, **kwargs)
     46         if _is_500_in_url(str(resp.url)) or resp.status_code == 202:
     47             raise httpx._exceptions.HTTPError("")

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:814, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    799     warnings.warn(message, DeprecationWarning)
    801 request = self.build_request(
    802     method=method,
    803     url=url,
   (...)
    812     extensions=extensions,
    813 )
--> 814 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:901, in Client.send(self, request, stream, auth, follow_redirects)
    893 follow_redirects = (
    894     self.follow_redirects
    895     if isinstance(follow_redirects, UseClientDefault)
    896     else follow_redirects
    897 )
    899 auth = self._build_request_auth(request, auth)
--> 901 response = self._send_handling_auth(
    902     request,
    903     auth=auth,
    904     follow_redirects=follow_redirects,
    905     history=[],
    906 )
    907 try:
    908     if not stream:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:929, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    926 request = next(auth_flow)
    928 while True:
--> 929     response = self._send_handling_redirects(
    930         request,
    931         follow_redirects=follow_redirects,
    932         history=history,
    933     )
    934     try:
    935         try:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:966, in Client._send_handling_redirects(self, request, follow_redirects, history)
    963 for hook in self._event_hooks["request"]:
    964     hook(request)
--> 966 response = self._send_single_request(request)
    967 try:
    968     for hook in self._event_hooks["response"]:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:1002, in Client._send_single_request(self, request)
    997     raise RuntimeError(
    998         "Attempted to send an async request with a sync Client instance."
    999     )
   1001 with request_context(request=request):
-> 1002     response = transport.handle_request(request)
   1004 assert isinstance(response.stream, SyncByteStream)
   1006 response.request = request

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:227, in HTTPTransport.handle_request(self, request)
    213 assert isinstance(request.stream, SyncByteStream)
    215 req = httpcore.Request(
    216     method=request.method,
    217     url=httpcore.URL(
   (...)
    225     extensions=request.extensions,
    226 )
--> 227 with map_httpcore_exceptions():
    228     resp = self._pool.handle_request(req)
    230 assert isinstance(resp.stream, typing.Iterable)

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py:155, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    153     value = typ()
    154 try:
--> 155     self.gen.throw(typ, value, traceback)
    156 except StopIteration as exc:
    157     # Suppress StopIteration *unless* it's the same exception that
    158     # was passed to throw().  This prevents a StopIteration
    159     # raised inside the "with" statement from being suppressed.
    160     return exc is not value

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:83, in map_httpcore_exceptions()
     80     raise
     82 message = str(exc)
---> 83 raise mapped_exc(message) from exc

ConnectTimeout: timed out

Screenshots
Screenshot 2023-11-09 at 2 43 03 PM

Specify this information

  • OS: MacOS 13.4.1
  • environment
$ pip list 
Package                           Version
--------------------------------- ---------
absl-py                           1.4.0
accelerate                        0.20.3
aeiou                             0.0.20
aiofiles                          23.2.1
annotated-types                   0.6.0
anyio                             3.7.1
appdirs                           1.4.4
appnope                           0.1.3
argon2-cffi                       21.3.0
argon2-cffi-bindings              21.2.0
arrow                             1.2.3
asttokens                         2.2.1
astunparse                        1.6.3
async-lru                         2.0.4
attrs                             23.1.0
audioread                         3.0.0
Babel                             2.12.1
backcall                          0.2.0
beautifulsoup4                    4.12.2
black                             23.7.0
bleach                            6.0.0
blis                              0.7.11
bokeh                             3.1.1
braceexpand                       0.1.7
Brotli                            1.1.0
cachetools                        5.3.1
catalogue                         2.0.10
certifi                           2023.5.7
cffi                              1.15.1
charset-normalizer                3.1.0
click                             8.1.7
cloudpathlib                      0.16.0
colorcet                          3.0.1
comm                              0.1.3
confection                        0.1.3
contourpy                         1.0.7
cycler                            0.11.0
cymem                             2.0.8
debugpy                           1.6.7
decorator                         5.1.1
defusedxml                        0.7.1
docker-pycreds                    0.4.0
duckduckgo-search                 3.9.4
einops                            0.6.1
entrypoints                       0.4
execnb                            0.1.5
executing                         1.2.0
fastai                            2.7.13
fastcore                          1.5.29
fastdownload                      0.0.7
fastjsonschema                    2.17.1
fastprogress                      1.0.3
filelock                          3.12.1
flatbuffers                       23.5.26
fonttools                         4.39.4
fqdn                              1.5.1
gast                              0.4.0
ghapi                             1.0.3
gitdb                             4.0.10
GitPython                         3.1.31
google-auth                       2.19.1
google-auth-oauthlib              1.0.0
google-pasta                      0.2.0
grpcio                            1.54.2
h11                               0.14.0
h2                                4.1.0
h5py                              3.8.0
holoviews                         1.16.2
hpack                             4.0.0
httpcore                          1.0.1
httpx                             0.25.1
hyperframe                        6.0.1
idna                              3.4
ipyflow                           0.0.178
ipyflow-core                      0.0.178
ipykernel                         6.23.1
ipython                           8.14.0
ipython-genutils                  0.2.0
ipywidgets                        8.1.0
isoduration                       20.11.0
jedi                              0.18.2
Jinja2                            3.1.2
joblib                            1.2.0
json5                             0.9.14
jsonpointer                       2.4
jsonschema                        4.19.0
jsonschema-specifications         2023.7.1
jupyter                           1.0.0
jupyter_client                    8.3.0
jupyter-console                   6.6.3
jupyter-contrib-core              0.4.2
jupyter_core                      5.3.0
jupyter-events                    0.7.0
jupyter-lsp                       2.2.0
jupyter-nbextensions-configurator 0.6.3
jupyter_server                    2.7.0
jupyter_server_terminals          0.4.4
jupyterlab                        4.0.4
jupyterlab-pygments               0.2.2
jupyterlab_rise                   0.40.0
jupyterlab_server                 2.24.0
jupyterlab-widgets                3.0.8
keras                             2.13.1rc0
kiwisolver                        1.4.4
langcodes                         3.3.0
libclang                          16.0.0
librosa                           0.9.2
linkify-it-py                     2.0.2
llvmlite                          0.40.1rc1
lxml                              4.9.3
Markdown                          3.4.3
markdown-it-py                    2.2.0
MarkupSafe                        2.1.3
matplotlib                        3.7.1
matplotlib-inline                 0.1.6
mdit-py-plugins                   0.4.0
mdurl                             0.1.2
mistune                           3.0.1
mpmath                            1.3.0
mrspuff                           0.0.32
murmurhash                        1.0.10
mypy-extensions                   1.0.0
nbclassic                         1.0.0
nbclient                          0.8.0
nbconvert                         7.7.3
nbdev                             2.3.12
nbformat                          5.9.0
nest-asyncio                      1.5.6
networkx                          3.1
notebook                          7.0.2
notebook_shim                     0.2.3
numba                             0.57.0
numpy                             1.24.3
oauthlib                          3.2.2
opt-einsum                        3.3.0
overrides                         7.4.0
packaging                         23.1
pandas                            2.0.2
pandocfilters                     1.5.0
panel                             1.1.0
param                             1.13.0
parso                             0.8.3
pathspec                          0.11.2
pathtools                         0.1.2
pedalboard                        0.7.3
pexpect                           4.8.0
pickleshare                       0.7.5
Pillow                            9.5.0
pip                               23.2.1
platformdirs                      3.5.3
plotly                            5.15.0
pooch                             1.7.0
preshed                           3.0.9
prometheus-client                 0.17.1
prompt-toolkit                    3.0.38
protobuf                          4.23.2
psutil                            5.9.5
ptyprocess                        0.7.0
pure-eval                         0.2.2
pyasn1                            0.5.0
pyasn1-modules                    0.3.0
pyccolo                           0.0.48
pycparser                         2.21
pyct                              0.5.0
pydantic                          2.4.2
pydantic_core                     2.10.1
Pygments                          2.15.1
pynndescent                       0.5.10
pyparsing                         3.0.9
pyrsistent                        0.19.3
python-dateutil                   2.8.2
python-json-logger                2.0.7
pytz                              2023.3
pyviz-comms                       2.3.1
PyYAML                            6.0
pyzmq                             24.0.1
qtconsole                         5.4.3
QtPy                              2.3.1
quarto                            0.1.0
referencing                       0.30.2
requests                          2.31.0
requests-oauthlib                 1.3.1
resampy                           0.4.2
rfc3339-validator                 0.1.4
rfc3986-validator                 0.1.1
rise                              5.7.1
rpds-py                           0.9.2
rsa                               4.9
scikit-learn                      1.2.2
scipy                             1.10.1
Send2Trash                        1.8.2
sentry-sdk                        1.25.1
setproctitle                      1.3.2
setuptools                        67.6.1
six                               1.16.0
smart-open                        6.4.0
smmap                             5.0.0
sniffio                           1.3.0
socksio                           1.0.0
SoundFile                         0.10.2
soupsieve                         2.4.1
spacy                             3.7.2
spacy-legacy                      3.0.12
spacy-loggers                     1.0.5
srsly                             2.4.8
stack-data                        0.6.2
sympy                             1.12
tenacity                          8.2.2
tensorboard                       2.13.0
tensorboard-data-server           0.7.0
tensorflow                        2.13.0rc1
tensorflow-estimator              2.13.0rc0
tensorflow-macos                  2.13.0rc1
termcolor                         2.3.0
terminado                         0.17.1
thinc                             8.2.1
threadpoolctl                     3.1.0
tinycss2                          1.2.1
torch                             2.0.1
torchaudio                        2.0.2
torchvision                       0.15.2
tornado                           6.3.2
tqdm                              4.65.0
traitlets                         5.9.0
typer                             0.9.0
typing_extensions                 4.6.3
tzdata                            2023.3
uc-micro-py                       1.0.2
umap-learn                        0.5.3
uri-template                      1.3.0
urllib3                           1.26.16
wandb                             0.15.4
wasabi                            1.1.2
watchdog                          3.0.0
wcwidth                           0.2.6
weasel                            0.3.4
webcolors                         1.13
webdataset                        0.2.48
webencodings                      0.5.1
websocket-client                  1.6.1
Werkzeug                          2.3.6
wheel                             0.40.0
widgetsnbextension                4.0.8
wrapt                             1.15.0
xattr                             0.10.1
xyzservices                       2023.5.0
$ env
MANPATH=/opt/homebrew/share/man:
TERM_PROGRAM=Apple_Terminal
GEM_HOME=/Users/shawley/gems
SHELL=/bin/bash
TERM=xterm-256color
KMP_DUPLICATE_LIB_OK=TRUE
HOMEBREW_REPOSITORY=/opt/homebrew
TMPDIR=/var/folders/5s/dkk8t0jn5fv6df9f68j9xddr0000gn/T/
LIBRARY_PATH=/opt/homebrew/lib
PYTHONUNBUFFERED=1
TERM_PROGRAM_VERSION=447
OLDPWD=/Users/shawley/github/blog
TERM_SESSION_ID=87E738C7-EE2E-4ADA-8D3A-93A29BF86763
USER=shawley
CPATH=/opt/homebrew/include
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.J7IfhT2guA/Listeners
PYTORCH_ENABLE_MPS_FALLBACK=1
WINEARCH=win32
BASH_SILENCE_DEPRECATION_WARNING=1
VIRTUAL_ENV=/Users/shawley/envs/blog
LSCOLORS=gxfxcxdxbxegedabagacad
PATH=/Users/shawley/envs/blog/bin:/Users/shawley/.cargo/bin:/Users/shawley/gems/bin:/opt/homebrew/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/opt/X11/bin:/Applications/quarto/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin
LaunchInstanceID=21AB5167-758E-4CC1-AC72-202BB19C92C4
__CFBundleIdentifier=com.apple.Terminal
PWD=/Users/shawley/github/blog/posts
LANG=en_US.UTF-8
XPC_FLAGS=0x0
PS1=(blog) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ 
CONDA_BASE=/Users/shawley/opt/anaconda3
XPC_SERVICE_NAME=0
SHLVL=1
HOME=/Users/shawley
HOMEBREW_PREFIX=/opt/homebrew
LOGNAME=shawley
INFOPATH=/opt/homebrew/share/info:
HOMEBREW_CELLAR=/opt/homebrew/Cellar
DISPLAY=/private/tmp/com.apple.launchd.00NKVjlcg4/org.xquartz:0
SECURITYSESSIONID=186b3
VIRTUAL_ENV_PROMPT=(blog) 
_=/usr/bin/env
  • duckduckgo_search version 3.9.4

_normalize removes apostrophes from search results

In ddg.py, line 54 body = _normalize(row["a"]) removes the apostrophe characters (&#x27;) which makes the search result less readable (same for "title": _normalize(row["t"]), on line 58).
From the examples on the duckduckgo_search GitHub page, this seems to be a problem that did not exist previously.
Removing the apostrophe is particularly problematic in French, this character is frequently used, the results are no longer understandable.
Thanks!

_get_url() https://duckduckgo.com KeyError: 'https'

Before you open an issue:

  1. Make sure you have the latest version installed. Check: ddgs version. Update: pip install -U duckduckgo_search
  2. Try reinstalling the library: pip install -I duckduckgo_search
  3. Make sure the site https://duckduckgo.com is accessible in your browser
  4. Try using a proxy. The site may block ip for a while.

Describe the bug
duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://duckduckgo.com KeyError: 'https'

Debug log
Traceback (most recent call last):
File "/home/zhouzhenyu05/test/kwaipilot-tool-server/src/kuaishou/search/search.py", line 111, in
results = web_search('什么是okr')
File "/home/zhouzhenyu05/test/kwaipilot-tool-server/src/kuaishou/search/search.py", line 94, in web_search
google_result = google_search(query)
File "/home/zhouzhenyu05/test/kwaipilot-tool-server/src/kuaishou/search/search.py", line 52, in google_search
results = [SearchResult(title=item['title'],url=item['href'],content=item['body']) for item in search_results]
File "/home/zhouzhenyu05/test/kwaipilot-tool-server/src/kuaishou/search/search.py", line 52, in
results = [SearchResult(title=item['title'],url=item['href'],content=item['body']) for item in search_results]
File "/opt/conda/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 91, in text
for i, result in enumerate(results, start=1):
File "/opt/conda/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 119, in _text_api
vqd = self._get_vqd(keywords)
File "/opt/conda/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 54, in _get_vqd
resp = self._get_url("POST", "https://duckduckgo.com", data={"q": keywords})
File "/opt/conda/lib/python3.9/site-packages/duckduckgo_search/duckduckgo_search.py", line 50, in _get_url
raise DuckDuckGoSearchException(f"_get_url() {url} {type(ex).name}: {ex}")
duckduckgo_search.exceptions.DuckDuckGoSearchException: _get_url() https://duckduckgo.com KeyError: 'https'

Specify this information

  • Linux x86_64
  • duckduckgo_search V4.1.0

HTTPStatusError: Client error '403 Forbidden' for url

Describe the bug
When i used the python library and call ddgs.news and loop through the list I get most of the time the error. Sometime it's ok...I even added sleep(2) and still get the error

HTTPStatusError: Client error '403 Forbidden' for url 'https://duckduckgo.com/news.js?l=wt-wt&o=json&noamp=1&q=ING%20and%20%27Ralph%20Hamers%27&vqd=4-167272866629043070647631275929701091116&p=-2&df=m&s=0'

Debug log
Add logging.basicConfig(level=logging.DEBUG) to the beginning of your script and attach log to the issue.

Screenshots
The following code I use
with DDGS() as ddgs:
keywords = "ING and 'Ralph Hamers'"
ddgs_news_gen = ddgs.news(
keywords,
region="wt-wt",
safesearch="Off",
timelimit="m"
)
for r in ddgs_news_gen:
i += 1
if i > limit:
break
results.append(r)

Specify this information

  • Windows 10
  • python 10
  • duckduckgo_search version

This library is not reliable.. Most of the time it results in http 403 error...

ddg_images() works only once

Could reproduce this issue both locally and on a remote Kaggle notebook.

Library version tested: 2.9.5
Python versions tested: 3.10.6 (native), 3.10.10 (conda)

from duckduckgo_search import ddg_images

result = ddg_images('chess rook', max_results=50)
print(result)

Running this code once returns an array of image URLs as expected. Running the script again returns an empty array. After waiting some time (~10 minutes), the script works again in Kaggle.
My best guess is that ddg changed their API policy and blocks IPs upon accessing their API through scripts. This would also explain why the script fixes itself in Kaggle, as the VM might change public IP after some time.

httpx.HTTPError on 3.9.5

Yesterday someone reported this bug, but he deleted the issue, so i don't know if it has some solution or what...

code:

async def async_search(query):
    try:
        async with AsyncDDGS() as ddgs:
            results = [r async for r in ddgs.text(query, max_results=5)]
            return results
    except Exception as e:
        print(e)
        return []
        
        
async def search_queries(queries):
   tasks = []
   for query in queries:
        tasks.append(asyncio.create_task(async_search(query)))
    results = await asyncio.gather(*tasks)
    return results

Debug log

2023-11-15 09:52:35.202 Uncaught app exception
Traceback (most recent call last):
  File "/Users/gianjsx/Documents/fuentes/.venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "/Users/gianjsx/Documents/fuentes/app.py", line 49, in <module>
    asyncio.run(main())
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/gianjsx/Documents/fuentes/app.py", line 46, in main
    for result in results:
  File "/Users/gianjsx/Documents/fuentes/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 96, in text
    for i, result in enumerate(results, start=1):
  File "/Users/gianjsx/Documents/fuentes/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 148, in _text_api
    resp = self._get_url("GET", "https://links.duckduckgo.com/d.js", params=payload)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gianjsx/Documents/fuentes/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 55, in _get_url
    raise ex
  File "/Users/gianjsx/Documents/fuentes/.venv/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 48, in _get_url
    raise httpx._exceptions.HTTPError("")
httpx.HTTPError

Specify this information

  • ddgs version 3.9.5
  • venv
  • macbook pro m2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.