For several APIs, parsing the serpapi_pagination.next

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Pagination iterator doesn't work for APIs with token-based pagination about google-search-results-python HOT 5 CLOSED

serpapi commented on May 24, 2024 1

Pagination iterator doesn't work for APIs with token-based pagination

from google-search-results-python.

Comments (5)

kikohs commented on May 24, 2024 2

Good point, I was actually struggling to understand why pagination didn't work for scholar.
A consistent API would be great! Thanks for sharing the snippet.

from google-search-results-python.

jvmvik commented on May 24, 2024 2

I'd just roll out a new version of the package but I have taking into account your code above.
My concern is that Google search provides duplicated search results when pagination.
Should we try to build a filtering mechanism on the back end or the client side.
So far, we are letting the user deal with this issue.
My experiment shows that it's not Serp API fault but the search engine returned similar search result from one page to another.
What do you think?

from google-search-results-python.

jvmvik commented on May 24, 2024 2

Sorry for the long wait on this.
I was working with a couple of wrong assumption.
1- page start, num are always supported / translate if needed.
2- start = f(x * num)

1- On the top the suggestion above, I can implement a mapping table per search engine.

    # set default
    self.start_key = "start"
    self.num_key = "num"
    self.end_key = "end"

    # override per search engine
    if engine == BAIDU_ENGINE:
      self.start_key = "pn"
      self.num_key = "rn"

So, the response.next can be parse for most of the search engine except Google Scholar.

2- The way the offset is returned by SerpApi is not consistent.
Google takes start=0, num=20

page 1 : start=0
page 2 : start=20
page 3 : start=40
see: tests/test_google_search.py#test_paginate in branch: 2.5.0

Ebay takes pn=0, rn=20

page 1 : start=2
page 2 : start=3
page 3 : start=4

see: tests/test_ebay_search.py#test_paginate in branch: 2.5.0

https://github.com/serpapi/google-search-results-python/pull/new/2.5.0

Could we improve the consistency between Search engines on the backend or the client ?
Or do we even care ? The user might not switch back on forth between search engine.

from google-search-results-python.

ilyazub commented on May 24, 2024 2

Could we improve the consistency between Search engines on the backend or the client ?

It depends on the target website — we mirror their query parameters. But consistency on the SerpApi backend should be improved too. For example, response for google_scholar_profiles engine contains pagination but no serpapi_pagination.

Currently, a reliable way to consume pagination across all search engines on the client is to use result['serpapi_pagination']['next'].

from google-search-results-python.

ilyazub commented on May 24, 2024 1

@jvmvik I'd not filter duplicated results on the SerpApi side.

from google-search-results-python.

Recommend Projects

Pagination iterator doesn't work for APIs with token-based pagination about google-search-results-python HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent