Giter VIP home page Giter VIP logo

Comments (3)

dimitryzub avatar dimitryzub commented on May 24, 2024

Hi, @monk1337 👋

Can I directly call this URL https://scholar.google.com/scholar?q=related:gemrYG-1WnEJ:scholar.google.com/&scioq=Multi-label+text+classification+with+latent+word-wise+label+information&hl=en&as_sdt=0,21 using serp API?

  1. Yes you can. You just need to pass the q= URL value to SerpApi q search parameter. In the case of the URL provided by you, SerpApi q parameter would be: related:gemrYG-1WnEJ:scholar.google.com:
    image

  2. You can retrieve data directly from URL using only CURL and JQ. We have a #AskSerpApi episode that covers specifically this question: #AskSerpApi: "How to extract a specific element from the JSON URL?" | CURL + JQ.

To extract related articles, you need to access ["organic_results"]["inline_links"]["related_pages_link"]:

image

Example code to extract related articles from the first page:

from serpapi import GoogleSearch

params = {
  "api_key": "...",
  "engine": "google_scholar",
  "q": "Coffee",
  "hl": "en"
}

search = GoogleSearch(params)
results = search.get_dict()

for result in results["organic_results"]:
    related_articles = result["inline_links"]["related_pages_link"]
    print(related_articles)

Outputs (where q= is a related articles search query that can be passed to SerpApi q search parameter):

https://scholar.google.com/scholar?q=related:sWzmct-yYzgJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:9WouRiFbIK4J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:fGeQlvu-2_IJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:-0fOFoq7wJ8J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:CZSAb_VNDkkJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:Jt15QwxlEw0J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:31GOrHWBl_AJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:KVT-hW9IrDoJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:Ang0MOfBmAUJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:QwF9cuvhnCoJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11

To access related articles with SerpApi, you can do it like so (keep in mind that this is an example):

from serpapi import GoogleSearch
import re

def get_related_articles_query():
    params = {
      "api_key": "...",
      "engine": "google_scholar",
      "q": "Multi-label text classification",
      "hl": "en"
    }
    
    search = GoogleSearch(params)
    results = search.get_dict()

    related_articles = []

    for result in results["organic_results"]:
        # https://regex101.com/r/XuEhoh/1
        related_article = re.search(r"q=(.*)\/&scioq", result["inline_links"]["related_pages_link"]).group(1)
        related_articles.append(related_article)

    return related_articles


def get_related_articles_results():
    for related_article in get_related_articles_query():
        params = {
          "api_key": "...",
          "engine": "google_scholar",
          "q": related_article, # related:sWzmct-yYzgJ:scholar.google.com ...
          "hl": "en"
        }
        
        search = GoogleSearch(params)
        results = search.get_dict()

        for result in results["organic_results"]:
            print(result.get("title"), result.get("link"), result.get("publication_info", {}).get("summary"), sep="\n")

get_related_articles_results()

Outputs:

Deep learning for extreme multi-label text classification
https://dl.acm.org/doi/abs/10.1145/3077136.3080834
J Liu, WC Chang, Y Wu, Y Yang - … of the 40th international ACM SIGIR …, 2017 - dl.acm.org
...

Code example in the online IDE: https://replit.com/@DimitryZub1/Google-Scholar-SerpApi-API-Extract-Related-Articles

Let me know if it makes sense and if you need additional clarifications 🌼

from google-search-results-python.

dimitryzub avatar dimitryzub commented on May 24, 2024

@monk1337 We've added a newserpapi_related_pages_link dict key from JSON response:

image

So now there's no need to use regex to extract search query:

re.search(r"q=(.*)\/&scioq", result["inline_links"]["related_pages_link"]).group(1)

Let me know if you need any additional help 🙂

from google-search-results-python.

dimitryzub avatar dimitryzub commented on May 24, 2024

Closing this as we implemented it. For more: https://serpapi.com/google-scholar-api

from google-search-results-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.