Google Parser

Google parser is a lightweight yet powerful HTTP client based Google Search Result scraper/parser with the purpose of sending browser-like requests out of the box. This is very essential in the web scraping industry to blend in with the website traffic.

Questions

Does this work with serverless functions? Yes, this works with serverless functions like AWS Lambda. I haven't tested it with other serverless functions but it should work with them too.

Are more features coming? Yes, I am working on adding more features like proxies, pagination, etc.

I'm stuck, what should I do? You can create an issue on GitHub, pull requests are also welcome.

Features

Proxy support ✅︎
Custom Headers support ✅︎

Installation

pnpm add @nrjdalal/google-parser

yarn or npm

yarn add @nrjdalal/google-parser

npm install @nrjdalal/google-parser

Usage

1. Browser Info

Usage:

import { browserInfo } from '@nrjdalal/google-parser'

const response = await browserInfo()

Response:

{
  method: 'GET',
  // IP address of the client
  clientIp: '182.69.180.111',
  // country code of the client
  countryCode: 'US',
  bodyLength: 0,
  headers: {
    'x-forwarded-for': '182.69.180.111',
    'x-forwarded-proto': 'https',
    'x-forwarded-port': '443',
    host: 'api.apify.com',
    // random user agent client hint
    'sec-ch-ua': '"Google Chrome";v="113", "Chromium";v="113", "Not-A.Brand";v="24"',
    // devices: ['Desktop']
    'sec-ch-ua-mobile': '?0',
    // operatingSystems: ['windows', 'linux', 'macos']
    'sec-ch-ua-platform': '"macOS"',
    'upgrade-insecure-requests': '1',
    // random user agent
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36',
    accept: '*/*',
    'sec-fetch-site': 'same-site',
    'sec-fetch-mode': 'cors',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'empty',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.5',
    'alt-used': 'www.google.com',
    referer: 'https://www.google.com/'
  }
}

2. Google Search

Usage:

import { googleSearch } from '@nrjdalal/google-parser'

const response = await googleSearch({ query: 'nrjdalal' })

Output:

{
  code: 200,
  status: 'success',
  message: 'Found 5 results in 1s',
  query: 'nrjdalal',
  data: {
    results: [
      {
        title: 'Neeraj Dalal nrjdalal',
        link: 'https://github.com/nrjdalal',
        description: 'Web Developer & Digital Strategist. Follow their code on GitHub.',
        ...
      }
    ]
  },

}

Error:

This error is thrown when the request is blocked by Google. This can happen due to various reasons like too many requests, captcha, etc. using the same IP address.

{
  code: 429,
  status: 'error',
  message: 'Captcha or too many requests.',
  query: 'nrjdalal'
}

3. Google Search with Same Headers

Why? It is not recommended to change headers for every request as it can lead to detection. So, it is recommended to use the same headers for every request for a single IP.

Usage:

import { getHeaders, googleSearch } from '@nrjdalal/google-parser'

const headers = getHeaders()

// same headers for same IP
console.log(await googleSearch({ query: 'facebook', options: { headers } }))
console.log(await googleSearch({ query: 'apple', options: { headers } }))

// regeneration of headers for new IP if needed
console.log(
  await googleSearch({ query: 'netflix', options: { headers: getHeaders() } })
)

3. Google Search with Proxy

Usage:

import { googleSearch } from '@nrjdalal/google-parser'

console.log(
  await googleSearch({
    query: 'microsoft',
    options: {
      proxyUrl: 'http://username:password@host:port',
    },
  })
)

nrjdalal / google-parser Goto Github PK

google-parser's Introduction

Questions

Features

Installation

Usage

1. Browser Info

2. Google Search

3. Google Search with Same Headers

3. Google Search with Proxy

google-parser's People

Contributors

Stargazers

Watchers

google-parser's Issues

Recommend Projects

Recommend Topics

Recommend Org