Giter VIP home page Giter VIP logo

Comments (11)

hotheadhacker avatar hotheadhacker commented on May 27, 2024 1

@hotheadhacker Like what?

@shobrook I have bypassed bot detection. But in case it occurs the user is prompted with a link to stackoverflow to whitelist themselves. Works fine. You can check my forked repository.

Fork: https://github.com/hotheadhacker/rebound

from rebound.

Koubae avatar Koubae commented on May 27, 2024

Having the same issue here, just pip installed yesterday but getting the above mentioned error.

from rebound.

shobrook avatar shobrook commented on May 27, 2024

Hey all, I'm aware of this issue. It seems that Stack Overflow has gotten stricter about bot detection and is doing a captcha check every time rebound makes a request. One solution is to use the StackExchange API instead of a web scraper, but this would require rebound users to register for an API token. It would also require a refactor of rebound.py. Please let me know if you have any other ideas or would like to work on this.

from rebound.

hotheadhacker avatar hotheadhacker commented on May 27, 2024

It would take lot if time to change to API module from a webscrapper and API calls won't be enough and will be a bottleneck.

Why don't we use advance webscrapper?

from rebound.

shobrook avatar shobrook commented on May 27, 2024

@hotheadhacker Like what?

from rebound.

hotheadhacker avatar hotheadhacker commented on May 27, 2024

Let me fork this and explain

from rebound.

cristicretu avatar cristicretu commented on May 27, 2024

@shobrook I think I know what the problem is. Doing many requests, in a short amount of time and using random user agents for every request will trigger the captcha every time. I suggest using a single user-agent / answer search, randomizing it only when the program is run (or just using a fixed UA, but that isn't' a very good idea). I don't know if it will fix it, but it certainly is a step in the right direction. Another option is to the the user's UA from the default browser, so it doesn't differ from normal browsing. I will look later and try to fix it. I changed the UA to be randomized only when the program is run, and also fixed some minor anti-pattern issues and cleaning up the code.

You can check my fork here: https://github.com/cristicretu/rebound

I will also try using a unique user agent: Google's Googlebot user agent (https://developers.google.com/search/blog/2019/10/updating-user-agent-of-googlebot). It sometimes fixed the capcha issues.

from rebound.

shobrook avatar shobrook commented on May 27, 2024

Thanks @cristicretu and @hotheadhacker. It seems like the user-agents are the issue here. Is there a reason why we can't just remove the list of user-agents and use the user's default agent when making the request to SO?

from rebound.

cristicretu avatar cristicretu commented on May 27, 2024

Is there a reason why we can't just remove the list of user-agents and use the user's default agent when making the request to SO?

That is the only solution, I think. Getting the user's default agent is a little bit tricky, but I will try to do it.

My idea is to open with webbrowser a tab where you can get the UA, then parse it to the script and then continue. This should be done only at the first time of executing, and then it should store the info.

Do you have another idea? @shobrook

from rebound.

surajawal7 avatar surajawal7 commented on May 27, 2024

Hi @shobrook, I somehow managed to work with the captcha but it has some dependencies.

Workflow:

  • Run the current script
  • If captcha page comes up, try to solve the captcha
  • If captcha is only ticking a checkbox, it will pass. If advanced captcha shows up, it will redirect the user to the manual verification in chrome

I have to start Google Chrome in debugging mode first and use Selenium to interact with the captcha. Dependency on opening Google Chrome in debugging mode and using Selenium web driver. This may cause issues based on the device and platform. But, using this method, I find that captcha solving if done once, it will not occur until the Chrome in debugging mode is restarted or in best case, captcha also does not shows up after restart of Chrome.

from rebound.

tiagoarodrigues55 avatar tiagoarodrigues55 commented on May 27, 2024

Hi guys, did you come to any conclusions? Need help fixing the issue? I'm having the same problem and I thought the idea of ​​the project was very popular, I wanted to see it work...

from rebound.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.