Giter VIP home page Giter VIP logo

Comments (12)

rocketinventor avatar rocketinventor commented on August 20, 2024 1

@leoncvlt Well, that was the book five days ago, but the Blinkist site was actually broken

from blinkist-scraper.

rocketinventor avatar rocketinventor commented on August 20, 2024

Hi @albertlaudia

At what part of the script is this happening? At the sign-in, or somewhere else? What happens if you manually change the URL after seeing this page..?

The way that the script is set up right now, it is meant to block the captcha from loading and just skip to the next page...
If you want to load the captcha, you can open up uBlock, and switch over hcaptcha.com from
block to allow.

However, I don't think that will help you much, because if you are getting the captcha page already, then you'll probably just get it again (even if you solve it).

from blinkist-scraper.

albertlaudia avatar albertlaudia commented on August 20, 2024

I am actually not 100% sure on how the captcha works. It just seems that the script stuck on the captha then throw an error that no internet
Loading
image

from blinkist-scraper.

FirstClassCitizenFCC avatar FirstClassCitizenFCC commented on August 20, 2024

Assuming you get the captcha after the login check, try to change the URL manually to https://www.blinkist.com/{language} when the captcha occurs.

from blinkist-scraper.

obsessivelearner avatar obsessivelearner commented on August 20, 2024

Not OP but I have a similar issue and it started a day before this issue was opened. The script ran perfectly for about 2 weeks prior to this. I don't have a premium account, I just scrape the daily book at midnight every day.

That out of the way, I tried what @FirstClassCitizenFCC suggested and changing the URL manually first redirects me to https://www.blinkist.com/en/nc/library followed by the same captcha page immediately after. Interestingly, even though the terminal says "logged into blinkist" initially, the final error was "Failed to log in to Blinkist" so I am not sure if the captcha is before or after the login check though I'm assuming after because it does load my account's library for a split second before it gets stuck on the captcha.

The first image is my terminal output on a regular run, the second image is the output I get when I manually change URL after I'm stuck on the captcha.

image

image

from blinkist-scraper.

johndoe-dev00 avatar johndoe-dev00 commented on August 20, 2024

I had the same problem with the captcha not loading correctly.
Disabling ublock did the trick for me.
Once you have sucessfully logged in (cookie file has been created) you can activate it again.

I also did a few other workarounds for the login process. You can check my fork.

from blinkist-scraper.

obsessivelearner avatar obsessivelearner commented on August 20, 2024

I had no clue how to disable ublock in the script because I'm very new to coding but disabling ublock in my Chrome instance after scraping started let me solve a captcha and then it scraped the books as normal once I accepted cookies.

@johndoe-dev00 I see you have a docker build for this project! That is something I had been searching for like a madman. I'll definitely check out your fork and docker. I hope to run this project on my Synology NAS via docker :)

I realize the issue isn't solved but having found the inelegant solution that we have, I realize the issue may be closed and I just wanted to thank everybody who's worked on the project and I hope to pay it forward in the near future.

from blinkist-scraper.

Riviss avatar Riviss commented on August 20, 2024

What ended up being successful for me was disabling Ublock, then clicking on the captcha area quickly when the page first loads, then the captcha would actually pop up to be completed and everything would work. (This may work without first disabling ublock, I had already disabled it when I tried this)

If I just left the page to load without clicking quickly, it would go to the page with the screenshot @albertlaudia posted.

from blinkist-scraper.

obsessivelearner avatar obsessivelearner commented on August 20, 2024

Disabling uBlock manually doesn't work anymore. Redirects to the following work of art:

image

The Title of the daily book is "The Internet of Us: Knowing More and Understanding Less in the Age of Big Data" and I'm not even mad.

Terminal Output looks like this:

image

from blinkist-scraper.

rocketinventor avatar rocketinventor commented on August 20, 2024

@obsessivelearner The issue that you are having has nothing to do with the script. The site is just broken right now...

Try navigating to https://www.blinkist.com/en/nc/daily/reader/the-internet-of-us-en manually in your web browser, you should see the same issue.

from blinkist-scraper.

leoncvlt avatar leoncvlt commented on August 20, 2024

Also I think that link appears broken simply because "The internet of us" is not available as the free daily book anymore - it probably worked for that day it was. https://www.blinkist.com/en/nc/daily should dynamically resolve to the free daily book, but reading the book from that link it doesn't send you to the book's generic reader page, but to a special https://www.blinkist.com/en/nc/daily/reader/{book-slug} url which obviously works for one day only.

from blinkist-scraper.

jonaschn avatar jonaschn commented on August 20, 2024

Using --no-ublock worked for me.
Also manually using the privacy-pass extension makes scraping audio possible again.

from blinkist-scraper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.