Giter VIP home page Giter VIP logo

Comments (3)

Type-Delta avatar Type-Delta commented on June 12, 2024 4

this error happened because for some weird reasons AnimePahe decided it's a good idea to put a DDoS Guard on EVERY PAGES including API respond.
(this is the same all for other Domains too: animepahe.ru animepahe.org animepahe.com)

Here is how this error happened

when Animdl search for anime titles on the site it will made an API request with url similar to this
https://animepahe.ru/api?q=Oroka%20na%20Tenshi%20wa%20Akuma%20to%20Odoru&m=search

which should get a JSON respond like this

{
   "total": 3,
   "per_page": 8,
   "current_page": 1,
   "last_page": 1,
   "from": 1,
   "to": 3,
   "data": [
      {
         "id": 5442,
         "title": "Oroka na Tenshi wa Akuma to Odoru",
         "type": "TV",
         "episodes": 12,
         "status": "Currently Airing",
         "season": "Winter",
         "year": 2024,
         "score": 6.62,
         "poster": "https:\/\/i.animepahe.ru\/posters\/dedc73ea139e05bddd50651cb35112806aaf984deaea672d25093def0d2a60aa.jpg",
         "session": "f115f686-4214-ee80-a402-6e801f2f6534"
      },
      ...
   ]
}

unfortunately this is what we got
Screenshot (2181)

since httpx.get() return page content IMMEDIATELY once the page is loaded
it got the content of the fake loading screen instead.

and when Animdl tries to parse the fake loading screen it failed with the error you've received.

so...

Is there any ways to fix/workaround this?

I'm not an expert but from what I know... not much actually.
because DDoS Guards are made for this.

unless AnimePahe remove this protection this is what we can try:

  • Wait for the fake loading screen to go away
  • Use cookies hack

Wait for the fake loading screen to go away

this simplest method is probably Wait for the fake loading screen to go away
because it will disappear and display the real content after a few seconds anyways,
if we can some how send an API request, wait and evaluate page content only when the real content is displayed
it could work for a bit.
but it does has some flaws the first is the fake loading screen may stay chilling for too long
and bypass the delay, this can be easily fixed with Headless Browser.
Instead of waiting for some delay we can wait for page elements to disappear.

Use cookies hack

cookies that saved on the browser can prevent the fake loading screen from appearing the second time.
we can attach this cookies with the request to trick the server that it's from the browser that has pass DDoS protection.
This method is a bit advanced but probably the most efficient way.
BUT those cookies has an expiration date like a real cookies, we might have to keep generating them to prevent this.

Still, it's not flawless

no matter what we do DDoS Guard is still active on the server, soon on later it could suspect
our API request for an attack and throw everyone's favorites puzzle: captcha to us.
the only problem is that Animdl can't automatically solve them and we're kinda stuck. sucks right?

PS: I'm still working this even though I'm not a contributor, because I still needs Animdl for my animes need ;)
for the real reliable fix would have to leave it for someone else with much more Neuron power and time than me.

from animdl.

TaleWatcher avatar TaleWatcher commented on June 12, 2024 2

utils.http_client.integrate_ddg_bypassing(
client,
".marin.moe",
)

Adding ".animepahe.ru" to this function should fix it

utils.http_client.integrate_ddg_bypassing(
    client,
    ".marin.moe",
    ".animepahe.ru",
)

from animdl.

justfoolingaround avatar justfoolingaround commented on June 12, 2024

I'm extremely busy rn due to university. I'll say that please refrain from posting (or even mentioning) that part of the codebase. It is an extremely unrecognized method that if fixed could sabotage a lot of scrapers.

from animdl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.