Giter VIP home page Giter VIP logo

Comments (4)

camp00000 avatar camp00000 commented on June 12, 2024

This worked for me locally on the dev branch, I was able to fetch chapters and tested downloading the first 10 chapters.

Please try out the dev branch and see if the issue still persists (with logs). (See this for how to install the dev branch)

If the dev branch works, the issue will be automatically solved by the next release -> please close the issue.

Please also wrap your log output in three backticks (`) at the start and end so it doesn't apply Markdown formatting on the log
It'll have a background like this text after you've done that.

from lightnovel-crawler.

nixos-s avatar nixos-s commented on June 12, 2024
================================================================================
                          [#] Lightnovel Crawler v3.4.2
                  https://github.com/dipu-bd/lightnovel-crawler
--------------------------------------------------------------------------------
Module load failed: c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\sources\ar\rewayatclub.py | 3
Module load failed: c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\sources\en\l\lnmtl.py | 3
Module load failed: c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\sources\en\n\novelmao.py | 3
Module load failed: c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\sources\en\r\ranobes.py | 3

-> Press  Ctrl + C  to exit

? Enter novel page url or query novel: https://www.lightnovelpub.com/novel/shadow-slave-05122222
Retrieving novel info...
Exception in thread Thread-1 (read_novel_info):
Traceback (most recent call last):
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\browser\basic.py", line 88, in read_novel_info
    self.read_novel_info_in_scraper()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\browser\general.py", line 18, in read_novel_info_in_scraper
    soup = self.get_novel_soup()
           ^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\soup\general.py", line 41, in get_novel_soup
    return self.get_soup(self.novel_url)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\core\scraper.py", line 304, in get_soup
    response = self.get_response(url, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\core\scraper.py", line 201, in get_response
    return self.__process_request(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\core\scraper.py", line 130, in __process_request
    raise e
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\core\scraper.py", line 123, in __process_request
    response.raise_for_status()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\requests\models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.lightnovelpub.com/novel/shadow-slave-05122222

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\threading.py", line 1073, in _bootstrap_inner
    self.run()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\threading.py", line 1010, in run
    self._target(*self._args, **self._kwargs)
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\browser\basic.py", line 95, in read_novel_info
    self.read_novel_info_in_browser()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\browser\general.py", line 49, in read_novel_info_in_browser
    self.novel_title = self.parse_title_in_browser()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\novelpub.py", line 81, in parse_title_in_browser
    return self.parse_title(self.browser.soup)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\templates\novelpub.py", line 76, in parse_title
    assert tag
AssertionError

 ! Error: No chapters found
<class 'Exception'>
File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\bots\console\integration.py", line 107, in start
    raise e
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\bots\console\integration.py", line 101, in start
    _download_novel()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\bots\console\integration.py", line 85, in _download_novel
    self.app.get_novel_info()
  File "c:\users\nixos\appdata\local\programs\python\PYTHON~1\Lib\site-packages\lncrawl\core\app.py", line 137, in get_novel_info
    raise Exception("No chapters found")


--------------------------------------------------------------------------------
 -  https://github.com/dipu-bd/lightnovel-crawler/issues
================================================================================

it opens the page in a browser and then fails Cloudflare check

from lightnovel-crawler.

camp00000 avatar camp00000 commented on June 12, 2024

If this was with proxy then it's likely due to bad ip reptutation on the proxy address, if it was without, then cloudflare has probably still temporarily banned you from accessing that site - for me I'm able to get the full amount of chapters but was unable to download a large amount.

Cloudflare can be a real pain to deal with..
The only thing that could be adjusted is to set the rate limiter, but I upon testing that it didn't seem to help much either so I'm unsure how to proceed.

from lightnovel-crawler.

nixos-s avatar nixos-s commented on June 12, 2024

No proxy

I can still access the site via my default browser (brave) but the chrome tab that it opens seems to have a problem, it's always failing no matter how much I try

Maybe I can try to pass the cookies from my browser to the other one

I'll try with a VPN or proxy too see if it works

from lightnovel-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.