Some documentation are generated client side with JS (ex: <a href="http://docs.prezly.

Note to myself: Add search on gns3: <a href="https://secure.helpscout.net/conversa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Never, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Crawl JS generated docs about docsearch-scraper HOT 10 CLOSED

algolia commented on May 25, 2024

Crawl JS generated docs

from docsearch-scraper.

Comments (10)

ElPicador commented on May 25, 2024

Note to myself:
Add search on gns3: https://secure.helpscout.net/conversation/154616922/4890/?folderId=696715

from docsearch-scraper.

pixelastic commented on May 25, 2024

This looks like a pretty big enhancement, considering that the underlying engine we're using for scrapping (Scrapy) only do static HTML parsing. For SPA application, I would rather try to hit the API level if possible, or say that DocSearch is not compatible with their documentation.

For Prezly, they are using readme.io, maybe we could create something directly on readme.io level

from docsearch-scraper.

proudlygeek commented on May 25, 2024

I'd say it's more a feature than an enhancement.

I would personally go with an optional HTTP Proxy which can process JavaScript (PhantomJS / Selenium) documentations and feed the resulting static page into Scrapy / Python. What do you think about this approach?

from docsearch-scraper.

ElPicador commented on May 25, 2024

There is Scrapy for JS: https://github.com/scrapinghub/scrapy-splash

from docsearch-scraper.

proudlygeek commented on May 25, 2024

@ElPicador very cool! As I can see it's basically what I said, just more handy and already Dockerized 😄 did you already give it a try?

from docsearch-scraper.

ElPicador commented on May 25, 2024

Never, @redox was the one who told me about it

from docsearch-scraper.

redox commented on May 25, 2024

@ElPicador @proudlygeek @pixelastic We've been thinking of making it the onboarding project of @aseure :)

from docsearch-scraper.

proudlygeek commented on May 25, 2024

Awesomeness!!! 💯 👍

from docsearch-scraper.

aseure commented on May 25, 2024

I've opened a PR to address those problematic documentations. Please see #46.

from docsearch-scraper.

pixelastic commented on May 25, 2024

I think this can be closed

from docsearch-scraper.

Recommend Projects

Crawl JS generated docs about docsearch-scraper HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent