Giter VIP home page Giter VIP logo

Comments (5)

erikrose avatar erikrose commented on September 5, 2024

Thanks for letting me know! Fathom doesn't list jsdom as a dependency because it strictly isn't one: Fathom is happy to run against any DOM-compliant object. For instance, it's quite happy running against an in-browser native DOM, and it shouldn't bundle jsdom into any in-browser deployments. We do use jsdom in the tests, so that's where we declare it, in devDependencies.

I think the real answer is going to take the form of switching to domino, which also has the advantage of not leaking. However, for the interrim, you might need to pass jsdom.jsdom(...).window.document to against(). I haven't tried it, but I'm guessing based on how https://github.com/tmpvar/jsdom#basic-usage looks. It appears that that's where they're not hiding their DOM-compliant object. If that does work and you're willing, I'd welcome a doc patch so other people don't run into trouble!

from fathom.

brightredchilli avatar brightredchilli commented on September 5, 2024

Cool! This works:

// tested with [email protected], [email protected]
const body = new jsdom.JSDOM(htmlString).window.document
const facts = rules.against(body)

Where should I make this pr?

from fathom.

brightredchilli avatar brightredchilli commented on September 5, 2024

Wow, it took me longer than it should have to understand what you were proposing. So this works perfectly using domino:

const body = domino.createDocument(htmlString, true)
const facts = rules.against(body)

I think perhaps the doc patch should switch to recommending this implementation for server side dom handling?

from fathom.

erikrose avatar erikrose commented on September 5, 2024

I'd love to see us switch over to domino entirely. Toward that, I tried replacing staticDom() (our helper procedure for creating server-side DOMs for tests) with your implementation. However, it didn't pass all the tests:

  5 failing

  1) Cluster tests distance() considers A and B to be far apart if one contains the other:
     TypeError: Cannot read property 'parentNode' of null
      at distance (clusters.js:112:30)
      at Context.<anonymous> (test/clusters_tests.js:199:26)

  2) Cluster tests clusters() groups nearby similar nodes together:

      AssertionError: expected [ Array(1) ] to deeply equal [ Array(2) ]
      + expected - actual

       [
         [
      +    "G"
      +  ]
      +  [
      +    "E"
      +    "D"
           "F"
      -    "D"
      -    "E"
      +    "B"
      +    "A"
           "C"
      -    "A"
      -    "B"
      -    "G"
         ]
       ]

      at Context.<anonymous> (test/clusters_tests.js:260:20)

  3) Readability ruleset finds content from... large inner clusters with some piddly little text in their container:
     TypeError: Cannot read property 'parentNode' of null
      at distance (clusters.js:112:30)
      at clusters (lhs.js:311:23)
      at new DistanceMatrix (clusters.js:186:44)
      at clusters (clusters.js:314:20)
      at BestClusterLhs.fnodes (lhs.js:308:24)
      at OutwardRule.results (rule.js:287:69)
      at BoundRuleset._execute (ruleset.js:202:31)
      at BoundRuleset.get (ruleset.js:116:40)
      at contentFnodes (examples/readability.js:105:35)
      at snippetsFrom (test/readability_tests.js:17:36)
      at Context.<anonymous> (test/readability_tests.js:44:26)

  4) Readability ruleset finds content from... large outer clusters with some piddly inner things contained:

      AssertionError: expected [ Array(3) ] to deeply equal [ Array(3) ]
      + expected - actual

       [
      +  "Smoo bars."
         "Mangaroo. Witches an"
      -  "Smoo bars."
         "Bing bang, I saw the"
       ]

      at Context.<anonymous> (test/readability_tests.js:72:16)

  5) Ruleset takes a subtree of a document and operates on it:

      AssertionError: expected 'root' to equal 'inner'
      + expected - actual

      -root
      +inner

      at Context.<anonymous> (test/ruleset_tests.js:293:16)

2 and 4 might just be brittleness of tests—not real failures—but the other ones look legit. If you want to look into it, you're most welcome, and I'm happy to answer questions. Otherwise, we'd better keep recommending jsdom in the docs until domino works with all parts of Fathom.

from fathom.

brightredchilli avatar brightredchilli commented on September 5, 2024

I'd be happy to, will take a look as soon as I can - but for now, the jsdom implementation as it stands in the fathom docs here does not work with [email protected]. Maybe worth a quick patch there.

from fathom.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.