Giter VIP home page Giter VIP logo

Comments (13)

ZachSaucier avatar ZachSaucier commented on May 14, 2024 1

@iandunn The main problem using that approach is that I've come across sites that don't use the article content for the article at all, they use it only for previews of other articles (why they do so eludes all logical explanations I can fathom).

This is really just a symptom of the larger problem of how JR does auto-selection. If I have time I'll look at redoing all of it this summer, but time gets shorter every day, haha.

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

Thanks for the report! It's because each section is technically in their own container, so Just Read selects the first container. I'll still working to try and fix this in the auto-selection mode, but I'm not sure how I can do that without error across all sites.

However, have you tried out user selection mode or highlight mode? I was able to easily get all the content using both of those modes.

from just-read.

paulvancotthem avatar paulvancotthem commented on May 14, 2024

I have a similar issue on pages in the "motorsport.com" domain.
When I use the "Clearly" extension (now unsupported and no longer developed), it renders the page much better.

Example:
http://www.motorsport.com/f1/news/massa-returns-to-f1-as-bottas-replacement-865853/

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

@paulvancotthem That website seems to work when I use JR's auto-selection...

from just-read.

paulvancotthem avatar paulvancotthem commented on May 14, 2024

@ZachSaucier It does not fully work for me. The top H1-title of that article ("Mercedes confirms Bottas as Hamilton's teammate") and the photo at the top of the page do not show up after JR parses it. JR's output starts at the H2-subtitle, ignoring what's above it.

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

@paulvancotthem I understand now.

The title isn't obtained because JR checks the article's container for a h1 or h2 first, then more globally (I believe that this approach is generally more favorable - keep in mind you can manually edit the title if you need to by clicking the pencil after hovering the title). The photo is just outside the article container, so it would be hard to programmatically find images like that and determine whether or not they should be included.

from just-read.

paulvancotthem avatar paulvancotthem commented on May 14, 2024

@ZachSaucier When I use the "Clearly"-extension (from Evernotes; no longer supported, but still available for download here), it renders this page correctly, both the title and the image and the rest of the article are parsed and rendered correctly. So, there must be a way to do this programmatically...
screenshot-001

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

I seem unable to get that download of Clearly to work on my computer. I can look at the code though, so I'll try to break down what's going on to let them select better than how JR selects.

from just-read.

paulvancotthem avatar paulvancotthem commented on May 14, 2024

@ZachSaucier Oh wow, thanks Zach!

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

For future reference for myself, a running list of other sites with this same issue:

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

One idea I have (potential feature) to help, not solve this, is to implement a "select more generally" or "select parent container" button. That way, if only part of the content is shown, users can stay in the Just Read format but select more content (which should tend to be the full article).

from just-read.

iandunn avatar iandunn commented on May 14, 2024

each section is technically in their own container, so Just Read selects the first container... I'm not sure how I can do that without error across all sites.

You may have already considered this, but would it be a problem to just include all the <article> elements inside the container?

Maybe some sites put too much crap in there, but with all the other formatting stripped away, it doesn't seem like it'd be that bad. At least for me, having too much (clean) text is a minor problem, while missing some important text is a major problem.

from just-read.

ZachSaucier avatar ZachSaucier commented on May 14, 2024

This should be more or less fixed in the latest version (1.1.0) with Just Read's new auto-selection algorithm.

from just-read.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.