Comments (13)
@iandunn The main problem using that approach is that I've come across sites that don't use the article content for the article at all, they use it only for previews of other articles (why they do so eludes all logical explanations I can fathom).
This is really just a symptom of the larger problem of how JR does auto-selection. If I have time I'll look at redoing all of it this summer, but time gets shorter every day, haha.
from just-read.
Thanks for the report! It's because each section is technically in their own container, so Just Read selects the first container. I'll still working to try and fix this in the auto-selection mode, but I'm not sure how I can do that without error across all sites.
However, have you tried out user selection mode or highlight mode? I was able to easily get all the content using both of those modes.
from just-read.
I have a similar issue on pages in the "motorsport.com" domain.
When I use the "Clearly" extension (now unsupported and no longer developed), it renders the page much better.
Example:
http://www.motorsport.com/f1/news/massa-returns-to-f1-as-bottas-replacement-865853/
from just-read.
@paulvancotthem That website seems to work when I use JR's auto-selection...
from just-read.
@ZachSaucier It does not fully work for me. The top H1-title of that article ("Mercedes confirms Bottas as Hamilton's teammate") and the photo at the top of the page do not show up after JR parses it. JR's output starts at the H2-subtitle, ignoring what's above it.
from just-read.
@paulvancotthem I understand now.
The title isn't obtained because JR checks the article's container for a h1 or h2 first, then more globally (I believe that this approach is generally more favorable - keep in mind you can manually edit the title if you need to by clicking the pencil after hovering the title). The photo is just outside the article container, so it would be hard to programmatically find images like that and determine whether or not they should be included.
from just-read.
@ZachSaucier When I use the "Clearly"-extension (from Evernotes; no longer supported, but still available for download here), it renders this page correctly, both the title and the image and the rest of the article are parsed and rendered correctly. So, there must be a way to do this programmatically...
from just-read.
I seem unable to get that download of Clearly to work on my computer. I can look at the code though, so I'll try to break down what's going on to let them select better than how JR selects.
from just-read.
@ZachSaucier Oh wow, thanks Zach!
from just-read.
For future reference for myself, a running list of other sites with this same issue:
- http://www.catb.org/esr/faqs/things-every-hacker-once-knew/
- https://plato.stanford.edu/entries/faith/
- https://www.weforum.org/agenda/2017/03/how-to-turn-a-co-worker-who-sees-you-as-a-threat-into-an-ally/
- https://www.phaseone.com/Emmanuel-Bournot.aspx
- http://g1.globo.com/politica/noticia/em-cerimonia-no-planalto-temer-diz-que-conduzira-governo-ate-31-de-dezembro-de-2018.ghtml
- https://www.quantamagazine.org/a-path-less-taken-to-the-peak-of-the-math-world-20170627/
- http://main.poliquingroup.com/ArticlesMultimedia/Articles/Article/1143/Eight_Common_but_Dangerous_Mistakes_of_A_High-Fat_.aspx
- http://www.consumerreports.org/pharmacies/get-a-pharmacy-discount-with-medicare/
- http://lithub.com/why-you-should-aim-for-100-rejections-a-year/
- http://step-in-project.blogspot.com/2017/07/lisas-journey-to-iraq.html (mixed with #82)
from just-read.
One idea I have (potential feature) to help, not solve this, is to implement a "select more generally" or "select parent container" button. That way, if only part of the content is shown, users can stay in the Just Read format but select more content (which should tend to be the full article).
from just-read.
each section is technically in their own container, so Just Read selects the first container... I'm not sure how I can do that without error across all sites.
You may have already considered this, but would it be a problem to just include all the <article>
elements inside the container?
Maybe some sites put too much crap in there, but with all the other formatting stripped away, it doesn't seem like it'd be that bad. At least for me, having too much (clean) text is a minor problem, while missing some important text is a major problem.
from just-read.
This should be more or less fixed in the latest version (1.1.0) with Just Read's new auto-selection algorithm.
from just-read.
Related Issues (20)
- Quick Question About Browsers HOT 1
- cannot load images because "blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource." HOT 1
- text-decoration doesn't seem to work HOT 3
- Option to remove floating icon to "add a comment" HOT 1
- RSS feed for shared pages HOT 2
- Share Article button not working HOT 8
- Removing highlight colour instead of text HOT 2
- Reddit is Borked HOT 2
- Add ChatGPT summarization on click HOT 21
- Works great but does not support 'Auto Copy' extension HOT 1
- icon menu ‘select content to read’ lost in ver5.3.1 HOT 1
- Coding section not support HOT 1
- Small SVG blown up to large white block HOT 3
- Code blocks are condensed into one line HOT 4
- Comic Dak Mode
- Wrong text extracted from page on defenseone.com HOT 2
- allow to specify the openai API endpoint base url HOT 2
- Just Read fails to open due to invalid HTML attribute HOT 5
- Let Summarizer handle additional response types HOT 7
- Theme switching doesn't work (6.0.4) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from just-read.