Comments (11)
I expect this can be done using a more extensively defined xpath query. Below some examples (not N24.de), which might be useful. Today is a slow news day, so I don't know yet whether tt-rss works well with these queries. Based on xpath validators, they should.
edit: ahh, unfortunately this does not seem to work with your code. So far you only use the first entry from the query, instead of adding all of them to the article text.
Selecting several specific divs / tags:
//h1 | //h2 | //h3
//div[@id='artikelKolom']/div[@Class='zaktxt clear']/div[@Class='zak_normal'] | //div[@id='artikelKolom']/p
Note: sequence matters when doing it like this! //h1 | //h2 | //h3 will show first all h1's, followed by all h2's and then all h3's
//div[@id='artikelKolom']/*[contains(@Class,'zaktxt') or name()='p']
Note: sequence does not seem to matter, sequence is based on sequence in file
Select all div's with certain classes. No need for the div's to have the same parent
//div[@Class='content illustrated' or @Class='post-body']
//div[contains(@Class,'illustration top')] | //div[contains(@Class,'post-body')]
//div[contains(@Class,'illustration top') or contains(@Class,'post-body')]
Note: not sure whether sequence matters
Select all children from div id='artikelKolom', except children with div class='broodtxt' or div class='bannercenter ...'
//div[@id='artikelKolom']/[@Class!='broodtxt']
//div[@id='artikelKolom']/[not(@Class='broodtxt')]
//div[@id='artikelKolom']/[not(contains(@Class, 'broodtxt'))]
//div[@id='artikelKolom']/[not(contains(@Class, 'broodtxt')) and not(contains(@Class, 'bannercenter'))]
from ttrss_plugin-af_feedmod.
I think it'll get too complicated if you need to "puzzle" the result together like this. Also it'll get worse when the source changes its layout (like N24 did some days ago).
Maybe I'll implement a blacklist which will remove certain XPath elements from the result. I think this is more robust.
from ttrss_plugin-af_feedmod.
A blacklist would be realy nice :D
Also I've a big problem with welt.de ... their feed url links to an overview page... there should be an rewrite of the sourceurl like:
http://www.welt.de/?config=articleidfromurl&artid=115415142
should be
http://www.welt.de/article115415142
Would be phantastic to see this features :D
from ttrss_plugin-af_feedmod.
Hi,
is there a way to use all entrys from the query, instead of adding only the first to the article text?
div[@Class='news-single-item']/p ==> only returns the first found p content
div[@id='news-single-item']/*[not(div[@Class='comments'])] ==> doesn't work :(
Thank you for your answer.
Kasad
from ttrss_plugin-af_feedmod.
Yes, but you need to make some changes to the init.php file. I did this
last weekend and this week it seems to work as expected. See
https://github.com/bfly75/ttrss_plugin-af_feedmod.
On Sun, Apr 21, 2013 at 12:29 PM, Kasad [email protected] wrote:
Hi,
is there a way to use all entrys from the query, instead of adding only
the first to the article text?div[@Class https://github.com/class='news-single-item']/p ==> only
returns the first found p contentdiv[@id https://github.com/id='news-single-item']/*[not(div[@classhttps://github.com/class='comments'])]
==> doesn't work :(Thank you for your answer.
Kasad
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-16719473
.
Ronald Capel
Wilhelminaplein 127, 4201 GW Gorinchem, The Netherlands
(maphttp://maps.google.nl/maps?f=q&source=s_q&hl=en&geocode=&q=Wilhelminaplein+127,+Gorinchem&aq=0&sll=52.27488,5.515137&sspn=3.97308,9.876709&ie=UTF8&hq=&hnear=Wilhelminaplein+127,+Gorinchem,+Zuid-Holland&ll=51.827477,4.973845&spn=0.007838,0.01929&t=h&z=16
|park http://www.ronaldcapel.nl/prive/parkeren)
Mob: +31-(0)6-55836128 Email: [email protected]
from ttrss_plugin-af_feedmod.
Wow, thank you very much - this works awesome :D
from ttrss_plugin-af_feedmod.
I think post-processing should also rip out (at least) id, class and style attributes from the content. Some pages I fetch using feedmod have elements with ids such as "overlay" in them that pick up tt-rss's styling, making things look wonky.
from ttrss_plugin-af_feedmod.
@bfly75: Thanks for that modification!
@mbirth: You should consider incorporating bfly75's modification. Maybe by creating a new type (eg. xpath-all-matches).
from ttrss_plugin-af_feedmod.
I just merged changes from @rangerer which add a new "cleanup" option to remove unwanted parts from the main XPath node. He also has provided a lot of examples.
from ttrss_plugin-af_feedmod.
Another thing this one should do: Make all URLs absolute (i.e. fully qualified including "http://www.example.org/) because like in #22, relative images are not shown.
from ttrss_plugin-af_feedmod.
Hi,
after my ttrss crashed I couldn't use the version of bfly75 any longer. Could you please add his way to display more than one div?
Greetings
K
from ttrss_plugin-af_feedmod.
Related Issues (20)
- Call to undefined method PluginHost::getInstance() HOT 4
- error: Invalid JSON! HOT 2
- globo feed HOT 1
- Support tagesschau.de
- Modify article content HOT 2
- cookies HOT 1
- Enable images in generated feeds HOT 6
- Feature Request: Regex replacements HOT 1
- German Umlaut not properly displayed HOT 1
- Using more than one element
- Problem Golem.de HOT 3
- problem android.pit HOT 2
- Modify after filtering
- Feeds mit Updates HOT 1
- Select /html/body/ HOT 5
- UTF-8 problem HOT 2
- the guardian feed HOT 1
- add license
- Use "readability" to auto-select article body
- Evolution of feedmod HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ttrss_plugin-af_feedmod.