Comments (8)
I also maintain a RSS Reader and you won't believe how many things you have to take into consideration. I saw that you started implementing some like supporting multiple feed standards, multiple date formats etc.
I'm stripping the bom away before parsing the Feed. If I remember correctly, I'm actually stripping anything away until the first <
because there are also feeds that like to place strings before the actual xml tags ...
Another thing that I saw when you replaced your parsing with a multiplatform library is that you don't parse HTML. However HTML websites can actually reference their feeds (<link rel="alternate" type="application/rss+xml" title="RSS" href="/rss.xml" />
) so it's nice when you for instance try to add a website, automatically parse the html get all of the feeds (there can be more) and then add them automatically in one go.
from twine.
Hi, thanks for reporting the issue. There might be some incorrect tag present the RSS feed. Is the issue reproducible when using Atom feed as well?
I will take a look at why the RSS feed parsing is failing.
hello, yes I have tried both atom & RSS feed links and same issue happens.
from twine.
I'm operating on a ByteArray
which I then convert to a String using the desired charset (this is important, many Russian feeds otherwise break), and .removePrefix("\uFEFF") // Bom.
then I use byteArrayString.indexOfAny
to find any of the desired starting tags such as <?xml
, <html
or from all the specifications that I use and drop the string. For now this seems to be handling all of the cases. As for the Russian Feed I think it was this one https://www.opennet.ru/opennews/opennews_review.rss which you can use to test that it works correctly and display the cyrillic alphabet.
from twine.
hmm I searched for issues here and saw you mentioned W3's RSS feed validator so I tried that tool and looks like said Microsoft RSS feeds do not validate. could that be the reason? 👀
from twine.
Hi, thanks for reporting the issue. There might be some incorrect tag present the RSS feed. Is the issue reproducible when using Atom feed as well?
I will take a look at why the RSS feed parsing is failing.
from twine.
It looks like the following feeds include BOM at start of the document/feed which is causing the parser to not properly move ahead with parsing.
from twine.
Another thing that I saw when you replaced your parsing with a multiplatform library is that you don't parse HTML. However HTML websites can actually reference their feeds () so it's nice when you for instance try to add a website, automatically parse the html get all of the feeds (there can be more) and then add them automatically in one go.
Hi, I do parse HTML when trying to fetch a feed and find the link to the feed.
from twine.
I'm stripping the bom away before parsing the Feed. If I remember correctly, I'm actually stripping anything away until the first < because there are also feeds that like to place strings before the actual xml tags ...
@vanniktech how are you doing this? Are you using regex to strip the BOM or a different method? At the moment I am trying regex to strip it away and it is working. Just want to check if there is a alternative you have tried.
from twine.
Related Issues (20)
- Mark all read button
- Font size option
- Use Text-to-speech to read rss titles on list view page
- Items not sorted (by publishing date) HOT 11
- Widgets
- [BUG] Deleting all rss feeds HOT 6
- Unable to add non-HTTPS Feed HOT 3
- Hope to provide the APK file HOT 1
- #Light mode request HOT 1
- Apostrophe rendered as ' in feeds page HOT 2
- [Feature Request] Hide bottom bar HOT 1
- Ability to export/import bookmarks
- Add light/dark mode HOT 1
- Swipe to mark as read HOT 2
- Ğ and ğ character problem HOT 2
- Cannot add authenticated links HOT 1
- Feed Import seems to have stopped working HOT 12
- Cyrillic text not rendering correctly
- Multiple RSS Feed Issues (Possible Feed URL parsing issue?) HOT 2
- Unable to add an RSS link (Stratechery mailing list/blog) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from twine.