Giter VIP home page Giter VIP logo

rss-parser's Introduction

rss-parser

Version Build Status Downloads

A small library for turning RSS XML feeds into JavaScript objects.

Installation

npm install --save rss-parser

Usage

You can parse RSS from a URL (parser.parseURL) or an XML string (parser.parseString).

Both callbacks and Promises are supported.

NodeJS

Here's an example in NodeJS using Promises with async/await:

let Parser = require('rss-parser');
let parser = new Parser();

(async () => {

  let feed = await parser.parseURL('https://www.reddit.com/.rss');
  console.log(feed.title);

  feed.items.forEach(item => {
    console.log(item.title + ':' + item.link)
  });

})();

TypeScript

When using TypeScript, you can set a type to control the custom fields:

import Parser from 'rss-parser';

type CustomFeed = {foo: string};
type CustomItem = {bar: number};

const parser: Parser<CustomFeed, CustomItem> = new Parser({
  customFields: {
    feed: ['foo', 'baz'],
    //            ^ will error because `baz` is not a key of CustomFeed
    item: ['bar']
  }
});

(async () => {
  const parser = new Parser();
  const feed = await parser.parseURL('https://www.reddit.com/.rss');
  console.log(feed.title); // feed will have a `foo` property, type as a string

  feed.items.forEach(item => {
    console.log(item.title + ':' + item.link) // item will have a `bar` property type as a number
  });
})();

Web

We recommend using a bundler like webpack, but we also provide pre-built browser distributions in the dist/ folder. If you use the pre-built distribution, you'll need a polyfill for Promise support.

Here's an example in the browser using callbacks:

<script src="/node_modules/rss-parser/dist/rss-parser.min.js"></script>
<script>

// Note: some RSS feeds can't be loaded in the browser due to CORS security.
// To get around this, you can use a proxy.
const CORS_PROXY = "https://cors-anywhere.herokuapp.com/"

let parser = new RSSParser();
parser.parseURL(CORS_PROXY + 'https://www.reddit.com/.rss', function(err, feed) {
  if (err) throw err;
  console.log(feed.title);
  feed.items.forEach(function(entry) {
    console.log(entry.title + ':' + entry.link);
  })
})

</script>

Upgrading from v2 to v3

A few minor breaking changes were made in v3. Here's what you need to know:

  • You need to construct a new Parser() before calling parseString or parseURL
  • parseFile is no longer available (for better browser support)
  • options are now passed to the Parser constructor
  • parsed.feed is now just feed (top-level object removed)
  • feed.entries is now feed.items (to better match RSS XML)

Output

Check out the full output format in test/output/reddit.json

feedUrl: 'https://www.reddit.com/.rss'
title: 'reddit: the front page of the internet'
description: ""
link: 'https://www.reddit.com/'
items:
    - title: 'The water is too deep, so he improvises'
      link: 'https://www.reddit.com/r/funny/comments/3skxqc/the_water_is_too_deep_so_he_improvises/'
      pubDate: 'Thu, 12 Nov 2015 21:16:39 +0000'
      creator: "John Doe"
      content: '<a href="http://example.com">this is a link</a> &amp; <b>this is bold text</b>'
      contentSnippet: 'this is a link & this is bold text'
      guid: 'https://www.reddit.com/r/funny/comments/3skxqc/the_water_is_too_deep_so_he_improvises/'
      categories:
          - funny
      isoDate: '2015-11-12T21:16:39.000Z'
Notes:
  • The contentSnippet field strips out HTML tags and unescapes HTML entities
  • The dc: prefix will be removed from all fields
  • Both dc:date and pubDate will be available in ISO 8601 format as isoDate
  • If author is specified, but not dc:creator, creator will be set to author (see article)
  • Atom's updated becomes lastBuildDate for consistency

XML Options

Custom Fields

If your RSS feed contains fields that aren't currently returned, you can access them using the customFields option.

let parser = new Parser({
  customFields: {
    feed: ['otherTitle', 'extendedDescription'],
    item: ['coAuthor','subtitle'],
  }
});

parser.parseURL('https://www.reddit.com/.rss', function(err, feed) {
  console.log(feed.extendedDescription);

  feed.items.forEach(function(entry) {
    console.log(entry.coAuthor + ':' + entry.subtitle);
  })
})

To rename fields, you can pass in an array with two items, in the format [fromField, toField]:

let parser = new Parser({
  customFields: {
    item: [
      ['dc:coAuthor', 'coAuthor'],
    ]
  }
})

To pass additional flags, provide an object as the third array item. Currently there is one such flag:

  • keepArray (false) - set to true to return all values for fields that can have multiple entries.
  • includeSnippet (false) - set to true to add an additional field, ${toField}Snippet, with HTML stripped out
let parser = new Parser({
  customFields: {
    item: [
      ['media:content', 'media:content', {keepArray: true}],
    ]
  }
})

Default RSS version

If your RSS Feed doesn't contain a <rss> tag with a version attribute, you can pass a defaultRSS option for the Parser to use:

let parser = new Parser({
  defaultRSS: 2.0
});

xml2js passthrough

rss-parser uses xml2js to parse XML. You can pass these options to new xml2js.Parser() by specifying options.xml2js:

let parser = new Parser({
  xml2js: {
    emptyTag: '--EMPTY--',
  }
});

HTTP Options

Timeout

You can set the amount of time (in milliseconds) to wait before the HTTP request times out (default 60 seconds):

let parser = new Parser({
  timeout: 1000,
});

Headers

You can pass headers to the HTTP request:

let parser = new Parser({
  headers: {'User-Agent': 'something different'},
});

Redirects

By default, parseURL will follow up to five redirects. You can change this with options.maxRedirects.

let parser = new Parser({maxRedirects: 100});

Request passthrough

rss-parser uses http/https module to do requests. You can pass these options to http.get()/https.get() by specifying options.requestOptions:

e.g. to allow unauthorized certificate

let parser = new Parser({
  requestOptions: {
    rejectUnauthorized: false
  }
});

Contributing

Contributions are welcome! If you are adding a feature or fixing a bug, please be sure to add a test case

Running Tests

The tests run the RSS parser for several sample RSS feeds in test/input and outputs the resulting JSON into test/output. If there are any changes to the output files the tests will fail.

To check if your changes affect the output of any test cases, run

npm test

To update the output files with your changes, run

WRITE_GOLDEN=true npm test

Publishing Releases

npm run build
git commit -a -m "Build distribution"
npm version minor # or major/patch
npm publish
git push --follow-tags

rss-parser's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rss-parser's Issues

Uncaught (in promise) Error: Unexpected close tag ; Chrome User-Agent?

I am currently seeing inconsistent behavior across browsers using rss-parser.

I am testing with this RSS feed: http://www.marketwatch.com/rss/topstories

I do get the same error for all RSS feeds I have tried on the same domain.

Here is the relevant code, Vue-flavored:

const CORS_PROXY = 'https://cors-anywhere.herokuapp.com/'

export default {
...
  mounted() {
    require('../../node_modules/rss-parser/dist/rss-parser.js')
    this.getFeeds()
  },

  methods: {
    async getFeeds() {
      const parser = new RSSParser({
        headers: {
          'User-Agent': 'rss-parser',
          'Cache-Control': 'no-cache',
          'Pragma': 'no-cache'
        },
        defaultRSS: 2.0,
        xml2js: {
          strict: true
        }
      })
      const feedRequests = this.$page.frontmatter.feeds.map(feed => {
        return parser.parseURL(CORS_PROXY + feed)
      })
      this.feeds = await Promise.all(feedRequests)
    }
  }
}

In Firefox, this will make a request that returns a valid RSS XML response; in Chrome, this is returning HTML, which makes rss-parser return the following error:

Uncaught (in promise) Error: Unexpected close tag
Line: 9
Column: 7
Char: >
    at error (rss-parser.js?e48f:12624)
    at strictFail (rss-parser.js?e48f:12648)
    at closeTag (rss-parser.js?e48f:12834)
    at SAXParser.write (rss-parser.js?e48f:13397)
    at Parser.parseString (rss-parser.js?e48f:11945)
    at Parser.eval [as parseString] (rss-parser.js?e48f:11620)
    at eval (rss-parser.js?e48f:8448)
    at new Promise (<anonymous>)
    at Parser.parseString (rss-parser.js?e48f:8447)
    at exports.IncomingMessage.eval (rss-parser.js?e48f:8520)

I can change all of the header attributes using the headers option, but User-Agent is the only one that will not change on Chrome (Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36). I suspect this has something to do with what I'm seeing.

Is there any way I can work around this? I will provide more information if necessary.

add xml close tag fix?

some RSS/XML format is not standard. not close tag ,like
<img src="xxx">
Fix this little problem in rss-parser ?

cors and importing library ES2015 way

Hello, I want to parse rss-feeds client side.

I'm getting error:
Failed to load https://www.reddit.com/.rss: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://test.com:8080' is therefore not allowed access. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

How to deal with this?
I understand this is because I'm parsing feeds client side.

My code is:
import * as Parser from 'rss-parser';
let parser = new Parser();
parser.parseURL('https://www.reddit.com/.rss', function(err, feed) {
console.log(feed.title);
feed.items.forEach(entry => console.log(entry.title + ':' + entry.link));
});

Am I importing the library correctly?

Thank you!

Encoding special characters (ISO 639-1)

Hello,

I use rss-parser on some feeds in the Serbian language, which has several special characters, like: Š, Č, Ć, Ž.
All of these are (in rss-parser 2.10.1) replaced with \ufffd.

Is there any way encoding can be improved to cover these (latin-extended, ISO 639-1 codes) characters? Same would apply to several other languages - Slovenian, Croatian, Bosnian, etc.

I love the simplicity of use, despite the lack of async/await support :) but I will have to find an alternative if I can't solve this encoding issue.

Error: Attribute without value

I get this exeptions and I can't understand what's the problem

Error: Attribute without value Line: 52 Column: 15 Char: s at error (c:\projects\myproject\node_modules\xml2js\node_modules\sax\lib\sax.js:651:10) at strictFail (c:\projects\myproject\node_modules\xml2js\node_modules\sax\lib\sax.js:677:7) at SAXParser.write (c:\projects\myproject\node_modules\xml2js\node_modules\sax\lib\sax.js:1340:13) at Parser.exports.Parser.Parser.parseString (c:\projects\myproject\node_modules\xml2js\lib\parser.js:322:31) at Parser.parseString (c:\projects\myproject\node_modules\xml2js\lib\parser.js:5:59) at Promise (c:\projects\myproject\node_modules\rss-parser\lib\parser.js:30:22) at Promise (<anonymous>) at Parser.parseString (c:\projects\myproject\node_modules\rss-parser\lib\parser.js:29:16) at IncomingMessage.res.on (c:\projects\myproject\node_modules\rss-parser\lib\parser.js:97:23) at emitNone (events.js:110:20)

Hope someone can help.

Invalid character in RSS XML exception

I wen't into the problem of parsing certain RSS feeds that did not have a valid XML structure due to illegal characters.
The problem might be more personal and without interest to the rss-parser package even so i expose it here.
My problem was that i scrapped a few RSS channels and whenever one of these channels had invalid XML (which happens quite often) the app fully stopped.

The exception is thrown in this line https://github.com/bobby-brennan/rss-parser/blob/master/index.js#L106 , my question is why is it not throw a error as the callback parameter like so https://github.com/7Ds7/rss-parser/blob/master/index.js#L107 ?

It would be verbose anyway and it would not stop the app with one exception. Is there a reason more deep that i can figure for for it to throw this exception? Is there any other way to prevent this exception without changing rss-parser source ?

Anyway many thanks for this package totally worth it :)

Not working on RSS 1.0

Trying to use it on craigslist for example https://seattle.craigslist.org/search/act?format=rss
Got error

events.js:142
throw er; // Unhandled 'error' event
^

TypeError: Cannot read property 'channel' of undefined
at /Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/rss-parser/index.js:26:29
at Parser. (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/xml2js/lib/xml2js.js:483:18)
at emitOne (events.js:78:13)
at Parser.emit (events.js:170:7)
at Object.onclosetag (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/xml2js/lib/xml2js.js:444:26)
at emit (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/sax/lib/sax.js:639:35)
at emitNode (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/sax/lib/sax.js:644:5)
at closeTag (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/sax/lib/sax.js:903:7)
at Object.write (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/sax/lib/sax.js:1436:13)
at Parser.exports.Parser.Parser.parseString (/Users/gordon/workspace/NoLostDog/nld-crawler/node_modules/xml2js/lib/xml2js.js:502:31)

when item contains many "link" fields

Having a issues where the RSS feed contains the following

<link rel="replies" type="application/atom+xml" href="http://testing.googleblog.com/feeds/3343661551924942594/comments/default" title="Post Comments" />
<link rel="replies" type="text/html" href="http://testing.googleblog.com/2017/07/evolution-of-gtac-and-engineering.html#comment-form" title="4 Comments" />
<link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/15045980/posts/default/3343661551924942594" />
<link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/15045980/posts/default/3343661551924942594" />
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/blogspot/RLXA/~3/ptSfzdHRC4s/evolution-of-gtac-and-engineering.html" title="Evolution of GTAC and Engineering Productivity" />


It would be nice if i could specify that the link i want is the type="text/html"

right now all i get it

{
	"link": "http://testing.googleblog.com/feeds/3343661551924942594/comments/default",
}

over ride parseURL proxy settings?

can we override the parseURL function with custom proxies? or something to get the required effect?

maybe??

let parser = new Parser({
  proxy: "133.21.34.1:8080",
});

Getting wrong encode

Hey, great job with rss-parser. But i when i try to catch some iso-8.. rss feed the typo isnt correct.. how can i fix this?

not passing multiple custom fields

Seems to only pass the last custom field passed to customFields.

ie:

let parser = new Parser({
    customFields: {
      item: ['s:type','s:type'], 
      item: ['s:vendor','s:vendor']
      item: ['summary','summary'],
      //item: ['s:tag','s:tag'],
      //item: ['s:variant','s:variant', {keepArray: true}] //<---

    }
  });
   
  parser.parseURL('https://kith.com/collections/all/products.atom', function(err, feed) {
    //console.log(feed.extendedDescription);

if(!err){
    feed.items.forEach(function(entry) {
        //console.log(entry['s:type']);
        //console.log(entry['s:vendor']);
        console.log(entry.summary);
        //console.log(entry['s:tag']);
        //console.log(entry['s:variant']);
        //console.log(entry);
        
    });
    //console.log(feed.items[7]['s:type']);
    //console.log(feed.items[7]['s:vendor']);
    console.log(feed.items[7]['summary']);
    //console.log(feed.items[7]['s:tag']);
    //console.log(feed.items[5]['s:variant']['s:price'][0]['_']);
    //console.log(feed.items[5].summary);
   }else{
       console.log(err.message);
   };

Seems to only pass THE LAST item added to "customFields".
's:vendor' will show undefined untill "item: ['summary','summary']," is commented out then it will work.

problem validating feed url

I am making an api which validate that given feed url is valid or not. I tested that url with some online feed url checker. Those were giving xml parsing error for that url, but when I tried testing that url with rss-parser it successfully return data from the url.

Here is the url for testing http://independentbanker.org/feed/

I want to return error if there is any error while parsing XML data.

How to get data and display on the screen ?

i just use rss-parser yesterday, and i can't control the data parsed the value of rss just can access in the async function i can't push into the array and render

i used: lasted rss-paser , create-react-app

Using Promises

Rather than putting everything in an async function, like in your example

let Parser = require('rss-parser');
let parser = new Parser();

(async () => {

  let feed = await parser.parseURL('https://www.reddit.com/.rss');
  console.log(feed.title);

  feed.items.forEach(item => {
    console.log(item.title + ':' + item.link)
  });

})();

You can do something like.

let Parser = require('rss-parser');
let parser = new Parser();

function rssPromise(url) {
  return new Promise((res, rej) => {
    (async () => {
      try { res(await parser.parseURL(url)) } 
      catch (err) { rej(err) }
    })();
  })
}

function print(feed) {
  console.log(feed.title);

  feed.items.forEach(item => {
    console.log(item.title + ':' + item.link)
  }
}

rssPromise('https://www.reddit.com/.rss').then(print).catch(err => { throw err });

Library quietly hanging on occasionally bad RSS feed.

My fetchRSS function is below. I'm pinging this function once per minute with a async timer and fetchRSS gets about 25 feeds in a loop.

My issue is that after a period of time (5-20mins) the call to parseURL (with a different feed failing each time) will just never return.
"let feed = await parser.parseURL(currentRssFeed)" //sometimes does not return

I'm not passing any argument for retries, so I should be getting the default number of 5.

Before I tear apart my code, is there a failure mechanism within parseURL that wouldn't at least return an error?

Per the code below I'll eventually see one feed print "Fetching: ..." but never get to the "Got: ..." printout. So, it's hanging somewhere in parseURL()

Any help/guidance would be much appreciated.

image

Is there any way to pass async option to XML2JS

I am getting the following error:
i tried to fix it by passing async : true and it worked. is there any workaround for this issue?
Code:
D:\Backend\node_modules\xml2js\lib\parser.js:329
throw err;
^

Error: Text data outside of root node.
Line: 1
Column: 1
Char: 
at error (D:\Backend\node_modules\sax\lib\sax.js:651:10)
at strictFail (D:\Backend\node_modules\sax\lib\sax.js:677:7)
at Object.write (D:\Backend\node_modules\sax\lib\sax.js:1035:15)
at Parser.exports.Parser.Parser.parseString (D:\Backend\node_modules\xml2js\lib\parser.js:322:31)
at Parser.parseString (D:\Backend\node_modules\xml2js\lib\parser.js:5:59)
at Object.exports.parseString (D:\Backend\node_modules\xml2js\lib\parser.js:354:19)
at Object.Parser.parseString (D:\Backend\node_modules\rss-parser\index.js:250:10)
at IncomingMessage. (D:\Backend\node_modules\rss-parser\index.js:297:21)
at emitNone (events.js:91:20)
at IncomingMessage.emit (events.js:185:7)
at endReadableNT (_stream_readable.js:974:12)
at _combinedTickCallback (internal/process/next_tick.js:80:11)
at process._tickCallback (internal/process/next_tick.js:104:9)

jspm compatibility

I tried using this library using jspm.
The file loads, the callback function of parseURL is never called.
Is this an incompatibility with jspm?
Any ideas?

Can't get 'isoDate' of 'the verge'

My source code:

let Parser = require('rss-parser');
let parser = new Parser();

(async () => {
 
    let feed = await parser.parseURL('https://www.theverge.com/rss/index.xml');
    console.log(feed.title);
    console.log(feed.items[0])
   
  })();

The output:

The Verge -  All Posts
{ title: 'Twitter finally draws a line on extremism',
  link: 'https://www.theverge.com/2018/8/14/17686856/twitter-proud-boys-ban-alex-jones',
  pubDate: '2018-08-14T10:00:02.000Z',
  author: 'Casey Newton',
  content: '  \n    <img alt="" src="https://cdn.vox-cdn.com/thumbor/EoSybAJL-nFKN4H6fezFy0-azv8=/0x0:2040x1360/1310x873/cdn.vox-cdn.com/uploads/chorus_image/image/60827987/mdoying_180118_2249_twitter_0670stills.0.jpg" />\n\n\n\n  <p id="qYqT9b">On Friday I wrote about <a href="https://www.theverge.com/2018/8/11/17677518/alex-jones-ban-facebook-twitter-youtube">Twitter’s seeming paralysis when it came to enforcing its platform rules</a>. What, exactly, was going on over there? Late Friday evening, we got an answer of sort
s. The company invited Cecilia Kang and Kate Conger of <em>The New York Times</em> to sit in on a meeting in which CEO Jack Dorsey and 18 of his colleagues debated safety policies. <a href="https://www.nytimes.com/2018/08/10/technology/twitter-free-speech-infowars.html">The meeting was rather … inconclusive</a>, they report:</p>\n<blockquote><p id="
ucI5TJ">For about an hour, the group tried to get a handle on what constituted dehumanizing speech. At one point, Mr. Dorsey wondered if there was a technology solution. There
was no agreement on an answer.</p></blockquote>\n<p id="i06dh6">Elsewhere in the piece, executives sound other notes we’ve heard before from this and other platforms: Free spe
ech is valuable. Moderation...</p>\n  <p>\n    <a href="https://www.theverge.com/2018/8/14/17686856/twitter-proud-boys-ban-alex-jones">Continue reading&hellip;</a>\n  </p>\n\n',
  contentSnippet: 'On Friday I wrote about Twitter’s seeming paralysis when it came to enforcing its platform rules. What, exactly, was going on over there? Late Friday evenin
g, we got an answer of sorts. The company invited Cecilia Kang and Kate Conger of The New York Times to sit in on a meeting in which CEO Jack Dorsey and 18 of his colleagues debated safety policies. The meeting was rather … inconclusive, they report:\nFor about an hour, the group tried to get a handle on what constituted dehumanizing speech. At one
point, Mr. Dorsey wondered if there was a technology solution. There was no agreement on an answer.\nElsewhere in the piece, executives sound other notes we’ve heard before fr
om this and other platforms: Free speech is valuable. Moderation...\n  \n    Continue reading&hellip;',
  id: 'https://www.theverge.com/2018/8/14/17686856/twitter-proud-boys-ban-alex-jones' }

Potential issue with self closing tags or not completely parsing the feed

Great work on the parser, but it seems like it is not able to parse certain feeds.

Expected: Full parsing of attributes and entries array
Actual: Partial parsing of attributes and only header info of entries

Repro:
Here's a live repo: https://runkit.com/duluca/rss-parser-parsing-issue-repro
or

const {promisify} = require('util')

const parser = require('rss-parser')
const parseURLAsync = promisify(parser.parseURL)


parseURLAsync('http://www.rferl.org/mobapp/articles.xml')
.then((data) => {
    console.log(data)
})

Status code 406

I'm trying to parse the Changelog RSS feed which is located at the following URL:
https://changelog.com/podcast/feed
but I'm getting status code 406.

After short investigation looks like the response Content-Type is application/xml and we're requesting application/rss+xml

ERROR in ./~/rss-parser/index.js Module not found: Error: Cannot resolve module 'fs'...

Hey, I've just downloaded and included your module in my project.
I am using webpack for bundling.

All I did is just included package in main.js like this:

import rss-parser from 'rss-parser';

It results an error:

ERROR in .//rss-parser/index.js Module not found: Error: Cannot resolve module 'fs' in /path/to/project/root/node_modules/rss-parser @ .//rss-parser/index.js 2:9-22

In webpack I use only babel-loader and json-loader.

My package.json dependencies:

"dependencies": {
    "babel-core": "^6.17.0",
    "babel-loader": "^6.2.5",
    "babel-preset-es2015": "^6.16.0",
    "copy-webpack-plugin": "^3.0.1",
    "json-loader": "^0.5.4",
    "rss-parser": "^2.5.2",
    "webpack": "^1.13.2"
}

Could you please help me to resolve this issue?
Thanks!

Does not parse <content:encoded> and <guid> elements (among others)

Trying to use rss-parser with Instant Article feed the following RSS results in incomplete data:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Instant Article Test</title>
    <link>https://localhost:8000</link>
    <description>1, 2, 1, 2… check the mic!</description>
    <language>en</language>
    <pubDate>Fri, 13 May 2016 15:14:05 GMT</pubDate>
    <dc:date>2016-05-13T15:14:05Z</dc:date>
    <dc:language>en</dc:language>
    <item>
      <title>My first Instant Article</title>
      <link>https://localhost:8000</link>
      <description>Lorem ipsum</description>
      <content:encoded>&lt;b&gt;Lorem&lt;/b&gt; ipsum</content:encoded>
      <pubDate>Wed, 04 May 2016 06:53:45 GMT</pubDate>
      <guid>https://localhost:8000</guid>
      <dc:creator>tobi</dc:creator>
      <dc:date>2016-05-04T06:53:45Z</dc:date>
    </item>
  </channel>
</rss>

This is the result in JSON; note the missing guid and content:encoded data:

{
  "feed": {
    "entries": [
      {
        "title": "My first Instant Article",
        "link": "https://localhost:8000",
        "pubDate": "Wed, 04 May 2016 06:53:45 GMT",
        "content": "Lorem ipsum",
        "contentSnippet": "Lorem ipsum"
      }
    ],
    "title": "Instant Article Test",
    "description": "1, 2, 1, 2… check the mic!",
    "link": "https://localhost:8000"
  }
}

create-react-app can't minify parser.js

Hi there,

Having a problem with the library building from create-react-app from v3.0.0 onwards. 2.12.1 is fine. On running npm run build, I get:

> react-scripts build

Creating an optimized production build...
Failed to compile.

Failed to minify the code from this file:

 	./node_modules/rss-parser/lib/parser.js:16

Read more here: http://bit.ly/2tRViJ9

Looks like an ES5/ES6 issue?

Unhandled error when decorating iTunes result without an image URL

parser.parseURL() throws the following error for podcast rss feeds that don't have an image URL:

Cannot read property 'href' of undefined

This happens occasionally and I've got about 10 podcasts for which I can reproduce. I've forked and fixed by just not adding an image property to itunes during decoration. I'll send a PR shortly to fix if it looks 👌 .

Newer than 2.7 versions throwing Buffer.allocUnsafe

Bug

When doing npm install and pulling 2.10.2 (I believe this also happens in newer than 2.7 too) the browser is throwing Buffer.allocUnsafe
screen shot 2017-08-22 at 11 33 16 am
screen shot 2017-08-22 at 11 38 11 am

Notes

Buffer is a node specific API. Since is not available in the browser It'll throw the exception. Maybe is the way the package is been build and packaged?

ERROR TypeError: rss_parser_dist_rss_parser_min_js__WEBPACK_IMPORTED_MODULE_3__.RSSParser is not a constructor

I am using rss-parser in my Angular project and in .ts file i am writing below code and inside ngOnInit() my code is written i am getting error

ERROR TypeError: rss_parser_dist_rss_parser_min_js__WEBPACK_IMPORTED_MODULE_3__.RSSParser is not a constructor

import { RSSParser } from 'rss-parser/dist/rss-parser.min.js';
ngOnInit() {
const CORS_PROXY = "https://cors-anywhere.herokuapp.com/"

let parser = new RSSParser();
parser.parseURL(CORS_PROXY + 'https://www.reddit.com/.rss', function(err, feed) {
console.log(feed.title);
feed.items.forEach(function(entry) {
console.log(entry.title + ':' + entry.link);
})
})
}
Thanks in Advance

Does not parse links for Google feedburner feeds

I'm trying to parse a few feedburner feeds, for example:

http://feeds.feedburner.com/blogspot/lQlzL

The link that is returned is of the: type="application/atom+xml". I'm not able to target the link to go to the type="text/html". The RSS feed also has feedburner:origLinkhttp://googleadsdeveloper.blogspot.com/2016/06/register-now-for-fall-2016-adwords-api.html/feedburner:origLink which would work as well for the link. This may be too specific of an issue, but still thought I'd try.

Incorrectly rewrites URL scheme

I used the example code locally (serving from the file system) which fetches https://www.reddit.com/.rss, and I'm getting

rss-parser.js:5812 Fetch API cannot load file://www.reddit.com/.rss. URL scheme must be "http" or "https" for CORS request.

That's using dist/rss-parser.js in version 2.5.0.


If I downgrade to 2.4.0, (which in the /dist/rss-parser.js file is labeled 2.2.4), I get the following error:

Fetch API cannot load https://www.reddit.com/.rss. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'null' is therefore not allowed access. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

(Initially I didn't realize you updated it and wondered why I was suddenly getting a different error).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.