Giter VIP home page Giter VIP logo

wikipedia's People

Contributors

0xflotus avatar bigmistqke avatar bumbummen99 avatar dopecodez avatar friendofdog avatar github-actions[bot] avatar greeshmareji avatar gtibrett avatar yg-i avatar zoetrope69 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

wikipedia's Issues

Error using page() to get infobox()

Do you have any thoughts on what Invalid attempt to destructure non-iterable instance is referring to in this context?

/PROJECTS/research/node_modules/wikipedia/dist/page.js:256
                throw new errors_1.infoboxError(error);
                      ^

infoboxError: infoboxError: TypeError: Invalid attempt to destructure non-iterable instance
    at Page.infobox (/PROJECTS/research/node_modules/wikipedia/dist/page.js:256:23)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  code: undefined
}

Node.js v18.16.0

The code:

const wiki = require('wikipedia')
let page, infobox
async function getPage(input) {
  try {
    page = await wiki.page(input)
    infobox = await page.infobox()
    console.log(infobox)
  } catch (error) {
    console.log(error)
  }
  return infobox
}

getPage('John M. Vining')

How do I pass page url instead of page text?

There are no clear instructions on page input parameters. There is just a singleton example with text input 'Batman'.
How do I account for passing a page url as data?

As an example, use page('Oliver Ellsworth') and then retrieve the infobox()
And then retrieve page('1st United States Congress') and then retrieve infobox()

Implement mobile html

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /page/mobile-html/{title} endpoint which which provides mobile friendly html.

Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

I'll be happy to help anyone who wants to pick this up.

Implement mobile sections

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /page/mobile-sections/{title} endpoint which which provides mobile friendly html.

Implementation for this can follow #17. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

Implement pdf api

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /page/pdf{title} endpoint which provides the page in a pdf format.

The API returns a file for straight download, so my intial thought is we'll have to stream the data to actually get the file to the user.

Any discussion on this is welcome.

When I attempt to make multiple requests at once I get a lot of PageErrors (parallel or sequential) even on valid items

When I do these one at a time they all resolve with a page, when I do more than ten at a time they start to throw pageerrors. I made a little code sample that perfectly illustrates the issue:

const wiki = require('wikipedia');

const subjects = [ "University of Washington", "USC Gould School of Law", "Watergate", "Supreme Court", "Justice Clarence Thomas", "Harlan Crow", "resignation", "impeachment", "public trust", "code of ethics", "University of Washington", "USC Gould School of Law", "Watergate", "Supreme Court", "Justice Clarence Thomas", "Harlan Crow", "resignation", "impeachment", "public trust", "code of ethics" ];

async function GetWikiSummary(subject) {
    let result = {};

	try {
        result.subject = subject;
		const page = await wiki.page(subject);
        result.canonicalurl = page.canonicalurl;
	} catch (error) {
        result.error = error;
	}

    return result;
}

async function getWikiSummaries(subjects) {
    const results = [];
  
    for (const subject of subjects) {
      try {
        const summary = await GetWikiSummary(subject);
        results.push(summary);
      } catch (error) {
        results.push({ subject });
      }
    }
  
    return results;
}

console.log("Starting");

(async () => {
    const converted = await getWikiSummaries(subjects);
    //const converted = await GetWikiSummary('impeachment');
    console.log(JSON.stringify(converted, null, 2));
})();

console.log("Done");

the list is actually duplicated items to show that the first 10 resolve and the last 10 (even though they are the same) will throw page errors. If I have a lost of 20 items what is the recommended way to get 20?

the result of the above code looks like this:

Done
[
  {
    "subject": "University of Washington",
    "canonicalurl": "https://en.wikipedia.org/wiki/University_of_Washington"
  },
  {
    "subject": "USC Gould School of Law",
    "canonicalurl": "https://en.wikipedia.org/wiki/USC_Gould_School_of_Law"
  },
  {
    "subject": "Watergate",
    "canonicalurl": "https://en.wikipedia.org/wiki/Watergate_scandal"
  },
  {
    "subject": "Supreme Court",
    "canonicalurl": "https://en.wikipedia.org/wiki/Supreme_court"
  },
  {
    "subject": "Justice Clarence Thomas",
    "canonicalurl": "https://en.wikipedia.org/wiki/Clarence_Thomas"
  },
  {
    "subject": "Harlan Crow",
    "canonicalurl": "https://en.wikipedia.org/wiki/Harlan_Crow"
  },
  {
    "subject": "resignation",
    "canonicalurl": "https://en.wikipedia.org/wiki/Resignation"
  },
  {
    "subject": "impeachment",
    "canonicalurl": "https://en.wikipedia.org/wiki/Impeachment"
  },
  {
    "subject": "public trust",
    "canonicalurl": "https://en.wikipedia.org/wiki/Public_trust"
  },
  {
    "subject": "code of ethics",
    "canonicalurl": "https://en.wikipedia.org/wiki/Ethical_code"
  },
  {
    "subject": "University of Washington",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "USC Gould School of Law",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "Watergate",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "Supreme Court",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "Justice Clarence Thomas",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "Harlan Crow",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "resignation",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "impeachment",
    "error": {
      "name": "pageError"
    }
  },
  {
    "subject": "public trust",
    "canonicalurl": "https://en.wikipedia.org/wiki/Public_trust"
  },
  {
    "subject": "code of ethics",
    "error": {
      "name": "pageError"
    }
  }
]

Thanks for the help!

Implement media list REST API

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a ​/page​/media-list​/{title} endpoint which lists the media files used in the page. This is something I would love to have in wikipedia.

Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

I'll be happy to help anyone who wants to pick this up.

geoSearchError: wikiError: TypeError: url_1.URLSearchParams is not a constructor

I'm currently working on a Vue application that has the following method

    async getLocations() {
      this.pages = []
      try {
        const geoResult = await wiki.geoSearch(2.088, 4.023, {
          radius: 5000,
          limit: 20,
        })
        console.log(geoResult[0]) // the closest page to given coordinates
      } catch (error) {
        console.log(error)
      }
}

Unfortunately it returns this exception:

geoSearchError: wikiError: TypeError: url_1.URLSearchParams is not a constructor
    at AsyncFunction.wiki.geoSearch (webpack-internal:///./node_modules/wikipedia/dist/index.js:469:15)    
wiki.geoSearch = async (latitude, longitude, geoOptions) => {
    try {
        const geoSearchParams = {
            'list': 'geosearch',
            'gsradius': (geoOptions === null || geoOptions === void 0 ? void 0 : geoOptions.radius) || 1000,
            'gscoord': `${latitude}|${longitude}`,
            'gslimit': (geoOptions === null || geoOptions === void 0 ? void 0 : geoOptions.limit) || 10,
            'gsprop': 'type'
        };
        const results = await request_1.default(geoSearchParams);
        const searchPages = results.query.geosearch;
        return searchPages;
    }
    catch (error) {
        throw new errors_1.geoSearchError(error);
    }
};

Error with proxy

How can i use it with a proxy ? I have this error

searchError: wikiError: FetchError: request to https://en.wikipedia.org/w/api.php?list=search&srprop=&srlimit=3&srsearch=Who%20is%20Harry%20Potter?&srinfo=suggestion&format=json&redirects=&action=query&o
rigin=*& failed, reason: connect ECONNREFUSED 185.15.58.224:443
    at AsyncFunction.wiki.search (D:\developpement\Nodejs\wikipedia\node_modules\wikipedia\dist\index.js:55:15)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async D:\developpement\Nodejs\wikipedia\index.js:5:31 {
  code: undefined
}

Incorrect data parsed from Infobox

https://en.wikipedia.org/wiki/All_Around_the_World_(Lisa_Stansfield_song)

const page = await wiki.page(pageTitle);
return page.infobox({ redirect: false });

returns

{
  name: 'All Around the World',
  cover: 'Lisa Stansfield - All Around the World.jpg',
  border: true,
  caption: 'Artwork for releases outside North America',
  type: 'Singles',
  artist: '2003',
  album: 'Affection (Lisa Stansfield album)',
  bSide: '"Wake Up Baby" (7"),"The Way You Want It" (12")',
  released: '16 October 1989',
  recorded: '1989',
  length: 'Duration',
  label: 'Arista Records',
  writer: [ 'Lisa Stansfield', 'Ian Devaney', 'Andy Morris' ],
  producer: [ 'Ian Devaney', 'Andy Morris' ],
  prevTitle: '8-3-1',
  prevYear: '2001',
  nextTitle: 'Too Hot (Kool & the Gang song)',
  nextYear: 'External music video',
  misc: 'Extra chronology',
  title: 'All Around the World (Norty Cotto Mixes)',
  year: '2003'
}

artist: '2003' is off

CORS error fetching summary in Firefox

I created a vue app where I want to show info to a specific location.
This is my code for fetching the summary:
const summary = await wiki.summary(pageName);

In chrome everything works perfectly, but in Firefox I'm getting this error:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://de.wikipedia.org/api/rest_v1/page/summary/Stuttgart. (Reason: header ‘user-agent’ is not allowed according to header ‘Access-Control-Allow-Headers’ from CORS preflight response).

Failing coverage check on forked Pull Requests

As seen on #17 , #12 , #11 and any other forked PRs to master, the coverage check fails.

The check fails due to CC_TEST_REPORTER_ID which is used in uploading test reports to codeclimate not being available to forks. The discussion at https://github.community/t/make-secrets-available-to-builds-of-forks/16166 is inconclusive, meaning we have to find our own solution or remove the check completely.

Possible solutions include:

  1. Find a way to make CC_TEST_REPORTER_ID available to forks following the above link.
  2. Remove the check from PRs. This will involve playing around with the main.yaml github action file to get it just right.
  3. Non Ideal Make CC_TEST_REPORTER_ID public. This is something we really shouldnt do as people using parts of the code might end up using this secret.

Other languages not fully working

Getting results in other languages has problems. Grabbing a page works, but things like summaries and On This Day are not. It seems like it's not using the correct REST url.
For example, for the Swedish site the summary page should be sv.wikipedia.org/api/rest_v1/page/summary/Stockholm
but it tries to use sv.wikipedia.org/v1/page/page/summary/Stockholm instead.

const wiki = require('wikipedia');
 
(async () => {
    try {
        const changedLang = await wiki.setLang('sv');
        const page = await wiki.page('Stockholm'); // Works
        const summary = await wiki.summary('Stockholm'); // Fails
        console.log(page, summary);
    } catch (error) {
        console.log(error);
    }
})();

Implement generate citation data

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /page/mobile-sections/{title} endpoint which which provides citation data for a given url.

Implementation for this can follow #17. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

Implement random page API

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /page​/random​/{format} endpoint which which gives a page in given format. This is something I would love to have in wikipedia. Find more details here.

Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

I'll be happy to help anyone who wants to pick this up.

Dropping support for node 10, 12

We are planning to drop support for node versions:

10.x.x
12.x.x
Our minimum version will be node 14.x.x.

It would be great to hear if the community feels like there might be an issue to dropping these versions.

Clarity on browser support

The README claims that the package can be used in browsers, but I couldn't find any documentation on it.

Is this actually feasible, how so?

Using `instanceof` to detect a `pageError`

I want to detect when a page doesnt exist so Im catching exceptions, and trying to work out if the exception is a pageError.

So Im trying to use :

      if (wikiError instanceof pageError) {

It works if I import the class using :

import { pageError } from "wikipedia/dist/errors";

..but if I use the barrelled main export types from the d.ts like so

import { pageError } from "wikipedia";

It fails with

Right-hand side of 'instanceof' is not an object

Obviously I don't want to rely on digging into the dist folder, any ideas?

Move from node-fetch to got,ky or axios

Issue

Updating node-fetch is a headache because the module seems to be using different formats which are not supported by modern typescript compilers and jest, both libraries which are very widely used. Additionally, node-fetch takes up a few more mbs as shown in https://www.npmjs.com/package/got#comparison, and also is less actively maintained.

Solution

Analyze the other major HTTP modules like got, ky, or Axios and transition Wikipedia to these libraries instead of node-fetch.
My initial feeling is using got because of its small size and active maintenance but other suggestions are welcome.

Implement featured content api

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Anyone who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /feed/featured/{year}/{mm}/{dd} endpoint which provides featured content for that particular day. Implementation for this can follow #8 .

Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

Implement events on this day API

There are a lot of new REST APIs for wikipedia present in the REST API docs.

We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.

The REST API has a /feed/onthisday/{type}/{mm}/{dd} endpoint which which provides events that historically happened on the provided day and month. We should support month and date in string format and also support the types of events.

Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.

I'll be happy to help anyone who wants to pick this up.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.