dopecodez / wikipedia Goto Github PK
View Code? Open in Web Editor NEWWikipedia for node and the browser
License: MIT License
Wikipedia for node and the browser
License: MIT License
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/mobile-html/{title} endpoint which which provides mobile friendly html.
Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
I'll be happy to help anyone who wants to pick this up.
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/pdf{title} endpoint which provides the page in a pdf format.
The API returns a file for straight download, so my intial thought is we'll have to stream the data to actually get the file to the user.
Any discussion on this is welcome.
How can i use it with a proxy ? I have this error
searchError: wikiError: FetchError: request to https://en.wikipedia.org/w/api.php?list=search&srprop=&srlimit=3&srsearch=Who%20is%20Harry%20Potter?&srinfo=suggestion&format=json&redirects=&action=query&o
rigin=*& failed, reason: connect ECONNREFUSED 185.15.58.224:443
at AsyncFunction.wiki.search (D:\developpement\Nodejs\wikipedia\node_modules\wikipedia\dist\index.js:55:15)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async D:\developpement\Nodejs\wikipedia\index.js:5:31 {
code: undefined
}
As seen on #17 , #12 , #11 and any other forked PRs to master, the coverage check fails.
The check fails due to CC_TEST_REPORTER_ID
which is used in uploading test reports to codeclimate not being available to forks. The discussion at https://github.community/t/make-secrets-available-to-builds-of-forks/16166 is inconclusive, meaning we have to find our own solution or remove the check completely.
Possible solutions include:
CC_TEST_REPORTER_ID
available to forks following the above link.CC_TEST_REPORTER_ID
public. This is something we really shouldnt do as people using parts of the code might end up using this secret.page: https://fr.wikipedia.org/wiki/Marseille
Wikipedia value: 13055 et de [[Secteurs et arrondissements de Marseille|13201 à 13216]]
Expected: 13055 et de 13201 à 13216
probably ?
Actual: Secteurs et arrondissements de Marseille
await wiki.setLang('fr');
const page = await wiki.page('Marseille');
console.log(await page.infobox());
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Anyone who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /feed/featured/{year}/{mm}/{dd} endpoint which provides featured content for that particular day. Implementation for this can follow #8 .
Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
The README claims that the package can be used in browsers, but I couldn't find any documentation on it.
Is this actually feasible, how so?
I'm currently working on a Vue application that has the following method
async getLocations() {
this.pages = []
try {
const geoResult = await wiki.geoSearch(2.088, 4.023, {
radius: 5000,
limit: 20,
})
console.log(geoResult[0]) // the closest page to given coordinates
} catch (error) {
console.log(error)
}
}
Unfortunately it returns this exception:
geoSearchError: wikiError: TypeError: url_1.URLSearchParams is not a constructor
at AsyncFunction.wiki.geoSearch (webpack-internal:///./node_modules/wikipedia/dist/index.js:469:15)
wiki.geoSearch = async (latitude, longitude, geoOptions) => {
try {
const geoSearchParams = {
'list': 'geosearch',
'gsradius': (geoOptions === null || geoOptions === void 0 ? void 0 : geoOptions.radius) || 1000,
'gscoord': `${latitude}|${longitude}`,
'gslimit': (geoOptions === null || geoOptions === void 0 ? void 0 : geoOptions.limit) || 10,
'gsprop': 'type'
};
const results = await request_1.default(geoSearchParams);
const searchPages = results.query.geosearch;
return searchPages;
}
catch (error) {
throw new errors_1.geoSearchError(error);
}
};
Do you have any thoughts on what Invalid attempt to destructure non-iterable instance
is referring to in this context?
/PROJECTS/research/node_modules/wikipedia/dist/page.js:256
throw new errors_1.infoboxError(error);
^
infoboxError: infoboxError: TypeError: Invalid attempt to destructure non-iterable instance
at Page.infobox (/PROJECTS/research/node_modules/wikipedia/dist/page.js:256:23)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
code: undefined
}
Node.js v18.16.0
The code:
const wiki = require('wikipedia')
let page, infobox
async function getPage(input) {
try {
page = await wiki.page(input)
infobox = await page.infobox()
console.log(infobox)
} catch (error) {
console.log(error)
}
return infobox
}
getPage('John M. Vining')
When I do these one at a time they all resolve with a page, when I do more than ten at a time they start to throw pageerrors. I made a little code sample that perfectly illustrates the issue:
const wiki = require('wikipedia');
const subjects = [ "University of Washington", "USC Gould School of Law", "Watergate", "Supreme Court", "Justice Clarence Thomas", "Harlan Crow", "resignation", "impeachment", "public trust", "code of ethics", "University of Washington", "USC Gould School of Law", "Watergate", "Supreme Court", "Justice Clarence Thomas", "Harlan Crow", "resignation", "impeachment", "public trust", "code of ethics" ];
async function GetWikiSummary(subject) {
let result = {};
try {
result.subject = subject;
const page = await wiki.page(subject);
result.canonicalurl = page.canonicalurl;
} catch (error) {
result.error = error;
}
return result;
}
async function getWikiSummaries(subjects) {
const results = [];
for (const subject of subjects) {
try {
const summary = await GetWikiSummary(subject);
results.push(summary);
} catch (error) {
results.push({ subject });
}
}
return results;
}
console.log("Starting");
(async () => {
const converted = await getWikiSummaries(subjects);
//const converted = await GetWikiSummary('impeachment');
console.log(JSON.stringify(converted, null, 2));
})();
console.log("Done");
the list is actually duplicated items to show that the first 10 resolve and the last 10 (even though they are the same) will throw page errors. If I have a lost of 20 items what is the recommended way to get 20?
the result of the above code looks like this:
Done
[
{
"subject": "University of Washington",
"canonicalurl": "https://en.wikipedia.org/wiki/University_of_Washington"
},
{
"subject": "USC Gould School of Law",
"canonicalurl": "https://en.wikipedia.org/wiki/USC_Gould_School_of_Law"
},
{
"subject": "Watergate",
"canonicalurl": "https://en.wikipedia.org/wiki/Watergate_scandal"
},
{
"subject": "Supreme Court",
"canonicalurl": "https://en.wikipedia.org/wiki/Supreme_court"
},
{
"subject": "Justice Clarence Thomas",
"canonicalurl": "https://en.wikipedia.org/wiki/Clarence_Thomas"
},
{
"subject": "Harlan Crow",
"canonicalurl": "https://en.wikipedia.org/wiki/Harlan_Crow"
},
{
"subject": "resignation",
"canonicalurl": "https://en.wikipedia.org/wiki/Resignation"
},
{
"subject": "impeachment",
"canonicalurl": "https://en.wikipedia.org/wiki/Impeachment"
},
{
"subject": "public trust",
"canonicalurl": "https://en.wikipedia.org/wiki/Public_trust"
},
{
"subject": "code of ethics",
"canonicalurl": "https://en.wikipedia.org/wiki/Ethical_code"
},
{
"subject": "University of Washington",
"error": {
"name": "pageError"
}
},
{
"subject": "USC Gould School of Law",
"error": {
"name": "pageError"
}
},
{
"subject": "Watergate",
"error": {
"name": "pageError"
}
},
{
"subject": "Supreme Court",
"error": {
"name": "pageError"
}
},
{
"subject": "Justice Clarence Thomas",
"error": {
"name": "pageError"
}
},
{
"subject": "Harlan Crow",
"error": {
"name": "pageError"
}
},
{
"subject": "resignation",
"error": {
"name": "pageError"
}
},
{
"subject": "impeachment",
"error": {
"name": "pageError"
}
},
{
"subject": "public trust",
"canonicalurl": "https://en.wikipedia.org/wiki/Public_trust"
},
{
"subject": "code of ethics",
"error": {
"name": "pageError"
}
}
]
Thanks for the help!
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/random/{format} endpoint which which gives a page in given format. This is something I would love to have in wikipedia. Find more details here.
Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
I'll be happy to help anyone who wants to pick this up.
We are planning to drop support for node versions:
10.x.x
12.x.x
Our minimum version will be node 14.x.x.
It would be great to hear if the community feels like there might be an issue to dropping these versions.
Getting results in other languages has problems. Grabbing a page works, but things like summaries and On This Day are not. It seems like it's not using the correct REST url.
For example, for the Swedish site the summary page should be sv.wikipedia.org/api/rest_v1/page/summary/Stockholm
but it tries to use sv.wikipedia.org/v1/page/page/summary/Stockholm
instead.
const wiki = require('wikipedia');
(async () => {
try {
const changedLang = await wiki.setLang('sv');
const page = await wiki.page('Stockholm'); // Works
const summary = await wiki.summary('Stockholm'); // Fails
console.log(page, summary);
} catch (error) {
console.log(error);
}
})();
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/mobile-sections/{title} endpoint which which provides mobile friendly html.
Implementation for this can follow #17. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/mobile-sections/{title} endpoint which which provides citation data for a given url.
Implementation for this can follow #17. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
Created an issue to track #48
We should allow customization of user agent but without using environment variables explicitly in a npm package.
Updating node-fetch is a headache because the module seems to be using different formats which are not supported by modern typescript
compilers and jest
, both libraries which are very widely used. Additionally, node-fetch takes up a few more mbs as shown in https://www.npmjs.com/package/got#comparison, and also is less actively maintained.
Analyze the other major HTTP modules like got, ky, or Axios and transition Wikipedia to these libraries instead of node-fetch.
My initial feeling is using got because of its small size and active maintenance but other suggestions are welcome.
Since Travis ci stopped unlimited builds, we need to migrate the project to Github Actions. Readme needs to be updated too to reflect this change. We can use the workflow defined here : https://github.com/dopecodez/pingman/tree/master/.github/workflows
Hi,
It would be nice to also add the URL of the Wikipedia page as a parameter of wiki.page(), not only the title and pageId.
Thanks!
Hi,
It's possible to remove this https://github.com/dopecodez/Wikipedia/blob/master/source/request.ts#L50 which is causing problems with the console in my app?
I can PR if that's the preference.
Thanks!
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /page/media-list/{title} endpoint which lists the media files used in the page
. This is something I would love to have in wikipedia.
Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
I'll be happy to help anyone who wants to pick this up.
https://en.wikipedia.org/wiki/All_Around_the_World_(Lisa_Stansfield_song)
const page = await wiki.page(pageTitle);
return page.infobox({ redirect: false });
returns
{
name: 'All Around the World',
cover: 'Lisa Stansfield - All Around the World.jpg',
border: true,
caption: 'Artwork for releases outside North America',
type: 'Singles',
artist: '2003',
album: 'Affection (Lisa Stansfield album)',
bSide: '"Wake Up Baby" (7"),"The Way You Want It" (12")',
released: '16 October 1989',
recorded: '1989',
length: 'Duration',
label: 'Arista Records',
writer: [ 'Lisa Stansfield', 'Ian Devaney', 'Andy Morris' ],
producer: [ 'Ian Devaney', 'Andy Morris' ],
prevTitle: '8-3-1',
prevYear: '2001',
nextTitle: 'Too Hot (Kool & the Gang song)',
nextYear: 'External music video',
misc: 'Extra chronology',
title: 'All Around the World (Norty Cotto Mixes)',
year: '2003'
}
artist: '2003' is off
Mixed Content: The page at '{APP_DOMAIN}' was loaded over HTTPS, but requested an insecure resource 'http://en.wikipedia.org/w/api.php?list=search&srprop=&srlimit=10&srsearch={QUERY}&format=json&redirects=&action=query&origin=*&'. This request has been blocked; the content must be served over HTTPS.
There are no clear instructions on page input parameters. There is just a singleton example with text input 'Batman'.
How do I account for passing a page url as data?
As an example, use page('Oliver Ellsworth')
and then retrieve the infobox()
And then retrieve page('1st United States Congress')
and then retrieve infobox()
I like to change to encodeUriComponent() instead of encodeURI() for better encoding for search params.
I want to detect when a page doesnt exist so Im catching exceptions, and trying to work out if the exception is a pageError.
So Im trying to use :
if (wikiError instanceof pageError) {
It works if I import the class using :
import { pageError } from "wikipedia/dist/errors";
..but if I use the barrelled main export types from the d.ts
like so
import { pageError } from "wikipedia";
It fails with
Right-hand side of 'instanceof' is not an object
Obviously I don't want to rely on digging into the dist
folder, any ideas?
I created a vue app where I want to show info to a specific location.
This is my code for fetching the summary:
const summary = await wiki.summary(pageName);
In chrome everything works perfectly, but in Firefox I'm getting this error:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://de.wikipedia.org/api/rest_v1/page/summary/Stuttgart. (Reason: header ‘user-agent’ is not allowed according to header ‘Access-Control-Allow-Headers’ from CORS preflight response).
There are a lot of new REST APIs for wikipedia present in the REST API docs.
We'll look through them one by one and implement the same. Any one who wants to pick up on any other REST API should ideally open a new issue.
The REST API has a /feed/onthisday/{type}/{mm}/{dd} endpoint which which provides events that historically happened on the provided day and month. We should support month and date in string format and also support the types of events.
Implementation for this should follow the summary or related method flow. Remember to write unit tests for all possible scenarios in your new functions, and try to use types as far as possible.
I'll be happy to help anyone who wants to pick this up.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.