wolfgangfahl / py-3rdparty-mediawiki Goto Github PK
View Code? Open in Web Editor NEWWrapper for pywikibot and mwclient MediaWiki API librarties with improvements for 3rd party wikis
License: Apache License 2.0
Wrapper for pywikibot and mwclient MediaWiki API librarties with improvements for 3rd party wikis
License: Apache License 2.0
allow looping over a set of pages and modify according to a lambda function.
Typical lambdas would be:
add duck typed version of html page retrieval so that either pywikibot or mwclient may be used (mwclient might not have this API support at this time)
mark edits, pushes and the like with a tag to allow undo (signature)
e.g. -q [[Category:City]]
The join command seems to be problematic when there is no list of warnings.
AttributeError: 'list' object has no attribute 'keys'
is given e.g. on query:
{{#ask: [[Acronym::+]]
|mainlabel=Event
| ?Acronym = acronym
| ?_CDAT=creation date
| ?_MDAT=modification date
| limit=200
}}
on openresearch.org
e.g. "pushed from xyz by wikipush"
tables might display wrongly if not fixed
ask results are limited e.g. to 1000/10.000 results based on the server configuration and the rights of the account being used.
When doing a query like
[[modification date::+]]
The resultsize is the number of pages in the wiki which might well be far more than 10.000
There should be an approach to workaround the limitation e.g. by batching or splitting a query into multiple queries. E.g. the above query can be modified to
[[modification date::<2020]][[modification date::>2018]]
To only select a range of dates - by then going range by range e.g. the query for the backup can be split to a point where it works e.g. by trying it out with a count result first.
queries might take a while and the progress should be optionally made visible in this case.
If a query might return more than https://www.semantic-mediawiki.org/wiki/Help:$smwgQMaxLimit records the query needs to be split in batches to work properly.
During the query escaping a blank is replaced with a underscore see:
py-3rdparty-mediawiki/wikibot/smw.py
Line 206 in a7872b3
For example query "[[isA::Event series]]" is converted to "[[isA::Event_series]]".
On www.openresearch.org this results in two distinct results but on my own wiki not.
A quick test showed that removing the mentioned line from above seems to resolve this issue (The query is then encoded by the library requests (" " to "+"))
Should this issue be fixed or should it stay dependent on the wiki configuration?
copying images should not be the default but needs to be activated since there might be many images in some pages and this holds up the initial copy.
Say one has an x.wiki
file and would like to upload that to their wiki
There is a method to upload files from a target wiki to a destination one, but I don't need a source wiki in my case as the file is on my own physical machine.
to get full path with scriptPath
The library pycrypto is discontinued.
The fork pycryptodome seems to be a good alternative which is actively maintained.
e.g.:
wikiedit -t test -q "[[isA::Event]]" --search "\\[Category:" --replace "[has category::"
with a query with some 8000 results
ini files should be writeable from the command line
e.g. "uploaded from doe.com intranet" ....
extra Command Line option which allows a selection dialog to appear in case of multi-page operation
Multiple Checkboxes- one per page and one "all/none" option
see https://stackoverflow.com/questions/19827615/startswith-typeerror-in-function/19827633
happens when the value of a link is empty and urlparse returns an empty byte typed string.
e.g. when wikiedit target login fails this should be made visible.
The following query fromhttps://www.semantic-mediawiki.org/wiki/Demo:JSON_-_datatype_export_examples
{{#ask:
[[Has_annotation_uri::+]]
|?Has_annotation_uri=anu
|?Has_boolean=boo
|?Has_code=cod
|?Has_date=dat
|?Has email address=ema
|?Has Wikidata item ID=eid
|?Has coordinates=geo
|?Has number=num
|?Has mlt=mlt
|?Has example=wpg
|?Telephone number=tel
|?Has temperatureExample=tem
|?Area=qty
|?SomeProperty=txt
|?Has Wikidata Reference=ref_rec
|?Soccer result=rec
|?Has_URL=uri
|format=json
|limit=1
}}
should work
Allow to convert different data representations to MediaWiki table.
E.g. a list of dicts as input
listofDicts=[
{'name': 'Elizabeth Alexandra Mary Windsor', 'born': self.dob('1926-04-21'), 'numberInLine': 0, 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
{'name': 'Charles, Prince of Wales', 'born': self.dob('1948-11-14'), 'numberInLine': 1, 'wikidataurl': 'https://www.wikidata.org/wiki/Q43274' },
{'name': 'George of Cambridge', 'born': self.dob('2013-07-22'), 'numberInLine': 3, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
{'name': 'Harry Duke of Sussex', 'born': self.dob('1984-09-15'), 'numberInLine': 6, 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
]
{{#ask: [[Category:City]] [[Located in::Germany]] |mainlabel=City |?Population |?Area#km² = Size in km² }}
should be valid and return a list of city records
wikibackup shall copy pages and images to a backup directory and optionally use git version control for the result. Basically it is a download functionality.
The user has read rights but the error message shows that there is a read rights issue.
copying 1 pages from media to wiki
1/1 ( 100%): copying Template:1toN ...:x::('readapidenied', 'You need read permission to use this module.', 'See http://media.bitplan.com/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes.')```
dependency missing in setup.py
is also needed by wikiCMS and e.g. SmartDataAnalytics/OpenResearch#91
wikipush -s or -t ormk -q "[[isA::Event series]]"
wikipush: TypeError("argument of type 'NoneType' is not iterable")
scripts don't work on Mac OS 13.6 using macports
Pages in File: Name space should be copied properly.
To allow a choice for the use of pywikibot or mwclient use a common base class Wiki which wraps core functionality in a consistent way.
only needs a target wiki but has similar options like wikipush
exists messages should be marked with ✅ in ignore mode. They might still be shown (even if somewhat verbose)
copying image File:Begriff Systemkontext.png ...✅:('fileexists-no-change', 'The upload is an exact duplicate of the current version of [[:File:Begriff Systemkontext.png]].', 'See http://swa.bitplan.com/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes.')
Example:
{{#ask:[[Has conference::+]]
|mainlabel=Talk
|?Has description=Description
|?Has conference=Event
|sort=Has conference
|order=descending
|format=table
}}
the query field would be Event here - also the duplicates should be removed from the query in this case.
- if self.wikiUser.user is not None:
+ if self.wikiUser.user:
scripts are avaialble via pip install method
wikibackup -s wiki -q "[[modification date::>2021]]"
Start: 2021-01-02 09:15:18, End: 2021-02-09 03:47:47
wikibackup: TypeError("unsupported operand type(s) for /: 'datetime.timedelta' and 'NoneType'")
for help use --help
To make handling of simple types easier e.g. empty string - the already implemented "delisting" that return items for lists of length ones shall be extended to a list length of zero.
in query mode the limit is ignored and a full query done. The performance is bad in these cases.
The wikiquery command line should allow to get results in a given format:
The base functionality for this should be taken from pyLodStorage.
Acceptance criterion:
Situation
The query {{#ask: [[IsA::Event]][[Acronym::~ES*]][[start date::>2018]][[start date::<2019]] | ?Title = title | ?Event in series = series | ?ordinal=ordinal | ?Homepage = homepage |format=table }}
is given see https://www.openresearch.org/wiki/Workdocumentation_2021-02-16
Action
put the query in a file "query1.ask"
wikiquery -s or --queryFile query1.ask --format csv
wikiquery -s or --queryFile query1.ask --format json
Expected Result
wikipush -s media -t kaw -p "Template:1toN"
wikipush: TypeError("'<' not supported between instances of 'NoneType' and 'int'")
for help use --help
allow uploading Files from command line
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.