Giter VIP home page Giter VIP logo

hostbot's Introduction

Table of Contents

HostBot: your Wikipedia robot friend

HostBot is a set of Python scripts that performs repetitive tasks on and for the Wikipedia Teahouse. HostBot is run by Jonathan Morgan, and is hosted by the Wikimedia Foundation. Hostbot makes extensive use of the MediaWiki API, the wikitools Python framework and may in future draw on pywikipediabot as well, although it does not currently do so.

Information about what the Teahouse is, and how it came to be can be found here. If you're curious about the current status of the project, read up here and here, or post to Jtmorgan's talk page on Wikipedia.

/new_editor_invites

These scripts maintain a database of new editors who meet the basic criteria for invitation to the Teahouse, and invites ~100 editors who meet these criteria on a daily basis. You can read more about the inclusion criteria here, and you can see a daily report of the users invited here.

Scripts

  • teahouseInvitees.py : Grabs the daily sample of new editors to invite.
  • inviteCheck.py : Checks to see whether these editors have been invited to the Teahouse already by someone other than HostBot.
  • hostbotInvites.py : Invites all editors from today's sample who meet the invite criteria.

MYSQL tables

These scripts use several of the standard Wikipedia database tables (enwiki.revision, enwiki.recentchanges, enwiki.user... maybe a couple others) which are available to account holders on toolserver.org.

They also make use of a custom tables created to keep track of who gets invited to the Teahouse, and whether they continue editing Wikipedia (for purposes of quantifying the impact of Teahouse participation on new editor retention).

th_up_invitees
The master list of invitees. Not all of these fields are strictly necessary.

Used by: all new_editor_invites scripts
 +-------------------+------------------+------+-----+---------+----------------+
 | Field             | Type             | Null | Key | Default | Extra          |
 +-------------------+------------------+------+-----+---------+----------------+
 | id                | int(11) unsigned | NO   | PRI | NULL    | auto_increment |
 | user_id           | int(11)          | YES  | UNI | NULL    |                |
 | user_name         | varbinary(200)   | YES  |     | NULL    |                |
 | user_registration | varbinary(14)    | YES  |     | NULL    |                |
 | user_editcount    | int(11)          | YES  |     | NULL    |                |
 | email_status      | varbinary(14)    | YES  |     | NULL    |                |
 | edit_sessions     | int(11)          | YES  |     | NULL    |                |
 | sample_group      | varbinary(14)    | YES  |     | NULL    |                |
 | sample_date       | datetime         | YES  |     | NULL    |                |
 | sample_type       | varbinary(11)    | YES  |     | NULL    |                |
 | invite_status     | tinyint(1)       | YES  |     | NULL    |                |
 | hostbot_invite    | tinyint(1)       | YES  |     | NULL    |                |
 | hostbot_personal  | tinyint(1)       | YES  |     | NULL    |                |
 | hostbot_skipped   | tinyint(1)       | YES  |     | NULL    |                |
 | user_talkpage     | int(11)          | YES  |     | NULL    |                |
 +-------------------+------------------+------+-----+---------+----------------+

/monthly_metrics

These scripts generate automated metrics about Teahouse activity and posts a simple metrics report to enwp.org/WP:Teahouse/Host_lounge/Metrics. Currently set up to run on the first of every month.

Scripts

Currently, each of these five scripts posts a separate section to the automated metrics page.

  • intro_section.py : Posts introductory content (mostly static) to the top of the page. This script must be run before any of the others.
  • questions_section.py : Posts the number of questions asked this month, their response rate, etc. and compares with the previous month.
  • profiles_section.py : Posts the number of profiles created on the Teahouse Guests page this month, their response rate, etc. and compares with the previous month.
  • hosts_section.py : Posts the number of hosts who participated this month, and compares that with the previous month.
  • pageviews_section.py : Posts the overall pageviews to Wikipedia Teahouse (main page only). Pulls from stats.grok.se.

MYSQL tables

These scripts use several of the standard Wikipedia database tables (enwiki.revision, enwiki.recentchanges, enwiki.user... maybe a couple others) which are available to account holders on toolserver.org. Several of the automated metrics scripts also use a set of custom tables that track various activity metrics on enwp.org/WP:Teahouse.

th_up_questions
Used by: questions_section.py
This table logs questions that have been asked on the Teahouse Q&A board (enwp.org/WP:Teahouse/Questions).

 +--------------------+-----------------+------+-----+---------+-------+
 | Field              | Type            | Null | Key | Default | Extra |
 +--------------------+-----------------+------+-----+---------+-------+
 | rev_id             | int(8) unsigned | NO   | PRI | 0       |       |
 | rev_user           | int(5) unsigned | NO   |     | 0       |       |
 | rev_user_text      | varbinary(255)  | NO   |     |         |       |
 | rev_timestamp      | varbinary(14)   | NO   |     |         |       |
 | rev_comment        | varbinary(255)  | YES  |     | NULL    |       |
 | post_date          | datetime        | YES  |     | NULL    |       |
 | week               | int(11)         | YES  |     | NULL    |       |
 | questioner_replies | int(11)         | YES  |     | NULL    |       |
 | answers            | int(11)         | YES  |     | NULL    |       |
 | first_answer_date  | datetime        | YES  |     | NULL    |       |
 +--------------------+-----------------+------+-----+---------+-------+

th_up_hosts
Used by: hosts_section.py
This table logs the activity of Teahouse hosts on WP:Teahouse and its subpages, as well as the location of their host profile.

 +----------------+------------------+------+-----+---------+----------------+
 | Field          | Type             | Null | Key | Default | Extra          |
 +----------------+------------------+------+-----+---------+----------------+
 | id             | int(11) unsigned | NO   | PRI | NULL    | auto_increment |
 | user_name      | varbinary(255)   | YES  | UNI | NULL    |                |
 | user_id        | int(11)          | YES  | UNI | NULL    |                |
 | user_talkpage  | int(11)          | YES  |     | NULL    |                |
 | join_date      | datetime         | YES  |     | NULL    |                |
 | last_move_date | datetime         | YES  |     | NULL    |                |
 | num_edits_2wk  | int(11)          | YES  |     | NULL    |                |
 | latest_edit    | datetime         | YES  |     | NULL    |                |
 | in_breakroom   | tinyint(1)       | YES  |     | NULL    |                |
 | retired        | tinyint(1)       | YES  |     | NULL    |                |
 | featured       | tinyint(1)       | YES  |     | NULL    |                |
 | colleague      | tinyint(1)       | YES  |     | NULL    |                |
 | has_profile    | tinyint(1)       | YES  |     | NULL    |                |
 +----------------+------------------+------+-----+---------+----------------+	

th_up_profiles
Used by: profiles_section.py
This table logs the activity of Teahouse hosts on WP:Teahouse and its subpages, as well as the location of their host profile.


/host_profiles

These scripts work together to make sure that the list of profiles visible to guests on the host profile page is reflective of the hosts who are currently actively participating in the Teahouse. Hosts who become inactive are moved to the host breakroom page. Currently, a host is considered inactive if they have not edited WP:Teahouse or any of its sub- or talk-pages for at least 2 weeks. Host profiles are moved back to the host landing page (which is transcluded into Teahouse/Hosts) when they become active again, or when they check in.

These scripts use the MediaWiki API.

Scripts

  • deactivateHosts.py : Moves profiles of inactive hosts to the breakroom
  • reactivateHosts.py : Moves profiles of newly active hosts to the host_landing page.
  • reorderHosts.py : Re-orders host profiles, putting the newest and most active hosts on top.
  • clearCheckins.py : clears the checkins page, if not blank.

Tables

Custom: th_up_hosts

Default: enwiki.revision

/guest_profiles

These scripts work together to make sure that the list of profiles visible to guests on the guest profile page is relatively short, and contains only the profiles of the most recent visitors. Older guest profiles are moved to the guestbook page, after which MiszaBot II takes care of archiving. Currently, guest profiles are archived weekly on en.wp.teahouse, from the two sub_pages, Left_column and Right_column, which are transcluded into Teahouse/Guests.

These scripts use the MediaWiki API.

Scripts

  • archiveGuestsLeft.py : Moves older guest profiles to the guestbook.
  • archiveGuestsRight.py : Moves moves older guest profiles to the guest book.

Tables

None

/featured_content

These scripts work together to push recent content to the Teahouse front page.

These scripts use the MediaWiki API.

Scripts

Tables

Default: enwiki.revision

hostbot's People

Contributors

jtmorgan avatar legoktm avatar jonasagx avatar

Watchers

Everton Zanella Alvarenga avatar  avatar

Forkers

jonasagx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.