There's another couple of metrics that Wikimetrics doesn't do, but that my team considers pretty important. Those are:
- Articles edited by student editors
- Page views of articles edited by student editors
You can get this data for courses using the EP extension (as well as longer-term page view data for any single article, and the articles edited by any single user) from my new coursestats tool on Tool Labs:
For page view queries of larger sets of articles, and for technical details, see the "technical details" section below.
If you have a list of usernames, you can find how many (and which) articles they edited by running a SQL query on WMF Tool Labs. This query lists mainspace, non-redirects edited by the provided set of users in the specified timeframe:
SELECT page_title FROM page WHERE page_id IN ( SELECT DISTINCT rev_page FROM revision_userindex WHERE rev_user_text IN ( "Ragesoss","Ragesock" ) AND rev_timestamp BETWEEN "201008" and "201407" ) AND page_namespace = 0 AND NOT page_is_redirect
You can save the query as a .sql file with the desired cohort of usernames and date range, and then run the query and save the results to a csv file like this:
sql enwiki < myquery.sql > queryresults.csv
Right now, the only tool we have for page views is stats.grok.se, which is fairly slow and only returns data for one article at a time. But if you have a list of articles (such as the one generated by the SQL query above), you can use a script like this to pull data for one article at a time and create a big CSV of cumulative page views.
#!/usr/bin/python3 import urllib import urllib.parse import urllib.request import json import csv import sys articles = sys.argv baseurl = 'http://stats.grok.se/json/en/latest90/' outputfile = sys.argv # get page views for a single article def articleviews(article): articleurl = baseurl + urllib.parse.quote(article) # Try to get the data via url request, and retry if it fails attempts = 0 while attempts < 10: try: response = urllib.request.urlopen(articleurl) attempts = 100 except urllib.error.HTTPError as e: print("HTTP Error:",e.code , articleurl) attempts += 1 # Stop the program if more than 10 attempts fail. if attempts == 10: print('Too many tries on ' + articleurl ) raise str_response = response.readall().decode('utf-8') data = json.loads(str_response) article_name = data['title'] views= data['daily_views'] view_sum = sum(views.values()) f = open(outputfile,'a') w = csv.writer(f, delimiter=',') w.writerow([article, view_sum]) f = open(articles, 'r') for line in f: line = line.rstrip() articleviews(line)
Adjust it to point to the language you want, and then use it (saved as 'pageviews.py') with a file containing the list of articles (eg, 'articles.csv') you want to check page views for, one article per line, and it will create another file in CSV format with the view data:
python3 pageviews.py articles.csv pageviews.csv
It returns a few hundred results per hour, so if you've got a lot of articles to check, it may be running for a while.