Talk:Education/Dashboard
Add topicThere's another couple of metrics that Wikimetrics doesn't do, but that my team considers pretty important. Those are:
- Articles edited by student editors
- Page views of articles edited by student editors
You can get this data for courses using the EP extension (as well as longer-term page view data for any single article, and the articles edited by any single user) from my new coursestats tool on Tool Labs:
http://tools.wmflabs.org/coursestats/
For page view queries of larger sets of articles, and for technical details, see the "technical details" section below.
technical details |
---|
For people comfortable using the toolserver and running python scripts, it's possible to pull those numbers.--Sage (Wiki Ed) (talk) 17:41, 3 July 2014 (UTC) Article edited[edit]If you have a list of usernames, you can find how many (and which) articles they edited by running a SQL query on WMF Tool Labs. This query lists mainspace, non-redirects edited by the provided set of users in the specified timeframe: SELECT page_title
FROM page
WHERE page_id IN
(
SELECT DISTINCT rev_page
FROM revision_userindex
WHERE rev_user_text IN
(
"Ragesoss","Ragesock"
)
AND rev_timestamp BETWEEN "201008" and "201407"
)
AND page_namespace = 0
AND NOT page_is_redirect
You can save the query as a .sql file with the desired cohort of usernames and date range, and then run the query and save the results to a csv file like this: sql enwiki < myquery.sql > queryresults.csv
Page views[edit]Right now, the only tool we have for page views is stats.grok.se, which is fairly slow and only returns data for one article at a time. But if you have a list of articles (such as the one generated by the SQL query above), you can use a script like this to pull data for one article at a time and create a big CSV of cumulative page views. #!/usr/bin/python3
import urllib
import urllib.parse
import urllib.request
import json
import csv
import sys
articles = sys.argv[1]
baseurl = 'http://stats.grok.se/json/en/latest90/'
outputfile = sys.argv[2]
# get page views for a single article
def articleviews(article):
articleurl = baseurl + urllib.parse.quote(article)
# Try to get the data via url request, and retry if it fails
attempts = 0
while attempts < 10:
try:
response = urllib.request.urlopen(articleurl)
attempts = 100
except urllib.error.HTTPError as e:
print("HTTP Error:",e.code , articleurl)
attempts += 1
# Stop the program if more than 10 attempts fail.
if attempts == 10:
print('Too many tries on ' + articleurl )
raise
str_response = response.readall().decode('utf-8')
data = json.loads(str_response)
article_name = data['title']
views= data['daily_views']
view_sum = sum(views.values())
f = open(outputfile,'a')
w = csv.writer(f, delimiter=',')
w.writerow([article, view_sum])
f = open(articles, 'r')
for line in f:
line = line.rstrip()
articleviews(line)
Adjust it to point to the language you want, and then use it (saved as 'pageviews.py') with a file containing the list of articles (eg, 'articles.csv') you want to check page views for, one article per line, and it will create another file in CSV format with the view data: python3 pageviews.py articles.csv pageviews.csv
It returns a few hundred results per hour, so if you've got a lot of articles to check, it may be running for a while. |