GLAM/Newsletter/November 2024/Contents/Memory of the World report
|
Wikidata and Wikipedia improvements
ByUntangling the mess
The first stage of improving the Memory of the World Register on Wikimedia projects is to get a clean data set on Wikidata. This is what I have been working on this month. The goal is:
- A query on Wikidata for things with heritage designation → Memory of the World should retrieve all and only the inscriptions that are on the register (currently there are 494 of these).
- Wikidata should know the year when each inscription was added to the register, and its name in English.
- Each statement should be verifiable via a link to the relevant entry in the UNESCO online database. Links to the old database (which give a 404 error) or citations to Wikipedia should be removed.
Once this is the case, it will be easy to generate list articles for many languages of Wikipedia. This will also enable "target lists" of topics for article creation, for example we can ask for a list of MotW inscriptions lacking an article on English Wikipedia.
We are a long way off this state of affairs, but have made a lot of progress this month. As I look further into Wikidata and Wikipedia, I am finding more errors, but fixing most of them as I go. The biggest shortfall, as explained last month, is that about a third of the MotW inscriptions have no representation on Wikidata. So I am concentrating on the quality of that existing incomplete data set.
My first query for things with heritage designation → Memory of the World returned about three thousand results. This was because of cases where the MotW inscription consists of multiple things, and each of them had been tagged with this designation. For example, for the National Debt Redemption Movement Digital Archive in Korea, 2439 documents in the archive had been tagged. For the Three Stelae of Kōzuke, each individual stele was also tagged with the designation. We should be able to find individual objects in an archive or collection in the MotW register, but we don't need to tag every single object in order to do that. Where a collection or archive has MotW status, the collection property will identify objects in that archive. For instance, this query lists all the documents in the National Debt Redemption Movement Digital Archive. When a group of objects (such as the three Three Stelae) is on the MotW register, the property part of will connect the individual object to the inscription.
Wikidata wrongly identified the Freddie Mercury song "Love Kills" as having Memory of the World status. The reason: "Love Kills" is "part of" the film Metropolis, and Metropolis is mentioned in the Memory of the World Register. However, these are two different things. The Metropolis that has MotW status is a restoration of the original 1927 film, while the song is part of the 1980s musical adaptation. So I made sure that only the restoration is tagged with heritage designation → Memory of the World.
There is not space to list all the errors and inconsistencies I have been fixing on Wikidata and Wikipedia, but here are some representative examples. For Codex Argenteus, Wikidata had one representation for the manuscript and another for the object on the MotW register. These are the same thing so the representations have been merged. The Stone stele records of imperial examinations of the Lê and Mạc dynasties had two inconsistent dates of entry to the register; the incorrect one was removed. The human being Éric Schwab could be inferred to be on the register because his connection to The Family of Man was incorrectly described. I replaced the property collection with the more appropriate archives at.
I discovered that the English Wikipedia article Memory of the World Register – Latin America and the Caribbean has a lot of superfluous entries. Someone with good intentions has pasted in a list of 113 instances of Brazilian cultural heritage, but these were nominated for MotW status by the Brazilian national committee; only a minority were accepted onto the MotW international register. I have posted a notice for other editors, and the list will need careful review to cut it down to just the correct entries.
I fixed the following problems by supplying or updating information:
- Broken links to an old UNESCO site from 10 articles on French Wikipedia (other old versions of the site still have links)
- Broken links from 2 articles on German Wikipedia
- 15 inscriptions lacking the year of entry to the register.
- Various incorrect properties ("award received", "part of the series", "subclass of", "member of") used for Memory of the World instead of the correct property, "heritage designation"
I noticed that the English Wikipedia article "Arolsen Archives – International Center on Nazi Persecution" had an unnecessarily long title, so got it renamed to Arolsen Archives.
One Wikidata entry for a MotW inscription — Human Rights Archive of Chile — lacked an English label. When fixing this, I noticed that this archive had an article in Spanish Wikipedia but not in English Wikipedia. So I translated the article and added in some academic citations, creating the first new Wikipedia article from this project: Human Rights Archive of Chile.
- Albania report
- Brazil report
- Canada report
- Croatia report
- Czech Republic report
- India report
- Italy report
- New Zealand report
- Poland report
- Portugal report
- Switzerland report
- UAE report
- Wikimedia UAE and Librarians report
- UK report
- USA report
- Biodiversity Heritage Library report
- AvoinGLAM report
- Memory of the World report
- Calendar