It's been a busy and historic month of open access work in the United States, spearheaded by the community from DC, New York City and Boston. Most of this was revealed in January and early February, with The Metropolitan Museum of Art celebrating the second anniversary of their open access initiative and the Cleveland Museum of Art also following the Met's lead by having their first-ever open access reveal. Read on for more details.
MIT/Met/Microsoft Hackathon and reveal
It could only be publicly revealed in January 2019, but on December 12-13, 2018, a special hackathon was held in Cambridge, Mass. with The Met, Massachusetts Institute of Technology and Microsoft Research where Sam Klein (User:SJ) and Andrew Lih (User:Fuzheado) represented the Wikimedia community.
The event centered around brainstorming and prototyping ideas around The Met's new metadata tags project that was yet to be publicly shown. In 2018, The Met began and completed an ambitious subject keyword tagging project for their complete collection of artworks. Chief digital officer Loic Tallon described it as adding:
"quality-controlled subject keywords to the more than 300,000 digitized artworks in the collection... The 1,063 subject keywords range from 'trees' to 'castles,' from 'floods' and 'portraits,' and from 'ritual objects' to 'cats.'"
Jennie Choi of The Met Museum
Sam Klein of MIT
Team of Tag! That's It
The Met wanted to know what they could do with these keywords, and with Wikimedian in Residence Richard Knipel (User:Pharos) in house, they turned to the Wikimedia community as one of their main partners. Andrew Lih was brought on to assist in the strategy and development of using artificial intelligence and machine learning with the dataset and Wikidata. At the hackathon, Andrew worked with Jennie Choi, The Met's General Manager of Collection Information and Nina Diamond, Managing Editor and Producer along with Microsoft Researchers Patrick Buehler, J.S. Tan and Sam Kazemi Nafchi to train a machine learning model on Microsoft Azure that could predict labels for artworks. Using the Met's roughly 1,000 word art vocabulary, and representative images to help train the model a proof of concept app was developed at the hackathon. The results were impressive enough that Andrew finished the creation of a Wikdata Distributed Game - Depicts to connect the subject keyword recommendations to Wikidata.
In the weeks after the hackathon, Andrew worked with The Met to create a crosswalk database of their art thesaurus terms to Wikidata items. The game was further refined, and on January 31, the first candidates generated from the AI system were fed to the Wikidata Game and Wikimedia community members added AI-generated statements to Wikidata. It is believed to be the first use of deep learning in computer vision for the Wikidata community (and possibly beyond).
On February 4, there was an event at The Metropolitan Museum of Art in NYC that demonstrated the result – the evening showcased the use of artificial intelligence and machine learning based on the Met's open access work, and included the unveiling of the new Wikidata Game.
The evening event for curators, trustees, journalists and technologists was held in the great entrance hall of The Met Museum, and had more than 100 people play the Wikidata Game on three large screen kiosks. More than 700 judgments were recorded from button presses by participants ranging from the museum's trustees to 12 year old attendees. The project will help inform further game development and supplement the existing "Wiki Art Depiction Explorer," a Knight Foundation-funded project that Andrew has with Effie Kapsalis of The Smithsonian Institution and Robert Fernandez of Wikimedia DC. More work is to come in going beyond just the Wikidata Game interface for yes/no judgments.
The initiative and the "Tag! That's It" project that resulted in the game development was covered in a number of publications:
Station setup near Egyptian wing of The Met Museum
Andrew Lih, Megan Wacha, Richard Knipel, Jane Alexander, Loic Tallon
Matthew Tarr, American Museum of Natural History; Andrew Lih; Jane Alexander, Cleveland Museum of Art
Playing Wikidata Game
Wikidata Game on Microsoft Surface kiosks
Andrew Lih and Richard Knipel
Open Access at the Cleveland Museum of Art
Inspired by The Met Museum's open access initiatives, on January 23, The Cleveland Museum of Art (CMA) released their digital content of more than 30,000 artworks and 60,000 object records under a CC0 license and uploaded images to Commons.;  Andrew Lih was asked by the CMA to assist in a novel approach – simultaneous mass donation of Wikidata and metadata information with upload of images. To create 4,000+ Wikidata items and link them to existing and new images, he created several new Python based tools for linking museum APIs to Wikidata item creation and will eventually be releasing them to the community under a free license.
The collaboration with the CMA's Chief Digital Information Officer Jane Alexander and Director of Technology Ethan Holda was furthered by Wikimedia presence at the Museum Computer Network conference in Denver (November 13-16, 2018), where Andrew gave the first-ever Wikidata tutorial for museum professionals at the influential annual gathering. News of the CMA open access donation made it to the front page of the main Cleveland newspaper.
A report on some of the project's outputs can be found in a live Wikidata Query notebook. Using a Python notebook approach for GLAM statistics has been very helpful for The Met and CMA projects, and has inspired other uses by GLAM Wiki community members (see Martin Poulter's notebook). There seems to be much interest in furthering this model as a education and insights tool. Please contact User:Fuzheado for more info on this.
The fruits of open access are already being seen – the CMA artworks have been fed into the AI system developed with The Met (described above) and depicts statements are already being added into Wikidata for their works via the Wikidata Game interface.