GLAM/Newsletter/April 2026/Contents/Asia report
|
|
Documenting and citing oral knowledge in audio and video
Tools, framework and GLAM collaborations
Recording speech, performance, and community memory as audio, video, subtitles, transcripts, and structured metadata has never been easy for Indigenous and other low-resourced languages. Oral knowledge documentation is used when written sources are limited, incomplete, or not the primary form of knowledge transfer. OpenSpeaks Archives has worked with nearly 20 communities from South Asia, collaboratively documenting their languages. The documented media is gradually entering language archives and libraries and has enriched nearly 1,000 Wikimedia pages in over 100 languages. These media files are being deposited in a dedicated collection on Endangered Languages Archive (ELAR) and two languages, Kusunda and Sora, are already up on Language Archive Cologne (LAC). The focus languages were recorded in India, Nepal and Sri Lanka.
OpenSpeaks has published an open framework for oral knowledge and language documentation, combining FAIR and CARE, community review, multilingual transcription, and time-coded subtitles. A new tool called Subtitler (alpha version on Toolforge) is getting ready to help language archivists and Wikimedians alike. This subtitle editor loads audio, video, and subtitle files locally and from Wikimedia Commons, letting users edit and translate subtitles and upload them to Commons. The Metadata Generator helps import, create, and edit metadata, and export it to three destinations: ELAR, LAC, and Wikimedia Commons.
Upcoming training and resources

OpenSpeaks is also running a community language documentation and archiving training series with seven online workshops and one in-person workshop in Kochi, Kerala (co-located with WikiConference India 2026) on 7 September 2026.
The application deadline is 15 May. Trainers and learners apply here.
The series will produce a curriculum, to be co-authored with the trainers. Some of the resources to be used for the training are a text style guide for captioning, subtitling, and transcribing, a framework, tool documentation, templates and example workflows from recording to Wikimedia upload and external archives.
- From the team
- Albania report
- Argentina report
- Asia report
- Australia report
- Brazil report
- Colombia report
- Italy report
- New Zealand report
- Nigeria report
- North Macedonia report
- Poland report
- Serbia report
- Switzerland report
- UK report
- USA report
- Biodiversity Heritage Library report
- Memory of the World report
- Calendar
| Home | About | Archives | Subscribe | Suggestions | Newsroom |

