GLAM/Resources/File formats

From Outreach Wiki
Jump to navigation Jump to search
GLAM

Galleries • Libraries • Archives • Museums
Email the team: glam﹫wikimedia.org

Get Started Model Projects and Case Studies Evaluating Projects Contact Us
For the GLAM-Wiki Community: ConnectDiscussionCalendarNewsletterResourcesVolunteersOther pages


What file formats can be used for media at Wikipedia?

Wikimedia Commons, Wikipedia's media repository, only accepts free content in free file formats. Our mission requires content to be freely redistributable to all and closed or patent-encumbered file formats fail to meet this standard.

Currently the following formats are allowed:

See more at Wikimedia Commons:File types.

What is wrong with proprietary formats like MP3 and WMV?

Simply, these formats are all built on top of technology that is controlled by software patents held by organizations/companies which enforce those patents. As such, no one is able to implement a player or encoder for these files unless they have paid a patent license fee to the controlling organizations/companies. This effectively precludes the ability of Free/Open Source Software developers to write any non-infringing players that decode these formats.

What are OGG and OGV?

OGG and OGV are the file extensions for the free/open audio and video file formats developed by the Xiph.Org foundation. OGG is the common file extension for Ogg Theora (similar to MP3) while OGV is used for Ogg Vorbis (similar to WMV).

Why does Wikimedia only accept OGG and OGV?

"Our mission requires content to be freely redistributable to all. Patent-encumbered formats fail to meet this standard" (Wikimedia Commons page on file types).

Free formats are necessary to ensure the continued ability of individuals to freely create and distribute free content—an essential part of the Wikimedia Foundation's mission is encouraging the development of free-content educational resources that may be created, used, and reused by anyone, without restriction. By taking a stand on this issue, as one of the most visited sites on the web, the Wikimedia Foundation is demonstrating for other institutions that websites can still be functional and popular with the userbase without relying on proprietary formats.

Why should cultural institutions support free file formats?

Cultural institutions generally have a mission of keeping important cultural works available long into the future; without them, many of these works would be lost. Everyone knows the importance of taking great care in preserving the physical collection so that the works will still be available to view by future generations. But digital preservation requires careful thought as well.

Proprietary formats depend on the companies who control the standards. To be able to read a file in a format that is not open, you must have a license from the company that owns it, or perhaps have a working copy of their software. They become gatekeepers to your ability to distribute the materials for viewing and display. In the worst case, if they go out of business, the files may be unreadable.

Open formats require no licensing fees and do not depend on one vendor to produce software to view it. The software and the specifications are freely available so that even after the format has been superseded, it is still possible to use the files. Open formats are just another way of delivering information; institutions might want to present the same file in both proprietary and open formats, and that is a perfectly acceptable decision.

Finally, since Wikimedia only accepts open formats, if institutions do not make files available in open formats, they can only be added to Wikimedia projects with the loss of quality that comes with conversion (see below).

How can users play these free/open file formats?

The current versions of Mozilla Firefox, Google Chrome, and Opera are able to play Ogg Vorbis sound files and Ogg Theora videos without any additional codecs or software. Internet Explorer and standalone players like Windows Media Player and QuickTime require the user to download the codec (an extension) from https://www.xiph.org/. Alternatively, there are free media players which the user may download and have Ogg support built in (see VLC media player). To play these free/open file formats, see Commons:Media help.

How can one use these free/open file formats when creating media?

For a list of software that you can use to create/edit free/open formats, see Commons:Software.

How can one convert non-free file formats to free/open ones?

Most of the software listed above will also allow you to convert non-free formats into free formats.

While converting from a patent-encumbered "lossy" format (like MP3) to a free/open one (like OGG) is beneficial from the perspective of eliminating the issues presented by non-free formats, the conversion does sometimes lead to a loss of quality. MP3 and OGG both use "lossy compression", which means that in order to create files that are smaller than the raw audio, it throws away some information. (The original lossless files can be very large, which not only increases the cost of storing and distributing them, but may be difficult for many people to download or play.) Each format throws away different pieces of information to reduce the file size (though there is some overlap), and so conversion from one to the other means that more information is lost. (Imagine making a photocopy of a photocopy, but with digital files!)

However, when creating an OGG Vorbis (audio) file from the original/source file (e.g., a WAV file), the resulting OGG file will be of equal or better quality than an MP3 at a similar bitrate/file size. Similarly, OGG Theora (video) files are on par with the patent-encumbered H.264 format (used by YouTube, Apple iTunes Store, and others).

This is why it is recommended to preserve the original/source file. Using the original file, you can create any number of lossy formats for distribution (such as MP3, OGG, WMV, AAC), and still have the original to create high-quality copies in other formats as necessary. Additionally, this means that if neither source files or files in open formats are made available to the public, downstream users have no choice but to use proprietary formats or suffer a loss of quality in converting them.