You are here

Omeka Developer Connects Museum Data to Wikipedia with Surprising Results

August 29, 2014

By Patrick Murray-John
Development Team Manager, Omeka

When an interesting set of data appears online, I'm a bit like an energetic German Shepherd spotting a squirrel. IMLS's recently released Museum Universe Data File (MUDF) definitely got my tail wagging.

I work on Omeka, a web publishing platform for galleries, libraries, archives, and museums (GLAMs), which aims to help cultural heritage institutions of all kinds publish online information and interpretations of their holdings. Built at the Roy Rosenzweig Center for History and New Media at George Mason University with generous original support from the Institute of Museum and Library Services, and subsequent support from the Mellon Foundation, Sloan Foundation, and Kress Foundation, Omeka is designed with the democratization and accessibility of our cultural heritage at its heart.

So why my tail-wagging about data?

Our cultural heritage encompasses more than books, paintings, sculptures, photographs, souvenirs, and the like that we might think of when we think of GLAMs. It also includes our video games and other software. And, since we live in an “Age of Data,” it includes the data we generate. What data we collect, how we gather and select it, how we publish it, and more, will be part of our own cultural heritage to future generations.

Put those two things together—a desire to involve the public writ large with cultural heritage, and the belief that our datasets are themselves cultural heritage artifacts—and you have the Museum Universe Data File and the Omeka site I built with it, US Museums Explorer.

The site was built by simply importing the MUDF into Omeka. But also, during the import process, I augmented the data by making use of a simple Linked Open Data (LOD) connection with information available in Wikipedia. Based on the museum's web address, the import process checks if there is an entry in Wikipedia that refers to the same web address. If one is found, the museum's description and image from Wikipedia are imported. Much more data is available in Wikipedia (and other LOD data sources), offering possibilities for exploration along many different vectors.

Importantly, the infobox templates in Wikipedia add a fairly rich structure to the knowledge the Wikipedia community presents. That well-known structure is essential to the ability to link data from the public with more curated data from IMLS and other sources. The greatest weakness in the process, which I confess surprised me, was how few museums and historical societies have Wikipedia entries. This is a missed opportunity for us to share information about the artifacts we know and care about.

The current results of the US Museums Explorer site are preliminary, and suggestive. To my delight, being able to put the museums on a map and search by what's nearby led me to discover the quite charming and informative Arlington Historical Museum not far from where I live. I doubt I would have discovered it without the data file. How many others might have a similar experience?

Dr. Patrick Murray-John is a Research Assistant Professor and Omeka Development Team Manager at the Roy Rosenzweig Center for History and New Media at George Mason University. He can be found on Twitter at @patrick_mj and on his blog, Hacking The Humanities.