Jon Ippolito | Learning from Mario

This essay contrasts the crowdsourced model of preservation practiced by game emulation fans with the centralized model practiced by preservation professionals. Its recommendations include increasing social and technical access to the documentation of culture and building networks that respect differences between different institutions and communities.

A version of this essay was first presented at DOCAM 2010 in Montreal, 4 March 2010. For background it draws on a book by Richard Rinehart and Jon Ippolito called New Media and Social Memory (Cambridge: MIT Press, forthcoming in 2011).

We are falling behind in the race to save digital culture. Our best efforts to preserve the rich outpouring of the last few decades known as media art are being buried underneath an avalanche of obsolete floppy disks, restrictive End User License Agreements, and antisocial archival practices. Even when aware of promising strategies such as emulation, museums and other cultural institutions are having trouble adapting to them.

Let me illustrate this by starting with one of the few triumphs of the art world's preservation efforts: the renewal of Grahame Weinbren and Roberta Friedman's Erl King, one of the first examples of interactive video from 1982. This piece was on its last legs when the Variable Media Network chose it as a poster child for the exhibition Seeing Double, resulting in an emulated version that a survey of visitors showed was practically indistinguishable from the original. The technique of emulation, whereby a newer computer impersonates an older one, enabled preservationists to salvage the source code and user experience of the Erl King while replacing its body with up-to-date guts.

The successful emulation of the Erl King was only possible because of a "perfect storm" consisting of talented technicians, an eager and forthcoming artist, access to the original software and hardware, and organizations willing to fund. It's hard to imagine spending two years and tens of thousands of dollars to re-create every interactive video installation from the 1980s, much less every endangered example of media art.

So our shining example of a successful emulation is shining all the brighter because it's pretty much standing alone, surrounded by less fortunate works that are all going dark.

If we're falling behind, who's keeping up? Super Mario Brothers, that's who. When it comes to preservation, the Olympians of new media art are getting their butts kicked by an Italian plumber. While professional conservators have only managed to future-proof a tiny sliver of new media artworks created since 1980 in any systematic and extensible way, a global community of dispersed amateurs has safeguarded the lion's share of a different genre of early computational media: video games.

Timeline of FCEUX emulator development

Take, for example, the FCEUX emulator, at the time of this writing the top-ranked emulator on the prominent site Emulator Zone for the enormously popular Nintendo Entertainment System (NES). FCEUX can trace its genealogy back to an early emulator called Family Computer Emulator, or FCE, so called because Nintendo released the NES in Asia as "Family Computer." In the manner of many open source projects, no company controlled the source code for this emulator; instead the programmer, known by the name Bero, released his abashedly titled "dirty code" online for other gaming fans to tinker with and extend. One such fan, known as Xodnizel, released an improvement called FCE Ultra that became so popular in the early 2000s that it spawned a half-dozen "forks," or versions modified by other users. By the late 2000s, NES fans merged four of the forks to produce FCEUX, a cross-platform and cross-standard emulator released under the GPL open-source license.

I cannot think of a single instance of software created by the professional preservation community in this supple way, passed from hand to hand over decades, diverging, re-converging, and constantly improving without a single institution or copyright holder at the wheel. The amateur preservationists responsible for the FCE emulator stream aren't laboring away in some government-funded thinktank or corporate software lab. They're banging out code in their underwear in a room in the basement of their mother's house. These guys are self-professed underdogs. In fact, the Webmaster of the emulator community "Home of the Underdogs" apologized in 2002 for not updating the HOTU Web site by explaining that "I've been overhelmed with my exams (that, btw, aren't going very well :o( )"

Compounding the underground status of fan-based preservation is the fact that trading the read-only memory (ROM) versions of vintage games used for emulation is just as illegal as sharing copyrighted music or movies over the Internet. Game execs and intellectual property lawyers have proven a bigger threat to Mario than the dragons and mushrooms that supply his in-game adversaries. Maybe that's why so many of the Nintendo emulators contain the defiant-sounding acronym "FCEU."

Professionals have the artist, sources, institutions, funding, rights. So what have the fanboys got that we haven't got, besides litigious Nintendo lawyers breathing down their necks?

In short, they have crowdsourcing. Some people use this term to mean applying a lot of individual attention spans to the same problem, as when NASA invites a population of "crater locaters" dispersed across the Internet to identify features on the Martian landscape. But here I'm using crowdsourcing to mean not just more people, but also more connections among them.

Much as professional conservators might fear an army of amateurs, such "unreliable archivists" have kept their culture alive without any institutional mandate or managerial oversight, while highbrow electronic artworks decay into inert assemblages of wire and plastic in their climate-controlled crates. The 21st century may never know the splendor that was 20th-century media art, but the future of Mario is assured.

It's time for archivists, conservators, and others in the preservation profession to admit that we are behind and try to emulate the frontrunners in the race to save digital culture.

We've already got loads of resources. Scholars have created several excellent resources for finding information on artists, artworks, and art movements. These online databases include the archives of the Langlois Foundation for Art, Technology, and Science, coordinated by Alain Depocas; Media Art Net (MediaKunstNet), coordinated by Dieter Daniels and Rudolf Frieling with the ZKM/Center for Art and Media Karlsruhe; and the Database of Virtual Art, coordinated by Oliver Grau and the Danube University Krems. All three are well researched, multilingual, relational databases packed with texts, images, and sometimes video documenting the fast-paced evolution of art and technology over the past fifty years. The accumulated knowledge accessible via their innovative interfaces represents many thousands of hours of research by archivists, interns, and software designers.

And yet a researcher who wants to learn more about, say, Shigeko Kubota, has to consult each of these important resources separately; there is currently no technical means to search all three at the same time. Search engines like Google are good at spidering pages that contain explicit links to each other, but as of this writing are currently unable to dig up any Web pages accessed by a form, such as by typing "Shigeko Kubota" into a search field.

The technical challenge is formidable, which is one reason a solution has so far evaded the designers of online archives, not to mention Google's engineers. Yet there is another reason that speaks more to the ingrained habits of institutions than the structure of PHP or MySQL: today's collecting institutions, no matter how digitized, remain hamstrung by their own history as centralized repositories.

To correct this will require institutions to make their data more accessible--not just technically, but socially. That means stepping away from an authoritative role to accept the input of amateurs, whether or not they are associated with an institution.

Let's look at a recent development aimed at encouraging both social and technical access. Preservers of new media art have a variety of potentially confusing strategies to choose from, like storage, emulation, migration, and reinterpretation. The Variable Media Questionnaire is designed to make their job easier by recording opinions about how to preserve creative works when their current medium becomes obsolete. Originally the Questionnaire, which I contributed to as part of the Variable Media Network (now Forging the Future), was a standalone database template. My collaborators at the Langlois Foundation and I tried to make this template as accessible as possible by giving it away freely to any institution that requested it.

Originally the Questionnaire polled only the opinions of a single source: the artist. Over time, however, we realized that there are many cooks at work in the development of a creative work, and so we added the ability to distinguish between different opinions on the same work. For example, most databases record measurements ("This installation is 36 inches wide"), as though the registrar were Saint Paul writing down the words God spoke in his ear. The Variable Media Questionnaire, by contrast, records interpretations ("This installation should fill the room").

Some have suggested that this approach is limited to artworks that are somehow "conceptual." But it is in fact conventional databases that are based on a platonic ideal. They presume to record unchanging "facts," even though they actually vary by source and by version of the work. How do you know the current dimensions are correct? Or that they weren't different the last time it was installed? The Variable Media Questionnaire, on the other hand, records opinions voiced in interviews. And it is those interviews, rather than some supposedly eternal title, date, medium, or dimensions, that are the epistemological building blocks of the Questionnaire.

Of course, those interviews are only as inclusive as access to the Questionnaire allows. That's why the most recent version is no longer a standalone database, but has been rebuilt by Still Water Senior Researcher John Bell from the ground up as a Web service. Which means anyone can add their opinions about a work: the artist, a conservator, the artist's mother, even a random gallerygoer who happened to see the work installed somewhere. The philosophy of crowdsourcing doesn't presume all these motley opinions will be equally valuable to the future--but it does presume that we can't be sure which will be, and therefore we should cast the net as wide as possible.

The Variable Media Questionnaire has always been built with flexibility in mind: respondents can choose more than one answer to a question, and weight them according to their preferences. The latest Variable Media Questionnaire now enables users to add and customize components of works, questions that go with those components, and the answers that go with each question. And, unlike the vast majority of open-source projects, these modifications are easy to make by non-programmers: Bell has built an interface that knows how to alter itself.

For example, let's say you wanted to add an interview about one of Daniel Spoerri's leftover meal installations to the Questionnaire. You would start by adding all the functional parts of the work, including some that are not so much physical components so much as essential aspects of the work to preserve. You might add types of material ("Custom Inert Material," a table to show it on), environment (a "Gallery" to put the table in), and interaction (a "Viewer" to look at it).Spoerri's work also contains remains of actual food--a critical component that unfortunately doesn't seem to match any of the parts available in the Questionnaire. A typical database would leave a registrar in this position with three unsatisfactory choices: write the info on a post-it and throw it in a physical file, shoehorn the information into an improper field, or call up the vendor and ask for a new field (good luck with that).

Not to worry! The Variable Media Questionnaire actually lets ordinary users add new components. So create a new part called Food, and add some appropriate questions ("What should be done with decaying food?") and answers ("Let it rot" or "Replace it with simulated food").

"Whoa," say the professional database developers, "you're delusional if you think the Questionnaire will be intelligible after you let every Tom, Dick, and Harry add their own crazy fields." Well, here's where we add some editorial oversight to the crowdsourcing project. No one can stop you from adding a custom part to your interview, but administrators must approve that new part before it shows up as an option for everyone else who adds interviews. (We've also applied this bottom-up approach to vocabulary, as we'll see later.)

Besides access, what really turbocharges crowdsourcing is connection. The traditional way to think of connecting data from different sources is to put it all a giant union database, or if that's impractical, to enforce a common standard that everyone has to obey. These solutions have their uses--as mentioned, the Variable Media Questionnaire became more sociable as a Web service than as a standalone application. But they are also limiting when working with a community of heterogeneous practitioners and data. So we'll look at some options for stimulating sharing while respecting differences.

It's possible to build the interface to a single dataset in a way that respects differences. Users of the latest Variable Media Questionnaire can compare opinions as they vary by work, by interviewee, and by date. One of the most illuminating features of such comparisons is the ability to highlight disagreements--a situation you aren't likely to see on the wall label for an artwork, but one I think is especially interesting from a historical perspective. When the Questionnaire reveals that the Eva Hesse estate and her close friend Sol LeWitt disagreed about whether her deteriorating sculptures should be emulated or left to die, that provokes an illuminating discussion about the aesthetics of postminimal art.

It's a bit harder to build an interface to multiple datasets that respect difference, because each of the datasets can be so...different. This is why the dominant philosophy to date has been, "put all the data in one place." Which unfortunately turns out also to mean, "put a big bottleneck in between my data and my users."

Now, few database designers would deliberately build a bottleneck into their own system. Yet whether under pressure from copyright holders or simply by force of habit, almost all of them design their databases as segregated silos--which can amount to an entire array of bottlenecks.

To be sure, most museums and archives in the United States and Europe have developed in-house databases and/or Web sites, and a smaller but significant proportion have databases that can be searched via their Web sites. So a curator who wants to search for "television" can consult the comprehensive databases of the Langlois Foundation, MedienKunstNetz, or the Database of Virtual Art.

What a researcher currently cannot do, however, is to search for the theme "television" across all, or even a handful, of such databases. For efficiency, such online databases are typically accessed via server-side scripts that take the form "index.php?theme=television," a formula that Google and Yahoo cannot spider. As a result, millions of dollars and countless hours of staff time and expertise are spent squirreling data away in private silos inaccessible to a broader public.

We've run into a similar incommensurability in Forging the Future, the alliance of museums and cultural organizations currently working on the release of new preservation tools. We wanted each tool to be useful on its own, but be even better when combined with other Forging the Future tools, or even with proprietary databases. But we soon found it difficult to convince our differing kinds of data and platforms to play nice; it's hard to get a Web-based union database and a desktop-based Filemaker client to speak the same language. The last thing we wanted to do was to jam everything into yet another silo'd database, so instead we went in search of a software equivalent of Star Trek's "universal translator"--maybe not strong enough to translate Klingon into English, but at least able to make the introductions between related people and artworks in different databases.

Still Water's John Bell came up with the idea of a Metaserver that could act like a sort of ISBN for art by generated unique, portable ids for people, works, and vocabulary. Any database with access to the Internet--even a desktop application like Filemaker--can hook into the Metaserver through an open API, at which point a registrar adding records to that database could simultaneously view or add to related data from every other database on the system. The Metaserver tunnels between silos.

As co-developer Craig Dietrich likes to say, the Metaserver isn't an archive, but rather an "inverse archive," that stores pointers to records in other folk's archives. Of course, the Semantic Web has promised this for some time, but there are plenty of doubts about when, and whether, it will ever arrive. (It's like the joke about fusion: it's the technology of the future, and always will be.) But registries like the Metaserver are lightweight and easy to build with practical techniques we have right now.

New media conservators such as Anne-Marie Zeppetelli and Joanna Phillips have described how hard it is to IT departments and database vendors to add new modules for preservation. Forging the Future sidesteps this problem by injecting content into an existing database rather than adding new fields to it.

So far the Metaserver team has prototyped the API and is working on testbed implementations with two distinct databases, the 3rd-generation Variable Media Questionnaire and The Pool, an online environment that tracks collaboration. For example, look up Nam June Paik's TV Garden in the the Variable Media Questionnaire and under "resources" you'll see that there is an associated record in The Pool. Click on that link and you'll find out what information The Pool has on TV Garden: namely that it's been rated highly by Pool users, and that it has inspired several subsequent works of new media.

Meanwhile, Forging partners Nick Hasty of Rhizome and Michael Katchen of Franklin Furnace have demonstrated how the Metaserver could help crowdsource a vocabulary shared among artists and curators in different institutions. In this model, VocabWiki, a cross-institional collective vocabulary for variable media works, is an editable set of terms and definitions fed from tags contributed by Rhizome's Artbase community ("generative art", "posthuman"). Thanks to the Metaserver, occurrences of those tags on the Rhizome Web site will be automatically hotlinked to VocabWiki for the latest definitions. So if you happen upon a work tagged "Virtual Reality" on the Artbase, you'll see a link to a definition of that term on VocabWiki.

In the past decade, a number of exciting new contenders have joined the race to rescue digital culture, so that the field now includes veterans like INCCA and Matters in Media Art as well as newcomers like DOCAM and the third-generation Variable Media Questionnaire. Rather than waiting for time to knock all but one victor out of the running, I believe we should respect the differences between these various tools and communities. Protocols like the Metaserver should help connect the silos without destroying what makes each unique.

Learning from Mario

Crowdsourcing Preservation

By Jon Ippolito