- June 15, 2011
- 5:28 pm
- Sabine Schramm
- no comments
Capturing the Continually Changing Web
By Bob Vail, SOC Website Reporter
There are enormous museums that catalog and archive important achievements in culture, science and history. How many preserve the history of the most influential invention of our time, the perpetually evolving the World
Wide Web? Well, none.
Meghan Dougherty, Assistant Professor—Digital Communication, discussed the challenges of archiving the Web in her presentation “Lost 404 Error Understanding Web History: Archive and analysis tools for studying digital cultural heritage” on April 6 in the Terry Student Center on Water Tower Campus. The presentation was part of the School of Communication’s Faculty Speaker Series, which included “The Making of an Interactive Web-Series,” presented by Aaron Greer, Associate Professor—Digital Cinema and Media Production, and “The Role of Gender in Gaming” presented by Adrienne Massanari, Assistant Professor—New and Digital Media.
“The Web tends to feel like an infinitely accessible library, a storehouse of information, waiting to be recalled, rendered and viewed, remixed and reposted,” said Dougherty, “but, in most spaces of the Web, things are continually overwritten, edited, excerpted, and deleted.”
We, as users of the Internet, tend to see it as a magical place where everything that is ever put on the Internet will be there forever. This is simply not true.
Dougherty used whitehouse.gov as an example. If you visit the site right now, you will see it as it exists right now. But what if you wanted to examine the difference in how the site has been used differently by President Obama during his years in office? You can’t, no one bothered to save how the website operated like over time. What if you wanted to study how it was used during different administrations? Forget about it, that data does not exist.
There is one place that is trying to save the history of the web, the Internet Archive and its Wayback Machine. Their intentions are good but even they have problems. The Wayback Machine simply takes a photo of the page and does not archive any of the site’s functionality.
Dougherty believes that this is the new twist on archiving ephemera (something transitory or short-lived).
“We use the rigor of archiving on things that we know are important, like newspapers,” she said.
“All of the Web needs to get put in that category. Even the places we know we should archive, like the New York Times (nytimes.com), they’re only archiving the content,” she said. “They’re not actually thinking about the container of the website and how important that is, why it would matter, and the different kinds of questions we could ask about the structure of that thing.”
Dougherty explained that though content is saved, advertisements are not, nor are the copy which lies above the scroll or below the scroll. A lot of history is being lost “We have lost a lot of perspective. We’ve lost the perspective about how easy things are now and how hard it used to be to create these things. We have lost a lot of the sense of the development of Web design. We’ve lost a lot of that history, it’s just gone,” she said.
The problem is not that difficult to fix, according to Dougherty there are three steps. Foremost we need to build a community that involves everybody (but especially archivists and tool builders coming together), create standards and practices for archiving, and build tools.
Loyola is currently involved in creating a digital repository for faculty and student work. Dougherty is currently working with the LUC Digital Repository Research Committee to create a space large enough to help alleviate some of these problems as they are faced by Loyola faculty and students. There have been other attempts to archive the Internet (like the Internet Archive and the Library of Congress) but they can only do so much.
The second step is establishing practices and standards for Internet archiving. “Web archiving should be standard practice for cultural institutions,” Dr. Dogherty said. “In academic circles, we need to instill in students the importance of archiving web materials as we study them.”
The third and final step is to build the tools that we will use to archive. “We need to support infrastructure development for e-research, funding and institutional structures,” she said, “but we also need support for individual, ad hoc contributions to that infrastructure. We need more researchers collaborating to build their own research tools for digital scholarship and share them.”
What is most important is that everyone needs to do their part. The good news is that it’s not that difficult. “Just save everything. Document everything because we can,” Dougherty said. “It takes zero time and storage is cheap. We can pile all kinds of storage together.
“Even if you make the worst archive in the world of a bunch of pages today,10 years from now I might be really disappointed that I can only answer question A and not B and C because you didn’t capture that stuff, but at least that’s something. It’s better than nothing.”



