The archive designed to be destroyed: Inside the campaign to save Ukraine’s digital footprint

Over 1,300 volunteers are preserving Ukrainian heritage, one website at a time.

Aug. 17, 2022, 4:02 p.m.

The archive designed to be destroyed: Inside the campaign to save Ukraine’s digital footprint

Over 1,300 volunteers are preserving Ukrainian heritage, one website at a time.

By Carolyn Stein

An academic technology specialist at Stanford co-founded an online archive to preserve digital Ukrainian culture. Since its founding, the archive has preserved over 50 terabytes worth of Ukrainian websites and other online content. (MICHELLE FU/The Stanford Daily)

The exhibit was titled I’m Ukrainian and that sounds proud. The pieces featured sunflowers and maps of Ukraine. Even on a computer screen, the exhibit radiated joy. It was small, but it caught the eye of Stanford academic technology specialist Quinn Dombrowski. There was no time to waste. Doing what she does best, Dombrowski acted quickly, pulling out her digital archiving tools and saving the online exhibit. 

Unlike some of the other websites Dombrowski archived, this exhibit wasn’t part of a museum. It wasn’t a collection of 16th century paintings. It wasn’t even created by a prominent artist. It was a collection of art pieces made by a group of children, featured in an after-school program’s blog post. 

“I think having multiple young children myself, some of these things geared towards kids have really spoken to me,” Dombrowski said. 

She archived it just in time: “I checked back with the site later, and they had actually pulled down that last blog post and the one before it, I think not wanting to sort of make themselves more of a visible target,” she said.

Watching Ukrainian websites be up one day and then go down the next is nothing new for Dombrowski — but a website going down is not always a citizen’s doing. Sometimes a website is lost to cyberattacks, power outages or Russian shelling. Even before the Feb. 24 invasion, another battle was taking place between Russia and Ukraine: a cyberwar

Archiving the children’s art is the kind of work Dombrowski has been doing since Russia invaded Ukraine in late February. The physical attack was accompanied by a much less visible danger — the escalation of an already present digital cyberwar, putting Ukraine’s digital history and cultural heritage at risk of complete erasure.

In response, Dombrowski, along with Tufts University music librarian Anna Kijas and Austrian Centre for Digital Humanities and Culture Preservation digital historian Sebastion Majstorovic, started an online archive to save Ukraine’s digital history and life. They named the archive Save Ukrainian Culture and Heritage Online, or SUCHO for short. More than 1,300 cultural heritage professionals have since signed up to archive Ukraine’s at-risk websites.

It’s a tale both SUCHO volunteers and others find inspiring, on par with war stories of people using Molotov cocktails as a weapon of resistance, Dombrowski said.

However, there is also a darker reality that lingers in the minds of SUCHO volunteers.

“Yes, it’s a feel-good story,” Dombrowski said. “But it also points to a failure of infrastructure. It should never come to this. There needs to be a greater investment in pre-emptive web archiving.” 

That kind of pre-emptive web archiving could have ensured that the “I’m Ukrainian and that sounds proud” exhibit was saved before SUCHO stepped in. 

The children’s art exhibit still can’t be found online by the public: The goal of the online archive is not to become a digital museum for others to enjoy. Most of the digital items SUCHO volunteers archive aren’t even displayed on SUCHO’s website. Rather, the end goal for SUCHO is self-destruction. Once the war is over, Dombrowski and all the other volunteers intend to return the websites to their original owners and help restore Ukraine’s online presence. 

“Digital repatriation is our goal, not creating some kind of safe archive for people to study from the West,” Dombrowski said. “We want Ukraine to come out of this and be in a position to rebuild the cultural heritage sector. And if there is data that we have archived that is of use, we want to put that back in their hands.”

In other words, Dombrowski is “data-sitting.”

Data-Sitting 101

Two days after Russia invaded, Kijas, the music librarian at Tufts University, put out a call for a “data rescue session.” She already planned on attending a March 5 virtual conference for digital humanists and saw an opportunity to help Ukrainians.

Kijas wondered what she could do as a librarian familiar with digital tools. She had a list of specific libraries and archives in Ukraine with unique music collections. She decided that she could help find these items and digitally archive the web pages.

Kijas’ tweet caught the attention of Dombrowski and her colleague Majstorovic. The two approached Kijas with an idea: What if they rescued more than just the music collections — and did it at a faster pace? 

Initially, the team planned to use the Internet Archive’s Wayback Machine to save information, but the Wayback Machine has a “superficial crawling of web pages,” Kijas said. The machine can only save the first or second layer of a website. More complicated features, such as interactive features and scripted code, can be lost.

To counteract this problem, Majstorovic proposed that the team use Webrecorder, an open-source tool that can go seven or eight layers deep into a website and capture some of the more complex media. The open-source nature of the tool also meant that volunteers could directly edit the code and modify the tool to their specific needs. 

“From there, once we started promoting our project and getting lots and lots of volunteers to join, people started to also think about, ‘Okay, what other tools can we use?’” Kijas said. “We also have no budget, right? So we’re just doing this as volunteers, and everyone who’s helping is a volunteer.”

Browsertrix Crawler is another tool that is popular among the volunteers and is particularly good for capturing websites that have more advanced features such as 3D tours and calendars. Carrie Pirmann, a librarian at Bucknell University who described herself as “reasonably techie,” set up the tool on both her home and work computers in early March. In her office, she will frequently run the crawler on her computer in the background as she works on other tasks — she has since crawled over 300 websites.

“This is kind of where my library and sleuthing skills really have come into play,” Pirmann said. “I don’t speak Ukrainian. I don’t read Cyrillic.”

Pirmann is also part of the situation monitoring team, where she uses Google Maps and library directories to identify any heritage websites that may need archiving. One of Pirmann’s favorite websites that she has archived is the website for a public library for a small town called Bohodukiv — which shares a sister city relationship with Pirmann’s hometown of Boyerton, Pa. 

“I’ve sort of started calling them digital scrapbooks,” Pirmann said, explaining that these websites not only have information about their library and book collections, but extensive photo albums from events to children’s programming. Pirmann hopes to one day return this website to the library. 

“Maybe people can find pictures of their kids. It may be one of the few places where if your apartment or your house has been bombed and you’ve lost everything, you might still be able to go back into that archive eventually and find a picture or two of your children,” Pirmann said.

Web crawling is not the only part of the archiving process. Producing metadata is another important part. Kim Martin, assistant professor of history at the University of Guelph, works for the metadata team. When an item is uploaded to the Internet Archive machine, it produces basic metadata such as a title, a description and a URL. However, the metadata consists of more than just this information. 

“So we needed not just like the URL for a JPEG image, but we needed to show what page that was linked off of, we needed to give it context,” Martin explained. The team now has 16 metadata fields that they regularly fill out, making conscious decisions about whether to fill out metadata fields in English, Ukrainian or Russian.

“We’re all learning from each other and really taking on tasks and moving forward as a group, which is pretty interesting, because none of us knew each other [before],” Martin said. 

From Lewisburg, Pa. to Wien, Austria, SUCHO volunteers coordinate all of their work through a Slack channel. There is never a dull moment in the Slack, with different channels ranging from rapid response teams to channels just for sending memes. 

“Someone found a Wild West restaurant in Ukraine that had the most whimsical description of cheese sticks as sheriff snacks, and almost like a micro-story in the menu description of cheese sticks,” Dombrowski recalled. It is these small moments that help sustain the SUCHO volunteers through their work.

“Memory Workers” and “Mobilized Humanities”

The darker side of culture preservation never leaves the minds of SUCHO volunteers. Every bomb alert, website and digital artifact is a reminder of the failure of infrastructure and of nations to proactively protect international culture. 

“It is a failure of infrastructure, it’s also a failure of like, international efforts of care and generosity, right?” Martin said. “We care so much about our own spaces and our own things that we put up on the web. And, you know, most librarians or archivists are memory workers, information workers and can speak to what their own country does, but not often across borders.” 

However, data rescue and emergency response humanities are not new fields. Alex Gil, a SUCHO volunteer and self-described “experimenter” at Columbia University, has long been an advocate for digital librarians and scholars to use their work to address modern day crises. Gil and his colleagues at Columbia have a specific name for this type of work: “mobilized humanities.”

“Everything that you’ve seen in my trajectory… all of that is an error-response to a failure of infrastructure,” Gil explained. “All of these things are things like, ‘Wow, yes, I’m glad we had that.’ And then there’s always a sense of like, ‘Why aren’t there full-time employees, either for the government or private or nonprofit, that do this for a living?’”

Global heritage preservation could benefit from a sister city model, both Gil and Dombrowski think. The idea behind this system is that cities across the world will partner with each other and at least once a year, they will work with each other to archive items that they think need to be backed up. One of the main benefits to this model is that it would be a decentralized, multinational model. 

“The Internet Archive right now does a great job being the world’s Internet Archive; they grab more stuff than anybody else. But it has one fatal flaw. It is centralized. And it is centralized in the United States. And there are biases seen in it,” Gil said, adding that the archive can skip websites not based in the U.S.

SUCHO coordinators have already reached out to organizations like UNESCO and the International Federation of Library Associations to discuss making the sister city archiving model a reality. Although digital archiving may not completely make up for the destruction of material artifacts, digital reproductions can still help in the process of reconstructing identity. 

The work to preserve digital heritage could also be used as evidence in future war crime tribunals. The International Criminal Court (ICC) has classified intentional destruction of cultural heritage as a war crime since 1998. However, the ICC didn’t convict anyone of intentional cultural heritage destruction until 2016. 

Currently, the ICC has sent a team of investigators and forensics experts to Ukraine. On May 23, Ukrainian courts sentenced a Russian soldier to life in prison as part of the first war crime trial. There are suspected to be over 15,000 war crimes committed by the Russian government. 

So, can 50 terabytes — and counting — worth of digital culture and heritage websites lead to justice?

Only time will tell. 

However, there are positive impacts from the work of the SUCHO volunteers, even if it feels minimal. For Martin, she saw the impact when she spoke to her best friend of 20 years.

“She just wrote to me after seeing a news article and she was like, ‘You’re protecting my cultural heritage.’ I was like, ‘Oh my god, of course, you’re Ukrainian.’ I didn’t even put it together,” Martin said. “I just did this. I dove into this thing. And then I got a letter from her mom and I was like, ‘Oh my goodness, this is nuts.’” 

It is those moments, she said, that make this work worth it.

Carolyn Stein serves as the Magazine Editor for Vol. 263. She is double majoring in communications and East Asian studies. Her favorite activity is going on unnecessarily long walks. Contact her at news 'at' stanforddaily.com.

Login or create an account