The ephemerality of online files: are our photos safe? | Technology

0
67

On September 1, 2017, Tuenti, the Spanish social network launched in 2006 and which for a time was the platform par excellence for teenagers, closed permanently. With it, the more than 6 billion photos that users had uploaded disappeared. The network had warned and made available to its users a tool to download their albums, but there were many people who, due to forgetfulness, carelessness or not having realized, lost those images.

Anyone who has more than a decade of experience on the Internet will be able to tell anecdotes about data that they once believed to be eternal disappearing into the ether of the network. Emails evaporated when you stop using an email service (even if the account is not closed); messages and posts on forums that cease to exist; blogs deleted when the platform that hosted them closes; photos that are deleted when the company imposes a new limit; migration failures that cause 50 million songs to be lost.

The latest example is from MySpace, which in 2019 announced that oops, something had gone wrong and it had been impossible to recover those lost songs, uploaded to the service between 2009 and 2015. A few days later, however, the organization for the preservation of the web Internet Archive published a catalog with almost 500,000 of those audio files. That is, users had lost access and MySpace had not made backup copies, but much of the data was hosted elsewhere. In this case, an academic group had downloaded all that music a few years earlier and sent it to the Internet Archive. But if the person who has lost their photos or emails does not know that they are on other servers or does not have access to them, they will feel that, in fact, they have disappeared. Which, on the other hand, is not at all strange.

“All our online content will disappear completely sooner or later,” says Daniel Gayo Avello, professor in the Area of ​​Computer Languages ​​and Systems at the University of Oviedo. How long it takes to disappear, he explains, will depend on how actively we work to preserve it. “If all my photographs, videos, messages and emails are on some platform, their permanence depends, obviously, on the terms of use and the very survival of the platform. For example, depending on the terms of use, my content may disappear after a while without accessing my account (i.e. I wouldn’t trust my Hotmail emails to still be there). On the other hand, if the company that owns the platform decides so, that content can disappear from one day to the next,” he elaborates.

Believing that that personal story that we have been uploading or publishing in different corners of the internet will always be there is a somewhat naive attitude. Gayo Avello compares the web to a forest. “It can be in a place for centuries and, although some of its trees may be centuries old, most are not. Trees grow, change, die, and the forest sometimes also grows, but at other times it shrinks, either due to chance events or intentional actions. The same thing happens with the Web, some websites arrive and others disappear,” he explains.

There are figures about all this: a recent Pew Research report indicates that 25% of the websites that existed at some point between 2013 and 2023 no longer exist. If we look at the oldest ones, those from 2013, the percentage of disappearance increases to 38%. Gayo Avello, who in 2022 gave a talk on this topic at the TechFest of the University of Oviedo, gives as an example the website Million Dollar Page, a relic from almost twenty years ago that sought “a form of monetization that today seems quite childish: selling each pixel of a 1000×1000 pixel banner for $1. Each advertiser could buy the portion they wanted and have a link to their site. In 2014, less than ten years after its launch, more than 20% of the targeted sites no longer existed,” he explains.

Returning to our personal files hosted on different services, should we begin to fear their disappearance? Are the images we have, for example, in Google Photos, in danger? Lorena González Manzano, cybersecurity specialist and member of the Computer Security Las (COSEC) working group at Carlos III University, explains that “nothing is 100% secure and they can always attack it.” However, “if the service provider is trustworthy or a large company (for example, Google), we assume reasonable security.”

A cyber attack could end up with data deleted, but the usual thing is that the companies that host them have “systems to prevent, both in the event of a cyber attack and the failure of a service, user data from being lost.” Furthermore, the expert continues, the attackers’ objective is usually not to delete the data, but simply to access it. “However, attacks such as ransomware “What they do is access the service where our data is hosted, they encrypt it and request money, either from us or from the company, to be able to recover it or not to reveal it or leave it public,” he points out.

Study history with disappearing data

The disappearance of websites and personal publications also means the loss of very valuable sources of documentation when writing the history of these decades. In order to preserve at least some of the richness of the web, organizations like Archive Team have been archiving web content for years so that it is not lost: Blogger blogs (if they are associated with inactive Google accounts, they may disappear), public messages and of relevance in Telegram, YouTube videos…

“The main problem of working in digital environments is the ephemerality of data,” agrees Elisa García Mingo, doctor in Social Anthropology and professor at the Faculty of Political Sciences and Sociology at the Complutense University of Madrid. “We realize it because we see them disappear in our investigations: an account that you follow, a website…”, she points out.

A large part of scientific knowledge is also at risk. According to a study published earlier this year who investigated how digital copies of academic articles are archived (in many cases there is no longer a physical copy), a third of the publishers did not appear to have any type of archival activity in place to preserve them. (And well, well, with copies in at least three archives, less than 1% of academic journal publishers did.)

On the other hand, talking about digital ephemerality does not mean that the opposite problem does not exist, that which we want to disappear and does not disappear, which has led to all the claims about the right to be forgotten. García Mingo, who studies digital sexual violence practices among young people, points out that there is something paradoxical in all this. “Sometimes we treat data as if it were going to be permanent and in reality it is ephemeral. But, on the other hand, people who have digital social practices as if they were not going to be archived, as if they were going to be volatile, then they have much more of a digital trace,” he assures. “The digital trace is much more permanent than how, for example, adolescents experience it. Furthermore, even when saving or publishing while being aware of its permanence, you create a file over which you have no control. It’s like having a file, but not having control of the building in which it is housed, you don’t even have access to the staff who are managing it.”

How to preserve what we do want to save

In digital archiving there are almost as many styles as there are people. Elisa García Mingo explains that it is a bit like what was done in analog practice. “There were those who, when developing the photos, selected them, organized them and made a very elaborate album, and those who simply put them in a cookie box,” she indicates. The same thing happens in the digital world. “There are people who create an archive without archival awareness, and there are people who have a very high level of digital archiving. They are the two poles: from a giant trail that you leave in a kind of conscious chaos to the most elaborate practices, all the people who make an album or a calendar or video summary every year,” she explains.

If what we want is to ensure that we will never find ourselves with the unpleasant surprise that we have lost photos, emails or documents that we did want, the level of archiving must be raised a little higher. “The United States Library of Congress coined an acronym, IDOM, sometimes IDEOM, which means ‘identify, decide, export, organize and make copies (make copies)’”, indicates Daniel Gayo Avello. Although the idea is simple, it requires “effort and perseverance.”

The expert explains the steps:

  • “We must identify all the digital content we have and where (for example, photographs, videos, audio, messages, websites, other types of digital files, etc.).”
  • Decide “what content is most important (for example, do we really need the 200 photos we took on that trip? Do I need a copy of all my emails?).”
  • Depending on the content, we may need to export it: “emails, WhatsApp messages, our tweet archive…”.
  • Organize material, which involves “giving meaningful names to files and organizing them into directory structures.” This part is key to later finding what we are looking for (Gayo Avello admits that he skips it, but then it takes him a long time to locate what he wants).
  • Make copies. “The 3-2-1 rule can apply here: at least three copies of the data, using at least two different storage systems and with at least one copy in another physical location.”

All of this, in addition, must be updated and maintained so as not to find ourselves with a highly organized archive of documents in obsolete formats that we no longer have anywhere to read.

From the point of view of cybersecurity, Lorena González Manzano recommends, if we store very sensitive data in external services, “encrypt it in some way.” On the other hand, if we do not want to rely on any service, “we can buy a hard drive to store the data ourselves or, better yet, a NAS, which is a high-capacity hard drive that recovers the data even if some of it reaches be damaged, for example, by a loss of current/light.”

You can follow The USA Print in Facebook and x or sign up here to receive our weekly newsletter.