You may not realize it, but the web we know and love is actually far more ephemeral than it appears. Behind this apparent stability lies an insidious threat that slowly but surely erodes our precious digital heritage: the formidable digital decay!
According to an in-depth study conducted by the Pew Research Center, nearly 38% of web pages that existed in 2013 have vanished into the void of cyberspace within a decade. This conclusion was drawn after examining a representative sample of web pages from the Common Crawl archives for each year from 2013 to 2023. And this sad fate is shared by a quarter of the pages published between 2013 and 2023, which are now inaccessible, their content lost forever… sniff…
It’s true that the web is in a constant state of flux. Sites change addresses, servers die, hosting companies shut down… and the damage doesn’t stop there, as this cyber gangrene also attacks links. Like the roads of a forgotten kingdom overgrown with weeds, it’s estimated that 23% of links on news sites and 21% on government sites now lead to vanished destinations, swallowed by time. Unique pages have been deleted or moved, but sometimes, entire sites disappear.
Even Wikipedia, renowned for the quality of its sources, is not spared. Indeed, 54% of the articles in the collaborative encyclopedia have at least one reference pointing to an absent page. Enough to shake the foundations of the temple of knowledge!
As for social networks, it’s even worse. On Twitter (now X, thanks Elon for that crappy name!), nearly 20% of tweets disappear within months of their publication. This volatility is even more pronounced for tweets in Turkish or Arabic, with over 40% disappearing within three months. On Twitter, accounts using default profile settings are also more likely to have their tweets deleted. More than 60% of non-visible tweets were due to accounts being made private, suspended, or deleted, and 40% were individual tweets deleted by their authors.
Faced with this hemorrhaging of data, the Internet Archive foundation and its famous site Wayback Machine strive to save entire swathes of the web before it’s too late… but more effort and imagination are needed. The web is growing at an astonishing speed, and preserving this immaterial heritage of humanity remains a major challenge.
Until a miraculous solution is found, remember to regularly back up your favorite sites and content, make archives that everyone can access, and don’t hesitate to report dead links to their owners! The web is our common good, and we must protect it from the ravages of time. Personally, I admit, I automatically delete my tweets after a few months, but regarding my site’s archives, few pages have disappeared… I try to keep it all afloat, even though I admit it seems a bit futile since much of my old content is outdated in terms of information.