In the late 19th century, astronomers began imaging the stars using light prisms and gratings. They recorded the spectra (dispersion of starlight into colors) of stars to determine what they were made of. Since then, these photos have also become useful for another purpose: They allow scientists to chart past concentrations of ozone in the Earth’s stratosphere and help reveal whether some changes in the ozone hole are natural. The hardest part of the whole process is finding these glass panels. I know this because I spent several weeks reviewing collections at various observatories around the world – from Germany to Australia.
What other historical data would be useful? There are many stories about this; Thousands of records made during sailing voyages over the past centuries are a treasure trove that can be used to study weather patterns today. Images of glaciers from the past and present have mesmerized the entire world and led to indisputable evidence of climate change. Medical records from old punch cards — left in the late 1950s and decoded decades later — have helped show how cholesterol levels can predict later disease.
To imagine the future, we must first study the past, but opportunities to do so are quickly fading, fueled by misunderstanding and neglect. Whether stored on glass plates, written on paper, old tapes, or floppy disks, only a few different forms of “heritage data” are readily available for research today, so the information recorded in them is effectively lost.
Scientists complained that they could not get enough data. Today we talk about “big data” like an untamed beast. The metrics being collected now are becoming increasingly complex, but they only tell us about the present. Measurements recorded long ago can show how climate, ecology, and other changes have changed on Earth, and data taken from individuals decades ago can inform modern medical and policy guidelines. To get that data, you need to start recovering it now.
The important question now is: Why don’t scientists in all fields seek to preserve ancient records, even if they are the best way to study long-term trends in change? Part of the answer lies in human psychology. At a talk I gave about the need to convert nearly lost astronomical data into long-term, easily shareable data, one audience member said: “Modern data is great.”
He missed the point. Few people have the desire to dig through old archival records, legacy data that yield information obtained using outdated technology but not available in any other format. Hydrologists in Cape Town, South Africa, have digitized handwritten data dating back 70 years – to find out how non-native tree species affect water supply in natural environments. High-resolution color images of existing birds cannot replace images of the now-endangered migratory pigeon and laughing owl.
“The hidden treasures of data—all the knowledge they have to offer—are left to rot on the shelves.”
It is time to save the traditional data, and ancient scholars are still alive who can provide the necessary correct information about the surrounding conditions. Techniques for digitizing many types of records are cheap and easy to access.
But digitization does not preserve everything. At least one epidemiologist has identified the spread of cholera in the Iberian Peninsula by inhaling envelopes. How is that? For centuries, post offices have used vinegar. To disinfect mail delivered from infected cities, it still had scent on it.
So what can we do? The Data Recovery Interest Group of the International Research Data Alliance provides guidelines (go.nature.com/2pgzkfs); To guide the researcher through the initial stages of data recovery, determine the necessary equipment and determine the best way to tackle the recovery process. The most important conditions for obtaining data from large-scale past human changes are identified. Many fields—such as biodiversity (http://rebind.bgbm.org), volcanology, and oceanography—have made great strides in preserving ancient data, but more needs to be done quickly and with better coordination.
The truth is that not all data can be saved. Setting priorities means looking for an opportunity to shed light on otherwise unanswerable questions. Researchers often dismiss traditional materials without considering their potential applications. Hidden treasures of data – with all the knowledge it can provide – are left to rot on shelves.
Everyone can lend a helping hand. The first challenge is finding records, photographs or other objects or understanding their value. Most of it has not been used for a long time, stored almost abandoned, and moisture, spiders and mice often do their best to destroy it.
A second challenge is ensuring that the necessary metadata (such as date, location, and constraints) exists so that the time and location of that data can be accurately determined when converted to modern formats.
Finding the resources needed for conservation is often difficult, and funding is scarce and disorganized, but activists have secured grants from agencies ranging from NASA to the US Agency for International Development and the German Research Foundation. The payoff is worth casting a wide net. University archivists can share their expertise and mobilize citizen science groups.
An important neglected resource is success stories. When researchers examine data that was once neglected and brought back to life in modern form, they are likely to discover hidden opportunities themselves. The next heroic rescue story may be your mission, but you must hurry; Some data is destroyed while writing this article, some may not be recovered tomorrow, and old memories we need to use may not be with us for a long time.
“Professional coffee fan. Total beer nerd. Hardcore reader. Alcohol fanatic. Evil twitter buff. Friendly tv scholar.”