I’ve reported on Serge’s experimental model at Princeton before, at http://blog.orenblog.org/2010/05/12/csg-spring-2010-storing-data-forever/
Funding and operational model for long-term preservation of research data. Piloting at Princeton.
Storing data forever.
What’s “forever”? We don’t usually tell people how long we keep stuff – like in libraries. We can treat data the same way as books – “indefinitely” – best effort to keep data around for a long time, which doesn’t have to be precisely defined.
Quotes Cliff Lynch – funding agencies don’t expect data to be kept forever. But Serge is uncomfortable with that.
The reality today is that we’re talking about an indefinite period of a “few years”.
Where do we store data? Your local web site; A disciplinary repository; At another university; in the cloud (Amazon, Google, Duracloud)
How to pay for storing data? Institution pays; grants pay – but they don’t go on forever; or – we don’t know (the most popular model). Most mechanisms require ongoing payment. That answers the “what should we store” question – by being willing to store whatever someone’s willing to pay for. Duracloud is charging $1800/year/Tb. Not a reasonable charge for long-term preservation.
At Princeton they’re trying a Pay Once Store Endlessly approach. Based on a steadily declining cost of storage (as computed on a per-unit-of-storage basis). Turns out you can store the research data forever for about twice the original storage cost. At Princeton that turns out to be about $5 per gigabyte (including tape drives) to store forever.
Not including added services like curation or translation – just a bit storage.
Serge looked at the data management plans for all grants submitted at Princeton since the mandate for a data management plan. 93 grants total. 27 (30%) have no data management plan. Most popular is on a web site or local disk (20%). Then DataSpace.