[CSG Spring 2010] Storage Strategies

Storage strategy survey results. Storage management is equally distributed between central IT, distributed, both, or not sure.

What’s provided centrally? All offer individual file space. Most offer backups for distributed servers and departmental file space. Half offer desktop backups.

Funding models – just about all have some variety of pay for what you use. Most have some common goods, and about half have base plus cost for extra.

About half do full cost recovery including staff time.

Challenges – data growth is top, tiered storage is next, along with centralizing and virtualization.

Biggest explicit challenges : Data growth, perception of cost, research storage.

Storage at Iowa
Central file storage: Base entitlement, individuals 1-5 GB, depts, 1 GB per FTE. 4 hour recovery objectives. 99.97% uptime. 89% participation. Enterprise level, high availability.

One price fits all network file storage, offered some lower-cost network storage, e.g. without replication or backup, now they’ve got lowest-cost bare server storage – lots of enthusiasm for that model.


Low cost SAN for servers $0.36 – $1.68 per year, depending on service level. Recovery is hw and sw, no staff time or data center charges.

Storage Census 2010

51% of storage being used by research. 35% Admin and Overhead (including email), 11% Teaching, 3% Public Service.

72% of storage is backup vs. online.

Next steps: identify and promote research solutions; build central backup service; build, promote archival solutions.

Storage @ U VIrginia – Jim Jolkl

Hierarchical Storage Manager Services: Storage for long-term research data (centrally funded but not well marketed); Library materials (funding via Library contributions to infrastructure); RESSCU (off-campus service for departmental disaster recovery backups).

Enterprise Storage – Based on Netapp clusters. NFS, CIFS for users, ISCSI, SAN internally. Works really well, highly reliable, replicated. Mostly used for central services. For departments it’s $3.20/GB/yr to $3.50 without backups. Lots of incidental sales to people who want a gigabyte or so for additional email quota. Doesn’t work for people who want a lot of storage.

New mid-tier storage service – focus on a reasonable and affordable storage service for departments and researchers.
Requirements: reliable, low cost, low overhead, self service. Unbundled services – optional remote replication and backups. Access via NFS and CIFS. Snapshots – users deal with their own restores. Offering Linux and WIndows versions. Doing group files based on their groups infrastructure. Using RAIDKING disk arrays. Using BetterFS on Fedora, Windows server for the windows side.

Cost model – 1 hour plus $0.34/GB/yr (raid5, but not replicated). Next year expect to drop price by 50%. Currently about 22 TB leased on NFS and only marginal WIndows use to date. All of the complaints about the costs of central storage have gone away. Research groups interested in buying big chunks.

Shel Waggener – Berkeley Storage & Backup Strategy

Shel says scale matters and no matter who says they’re doing it better faster cheaper, without scale they’re not.

2003 – every department runs own storage – including seven within central IT.
2004 – data center moves creates opportunity for common architecture
2006 – dedicated storage group formed. No further central storage purchases supported except throuh storage team.
2007 – Hitachi wins bakeoff. 250 TB. Email team works with storage group to move from direct-attached to SAN
2010 – over 500 hosts using pool – 1.25 PB expanding to 3 PB this year.

SAN-based approach. Lots of serial attached SCSI disk – moving away from fiber-channel.

Cheapest storage is now 25 cents gigabyte per month. The most expensive tier (now $4.00/GB/Month) bears the cost of the expensive infrastructure that the other tiers leverage.

Failure rate on cheap disk is reliable, but recovery time is longer.

At the cost of storage, they don’t have quotas for email.

One advantage is paying for today’s storage today. Departments buy big arrays and use 5% in the first two years, which is much more expensive. But that’s what’s supported by NIH and NSF.

Backing up 338 users’ desktops (in IST) takes up 1.3 TB.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: