[CSG Spring 2008] Cyberinfrastructure Workshop – Jim Pepin


Disruptive Change –

Things creating exponential change – transistors, disk capacity, new mass storage, parallel apps, storage management, optics.

Federated identity (“Ken is a disruptive change”) team science/academics; CI as a tool for all scholarship.

Lack of diversity in computing architectures – X86 or X64 has “won” – maybe IBM/Power or Sun/SPARC at ages. Innovation is in consumer space – game boxes, iPhones, etc.

Network futures – optical bypasses (we’ve brought on ourselves by building crappy networks with friction). GLIF examples. Security is driving researchers away from campus networks. Will we see our networks become the “campus phone switch” of 2010?

Data futures – Massive storage (really really big) Object oriented (in some cases); Preservation, provenance (- how do we know the data is real? ) distributed, blur between databases and file systems. Metadata.

New Operating Environments – Operating systems in network (grids) not really OSs. How to build petascale single systems – scaling apps is the biggest problem. “Cargo cult” systems and apps. Haven’t trained a generation of users or apps people to use these new parallel environments.

In response to a question Jim says that grids work for very special cases, but are too heavyweight for general use. Cloud computing works in embarrassingly parallel applications. Big problems want a bunch of big resources that you can’t get.

The distinction is made between high throughput computing and high performance computing.

100s of teraflops on campus – how to tie into national petascale systems, all the problems of teragrid and VOs on steroids – network security friction points, identity management, non-homogenous operating environments.

Computation – massively parallel – many cores (doubling every 2-3 years). Massive collections of nodes with high speed interconnect – heat and power density, optical on chip technology. Legacy code scales poorly.

Vis/remote access – SHDTV like quality (4k) enables true telemedicine and robotic surgery, massive storage ties to this,

Versus – old code , writte on 360 or vaxes, vector optimized, static IT models – defending the castle of the campus. researchers don’t play with others well. condo model evolving. will we have to get used to the two port internet? Thinking this is just for science and engineering – social science apps (e.g. education outcomes at clemson – large data, statistics on huge scale) or shoah foundation at USC – many terabytes of video.

VIsion/sales pitch – access to various kinds of resources – parallel high performance, flexible node configurations, large storage of various flavors, viz, leading edge networks.

Storage farms – diverse data models: large streams (easy to do); large number of small files (hard to do); integrate mandates (security, preservation), blur between institution data, and personal/research; storage spans external, campus, departmental, local. The speed of light matters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s