[CSG Winter 2010] Collaborative Platforms workshop

UC San Diego is hosting CSG for the first time.

Ken Klingenstein (Internet2)
Agenda for the morning:

Use Cases: LIGO, iPlant, and Bamboo
Basic generic build issues/COmanage
Basic outsource issues/Google offerings
The social networking angle
Connect/Sharepoint experiences
Panel

Federated Identity – reducing the authentication barriers to collaboration, note that sometimes the IdP is not the enterprise but the VO or other org.

Multiple levels of assurance, depending on use.

A new word: Domestication of Applications – about refactoring applications to use the emergent identity services infrastructure. Make an external call.
Begins with federated identity and authentication, but gains a lot from group management for access control, etc.
Lots of different flavors.

Examples:
COmanage – the Dutch have done amazing things
Commercial offerings – Sharepoint, Adobe Connect, Google Sites, Wave, Google Apps
Repurposed LMS – Sakai, Croquet

Dutch National Collaboration Infrastructure – serving a million users
Domesticated tools – Adobe Connect; Alfresco; Foodle; Filesender; Confluence, Drupal, etc. Google Apps; My Experiement.org.
Done both grid integration and workflow.

Key issues:
Extent of application domestication
Appliance, service, cloud offering
Waiting for other technologies to happen – interfederation, discovery, metadata tagging, etc.
We’re early in understanding the UI
Domain applications/ science portal – can I use my groups for getting to grid application?

Collaborations and Virtual Organizations
– Move from a tool-based identity world to a collaboration-centric space

Roles, schema, and attributes

Big Science Collaborations
LIGO (ligo.org)
The single largest VO funded by NSF – does high energy physics.
Complex internal access issues – lots of internal competitive proposals requiring complex access control
A small number of very large files

iPlan – http://www.iplantcollaborative.org
NSF leading with cyberinfrastructure for the plant biology community
Broad outreach, education and training components – rich external access issues.
A very large number of small files

Needs of Big Science Researchers
Access to collaboration tools
No modifications to existing domain science apps – in some cases jobs run for years.
Command line tools – an interesting challenge
Internation capabilities
Multiple levels of assurance
Roles, attributes, metadata, and ontologies

Chad Kainz (U Chicago) – Humanities Research – A View from the Bamboo
Google is a big humanities project 🙂
Tow facets of collaboration – substantive and methodological – Methodological is a common goal. Substantive is uncommon – “I’ve got my thing and I want to unleash it on the world”. Strong desire to shift towards methodological.

Scholarly Networking is not the same as Social Networking

Social networking is focused on individual trying to connect with known group (friends, colleagues, family)
Scholarly Networking – Individual is seeking different connections that cross disciplines or engage other individuals with similar interests elsewhere.

What is emerging are invisible colleges of like-minded individuals who work at different institutions.

Pub problem – Neet to be in the right pub at the right time to make the right connection. “The most expensive dating service we have on campus is the VP of Research”

Five things came out of Bamboo workshops:
– Enable the discovery of scholars and their work at scholar-scholar level. Requires contextual metadata about projects, content, services, etc. Manual entry of metadata will fail (duh).
– Enable the creation of scholar profiles from data sources at institutions and create mechanisms to mine the data across institutions. Creates tension among institutions, scholars, and scholarly societies (academic “stars”).
– Organize – enable groups to organize outside of normal boundaries.
– Engage – enable a variety of scholars and institutions to engage in the network even if they don’t have organized data.
– Market – create a participatory market to promote greater SME interaction.

Ken – In LHC crowd, “discovery” is finding 2,000 processors they can use. In humanities it’s “is there a piece of software out there anywhere that can do…?” Distinction of provenance – in big science it’s store data so experiments can be repeated in the future. In the humanities it’s about giving proper credit – building arguments on arguments, so need to be able to work your way back through the subjective views. If an assertion is made by a grad student, who’s the advisor of that grad student?

A Middleware Unified Field Theory – Mike Gettes (MIT)
This is about Internet2’s COmanage.
We want inter-enterprise workgroup collaborations (or CO – Collaborative Organizations)
Identity, groups, federation, and applications.
Give control to community members – stop making people come to central IT.
Integrate with existing higher ed infrastructure.
Shib is federating technology. Group management. LDAP-PC publishes to LDAP, apps talk to LDAP.

Foodle – A federated Doodle.

Google Wave – Chris Hubing (Penn State)
Use cases: collaborative authoring of student documetns – teacher can play back wave and watch evolution of document; wavelets for discussion generation – hard to in google docs
Federation steup “like email” dns srv records; uses x.509 certs between wave servers; wave fed protocol is an extension of XMPP protocol.
Wave Fed prototype server – plugs into XEP-0114 Extension.
Wave providers right now or FedOne and Ruby on Sails (written by a high school student!)

Federation only works with wavesandbox.com – main google wave site doesn’t do federation yet
only trusts StartSSL certs – a little questionable on business practices
no web ui for fedone prototype server
if you use google apps for edu you need to disable chat service, because of name service collisions.

Advertisement

[ECAR Summer 2008] Gwen Jacobs – The (Neuro) Science of Learning

Gwen’s talk is subtitled What do we know about how the brain learns that can inform and improve pedagogy? What do we know about the brain that might be useful? Experience and learning change the physical structure of the brain, which organizes and reorganizes brain function. Different parts of the brain may be ready to … Continue reading “[ECAR Summer 2008] Gwen Jacobs – The (Neuro) Science of Learning”

Gwen’s talk is subtitled What do we know about how the brain learns that can inform and improve pedagogy?

What do we know about the brain that might be useful?

Experience and learning change the physical structure of the brain, which organizes and reorganizes brain function. Different parts of the brain may be ready to learn at different times. Learning continues throughout life.

Four examples

Language learning. Two parts of learning language, which begins right when you’re born: perceptual part (hearing and perceiving) and production (practicing to make songs and words). Human language acquisition is mimicked in bird song – every step of the way. The babbling phase is practicing. Birds and humans learn the songs that they hear. As native language skills improve, perception of other languages decreases, so it becomes harder to acquire other languages later in life. Both learn better with a live tutor – social interaction is important to learning. In birds learning stops at sexual maturity – language ability decreases after age 14 in humans.

Why is it so hard to learn new languages as you get older? As you learn and focus on your native language you gradually lose the ability to perceive other languages. Example of experiment with Japanese speakers who can’t perceive difference between “ra” and “la”. Language area in the brain of bilingual speakers is enlarged.

What’s going on in teenager’s brains?

Different parts of the brain mature at different times (Toga’s work at UCLA). The wiring of the brain changes just prior to the onset of puberty . Sensory motor parts mature first, language and spatial reasoning during ages 6-12, frontal lobes (reasoning, decision making) mature last – not till age 20. If you look at the brain regions responsive to emotions like fear or anger, they are gradually switched to those for reasoning.

Sex hormones can change brain structure and function. In songbirds males are the ones who learn to sing. If you give females testosterone they can learn to sing – they develop “male” brain structures. There are studies suggesting that there are gender differences in human brains.

Experience continues to modify the brain – learning takes place throughout our lives.

A study that looks at brain imaging in London taxi drivers. Asks them to remember a route – taxi drivers have a very large hippocampus, which is involved in short-term memory and navigation. Being a musician throughout life actually improves your cognitive abilities in many other areas. People who play checkers, Scrabble, Sudoku, etc – there’s a lot of evidence that improves ability to solve other problems. Deaf individuals learn language through signing – turns out that same brain regions are used for language, no matter what the sensory modality.

Our students – how can we engage them? Given their current experience, multitasking all the time, how can we engage them in the classroom?

Active learning = paying attention

Science of learning – National Science Foundation. Goals: advance frontiers of all the sciences of learning through integrated research. Started in 2004. Six centers funded so far, with very different explorations underway.

Temporal dynamics of learning – a distinct region of our brain responsible for remembering faces. There’s a difference between categorizing an object vs. categorizing the object as something you can recognize and name. Musicians activate different brain regions when they look at music notation – musicians are better at multitasking than non-musicians. Individuals with autism or those with Asberger’s have a hard time recognizing and making facial expressions – turns out they don’t use that part of their brain. UCSD center has developed a game that helps them recognize expressions.

LIFE center – learning how a toaster works from a video. People learn much better from a point of view camera vs. a side camera. Social homework game – kids train an AI agent to answer questions. Students learn and retain more from training these agents.

CELEST center – neuro-morphic engineering. DIVA – A model for speech. Looking at how we learn to make motor movements to create speech. Created a model for reproducing speech from listening.

There’s more, but I’ve got to leave to catch a shuttle to the airport…

[ECAR Summer 2008] Rudolf Dimper – Cyberinfrastructure at the European Synchotron Radiation Facility and Its Impact on Science

Rudolf is head of computing services at ESRF. Observations today done with sophisticated instruments – including synchrotrons. A synchrotron is a super microscope to examine condensed matter. Operate from the UV to the hard x-ray spectrum. Syncrhotron light – used because it has remarkable properties: brilliance (1,000 billion times brighter than a hospital x-ray tube). … Continue reading “[ECAR Summer 2008] Rudolf Dimper – Cyberinfrastructure at the European Synchotron Radiation Facility and Its Impact on Science”

Rudolf is head of computing services at ESRF.

Observations today done with sophisticated instruments – including synchrotrons. A synchrotron is a super microscope to examine condensed matter. Operate from the UV to the hard x-ray spectrum. Syncrhotron light – used because it has remarkable properties: brilliance (1,000 billion times brighter than a hospital x-ray tube).

The European light source is a 6 GeV source in Grenoble. Concentrates 10,000 researchers and engineers. Cooperation between 18 countries. Annual budget of ~80 m€.

Some applications – studying the structure of spider silk; medical applications, including angiography that gives better results than conventional hospital techniques; Geophsysics, sutdying samples which undergo extreme changes; Chemistry, how do catalytic processes function; Semiconductors; Paleontology, high res microtomography of fossils.

52 Beamlines, 6222 user visits in 2007. 15,308 eight hour shifts scheduled for experiments in 2007. > 1500 peer reviewed publications/year.

Over last 10 years the data volume has increased by a factor ~300. In 2007: 300TB, ~1*10E8 files.

Storage policy is only to keep 6 months of data, even for internal users. This is under heavy discussion.

French network infrastructure has not increased bandwidth in three years. That’s a problem

Network based on Extreme Network switches, storage on NAS systems, StorageTek tape for offline. Commodity clusters in teh data center, ~400 CPUs (totally insufficient).

ESRF Upgrade Programme – 290 m€ programme.

Single most productive facility producing protein structures for the Protein Databank.

New methods: nanobeams & raster scans. Looking to increase resolution by two orders of magnitude. Petabytes of data – how to carry away the data. Easy to imagine a PB per day in 10 years, contrasted to 15 PB per year for the LHC.

Two fundamental problems:

Latency diminish teh time to measure, store and analyze data.

Add functionality – new ways to measure, store, and analyze data. “Have to get our hands dirty with grid tools”

Looking desperately for 100 Gbps network.

I/O bottlenecks in research clusters is a big issue.

ESFRI position paper on digital repositories – lots of storage and access policies.

[ECAR Summer 2008] Heidi Hammel – The future of exploration in astronomy

Heidi Hammel led the Hubble Telescope team that looked at the Shoemaker-Levy asteroid impact and works at the Space Science Institute in Boulder (though she lives in Ridgefield CT). Telescopes – devices for gathering light. Refractive telescopes use lenses. Reflective telescopes use mirrors. Viewing through the atmosphere blurs your image. Adaptive optics helps – e.g. … Continue reading “[ECAR Summer 2008] Heidi Hammel – The future of exploration in astronomy”

Heidi Hammel led the Hubble Telescope team that looked at the Shoemaker-Levy asteroid impact and works at the Space Science Institute in Boulder (though she lives in Ridgefield CT).

Telescopes – devices for gathering light. Refractive telescopes use lenses. Reflective telescopes use mirrors. Viewing through the atmosphere blurs your image. Adaptive optics helps – e.g. the hexagonal segments of the Keck telescope mirror which can adapt at 90 Hz. So why put a telescope in space? Clouds, but even clear atmosphere distorts light. Even worse, it absorbs light.

More to light than meets the eye. Earth’s atmosphere absorbs UV and infrared and some radio. Adaptive optics not suited for all visible wavelengths.

James Webb space telescope – 6.5 m “mirror” (adaptive). 3 cameras, 1 spectrometer. Less than half the cost of Hubble ($~ 4.5 B full life cycle). Launch date 2013. About a million miles out, at the L2 point – no way to service it. Collaborators from all over the country and all over the world. Need to coordinate. Needed a versatile platform for distributed configuration and data management – NGIN (Next Generation Integrated Network). Does all project management functions including risk management, action-item tracking, shared files for collaboration, etc. Being expanded and developed. Has kept the project on schedule and on budget for the past three years.

Webb origins science – four themes – first light and reionization; assembly of galaxies; birth of starts & protoplanetary systems; origins of the universe and life itself.

Large Synoptic Survey Telescope – to image entire accessible sky to a deep level with a wide field of view in a fast operational mode. To be built in Chile. Looking for time-variable phenomena in a huge wide swath of sky. Camera is a 3.5 degree sensor. 3200 megapixel camera. Designed to work fast. Six colors, 20k sq degrees at 0.2 arcsec/pixel with each field revisited 2,000 times. ~ 2 terabytes per hour, > 10 billion objects. “The Monster Truck of telescopes”. 6 GB of raw data every 15 seconds.

Fundamental BIG question – what is the fate of the universe? Contrary to previous models (Open, flat, Closed) recent observations of supernova indicate that the universe is accelerating. Some mysterious force is counteracting gravity – call it dark energy. It permeates all of space. As of the last couple of years, discovered 743% of the universe is dark energy (about 23% is dark matter). LSST is to investigate dark energy by taking precision measurements of four dark energy signatures in a single data set.

Another big question – what is our destiny? The military is observing meteor impacts from satellites that monitor large explosions. LSST will inventory the Near Earth Objects population.

Argo – Voyage Through The Outer Solar System

Use Neptune to get to the Kuiper Belt. Why go to Neptune again? Voyager flew by Neptune in 1989 (having launched in 1977). Old technology, besides – everything we can detect in the neptune system has changed in the last 20 years – cloud distribution, stratospheric tempaerature, its ring system, triton’s atmoshpere. etc. We can’t see the details to explain this. Kuiper belt – Pluto and 10,000 of his closest friends. Argo’s access ~4000 times bigger than that of New Horizons (current mission to Pluto). If launched in 2020, will get to Neptune in 2033, Kuiper belt in 2041. Argo team has never met in one place at one time. Entire mission is being remotely planned and executed. This mission is not unique – it’s emblematic of a new mode of operation.

Space Science Instituted – 501(c)(3) formed in 1992 in Boulder to enable world-class research in space and earth science. Heidi is the director of the research group, Also there’s a flight ops group that runs the Cassini spacecraft, and there’s a large education and public outreach group that builds museum exhibits. 30-50% of research staff distributed nationwide. Off-site from inception of ssi, for over 15 years. She quit MIT when they told her she had to sit in her office 5 days a week in Cambridge, when she lived in Conneticut. The off-site option offers significantly reduced grant overheads when compared to universities. Growth management is a challenge – lots of people want to work this way. Many young scientists are leaving (or not going to) academia for these kinds of alternatives.

[ECAR Summer 2008] Kevin Trenberth – NCAR – Global Warming Affects us all: What must be done?

The IPCC report stating that warming of the climate is unequivocal and very likely caused by human activities was a remarkable demonstration of the strength of the evidence – passed by 130 nations. Increasing CO2 – has a lifetime of 100 years before it gets taken out of the system. US continues to increase – … Continue reading “[ECAR Summer 2008] Kevin Trenberth – NCAR – Global Warming Affects us all: What must be done?”

The IPCC report stating that warming of the climate is unequivocal and very likely caused by human activities was a remarkable demonstration of the strength of the evidence – passed by 130 nations.

Increasing CO2 – has a lifetime of 100 years before it gets taken out of the system. US continues to increase – 20% increase since 1990. China now represents a big and growing percentage of CO2 emissions. If we make gains in western world, will they be overwhelmed by the emerging world? There is pressure to look at emissions per capita rather than by nation. Western Europe is 2.5 x better than US – is that due to higher gas prices? Highlights the fact that population is a big part of the equation, but nobody is talking about that.

There are also differences between states – California vs. Texas, for example.

Evidence – sea level is rising – 48 mm since 1992 (as measured by satellites). Might be best measure. Glaciers are melting, even as snowfall rises. Snow season is getting shorter – meltoff in Pacific Northwest is 7-10 days earlier now. Risk of drought increases substantially, along with wildfire danger.

Everything that’s going on in climate has a natural variability component and a global warming component.

Modeling the climate system is complex. Need computers that are 10,000 times faster than those we have now to accurately model. Shows a slide that models global temperatures that accounts for the real observations by adding human effects to what would have occurred naturally.

Precipitation patterns change – the wet places get wetter and more intense, the dry places get dryer.

What do we do?

Mitigation, adaptation, or do nothing.

Doing nothing == adaptation without planning.

What you do relates to value systems. That’s where politicians get involved.

The UN Framework Convention on Climate Change (ratified in 1994, including by US). Kyoto Protocol is a legal instrument under that convention. US withdrew in 2001. In 2004 US emissions were 16% over 1990 levels for greenhouse gasses.

What about a carbon tax? If there was a value to CO2 presumably the production of CO2 as waste would be reduced. Cap and trade is a variation – favored by Congress at present (at least partly because it doesn’t have the term “tax” in it). Tracking sources of violators becomes a whole new industry. If countries don’t subscribe it can favor those who pollute.

Coal fired power plants have been brought online at a rate of 2 per week over the past 5 years. China leads with one every 3 days or so.

A freeze on emissions means that conventrations of CO2 continue to increase. We have to adapt to climate change.

Assess vulnerability; devise coping strategies; determine impacts of possible changes : we need information!

We need to observe and track climate changes as they occur; analyze global products with models; understand the changes and their origins; validate and improve models; initialize models and predict futue developments; and assess impacts so as to provide advice.

Weather prediction – a problem of predicting the evolution of the atmosphere for minutes to days to perhaps 2 weeks ahead. Begins with observations of initial state ; atmosphere is a chaotic fluid, small uncertainties or model errors grow rapidly in time and make longer term prediction impossible.

Climate prediction – problem of predicting the patterns or character of weather and the evolution of the entire climate system. Often regarded as a “boundary value” problem This means determining systematic departures from normal from the influences of the climate system and external forcings. The oceans and ice evolve slowly, providing some predictability on mult-year time scales. Because there are many possible weather situations, it is inherently probabalistic.

As time scale is extended, the influence of anomalous boundary forcings grows to become noteworthy. The largest signal is El Nino. involves knowing the state of the ocean. All climate prediction involves initial conditions of the climate system, leading to a seamless (in time) prediction problem. A challenge we’re not capable of meeting at the present.

There have been no revolutionary changes in weather and climate model design since the 1970s. The models are somewhat better. Meanwhile, computing power is up by a factor of a million. That’s gone to increasing resolution of models and longer runs.

[ECAR Summer 2008] Bob Franza

Seattle Science Foundation – to nurture networks of experts. They have an 18k sq. ft. facility in Seattle, but if they buid more bricks in the future they will have failed. Working in virtual environments (second life) – not games. Using the virtual environment to solve real problems of distributed teams, not just as a … Continue reading “[ECAR Summer 2008] Bob Franza”

Seattle Science Foundation – to nurture networks of experts. They have an 18k sq. ft. facility in Seattle, but if they buid more bricks in the future they will have failed. Working in virtual environments (second life) – not games.

Using the virtual environment to solve real problems of distributed teams, not just as a neat technology.

CareCyte – why should workflow and health care be foreign concepts to each other? If you were going to redesign a health service facility, what would you build? Rethought the design – ultra-fast design, manufacture, assembly, near-laminar airflows, all internal walls are furniture that can be reconfigured easily, etc. Rendered the facility in about 2.5 days in second life, and can show people the ideas and design in an engaging way that can’t be accomplished with drawing and static images. Have to not use the technology in ways that replicate existing activities (e.g. giving lectures).

“Recruiters will look at somebody with a World of Warcraft score of 70 or above as CEO material.”

Doesn’t like the term “virtual” – has negative semantics around it, including accountability. Prefers the term “immersive”.

Bob talks about the ability to become things you aren’t in the environment, whether that’s a molecule to better understand how physics work or seeing what it’s like to be in a wheelchair or changing gender.

He’s asked what the implications of immersive environments are for university enterprises.

They’re looking at the undergraduate health sciences curriculum – can’t find anatomy profs anymore. Can’t supply cadavers for education – why do you need them? The curriculum is the same worldwide – you’ve got buildings on campuses with students coming in and getting bored. What does it cost to heat, cool, and illuminate those buildings? We have brought no imagination to these challenges. We have to look at the cost of operations. What is the cost of distributing rolls of toilet paper into thousands of classroom buildings?

All of the retired faculty could be participating in these immersive environments to bring education to many more people.

Macro nodes – very large data centers sitting next to hydroelectic generating stations. Biological scientists haven’t figured this out to the extent that astro and physicists have with things like the Hubble – collaborate to create resource.

Have to stop thinkign about physical space as the basis of anything except for those things that absolutely require it.

We have no technology excuses – the fundamental issue is will. Oil prices will drive that will.

We have to stop asking students to do pattern recognition. The way we’ve been evaluating students doesn’t have anything to do with the challenges they will face. But if they have to get along with a group of others to actually accomplish something, that will translate.

Bob invites people to contact him and work with them in Second Life.

[CSG Spring 2008] Cyberinfrastructure Workshop – Jim Pepin

Disruptive Change – Things creating exponential change – transistors, disk capacity, new mass storage, parallel apps, storage management, optics. Federated identity (“Ken is a disruptive change”) team science/academics; CI as a tool for all scholarship. Lack of diversity in computing architectures – X86 or X64 has “won” – maybe IBM/Power or Sun/SPARC at ages. Innovation … Continue reading “[CSG Spring 2008] Cyberinfrastructure Workshop – Jim Pepin”

Disruptive Change –

Things creating exponential change – transistors, disk capacity, new mass storage, parallel apps, storage management, optics.

Federated identity (“Ken is a disruptive change”) team science/academics; CI as a tool for all scholarship.

Lack of diversity in computing architectures – X86 or X64 has “won” – maybe IBM/Power or Sun/SPARC at ages. Innovation is in consumer space – game boxes, iPhones, etc.

Network futures – optical bypasses (we’ve brought on ourselves by building crappy networks with friction). GLIF examples. Security is driving researchers away from campus networks. Will we see our networks become the “campus phone switch” of 2010?

Data futures – Massive storage (really really big) Object oriented (in some cases); Preservation, provenance (- how do we know the data is real? ) distributed, blur between databases and file systems. Metadata.

New Operating Environments – Operating systems in network (grids) not really OSs. How to build petascale single systems – scaling apps is the biggest problem. “Cargo cult” systems and apps. Haven’t trained a generation of users or apps people to use these new parallel environments.

In response to a question Jim says that grids work for very special cases, but are too heavyweight for general use. Cloud computing works in embarrassingly parallel applications. Big problems want a bunch of big resources that you can’t get.

The distinction is made between high throughput computing and high performance computing.

100s of teraflops on campus – how to tie into national petascale systems, all the problems of teragrid and VOs on steroids – network security friction points, identity management, non-homogenous operating environments.

Computation – massively parallel – many cores (doubling every 2-3 years). Massive collections of nodes with high speed interconnect – heat and power density, optical on chip technology. Legacy code scales poorly.

Vis/remote access – SHDTV like quality (4k) enables true telemedicine and robotic surgery, massive storage ties to this,

Versus – old code , writte on 360 or vaxes, vector optimized, static IT models – defending the castle of the campus. researchers don’t play with others well. condo model evolving. will we have to get used to the two port internet? Thinking this is just for science and engineering – social science apps (e.g. education outcomes at clemson – large data, statistics on huge scale) or shoah foundation at USC – many terabytes of video.

VIsion/sales pitch – access to various kinds of resources – parallel high performance, flexible node configurations, large storage of various flavors, viz, leading edge networks.

Storage farms – diverse data models: large streams (easy to do); large number of small files (hard to do); integrate mandates (security, preservation), blur between institution data, and personal/research; storage spans external, campus, departmental, local. The speed of light matters.

[CSG Spring 2008] Cyberinfrastructure Workshop – Virtual Organizations

Ken Klingenstein – An increasing artifact of the landscape of scientific research, largely from the cost nature of new instruments. Always inter-institutional, frequently international – presents interesting security and privacy issues. Having a “mission” in teaching and a need for administration. All of these proposals end with “in the final year of our proposal three … Continue reading “[CSG Spring 2008] Cyberinfrastructure Workshop – Virtual Organizations”

Ken Klingenstein –

An increasing artifact of the landscape of scientific research, largely from the cost nature of new instruments.

Always inter-institutional, frequently international – presents interesting security and privacy issues.

Having a “mission” in teaching and a need for administration. All of these proposals end with “in the final year of our proposal three thousand students will be able to do this simulation”. Three thousand students did hit the Teragrid a few months back for a challenge – 50% of the jobs never returned.

Tend to cluster around unique global scale facilities and instruments.

Heavily reflected in agency solicitations and peer review processes.

Being seen now in arts and humanities.

VO Characteristics – distributed across space and time; dynamic management structures; collaboratively enabled; computationally enhanced.

Building effective VOs. Workshop run by NSF in January 2008. A few very insightful talks, and many not-so-insightful talks. http://www.ci.uchicago.edu/events/VirtOrg2008/

Fell into the rathole of competing collab tools.

Virtual Org Drivers (VOSS) – solicitation just closed. Studying the sociology – org life cycles, production and innovation, etc.

NSF Datanet – to develop new methods, management structures, and technologies. “Those of us who are familiar with boiling the ocean recognize an opportunity.”

Comanage environment – externalizes id management, priveleges, and groups. Being developed by Internet2 with Stanford as lead institution. Apps being targeted: Confluence (done), Sympa, Asterisk, DimDim, Bedework, Subversion.

Two specimen VOs

LIGO-GEO-VIRGO (www.ligo.org)

Ocean Observing Initiative ( http://www.joiscience.org/ocean_observing )

The new order – stick sensors wherever you can and then correlate the hell out of them.

Lessons Learned – people collaborate externally but compete internally; time zones are hell; big turf issue of the local VO sysadmin – LIGO has 9 different wiki technologies spread out over 15 or more sites (collaboration hell). Diversity driven by autonomous sysadmins. Many instruments are black boxes – give you a shell script as your access control. Physical access control matters with these instruments. There are big science egos involved.

Jim Leous – Penn State – A VO Case Study.

Research as a process: lit search/forming the team; writing the proposal; funding; data collection; data processing; publish; archive.

Science & Engineering Indicators 2008

publications with authors from multiple institutions grew from 41% to 65%. Coauthorship with foreign authors increased by 9% between 2995 and 005.

How do we support this? Different collaborative tools. Lit Search – refworks, zotero, del.icio.us; Research info systems – Kuali Research; home grown; Proposals – wikis, google docs; etc. Lots of logins. COmanage moves the identity and access management out of individual tools and into the collaboration itself.

Need to manage attributes locally – not pollute the central directory with attributes for a specific collaboration effort.

What about institutions that don’t participate. LIGO – 600 scientists from 45 institutions.

LIGO challenges – data rates of 0.5 PB/yr across three detectors (> 1 TB /day); many institutions provide shared infrastructure, e.g. clusters, wikis, instrument control/calibration); international collaboration with other organizations; a typical researcher has dozens of accounts.

Penn State Gravity Team implemented LIGO roster based on LDAP and Kerberos – Penn State “just went out and did it” – drove soul searching from LIGO folks – “why shouldn’t we do this?”. Led to LIGO Hackathon in January, which was very productive. Implemented Shibboleth, several SPs, Confluence, Grouper, etc.

Next steps are to leverage evolving LIGO IAM infrastructure; establish permanent instance of LIFO COmanage; encourage remaining institutions to join InCommon; and (eventually) detect a gravity wave?

Bernie Gulachek – Positioning University of Minnesota’s Research Cyberinfrastructure – forming a Virtual Org at Minnesota – the Research Cyberinfrastructure Alliance.

A group of folks who have provided research technology support – academic health center; college of liberal arts; minnesota supercomputer institute; library; etc.

Not (right now) a conversation about technology, but about organization, alliances, and partnerships. Folks not necessarily accountable to each other, but are willing to come together and change the way they think about things to achieve the greater common good.

Both health center and college of liberal arts came to IT to ask how to build sustainable support for research technology .

Assessing Readiness – will this be something successful, or a one-off partnership? What precepts need to be in place for partnership? The goal is to position the institution for computationally intensive research. They have a (short) set of principles for the Alliance.

Research support has been silo’ed – need to have a connection with a specific campus organization, and the researcher needs to bridge those individual organizations. The vision is to bring the silos together. Get research infrastructure providers talking together. Researcher consultations – hired a consultant.

Common Service Portfolio – Consulting Services; Application Support Services; Infrastructure Services – across the silos. Might be offered differently in different disciplines. Consulting Services are the front door to the researcher.

Group is meeting weekly, discussing projects and interests.

[Bamboo Workshop 1a] Day 2

The day starts with George Breslauer, Provost of UC Berkeley, talking to us. Three questions – 1. Impact of new technology – technology can make research more efficient, but how to do this as smart as possible? By the time you implement a new system in the university, you only have 12-18 months before the … Continue reading “[Bamboo Workshop 1a] Day 2”

The day starts with George Breslauer, Provost of UC Berkeley, talking to us. Three questions – 1. Impact of new technology – technology can make research more efficient, but how to do this as smart as possible? By the time you implement a new system in the university, you only have 12-18 months before the next cutting edge – whether technology has the capacity to transform the humanistic disciplines? 2. Where will shared technologies work best, and where will individual campuses need to invest? 3. How does Bamboo create a collaborative cultural model to sustain this effort? Making collaboration work depends on non-self-evident cultural factors.

We were broken up into tables of eight people for the morning to discuss scholarly practices. I was at a table with fascinating folks – Ted Warburton from UC Santa Cruz, a dancer who uses 3D motion capture to create new art; Niek Veldhuis from Berkeley, who researches ancient Sumerian from cuneiform clay tablets; Katherine Harris from San Jose State, whose research area is 19th century literary annuals; Sharon Goetz, a medievalist who manages digital publications at Berkeley’s Mark Twain Project; Tom Laughner, Director of Educational Technology Services at Smith College; Angela Thalis from UC Santa Cruz; and Michael Ashley, an archaeologist who is the program manager for Berkeley’s Media Vault.

The conversation was wide-ranging and captivating, covering how people do their research, how they connect to others in their field, through to publication and professional development. I thought the organizers posed two really good questions to get things flowing: On a really good day, what activities do you do; and in a really good term, what things do you accomplish?

In the afternoon we combined two tables to try to cluster and categorize the practices we identified in the morning. I found that less compelling, perhaps because we lost some of the fascinating details, perhaps because it was harder to have an involving conversation with sixteen people; or perhaps because I just got tired.

It will be interesting to see where this conversation evolves, both through the rest of this meeting and in the following meetings in Chicago, Princeton, and Paris.

[ECAR 2007 Winter] Robert Kraut – Conversation and Commitment in Online Communities

Robert Kraut is the Herbert A. Simon Professor of Human-Computer Interaction in the Business School at Carnegie Mellon. It’s interesting to study online communities because the interactions are exposed and documented. Defining success: – Success is multidementional: transactional (did your question get answered?, were resources exchanged?); individual (was commitment developed?); and group (did it successfully … Continue reading “[ECAR 2007 Winter] Robert Kraut – Conversation and Commitment in Online Communities”

Robert Kraut is the Herbert A. Simon Professor of Human-Computer Interaction in the Business School at Carnegie Mellon.

It’s interesting to study online communities because the interactions are exposed and documented.

Defining success:

– Success is multidementional: transactional (did your question get answered?, were resources exchanged?); individual (was commitment developed?); and group (did it successfully recruit and retain members, and persist over time?).

Developing Commitment:

Commitment develops ov er time, with early phase especially fragile, and it’s a bi-directional process. There’s a cost-benefit analysis.

It’s rational for groups to be skeptical of newcomers. Newcomers take resources from existing members. The group is more likely to be welcoming if they perceive newcomers as”deserving”. Thesis: individuals may use self-revealing introductions to signal both legitimacy and investment.

He’s looking at research questions about whether groups ignore newcomers and whether conversational strategies encourage group members to pay attention to newcomers. Looked at 99 Usenet groups, around 40k messages. They’ve seen that groups respond less to newcomers across the board, particularly in political and hobby groups. They then used machine learning to analyze messages to find self-introductory messages. Attempt to predict whether a given message will get a reply. Found that newcomers with a self-introduction are treated as well as old-timers without one. They found that messages with a group-oriented introduction (“I’ve been lurking here for a while…”) almost doubled the chance that a message would get a reply.

I wonder how this connects with the public profiles like in Facebook?

When will indivduals “join”?

Individuals evaluate potential benefits from the group. The reactions they get from initial attempts to engage the group will be especially meaningful. Hypothesis is that people will be more likely to continue to participate if people respond to them, and if the reply comes from people with higher status in the group and if they are positive in attitude. Found that only 20% of newcomers who don’t get a reply to their initial message are seen again, while 40% of those that do get a reply are active subsequently. The idea of the “welcoming committee”, like Wikipedia has, is very useful in developing commitment. The more central the replier is to the group, the more powerful it is for developing commitment of the newcomer. The tone of the welcoming language also has an effect.

I asked whether they’ve done any work in looking at the formation of new online communities and what factors might lead to success. It’s hard to research the formation of new communities because it’s hard to catch communities at the moment of formation. The problem with starting new groups is that there’s no content, so no reason for people to go there – a chicken and egg problem. One thing that helps is to find niche markets where people have a very high need for information sharing and will accept relatively low returns as worthwhile. Working with an existing organization might help.

Facebook is a fantastically successful community. Some of its success can be attributed to it having started with a small handful of communities (universities) that provide a pre-existing connection (students at the same institution), and then built on the early success, with the latest example being opening up the API to allow other people to build new services for the community.

In his courses they’ve been using Drupal because it offers lots more flexibility than course management systems. Even then they’ve had a hard time getting students to participate – so they’re learning how to issue challenges, use reputation-building systems, and other techniques to encourage participation. In on-campus communities, the hostility towards newcomers is less of a problem because people already consider themselves part of an existing collective.