CNI Fall 2013 – Creating A Data Interchange Standard For Researchers, Research, And Research Resources: VIVO-ISF

Dean B. Krafft, Brian Lowe, Cornell University

What is VIVO?

  • Software: an open0source semantic-web-based researcher and research discovery tool
  • Data: Institution-wide, publicly-visible information about research and researchers
  • Standards: A stnadard ontology (VIVO data) that interconnects researchers

VIVO normalizes complex inputs, connecting scientists and scholars with and through their research and scholarship.

Why is VIVO important?

  • The only standard way to exchange information about research and researchers across divers institutions
  • Provides authoritative data from institutional databases of record as Linked Open Data
  • Supports search, analysis, and visualization of data
  • Extensible

An http request can return HTML or RDF data

Value for institutions and consortia

  • Common data substrate
  • Distributed curation beyond what is officially tracked
  • Data that is visible gets fixed

US Dept. of Agrigculture implementing VIVO for 45,000 intramural researchers to link to Land Grant universities and international agricultural research institutions.

VIVO exploration and Analytics

  • structured data can be navigated, analyzed, and visualized within or across institutions.
  • VIVO can visualize strengths of networks
  • Create dashboards to understand impact

Providing the context for research data

  • Context is critical to find, understand, and reuse research data
  • Contexts include: narrative publications, research grant data, etc.
  • VIVO dataset registries: Australian National Data Registry, Datastar tool at Cornell

Currently hiring a full-time VIVI project director.

VIVO and the Integrated Semantic Framework

What is the ISF?

  • A semantic infrastructure to represent people based on all the products of their research and activities
  • A partnership between VIVO, eagle-i, and ShareCenter
  • A Clinical and Translational Information Exchange Project (CTSAConnect): 18 months (Feb2012-Aug2013) funded by NIH))

People and Resources – VIVO interested primarily in people, eagle-i interested in genes, anatomy, manufacturer. Overlap in techniques, training, publications, protocols.

ISF Ontology about making relationships – connecting researchers, resources, and clinical activities. Not about classification and applying terms, but about linking things together.

Going beyond static CVs – distributed data, research and scholarship in context, context aids in disambiguation, contributor roles, outputs and outcomes beyond publications.

Linked Data Vocabularies: FOAF (Friend of a Friend) for people, organizations, groups; VCard (Contact info) (new version); BIBO (publications); SKOS (terminologies, controlled vocabularies, etc).

Open biomedical Ontologies (OBO family): OBI (Ontology of biomedical investigations); ERO (eagle-i Research Resource Ontology); RO (Relationship Ontology); IAO (Information Artifact Ontology – goes beyond bibliographic)

Basic Formal Ontology from OBO – Process, Role, Ocurrent, Continuant, Spatial Region, Site.

Reified Relationships – Person-Position-Org, Person-Authorship-Article. RDF Subject/predicate model breaks down for some things, like trying to model different position relationships over time.  So use a triple so the relationship gets treated as an entity of its own with its own metadata. Allows aggregation over time, e.g. Position can be held over a particular time interval. Allows building of a distributed CV over time.  Allows aggregating name change data over time by applying time data to multiple VCards with time properties.

Beyond publication bylines – What are people doing? Roles are important in VIVO ISF. Person-Role-Project. Roles and outputs: Person-Role-Project-document, resource, etc.

Application examples: search (beta.vivosearch.org) can pull in data from distributed software (e.g. Harvard Profiles) using VIVO ontologies.

Use cases: Find publications supported by grants; discover and reuse expensive equipment and resources; demonstrate importance of facilities services to research results; discover people with access to resources or expertise in techniques.

Humanities and Artistic Works -performances of a work, translations, collections and exhibits. Steven McCauley and Theodore Lawless at Brown.

Collaborative development – DuraSpace VIVO-ISF Working Group. Biweekly calls Wed 2 pm ET. https://wiki.duraspace.org/display/VIVO/VIVO-ISF+Ontology+Working+Group

Linked Data for Libraries

December 5, 2013 Mellon made a 2 year grant to Cornell, Harvard, and Stanford starting Jan 2014 to develop Scholarly Resource Semantic Information Store model to capture the intellectual value that librarians and other domain experts add to information resources, together with the social value evident from patterns of research.

Outcomes: Open source extensible SRSIS ontology compatible with VIVO, BIBFRAME and other ontologies for libraries.

Sloan has funded Cornell to integrate ORCID more closely with VIVO. At Cornell they’re turning MARC records into RDF triples indexed with SOLR – beta.blacklight.cornell.edu

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s