[CSG Fall 2005] Cliff Lynch on random musings


Cliff Lynch is noting that we talked a tremendous amount about repositories today, and that the approach was largely technical. If you talk to administrators, faculty, or librarians, you’ll get very different views on what these are for, even though when you peel away the social and political layers you get something that looks very similar.

CNI has been concentrating for the past year or two on what are referred to as Institutional Repositories – a service that has a significant insitutional commitment behind it that documents the academic or cultural life of the institution. That covers a place where you put digital materials created by faculty, documenting performances that happen in a university community. Typically it’s not a place where you’re doing frontline teaching and learning – it’s not a course management system. Course management systems generally have lots more specialized services than repositories. There are lots of questions about how rich the functionality in institutional repositories should be.

If you think about putting documents in and getting them out and perhaps migrating document formats over time, that’s a good set of functions. Now let’s think about video – you can think about it in terms of a large file – put it in, pull it out – what you do with after is not the repository’s business. But you can think of video in the context of including streaming, handling different bit rates, etc etc. These are the types of scoping questions we see around institutional repositories.

There are lots of repositories on campus besides just institutional repositories and course management systems. All kinds of research groups setting up repositories – often on the same software used for the institutional repository. What’s different is the scale of implementation and the extent of institutional commitment.

There are two major streams of argument that have been used to support deployment of institutional repositories. One talks about the move to production of scholarship in digital form – scholarship that is more than page images, but encompasses datasets, software, simulations, etc that don’t fit into the tradition of scholarly journals or monographs. In order to keep scholarship healthy institutions need to take responsibility for archiving and maintaining these materials.

The other argument runs around a set of issues that go around the rubric of “open access” – a policy position that says that the reporting of scholarship should be free and openly accessible and that the Internet makes that possible at a low cost and that it breaks down barriers to scientific progress, bridges equity gaps between nations and communities. On other argument that has some political traction is that a tremendous amount of research is paid for by the government and that citizens have the right to access it. This is the open access thesis. One of the strategies is for scholars to deposit copies of works into public repositories, either institutional or discipline-based. This approach is getting traction in both the US and Europe.

You have these two justifications, but we don’t really know much about what is in institutional repositories or how many of them are deployed. CNI did a project on repositories in 13 countries and then pulled together a meeting in Amsterdam to understand similarities and variations in implementations. There are two articles on this in D-LIB magazine last week.

A couple of significant highlights – there are a couple of nations in Europe that have an institutional repository deployed in every higher education institution in the country. There are other nations where deployment rates are very low. In the US they looked at CNI membership which is primarily research institutions. They found around 40% had some sort of repository deployed, and around 80% of the rest had some planning underway.

In almost all institutions the intellectual leadership for this activity has come from the Libraries.

If you look at the European data they are doing this mostly about open access, and if you look at the material in repositories it’s mostly textual material. If you look in the US the picture is quite different – there’s lots of stuff that isn’t textual. Everything from architectural models, video, datasets, software, etc. Institutional repositories may be picking up the need for places to store data that are filled by national data centers in other countries.

While we thought we had a reasonable working definition of institutional repository, the thing that came through very clearly is how chaotic the campus environment is. The relationships of repositories and course management are confused, there are lots of departmental repositories where people don’t talk together or to the central repository. Lots of confusion over what’s a digital library and what’s an institutional repository. It would be useful to try to get some working definitions at least at a campus level.

There is considerable interest at the policy level in the US in starting to get a handle on the datasets that produced as a result of research activity. The NIH put a requirement on all grants over $.5 million to have a data plan. The grant holders naturally want to hand over long term responsibility for this to the institution. The National Science Board issued a set of policy recommendations around long-lived data standards. It’s worth looking at because this is the beginning of setting policy principles that will affect grant awards at institutions that will drive us to deal with data stewardship. The Office of Science and Technology Policy has also picked up on this report.

CNI as of earlier this week started an informal call for experience from institutional representatives to get additional insight into what’s going on.

