CNI 2012 Fall meeting – Opening Plenary

I came in a bit late to this plenary, so forgive me for the incomplete notes.

Cliff Lynch, CNI

MOOCS – Machine grading will be important. Masses of data coming out of MOOCS – who controls it and who gets to do what with it? Nobody seems to have asked the students. What’s the pact around learning data? Google decided it would be fun to build a MOOC platform and ran a couple of instances of how to search using Google course. They did it in order to drive use to their product – an example of applying this technology outside of academia, which will be increasingly more common. An affordable way of doing consumer education.

It’s clear that globally promiscuous admission to MOOCs doesn’t map well to the way institutions license content. Will drive greater use of open access materials. Will also drive broader community licensing of materials. Finally we’re seeing some serious work on alumni access to licensed materials – JSTOR is one among players working on this. There will be pressure to move towards more rational personal licensing schemes, but these have high transactional costs.

E-textbooks – Seeing some attempts to do licensing at scale. May have some economic payoff. Completely remap the relationships between faculty, presses, bookstores, etc. One of the messages to take away from MOOCs and e-texts is we need a more deliberate strategy around licensing of instructional materials. A lot of this work has taken place in the IT community, who’ve been able to make some progress. Libraries have stayed away from licensing e-texts at the same time as they’ve developed sophisticated understandings of licensing of other materials that could be brought to bear.

About a month ago there was a short piece in the Chronicle about a new textbook platform. The distinctive feature was that it would report to the teacher whether you’d done the readings. Quoted a few faculty who thought this was wonderful. Do you find this at all creepy? Could ask the same question about MOOCs or LMS systems. Do the students even know? This is an issue waiting to hit the front pages. Need some conversations about privacy and informed consent around these platforms.

Debates going on about how vocational higher education should be, should we still do humanities, are there really decent jobs for STEM graduates, etc. These things are under debate with a new intensity that deserves some serious considerations. The role of employers in training as opposed to universities in teaching – revisit the comments on MOOCS in teaching outside of academia.

Science is under a great deal of pressure. There is a crisis about reproducibility of results bubbling up – some attempts to reproduce results aren’t going very well. If we don’t get this under control it will affect the public support and funding of science.

The rise of PLOS ONE – publishing a measurable part of the scholarly literature. Vetting for correctness rather than ranking. Offers a level of predictability not found in many major journals.

Public libraries are being largely cut out of ebooks, particularly mass market ebooks. Very good example of where licensing can take us, and the extraordinary power that licensing rather than first sale gives publishers.

We’ve had some encouraging court judgements supporting the principle of fair use. At the same time there are some very troubling things in the first sale area suggesting that we may see an increasing limitation to first sale. Broad populace is starting to wake up to this – is someone going to be able to inherit your ebooks?

One of touchstones of CNI’s work has been understanding the changes in scholarly practice. That’s worth continual revisiting. Can identify a number of new developments over the past few years. One is one that would be easy to mis-classify as “big data” – we’ve moved into a world where there is an abundance of evidence, whether that’s an historian visiting records or an archaeologist seeking to understand urn making. We have lots of examples – we want to look at outliers, make sense of millions of email messages, etc. We’re seeing automatic search and clustering tools, analysis of social networks, etc. Different (and predate) big data tools.

Some new scholarly environments – Math Overflow – mathematicians and upper level grad students can post questions and get answers. Has an elaborate system of ratings and rankings – similar to Stack Overflow. Very slce-able scholarly practice communities using these tools in their work. Wolfram Alpha – a new class of information system that has some capabilities for encoding computational knowledge. We need to be very open to recognizing these kinds of new systems showing up in scholarly communities. Scholarly practice does not stay still – change continues to ripple.

Data curation and research data management. CNI has been very active for a decade now, trying to look at what was coming. We’ve seen NSF and NIH requirements, other funding agencies are moving along this path. While we’ve changed the regulations we’re flying mostly blind. We know very little (collectively) about what’s being proposed, what effect it’s having on funding decisions, and whether people do what they say they’re going to do. There’s a tremendous need to collect data so we understand what’s working or not.

Some specific problematic areas – Individually identifiable data: reusing this is very hard We need to think about research continuity and risk management. See Hurricane Sandy.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: