CSG Fall 2012 – Driving Out Technical Debt

Sharif Nijim from Notre Dame is leading a discussion on driving out technical debt.

Technical debt is a concept that when you make technical compromises you build up technical debt that you then have to pay off later. At Stanford it’s being used to talk about not keeping bench-depth of staff in various areas, or deploying systems without the companion expertise in the tool. Allows you trade off quick delivery for long-term payoff.

Kitty notes that there is technical debt in all of the legacy services we keep hanging on to as we get thinner and thinner on the ground. Michael notes that it’s not just technology but also in skill sets, expertise and other areas. Laxmi brings up the concept of varying levels of risk in debt – not always the same.

Ilee – With ERP systems we create backlogs of requests from user community that we can’t get to. With old technologies the cost per unit change is higher – that’s why you want to retire technical debt by moving to new technologies. Want to incorporate new features that take care of some of that backlog. At USC they had systems that were 25 years old – was getting increasingly harder to find people who could maintain the software, and it took a long time to get things done. They decided to go to Kuali for finance and research administration, for HR they decided to go to Workday, still working with Kuali group on Student system. The timeline started in 2009, and will end in 2016/17. They think that the cost per unit change will go down as well as gaining improved business processes. They’re automating processes that used to be manual, allowing more debt to be retired. Created a data warehouse with Cognos that also allows the University to do analytics.

One way of looking at that is replacing older, riskier debt, with newer higher-quality debt. Will also allow of the retirement of debt in some distributed areas as shadow systems are eliminated.

I asked how one measures this technical debt – Teri-Lynn says that Garnet must have a method because they have done estimates of the size of technical debt overall, but she hadn’t been able to find any methodology documented.

Mark notes that there’s debt you take on intentionally or debt that you back into, and looking at it as balancing the risk in your portfolio is a way of understanding it. With financial debt we know what’s on our books, but we don’t know with technical debt.

Michael makes the point that not all debt is bad – Kitty likens this to “I’ll gladly pay you Tuesday for a hamburger today.” Tim says that as we try to get to yes in satisfying requests we are taking on more debt. Bruce says that as we make tactical decisions we have to be conscious of keeping things narrow enough to not let them creep into more debt. Kitty talks about the maturation of service management and service owners not always realizing the amount of technical debt that particular technologies are accumulating. Bruce notes that this is because people are attached to the technology, not the capability.

Steve notes that sometimes retiring technical debt can come at the cost of incurring political debt.

It would be good if we had periodic reviews of debt. We could categorize the debt the same way we do with risk management. Tim asks if we could catalog the debt perhaps as an addendum to our service catalog. Kitty replies that many of us do risk management, and incorporating this as a concept of risk management could help.

Tim talks about Harvard’s approach – they’ve identified a person to evangelize the concept of technical debt, provide that level of awareness to the business. He’ll follow the approach of building risk statements into the services. It’s in a formative state.

In the chat room the question is raised as to whether we’re building “cloud debt” as we move services to the cloud.

Advertisement

CSG Fall 2012 – Balancing Central and Distributed Services

Bernie Gulacheck from Minnesota is leading a discussion on Central and Distributed Services. This is not a new topic, but the context has changed. We’ve seen the delivery of technology services change over the years. In the late ’80s and early ’90s distributed service units in Libraries, Administration, Academic Computing, were amalgamated into central IT units. Then the conversation shifted to the current landscape of distributed technology units and a central unit. The model along service continuum was often new technology emerging in the distributed units and then later being centralized for economies of scale. The cloud shifts this dynamic, where both central and distributed units can shift or bring new services into being in the cloud.

We’d like to believe that each unit that manages its own technology services is focused on its mission so as to create complementary and not duplicative services – sometimes that’s the case, sometimes it isn’t. What are the elements that facilitate this model? One comment is that what works is transparency – letting the deans and administrators know what is being offered centrally and going through the services each school is offering to see where there is duplication. The service catalog was very important in making this happen. Making that visible allows the conversation about efficiency and making sure that the quality of central services is acceptable to the schools.

Cornell has a structure where the distributed technology leaders also report in to an associate CIO in the central office – they are learning how to build the trust and efficiency in the group. They are building a brand of IT@Cornell that encompasses the entire concept, and that’s starting to work. The services organization is trying to lower cost and maximize efficiencies in order to provide the best service possible to demonstrate the utility value of central services so they don’t have to be duplicated locally.

Kitty notes that we have to be conscious that some services can only be delivered by the person who sits right next to the user – need to know the people, who’s got grant deadlines, etc. Also it’s a challenge for us to make core services easy enough to use.

Bernie notes that often the cloud services are superior to what we can offer, but the factors preventing us from moving in that direction are some of the same factors that prevent the distributed units from moving services to the center.

Elazar notes that trust is a key factor – they’re rolling out a new desktop support environment that will cover the whole institution, and it’s the same with consolidating data centers. Ron Kraemer notes that little things count – like referring to services as the “OIT” data center instead of the “Notre Dame” data center.

Tom says that the ability to present central services as something that distributed units can just use, much as they do the cloud, is important.

Sometimes the actual consolidation of services, even when everyone agrees it makes sense, can be perceived as threatening people’s jobs, which makes it hard to make progress.

Tracy notes that the more you can include people from the units within the central organization as much as possible can help build the relationships. Also you have to build a story and stick to it that gives people hope and a sense of purpose – where is the evolution of their position?

The concept of the say/do ratio is important – ideally would be 1:1.

Developing soft skills in the organization is important.

Bill notes that they started something called the Stanford Technical Leaders Program, where they brought in MOR to help build skills with 13 technical people from the central unit and 13 distributed people from around the campus. Last year they put on an un-conference, and registration fills within minutes after they open the web site. Once it gets to management it’s a failure – want to build the soft skills and the relationships.

It’s important to be honest that jobs are shifting and new skills will be needed – it won’t always be possible to retrain people, and in some cases groups will shrink.

At Brown they looked and found that they’re 49% central and 51% distributed, and in many cases the distributed people are being paid better than in central IT.

Tom notes that governance has helped, but that inputs from distributed units hasn’t always come through those administrative processes. Being able to prioritize and schedule work realistically is important.

Bill talks about “getting beyond polite.” He was told that that his (Bill’s) presence in the room was too loud, and without him in the room the discussion gets more down to earth.

I noted that often people ask for the help of the central unit in solving problems but we don’t have the capacity to deliver help in a timely manner. Bernie then asks what happens when we build services that have been requested by the distributed technology services but the units then opt out and complain about cost increases? Chuck has found that an effective technique is to let the unit lead the project and be responsible for end-to-end including announcement can be effective. Ilee says that making sure that the distributed units are involved in the definition of the services and that having a way to communicate with the deans is important.

Where we’re still in hot water is where we’ve over-promised and underestimated the complexity of replacing local services with central services, which burns our goodwill chips. We don’t want to stifle innovation in the units.

There are often pressures on the CIO to optimize cost in IT, but deans and other leaders can be hesitant to have conversations about the steps necessary to achieve those savings.

It might be possible to give schools score cards about where they are in comparison to each other and central units – has to be done independently (e.g. by the finance unit). Can help deans make decisions on how to allocate resources.

Having visibility into all the IT requests can help people understand what is happening and alert people to potential duplications of effort.

At one institution they don’t use the word “distributed” but use “federated”.

One person notes that if you have distributed people also report in to the central unit that you let the units off the hook a bit – can be a double-edged sword.

CSG Fall 2012 – Future of the IT Organization pt. 2

Now we’re having a series of point-counterpoint arguments about some issues around the future of IT organizations in higher ed.

The first is about the pace of change. Bernie Gulachek from Minnesota is arguing the point that change in higher education comes slowly and will continue to do so. Applications are up and enrollment is being managed, and until that is threatened our institutions have little impetus for change. Our institutions are not architected for change and we have a hard time affecting outcomes. We’ve had lots of opportunities for change in the last 10-15 years, but we’ve made conscious decisions to not bring about change. We’re now talking about commodity solutions in the cloud because we weren’t able to come together and create our own solutions collaboratively over the last ten years. Change will come incrementally.

Ted Dodds from Cornell  takes the position that we’re in denial – people believe we can duck and cover and the old ways will serve us into the future. The notion that somehow we can protect and insulate our community from the change doesn’t make sense. What he’s seeing for the first time ever is that academic leadership is provoked and engaged by MOOCs. These will undeniable change and affect the business models of our institutions. If we as IT leaders aren’t out in front of and participating in this change we won’t survive.

Charlie Leonhart from Georgetown and Bill Clebsch from Stanford are arguing about efficiency in IT organizations. Charlie starts by stating that the era of big IT is over. When he walks around campus he hears complaints about “you’re too big, you’re too fat, you’re too slow, you have too many people, you’re in my way – you guys suck.” We have basic facts – decimated budgets, needs to cut spending. We require bold leadership. Five point plan – cut spending, do more with less, cut staff positions, let’s virtualize. Get rid of desktop devices. Encourage innovation that drive down costs. Build strategic partnerships. Reduce central services and let users vote with dollars.

Bill starts saying there are 3 tsunamis: Research computation and big data (not building wet labs anymore, but building to support cyberscience); online learning; cloud and mobility. All this takes more money. The dollar savings aren’t there – but advantages in scaling, speed, and response to user. Mobility is a cost, not a cost saver. This is becoming a larger part of our lives – we can’t stop spending more. This is a part of the critical mission of the University – can’t stop spending now.

Elazar thinks we can do both – achieve efficiencies in our organization and use the savings to invest.

Elazar Harel from UCSF and Ron Kraermer are arguing about Bring Your Own Device. Elazar starts by saying that he believes that everybody should be able to bring whatever devices they want and they should work well in our environments. It also means BYOA (apps), BYOC (connectors), BYOP (printers), etc. When you think about this, there are some solutions – we need to be in the cloud with our infrastructure and applications.

Ron asks what business are we in? We’re in the business of delivering an optimal educational experience. We can compromise that technology by trying to support everything. What he plans to do at Notre Dame is lower the cost of textbooks by delivering them on standard devices (iPads), and they’ll train faculty on those devices to develop and deliver content. iPads will dominate the education market, and they can use it to improve the experience.

There’s a general discussion of the emerging roles of vendor relationship managers, service and product managers and technical integrators. The crowd isn’t clear yet as to how much or how quickly the old roles will go away.

CSG Fall 2012 – Future of the IT Organization

Ron Kraemer from Notre Dame and Steve Fleagle from Iowa are leading a workshop on the future of the IT organization – what’s changing in IT and how will it affect us?

Steve starts by talking about drivers of change in higher ed. We live at the intersection between higher ed and IT – there are things that happen on each side that will drive our organizations to change.

IT drivers of change include cloud computing, the consumerization of technology, personal cloud (more than just storage, but the glue that links devices, information, people, and services), identity management, migration from the PC as the most common method of accessing IT, widespread availability and value of large data sets,  high demand for skilled IT staff, business-driven IT, the success of disruptive technology (encouraging innovation), the rate of change will continue to increase. Access to click-stream and other data logs and how and whether to share them will also be interesting. Social network tools are increasingly important.

Higher ed challenges include the growing scrutiny and criticism of higher ed, financial pressures, growing enrollments, increased competition for research funding, likely market disruptions, and other challenges including a culture averse to change. Startups don’t have the same constraints, and are starting to receive funding from venture capitalists and some states. Students are flocking to these alternatives and some faculty are starting their own companies in this space. Some universities are responding (e.g. Coursera and edX), but alternatives are gaining traction – StraighterLine, Western Governors University, Excelsior College, Udacity. What happens when universities start accepting transfer credits from StraighterLine or MOOCs, or employers start taking those credits in place of degrees? There are emerging corporate universities that are training employees. Greg notes that higher education isn’t just one sector, it’s multiple sectors – community colleges are different from universities.

Ron Kraemer notes that states are decreasing their funding for higher education, but want their influence to increase. The difference in private institutions is that funds come from the benefactors that support us – they want influence too, but they bring resources along with it.

What Ron thinks about: Am I burning up my staff? Are we as efficient as we can be? Can we maintain quality as we are spread more thinly? Are we taking enough time for professional development? Does the current organization structure meet our needs? – for how long? How can we create a more comprehensive view outside IT? Why do most outside IT think this is easy?

Several folks note that our needed skill sets are changing and we need more skills in procurement, contracts, and business analysis.

One comment is about how computer science departments aren’t training people for our industry – where are the new people going to come from?

Mairead Martin is leading off a series or presentations on organizations. She asks to what extent do our organizational structures allow us to respond to change?  Traditionally stability and job security brought and kept people in our organizations – that’s no longer the case. Process management has become big – project management, ITSM, etc. It still takes us a long time to recruit and hire people. Investing in staff well-being and organizational resilience is important – how do we help our organizations through tough times? Diversity is an issue too. Kitty notes that our younger staff aren’t necessarily seeking security – they’re far more willing to move on if they’re not getting what they’re looking for. There’s some talk about career ladders and Ron notes that perhaps we shouldn’t have traditional ladders but flatter and flatter structures where the technology contributors can have influence on decision making processes. Shel notes that there’s more job stagnation in higher-ed IT than any other IT organization, which leads to people defending old technologies that are no longer the right ones to champion. Ten years with one skill set is no longer a viable skill set in IT.

Shel Waggener from Internet2 now is talking about cloud services and the impact they are having. At the scale of Google, Amazon, or Microsoft the scale changes how we approach our roles. “This is a tsunami… the potential upside is so big its hard for me to imagine any large research university that wouldn’t want to be involved” – Richard Demillo. Education is scaling similarly.

Our constituents don’t care about our organization. What it’s like today to get it: campus identifies a need, contacts IT, makes a case on why it’s needed, begin requirements process, realize project is bigger and more expense, cut scope, scrape more resources, finally see early prototype, start testing, fix bugs, deploy. Takes months or years to do that. What they really want from IT – the magic iPhone app! – Google idea to find app, install on device, start using new app – takes minutes. You can’t download an enterprise app, but what is an enterprise app now? An aggregation of many of those small apps.

We were too slow to adapt individually – we don’t have the time anymore. We have to find ways to work together to influence vendors. Our current organizations aren’t adequate to make this change. We have to find a way to involve our staff in the business changes that are happening in our institutions. In the past you could stop procurements of new waves of technologies – they are going to get steamrolled by the wave.

Major investment is flowing into cloud services for commercial offerings that can be adapted to higher ed use. 70% of higher ed IT spend is on people. The only leverage you get from Moore’s law is on the 30% left.

Shel outlines the Net+ process of aggregating demand for cloud services.

Sharif from Notre Dame is talking about the Cloud Compromise. He uses Box as an example – it doesn’t handle groups well. What kind of compromises are we willing to take to use cloud services? Another example – Notre Dame turned on Google Sites for everyone – started getting errors when people started uploading content into Sites. Found out from Google that there was an undocumented limit to uploads for an enterprise into Sites. Google raised the limit for Notre Dame, but is that good enough? A comment is that we should get used to things not being perfect. Options – Wait and watch and see if the problem recurs or intentionally fill up a site until it hits the limit to make the point or pay Google for better service. Bruce notes that different providers respond to different pressure points – perhaps we can find the way to collectively stuff the ballot box with specific vendors. There’s an observation that we can influence smaller vendors but not the larger ones like Google.

CSG Fall 2012 – Global hiring, staffing, and procurement

Steve Huth from Carnegie Mellon is talking about this topic.

Want to think about the nature of staff you’re sending over. They need to be adaptable. Think “Start Up” rather than a mature organization. Are your staff prepared for that? In one of the locations the guy building out the network and servers also helped put in the automatic gate openers. Some of the best staff had never travelled internationally – don’t want to miss out on a good prospect because they don’t meet your mental model.

Is your organization ready to do this? We deal with a lot of process and constraints built up over a lot of time – in these situations what you might need is a bag of cash in the souk – will drive procurement folks crazy.

Need people who are capable of interacting with the ambassador or the government – a lot of non-technical skills.

Dealing in a global environment offset by a fair amount of time. Even the work week might differ. Maintaining communications takes a lot of flexibility. Technology can help, but nothing takes the place of people going and having the remote people come back. Censorship can be a big deal in some of these countries. If we’re dealing with academic materials, they come in ok, but when you’re dealing with staff their materials may not. Share the good and bad parts.

People need more help than they would on a business trip to navigate in these new societies. Find local people who can help with things like getting health care set up, or if you’re in a car accident. Need to understand the culture when you’re doing business. Odd sorts of situations that won’t resolve in any meaningful way in any meaningful amount of time.

Things happen back home that global staff will have a hard time dealing with. Can be a high personal cost – if something needs to happen quickly you need to be prepared to get people on a plane so they can get home to deal with emergencies.

International assignments provide an unparalleled opportunity for growth and development. CMU worked on an international work assignment program – have people submit ideas for work at international locations. Gets people thinking about the campus globally.

CSG Fall 2012 – Global IT Services & Environments

Bob Johnson from Duke is introducing the Global Networking panel – we’re all faced with the same networking issues. Bandwidth availability, political restrictions, latency, jitter. Asia Pacific activities underway – Duke has a medical school in Singapore, building a million square foot campus in Shanghai, NYU has presence in Shanghai, Chicago has a presence in Singapore and Beijing. NYU is opening a presence in Sidney.

Working on building an International Network Exchange Point and Co-location site.  – provide common point for connection of regional R&E networks. Benefits are cost savings, building on R&E networks instead of leased lines to the US. Co-lo space is just cost sharing. Having a neutral location for data storage, hosting computer services (lower latency).

Where to put this? Reviewed three sites. The best ended up being TATA in Singapore.

Dale from Internet2 goes over the process and the support services required. There will be a telepresence and HD video exchange under the Internet2 Commons. Network Performance monitoring will be available.  Multiple functions for this facility: Co-location (initially capable of 10 racks, which can grow); Layer 3 capability; an instance of an Advance Layer 2 Services exchange – support OpenFlow, SDN, Dynamic Layer 2 circuits; Exchange will operate as a GLIF Open Lightpath Exchange; Essentially policy free – if you can get a circuit in and pay the fees, you’re welcome.

Some sites might bring in their own address space and router, others might use shared space. 1 Gb physical link to commodity Internet. Initially provision 200 meg on 1 gig circuit (with some burst capabilities). 1 Gb link to global switching building for peering – TEIN3, Gloriad (which ends up in Seattle). 1 gig link to Hong Kong light to meet CERNET and CSTnet.

Timelines: Sept 14 Tat agreements to be signed and in place and equipment ordered; Nov 1 – equipment delivered to Singapore; Dec 15- everything in place to beging testing; Jan 1 – fully operational.

Kitty – what’s worked and what hasn’t?

NYU has been testing, focusing on latency and user experience – acceptable, tolerable, or frustrating.

Common issues – network bandwidth, amount and does it match contracted bandwidth? Response times are highly variable, Some apps aren’t tuned for latency. Latencies range from around 80 ms to over 300 ms depending on sites. Focused on two forms of testing/monitoring – latency simulator and actual testing from different locations. Implemented a tool to understand user experience for web-based applications.

Implemented a long distance performance simulatore to create profiles. Implemented a tool call TrueSight that’s a Web App performance tool – allows clear understanding of what happens in a web app. An appliance connected to the F5 Span port – captures all the traffic and analyzes. Performance metrics of http and https web traffic. Able to track usage over time then drill down into specific sessions. Service leads get daily or weekly reports. Anonymized data being moved to data warehouse for trend analysis.

Remediation – optimizing webpages, applications; tuning network; WAN acceleration

It’s hard for app builders and owners to think of applications this way. Network folks haven’t really understood how applications perform on networks. Most app builders assume their users are on the LAN, not across the world.

Aspire to do testing before going live, setting watch points on end-user app tool to watch how performance is doing. Working with cloud vendors on how they test instances before selecting.

CSG Fall Meeting 2012 – Big Data – open discussion

Raj opens the discussion by asking about who’s responsible for big data on campus, and do we even know who makes that decision?

In talking about repositories, are there campuses that are trying hard to fill their repositories? A few raise their hand, but most don’t.

Moving to the cloud for storage – will there still be a need for local storage expertise? John says the complexity and heterogeneity is increasing and won’t go away any time soon. Bruce would like to see us positioned to provide expertise and advice on advanced data handling like Hadoop. Mark notes that we should be using those technologies to mine our own data sets – e.g. security mining application and network logs.

In a discussion of data management and the need to engage faculty on planning for data management Kitty notes that she’s had some success hiring PhDs who didn’t want to go into faculty positions but had a lot of experience with data. They could talk to faculty in ways that neither librarians nor IT people can. Curt notes that last year’s data management workshop noted the need for grad students to be trained in data management as part of learning research methods.

John asks whether people are planning central services for systems that store secure data that cannot be made public. Bill notes that Stanford is definitely building that as part of their repository, including ways to store data that cannot be released for a certain number of years. Carnegie Mellon is also planning to do this. Columbia has a pilot project in this area.

Raj notes that on the admin side, Arizona State is doing a lot of mining of their student data and providing recommendations for students on courses, etc.

Mark notes that we don’t have a good story to tell about groups in controlling access. Michael says we do have a fairly good story on group technology (thanks to Tom’s work on Grouper), but we still need to work across boundaries and to develop new standards, such as O-AUTH and SCIM.

Mark also postulates that the volume of data generated by the Massively Online Courses would be a really interesting place to think about big data.

There’s some general discussion about the discoverability of data in the future and the ability to understand and use data in the future when the creators might not be available. That’s where data curation becomes important.  Data curation should include standards for metadata and annotation, and also processes for migrating data forward as formats change. Ken quotes a scientist as saying “I would rather use someone else’s toothbrush than their ontology”.

On to lunch.