CSG Spring 2013 – Data Warehousing and Data Governance

We’re at Brown University in Providence, Rhode Island. The first workshop is about data warehousing and governance. 

Business Intelligence survey –

What are the most important business drivers for BI? Top 3: Have data consistent & reliable across organization (this is the therapy we provide as IT professionals, to get the business offices to agree on data); make better data-informed decisions – people expect data at their fingertips, and it’s not; easier access to data – access is easy, but trusting that people will make decisions based on correct interpretations of data is not uniform. What is the role of IT in making people come together on data?

What are the biggest barriers to using BI? Cooperation and coordination across depts – business offices don’t have the time or intellectual investment to help figure the issues out; No common source of data – data pops up everywhere, how do we corral it to have a system of record?; no consistency in definitions or terms; Customers knowing what they want in a BI solution – we know this might not be an IT problem, but the rest of campus doesn’t always. 

What business Intelligence related initiatives are you interested in? Self-service – you give people access to tools and the take it and put it in Excel.; Leaerning analytics and student outcomes (predictive analyses); Big data

BI – Just Do It – Suneetha Vaitheswaran, U of Chicago

There is a roadmap for doing BI the right way, as expressed by professional organizations. The state of universities don’t always support that approach – economic downturn, continued aggressive growth, faculty-driven, bottoms-up institution with lack of standardization around data, but executives are demanding data for decision making. Hundreds of legacy and siloed source systems, erratic data stewardship. 

Took over management of systems and consultants to coordinate reporting approaches into an enterprise wide solution. Began to revamp the training program, recognizing that not everyone should be a user of the ad hoc reporting tools. Began using consistent methods and tools. Started a Data Stewardship Council. Had a new registrar that championed the Student data warehouse project, and got reporting considered as part of the new research admin system. 

The emerging vision is to login to one place for reporting with a similar look/feel across subject areas, to support different user skill levels with training as a means to an end. High performance and availability and integration points are important. Had to commit to multiple projects at a team, focused on standards and process including coordinating with the PMO, and brought in SME’s earlier. Had to acknowledge surprises and new priorities as they come up – need to work them into the strategy as they come up. They do a fair amount of resource loading to know where they have capacity. 

Marketing opportunities come up in new ways, so be ready. Planning happens all the time.

MIT has managed data services where they maintain servers and databases for local units and integrate tables with local data.

Managed self-service BI – Jerry Singer, UCSD

Definition – Managed facilities that empower BI users to become more self-reliant and less dependent on central IT

Primary objectives: Agility to respond to changing requirements; quick to market; lower costs; clearer definition of roles with emphasis on individual strengths; improve access to source data; facilitate data governance and collaboration; centrally control tool set and enterprise data; reduce toolset learning curve; provide function righ BI tools with current state capabilities. 

Key success factors: make BI tools easy to use; make BI results easy to consume and enhance; make DW solutions fast to deploy and easy to manage; make it easy to access source data. 

Roles – BI/DW Builder and info producer (Access); BI/DW builder; Information producer; information consumer and collaborator

Organizational considerations: governance; IT role (monitoring, oversight); Departments (access and security definition, requirements and design).

Recommendations: Just installing a tool does not get you there. Support collaborative BI, starting with friendly collaborators with current needs. Establish governance, reuse deliverables, release toolset functionality incrementally by creating a starter set of reports, analyses and widgets. Provide training and sandbox environments, enable data federation to include DW source.

Framework for starting or restarting a BI program – Todd Hill, Notre Dame

It was taking far too long to deliver business value to customers, and BI projects are incredibly complicated. Came up with a metaphor: BI is like a fine restaurant: variety of options; well-prepared (easy to use); consistent; timely; great service.

Looked more like: long time to deliver (15-18 moths), customer frustration, poor morale, internal conflict, role confusion.

Data prepartion is like the kitchen. Presentation is like the dining room experience. No matter how good your kitchen is, that’s not what the customers come for. 

Step 1: Get serious about BI – create a sense of urgency. Build a guiding coalition with a strong sponsor – it’s a business problem not a technology one. 

Who’s driving the bus? Make sure you know. Is it central IT, or the business?

Step 2 – Drive accountability

Step 3 – Remember that BI is about much more than technology: Data governance, functional requirements, technology. 

Data input integrity is important – push changes back into the source systems, don’t change in the DW. 

80-90% of customers use pivot tables in Excel to analyze data – so why not support it?

Step 4: Begin with the end in mind. Mapped requirements by stakeholders. Shows commonality of requirements across organizations. EVP required prioritization of functions. 

Step 5 – … but start small – pick bite size chunks that deliver value, don’t spend all your time in the kitchen. Did a 4 month project in HR, which resulted in the customers demoing to the EVP by themselves.

Step 6 – Embrace change – you’re not going to get it right at first. Embraced agile method – typical project is 4 months. Business value delivered in one month increments. Think about the spectrum of BI from personal, through team, to enterprise. Hard to start with etnerprise. Personal BI is enabled by in memory tools that encourage iteration, then you want to share. Jump to team BI – get more feedback, iterate some more till it’s good for your department. They build in a burn-in period of 4 months. Then elevate to the enterprise, which is where you can invest in infrastructure, fine grained access controls, etc. 

This is not a prototype approach, but each step involves delivery of business value. Finding issues in the enterprise level requires about 8x the time to fix than the personal space, so it’s important to find them early. 

Guiding principles: keep it simple (don’t over-engineer a solution); be agile (flex on scope, not timeline); speak common language; Work smarter; celebrate successes (and failures) – they demo after the 2nd and 4th sprints to all customers.

Data security 

One case (from Suneetha) – legacy systems s DW must build custom security; each model driven by steward; department pressure for additional granularity; substantial administration and workflow; data usage requests to document feeds; data stewardship council surfaces issues. Model deals with culture and systems at each insittution. 

Data Governance Approaches – How do you secure and manage access to data? Anja Canfield-Budde, University of Washington.

Data Management Committee, appointed by Provost. Created a Data Map, aligning business and data perspectives. Seven main subject areas further classified into data domains, each of which has a steward identified. Developed a framework for managing access to data. A finite set of fourteen roles has been identified across all areas. Access granted by role across subject areas. Data Access Control, controls data access across systems. Data stewards classify users into the fourteen roles, and then apply security classifications to the data. DAC consumes data from DAC. Security views dynamically show only data user is allowed to see.

The implementation of data security has resulted in a rapid growth for BI. This was not anticipated. By not assuming that all data was sensitive and being able to trust classification of data people were more comfortable. Have a portal that not only provides access to data, but also metadata about security.

Data governance: http://www.washington.edu/uwit/im/dmc/

UW Data Warehouse and BI: http://decisionsupport.washington.edu 

Effective Practices for Data Governance – Mike Chapple, Notre Dame

Difficulty in defining terms like “student”. Overarching goal is to provide access to data. Model of pillars, with five principles; Quality & consistency; policies & standards; security & privacy;  compliance; retention & archiving.

Each project preceded by a data governance review that defines terms in those five principles. 

Governance model – Exec sponsors (EVP & CIO), Campus data steward, unit data stewards, coordinating committees (info governance committee, data-driven decision making steering committee). 

Building a data dictionary – definition, source, what’s the query, revision history, etc. Definition has to be in plain english understandable by average admins. Using Google docs because it’s easy to do collaborative editing, but it’s a bad repository – plan to move to Sharepoint for the repository. Defining each term requires 1.5 hour meeting with 6-8 people plus an hour prep and an hour cleanup. Have currently defined around 100 terms. 

Give the group a starting point – don’t start with a blank sheet. Develop simple guiding principles (e.g. put jargon in its place). 

Business Intelligence Functional Approach – Chris Frederick, Notre Dam

Customers were unsatisfied, and deliveries were late and incomplete. 

Key issues: Insufficient collaboration, low accountability, user doesn’t know all the requirements up front, bad data, no shared data definitions, overly complex and hard to maintain (we were enamored of the technology), tools are too difficult.

Waterfall doesn’t work – takes too long and you don’t get the tight feedback loops you need. Be agile. Got visual with their scrum boards. Priorities change. 

Breakthrough – In-memory BI tools. Column-oriented databases instead of row-oriented. Can mash up data from disparate soources, can easily fit 10s of millions of rows into memory, reduces need to optimize databases, quickly create rich interactive dashboards. Security needs to be first in concern, because data sits on your laptop. Usually requires 64-bit architecture. Watch out for “spread marts”.  Hot tools: microsoft powerpivot for Excel, Tableau, QlikView. 

Work with partners who are “shovel ready”. Motivation, availability, knowledge of business problem.  Co-locate – get out of the server room. Exec product owner came to the daily standups.

Schedule live BI demos. Learn in the least costly manner.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: