Higher Ed Cloud Forum: Beyond the Architecture — Rethinking Responsibilities

Glenn Blackler (UC Santa Cruz)

Cloud-First! Now What…?

Santa Cruz’s approach – hw infrastructure was going to turn into a pumpkin in sprint 2018. “Screw it – we’re all in, let’s jump.”

What’s our approach? How can existing teams support this change? Program work vs. migration specific work. Our focus – enterprise applications.

Defining the program: Plan for a quick win (build confidence, get familiar, identify training needs). Go big – went from a small PHP app to identity management infrastructure. All in! — moved Peoplesoft and Banner. Run concurrent migrations.

But really. … why? Need to continually talk to customers about why they’re doing it. Benefits of cloud migration aren’t apparent – have to sell it. The pitch: elasticity, DR/BR, Accommodation (additional test environments); modernized tools and team structures; sustainability.

Teams – Separation of duties – now have separation between sysadmins and app admins and developers. Always been a handoff, ticket driven organization. Don’t know what org looks like in new world – took really smart people and threw them in a room and told them to figure it out. Core team includes App and Sys admins, plus less frequent contributions from security, DBA, networking, devs.

Looking at Cloud Engineering Team that incorporates OS Setup/Config/App Config/Maintenance. DBA team still a bit separate. Security contributing across the board, but not necessarily hands on all the time. Teams are learning new things about each other that they didn’t know in the ticket-driven world.

Future – shared responsibilities mean fewer handoffs; engineers with wider breadth of skills; improved cross-team collaboration through shared code base; continuous improvement through evolving technical design and available services; adjusted job titles and responsibilities; ITS reorganization; budget impact, review of recharge model.

New ways of collaborating: Sys and App admins using a single git repository for code. Shared tools/technologies, password management; cross-functional tier 1 support;

Lessons learned – don’t lock decisions down too early, use governance to end debates, identify project goals that foster exploration (within timeline), use consultants carefully. Traditional PM will not work, push boundaries of what is possible, required vs. ideal – compromise is important; don’t compare with mature on-premise architecture; be prepared for rumors;

Not everyone is on the bus – what about those who don’t want to get on?

Higher Ed Cloud Forum – Lightning Round #1

Phil Robinson – Cloud Progress at Cornell Student Services IT

First AWS account – July 2015 – adopted a cloud first strategy. Now have about 30 apps on AWS (migrations, rewrites, new apps). Automate with Jenkins and Ansible. Retiring on-prem VMs.

Custom class-roster app, used by students to decide what to take. Added central syllabi feature this year. Using SNS+SQS as message bus, orchestrating events; CloudFront delivery for syllabi; On fly ClamAV scans on upload; ElasticSearch for searching; SES for notifications by email. Developed in 3632 hours.

Looking towards containerizing and VDI.

Gerard Schockley – BU iPaas RDS AWS

IPaaS ODS in RDS – integration service designed to integrate many data feeds into SnapLogic platform. Operational Data Store. Using AWS Aurora.

Bob Winding – Cloud Automation Journey

Most fully automated in GovCloud project. CloudFormation (VPCs, IAM, Security Groups, Centralized alerts); ANsible and CloudFormation for server builds; Consol federation with ADFS; Consistent process for all project accounts; new project account in a couple of hours; decentralized maintenance of CF Templates.

Penn –

What does “cloud native” mean at Penn?

Case study 1 – online giving portal: Data ETL (Talent); to Postgres RDS (fundraising metadata); S3 / Cloudfront; to Oracle on prem. Near real-time

Case study 2: Service ordering (VDI and Backup requests). On prep powershell makes changes in AD groups, sends messages through SQS

Case study 3 – Device registration. On prep registration; does API keys in Lambda

Sara Jeanes – Considerations in moving HPC workloads to the cloud

Initial framing questions: Do they have a preference for which cloud provider (do they have credits, different tech); Is there a multi-cloud resiliency need?

Workload questions: Can it be interrupted (use spot instances), large workloads firewall considerations (ScienceDMZ);

Jeff Minelli – Penn State – CloudCheckr enabling transparency at Penn State

Gain insights into financial transparency, spend optimization, resource utilization and right-sizing, cost allocation, best practices, security & compliance, collection and unification of AWS API data, continuous monitoring, reporting and alerts

Working with CloudCheckr to enable SAML. Basic group email notifications. Configuration of $100 spending alerts.

Trying to get CloudCheckr into InCommon.

Network Firewall Policies for Hybrid Cloud – Brian Jemes – University of Idaho

In cloud managing firewalls with server tags. Gets complicated when managing across on-prem and cloud. On prep have Cisco tools to manage ASA firewalls.

Options: manage hybrid cloud policy in on-prem firewall; manage hybrid policies with traditional firewalls in cloud; develop a hybrid tool.

Looking at a startup called Bracket Computing – cloud firewall policy manager. brkt.com – Provides micro-segmentation.

John Bailey – Washington University (St. Louis). Cloud IAM

Balance between security and usability. Enhncing usability with SPNEGO integrated auth. leverages kerberos token from machine login to perform a web SSO login, making the web login invisible to the customer.

Lou Tiseo – how categorizing resources help to understand cloud usage

Requiring seven different tags. Using Cloudyn management dashboard. Helped save costs by using reserved instances.

Chris Malek Caltech – Automation tools for AWS ECS and Batch

deployfish – configure almost all aspects of an ECS services (load balancing, app autoscaling, volumes, environment, etc). They’ve open sourced it. Create, inspect, scale, update, destroy and restart ECS services with single commands; manage multiple environments (test, qa, prod, etc). Integrates directly with terraform.  YAML driven

batchbeagle — allowing people to manage AWS Batch. Create, update, disable, and destroy queues. Create, update, disable, and destroy compute environments. Create job descriptions. Submit and manage jobs, etc.

Amanda Tan – Washington

Enabling cost notifications on AWS. Cost monitoring is difficult – should be zero effort. Two prong attack: auto-tag resources, send email notification with total spend and resource usage daily. Cloud Formation Template sets up Cloudwatch which invokes auto tag lambda function. AutoTag tags resources with owner and principal-id. Notification works off DLT billing records, provided in S3 buckets twice a day.

 

 

Self Service at Yale

Rob Starr, Jose Andrade, Louis Tiseo (Yale University)

Community told them needed to be able to spin machines up and down at will for classes, etc. Started with a big local open stack environment, now building it out at AWS.

Wanted to deliver agility, automate and simplify provisioning, shared resources, and support structures, and reduce on-premises data centers (one data center by July 2018).

Users can self-service request servers, etc. Spinup – CAS integration, patched regularly, AD, DNS, Networking, Approved security, custom images.

Self-service platform – current manual process takes (maybe) 5 days. With Self-Service, it takes 10 minutes. Offering: Compute, Storage, Databases, Platforms, DNS

All created in the same AWS account. All servers have private IP addresses.

ElasticSearch is the source of truth.

Users don’t get access to the AWS console, but can log into the machines.

Built initial iteration in 3 months with 3 people. Took about a year to build out the microservices environment with 3-4 people. Built on PHP Laravel.

Have a TryIt environment that’s free, with limits.

Have spun up 1854 services since starting, average life of server is 64 days.

Higher Ed Cloud Forum 2017 – Intro and Multi Account AWS Strategy

Survey Results

46 institutions attending, 4 vendors, 81 unique roles among 90 attendees.

40% cloud first, 12% have a documented cloud exit strategy.

82% AWS, 14% Azure, 4% Google, 2% other

Staff readiness is the #1 obstacle to broad adoption

42% have signed the I2 Net+ agreement, 11% have enterprise agreement with cloud provider

21% have containers/serverless in production, 9% non-prod, 70% not currently adopting.

Managing and Automating a Multi-Account Strategy in AWS: Brett Bendickenson (Arizona)

Have their own agreement with AWS. Currently have about a 75 accounts in their consolidated billing. 24 accounts in central IT.

UITS Cloud Advisory Team — cross functional group from within UITS to advise and decide on cloud practices and policies.

  • Tagging Policy – extremely important to get right up front. Service, name, environment, created by, contactnetid, accountnumber, sub account

Multi-account strategy. Workloads segregated into production and non-prod accounts. Tipping point was properly restricting everything by permissions – can do it with IAM roles, but it’s a lot of work. Decided on further segregation by teams / technologies, e.g. Kuali, PeopleSoft, IAM. Each has prod and non-prod accounts.

Each account has an account steward (director or dept. head) — responsible for spend, security, etc. Each account has an email list, with the address used for the root login address. Password stored in common vault, secured with MFA hardware token (kept in Ops). Linked to a central billing account. Set of account foundation templates are deployed. Started using AWS Organizations.

Account foundation modeled after the AWS NIST 800-53 Quickstrart CloudFormation Template. Set of CloudFormation templates which deploy roles, security controls, etc. Sets up an EC2 instance that runs a set of Ansible playbooks that set up Shib, bas AWS info, IAM, Logging, Lambda.

Federated Roles – SysAdmin, IAMAdmin, InstanceOps, ReadOnly, BillingPurchasing. Using Grouper for authorizations.

Using federated identities, no IAM users (generally).

CloudTrail enabled in all accounts. Enabled for all regions, records all API calls, sent to a central S3 Bucket in root account. CloudTrail logs also saved to CloudWatch logs in account for local reference.

Alarms set for changes in Network ACL, Security Group changes, Root Account activity, unauthorized access, IAM Policy changes, access key creation, cloud trail changes. (not all used in non-prod)

Lambda Functions – Alarm details (interrogates cloud trail events and sends actual API calls that raised the alarm); CreatedBy automated tagging for EC2 instances; OpsWorks tagging helper; OpsWorks tagging helper; Route53 helper (updates DNS); Tag monitoring – checks tags on instance launch (looking at Cloud Custodian from CapitalOne (open source)); AMI lookup

Arizona’s code: https://bitbucket.org/ua-ecs/service-catalog

CSG Winter 2017 – Cloud ERP Workshop

Stanford University – Cloud Transformations – Bruce Vincent

Why Cloud and Why now? Earthquake danger; campus space; quick provisioning; easy scalability; new features and functions more quickly

Vision for Stanford UIT cloud transformation program: Starting to behave like an enterprise. Shift most of service portfolio to cloud. A lot of self-examination – assessment of organization and staff. Refactoring of skills.

Trends and areas of importance: Cloud  – requires standards, process changes, amended roles; Automation – not just for efficiency – requires API integration; IAM – federated and social identities, post-password era nearing for SSO; Security – stop using address based access control; Strategic placement of strong tech staff in key positions; timescale of cloud ignores our annual cycles.

Challenges regarding cloud deployments: Business processes tightly coupled within SaaS products, e.g. ServiceNow and Salesforce; Tracking our assets which increasingly exist in disparate XaaS products; Representing the interrelationships between cloud assets; Not using our own domain namespace in URLs.

Trying to make ServiceNow the system of record about assets – need to integrate it with the automation of spinning instances up and down in the cloud.

Cloud ERP – Governance and Cloud ERP – Jim Phelps, Washington

UW going live with Workday in July. Migrating from old mainframe system and distributed business processes and systems. Business process change is difficult. Built an integrated service center (ISC) with 4 tiers of help.

Integrated Governance Model:  across business domains; equal voice from campus; linking business and technology; strategic, transformative, efficient…

Governance Design: Approach – set strategic direction; build roadmap; govern change – built out RACI diagram.

“Central” vs “Campus” change requests – set up a rubric for evaluating: governance should review and approve major changes.

Need for a common structured change request: help desk requests and structured change requests should be easily rerouted to each others’ queues.

Governance seats (proposed): 7 people – small and nimble, but representative of campus diversity.

Focus of governance group needs to be delivering greatest value for the whole university and leading transformational change of HR/P domains. Members must bring a transformational and strategic vision to the table. They must drive continuous change and improvements over time.

Next challenge: transition planning and execution – balancing implementation governance with ISC governance throughout transition – need to have a clear definition of stabilization.

Next steps: determine role of new EVP in RACI; Align with vision of executive director of ISC; provost to formally instantiate ISC governance; develop and implement transition plan; turn into operational processes

UMN ERP Governance – Sharon Ramallo

Went live with 9.2 Peoplesoft on 4/20/2015 – no issues at go-live!

Implemented governance process and continue to operate governance

Process: Planning, Budgeting; Refine; Execution; Refine

  • Executive Oversight Committee – Chair: VP Finance. Members: VP OIT, HR, Vice Provost
  • Operational Administrative Steering Committee: Char: Sr. Dir App Dev;
  • Administrative Computing Steering Committee – people who run the operational teams
  • Change Approval Board

Their CAB process builds a calendar in ServiceNow.

USC Experience in the Cloud – Steve O’Donnell

Current admin systems  – Kuali KFS/Coeus, custom SIS (Mainframe), Lawson, Workday, Cognos

Staffing and skill modernization: Burden of support shifts from an IT knowledge base to more of a business knowledge base – in terms of accountability and knowledge.  IT skill still required for integrations, complex reporting, etc. USC staffing and skill requirements disrupted.

Challenges: Who drives the roadmap and support? IT Ownership vs. business ownership; Central vs. Decentralized; Attrition in legacy system support staff. At risk skills: legacy programmers, data center, platform support, analysts supporting individual areas.

Mitigation: establishing clear vision for system ownership and support; restructure existing support org; repurpose by offering re-tooling/training; Opportunity for less experienced resources – leverage recent grads, get fresh thinking; fellowship/internships to help augment teams.

Business Process Engineering – USC Use cases

Kuali Deployment: Don’t disrupt campus operations. No business process changes. Easier to implement, but no big bang.

Workday HCM/Payroll: Use delivered business process as starting point. Engaged folks from central business, without enough input from campus at large. Frustrating for academics. Workday as a design partner was challenging. Make change management core from beginning – real lever is conversations with campus partners. Sketch future state impact early and consult with individual areas.

Current Approach – FIN pre-implementation investment

Demonstrations & Data gathering (requirements gathering): Sep – Nov. Led by Deloitte consultants; cover each administrative area; work team identifies USC requirements; Community reviews and provides feedback. Use the services folks, not the sales folks.

Workshops (develop requirements)- Nov – Feb. Led by USC business analysts, supported by Deloitte; Work teams further clarify requirements and identify how USC will use Workday; Community reviews draft and provides feedback

Playbacks (configure): March – May. Co-led by consultants and business analysts; Workday configured to execute high-level USC business requirements; Audience includes central and department-level users

Outcomes: Requirements catalog; application fit-gap; blueprint for new chart of accounts; future business process concepts; impacts on other enterprise systems; data conversation requirements; deployment scope, support model

CIO Panel – John Board; Bill Clebsch; Virginia Evans; Ron Kraemer; Kelli Trosvig

Cloud – ready for prime time ERP or not? Bill – approaching cautiously, we don’t know if these are the ultimate golden handcuffs. How do we get out of the SaaS vendors when we need to? Peoplesoft HR implementation has 6,000 customizations and a user community that is very used to being coddled to keep their processes. ERP is towards the bottom of the list for cloud.

Virginia – ERP was at the bottom of list, but business transformation and merger of medical center and physicians with university HR drove reconsideration. Eventually everything will be in the cloud.

John – ERP firmly at the bottom of the list.

Kelli – at Washington were not ready for the implementation they took on – trusted that they could keep quirky business processes, but that wasn’t the case. Took a lot of expenditure of political capital. Everyone around the table thought it was all about other people changing. Very difficult to get large institutions onto SaaS solutions because the business processes are so inflexible. Natural tendency is to stick with what you know – many people in our institutions have never worked anywhere else. Probably easier at smaller or more top-down institutions.

Ron – Should ask is higher-ed ready for prime time ERP or not? We keep trying to fix the flower when it fails to bloom. People changing ERPs are doing it because they have to – data center might be dying, cobol programmers might be done. Try to spend time fixing the ecosystem. Stop fixing the damn flower.

Kelli – it’s about how you do systemic change, not at a theoretical level.

Bill – what problem are we trying to solve? Need to be clear when we go into implementations. At Stanford want to get rid of data centers -space at too much of a premium, too hard to get permits, etc.

John – there’s an opportunity to be trusted to advise on system issues, integration, etc.

Kelli & Ron – The financial models of cap-ex vs. op-ex is a critical success factor.

Ron – separating pre-sales versions from reality is critical. That’s where we can play an important role.

John – we have massive intellectual expertise on campus, but we’ve done a terrible job of leveraging our information to help make the campus work better. We’ve got the data, but we haven’t been using it well.

Bernie – we need to start with rationalizing our university businesses before we tackle the ERP.

Ron – incumbent on us to tell a story to the Presidents. When ND looks at moving Ellucian they think what if they can stop running things that require infrastructure and licenses on campus? Positions us better than we are today. Epiphany over the last 6 months: We have to start telling stories – we can’t just pretend we know the right things to do. Let’s start gathering stories and sharing them.

Kitty – Part of the story is about the junk we have right now. The leaders don’t necessarily know how bad the business processes and proliferation of services are.

Cloud Forum 2016 – Cornell’s BI move to the cloud

Jeff Christen – Cornell

Source Systems – PeopleSoft, Kuali, WOrkday, Longview. Dimensional data marts: finance, student, contributor relations, research admin. BI Tools – OBIEE and Tableau

They do data replication and staging of data for the warehouses. Nightly eplication to stage -> ETL -> Data Marts

Why replication/stage? Consistent view of data for ETL processing, protects production source systems; tuning for ETL performance.

Started journey to cloud 2 years ago. Were using Oracle streams – high maintenance, but met some needs. Oracle purchased a more robust tool and de-supported Streams. ETL tools challenge – were using Cognos Data Manager for 90% of their work, but IBM didn’t continue to support it. Replaced it with WhereScape RED, but requires rewriting jobs.  Apps were already moving off-premise. WorkDay for HR/Payroll, PeopleSoft to AT&T hosting; Kuali financials moving to AWS. Launched pilot project to answer “what would it take to run data warehouse environment in AWS?”

Small pilot – Kuali warehouse in AWS. Which existing tools will work? Desire to use AWS services such as RDS where possible; Testing of both user query performance and ETL performance.

Why Oracle RDS and not Redshift? Approximately 80% of the Kuali DW is operational reporting. Needs fine-grained security at the database level; A lot of PL/SQL in the current environment; Currently exploring Redshift for non-sensitive high volume data

Some re-architecting: Oracle Streams not supported with Oracle RDS (used Attunity). Oracle Enterprise Manager scheduler not supported with Oracle RDS – using Jenkins (so beautiful and simple); No access to OS on RDS databases – installed Data Manager on separate Linux EC2 instance; Using WhereScape to call Data Manager from the RDS database.

Need to be more efficient. On premise the KDW had two physical servers. Found some inefficiencies in ETL code and some dashboard queries were masked by large servers. Prioritization of ETL code conversion by long running areas helped get AWS within nightly batch window. Some updates made to dashboards to improve performance or offer better filter options. Hired database tuning consultant (2wk) to help with Oracle tuning.

Testing and User Perception. Started with internal unit testing. Internal query execution time comparisons between on premise and AWS. User testing of dashboards on premise versus AWS. Repoint of production OBIEE financial dashboards to AWS for a day (x3). Some queries came back faster, some slower. Went through optimization and tuning to get it comparable across the board.

Cutover to AWS. Cutover Sept. 8. Redirected all non-OBIEE ODBC client traffic in October. Agreed to keep the on premise KDW loading in parallel for two month end closings as a fall back.

Next Steps. Parallel Research Admin Mart already in AWS – expect cutover by end of CY. Need more progress on ETL conversion before moving student and contributor marts. Continue Big Data / non-traditional data investigation (Cloudera on AWS). Redshift for large non-sensitive data sets.

Lessons learned: Off premise hosting does not equal Cloud technology. Often hard to get data out of SaaS apps.

Cloud Forum 2016 – Lightning Rounds #2

Cloud VDI – Bob Winding (Notre Dame)

Use cases they looked at:

  • Classes that need locally installed software
  • Application delivery instead of high-end lab machines
  • Workstations for researchers wher the whole project is in the cloud
    • NIST 800-171, ITAR, etc
    • Heavyweight, graphics and processing-intensive work

Looked at: Workspaces (AWS); Microsoft RDP and RDP Gateway, Fra.me, Ericom Blaze and Ericom Connect

Performance is everything – did tests with PTC Creo, Siemens NX10, and Solidworks. Set up test environment in Oregon. Nobody in central IT knew how to operate the software. Found in almost every case that the remote setup was beating the local desktop performance. In some cases, local environment crashed under load, but in AWS loaded in under 2 minutes. (G2X.large).

Researchers observed that they can transfer

Cloud Governance – Do You Need a CLoud Center of Excellence? Laura Babson (Arizona)

a group that leads an organization in an area of focus

Establish best practices, governance, and frameworks for an organization

Applications vs Operations – what do you about tagging, automation, monitoring, security, etc. Don’t want to end up with different ops solutions for different applications.

CoE can help streamline decision making. CCoE can make decision if funding isn’t required, or make a recommendation to a budget committee if funding is required.

Recent decision making: Account strategy – how many and where to put each workload? Campus to Cloud Connectivity’ Monitoring; Tagging policy

Can help with communication and engagement across the organization

AWS CloudFront – Gerard Shockley (Boston U)

What is a CDN? geographically dispersed low latency, high bandwidth solution for distributing http and https.

Terminology: Distribution (rules that control how cloudfront will access and deliver content); Origin (where the content lives)

Only works with publicly visible infrastructure at AWS

Easy to get metrics and drill down into specifics

DevOps != DevOps – Orrie Gartner (Colorado)

Brought a new data center online 3 years ago to consolidate IT across campus, built a private cloud

Ops and Devs teams work close together, automating everything, fine with accepting higher risks, building strong relations between teams, performing continuous integration and deployments.

Didn’t go well this summer moving to the public cloud – lack of understanding of vision and goals from other silos.

Ensure the entire enterprise strives for the same end goal, communicates that goal

Created a vision and articulated cloud strategy. 6 phase roadmap to to public cloud, includes embracing DevOps culture. Line in strategic plan – encourages every team to articulate how they will embrace DevOps concepts.

Educate Up. Educate Laterally. Educate Down.

Change is not easy – changing culture in the organization. Prosci ADKAR – model embraced for making organizational change. Small steps, like encouraging process folks to use Jira, the same tool used by the devs and ops folks.

Us versus Them – a View From the Information Security Bleachers- David McCartney (Ohio State)

Security is not the enemy – they’re scared, unaware, and unprepared for the cloud.

Scared – “how can we stop you?”

Unaware – why move? what kind of data? what security is needed (vs. what you think you need)? what did we do to deserve this?

Unprepared – How do current security services expand? What do you mean “no agent”? Logging? Auditing? Access management? Vulnerability scans? incident response? What about regulatory and framework requirements?

Model Us + Them – Embrace security, buy them booze.

Engage security early, sell the opportunity to do something new and exciting, provide options for training and guidance.

MCloud: From Enable to Integrate – Mark Personett (Michigan)

MCloud is an umbrella service. Strictly IaaS – currently offering AWS, but might mean others later

First iteration launched in 2014 – access to UM enterprise agreement, optional consolidated billing; data egress waiver; M Cloud Consulting Service

Working on launching M Cloud AWS Integrate: provisioning – private network space, shibboleth integration, etc; Guardrails – security best practices, common logging, reporting, etc; Foundational services in AWS – AD, Shib, Kerb, DNS, etc; Site to Site VPN services.

Azure Remote App – Troy Igney (Washington U in St. Louis)

two core requirements when enrollment in second year CS class spiked. Needed Visual Studio. New computers too expsensive. On prem VDI – too expensive. Off Prem VDI – Azure Remote App.

Goal – deliver consistent development environment across a range of BYOD devices.

Challenges: Support an entire class’s logons at once. Required Micsrosoft off-menu configuration.

Advantages – template once and deploy, capacity costs based on current enrollment – dynamically adjust for enrollment changes.

Largest RemoteApp deployment directly supporting classroom delivery.

Microsoft dropped RemoteApp in favor of Citrix virtualization technologies.

Lots of lessons learned supporting remote VDI

Adopting Cloud-Friendly Architecture for On-Premise Services – Eric Westfall (Indiana)

Indiana primarily on premise with an increasing amount of SaaS. Have newer data centers and heavy investment in VMWare. Inevitable to get to hybrid environment, but in the meantime working to be prepared – “cloud-ready” app architecture.

12 factor principles
Stateless Architecture
Microservices
Object Storage (using S3 API in on-prem solutions)
Non-Relational databases

Facilitating DevOps culture

Containerization – investing heavily in Docker. Adopting Docker Data Center

Hope it will allow to take advantage of existing infrastructure investments. Give dev and ops staff opportunities to experiment with cloud services. Allow modernization of app architecture and deliver practices. Prepare for inevitable future.

Cloud Initiative and Research – Steve Kwak (Northwestern)

Cloud Governance – October 2015. IT Directors from the schools and enterprise IT. Hired a consultant to help develop governance.

Cloud Architecture and COnsulting Team – April 2016 – 5 initial team members. set up initial environments at AWS and Azure. Worked through billing and accounts, and providing consulting.

Running cloud days and “open mic” sessions with AWS .

Research environments – 3 centrally managed – HPC (heavy upfront investment for dedicated compute, always a queue); Social Science cluster (aging infrastructure, limited support); Research data storage (separate storage from HPC). Looking to burst HPC to the cloud and move the other two.

Genomics pilot in AWS. Hire on a 3rd party team to put architecture together.

HPC Environment -working on targeting specific workloads in cloud with scheduler, and figure out bursting.

Controlled Approach to Research Computing in AWS – Paul Peterson (Emory)

Mindset of security team – need a similar set of controls in cloud as on-premise. This is quite challenging.

Started working to build Research Cloud. Collected 24 use cases and put them in three categories, divided into 2 VPC types. Worked with AWS professional services to build out VPCs. Pilot started this summer, going to end of year.

Type1 VPC- one availability zone, no Internet gateway – access only through Emory. Single sign-on with Shib.

Tpe2 has two availability zones, and an Internet gateway.

Goal of project team is to make requests for VPCs easy. Automation is key.

Generate VPC service. Created an inventory of accounts, LDS groups, Exchange distribution lists, and CIDR ranges.

Service gets next available account, adds admins to LDS group, creates SAML provider, Creates account alias, selects cloudfront template, get next available CIDR range Creates stack, compute subnets for account. Takes less than 5 minutes.

We Demand, On-Demand: Berkeley Analaytics Environments, VDI and the Cloud – Bill Allison (Berkeley)

Central IT budgets getting cut 10% year-over-year.

VDI use cases have been mostly around desktop pps, not research. Funded a pilot through December. User and use-case driven (faculty oriented) – need to tell story from a faculty perspective. Research IT group is like field workers, mst with PhDs.

Analytics Environment on Demand – not a change in the way you compute, at least on the surface. Use the skills you know already. Creating an abstraction layer.

Art of Letting Go – Relationship advice for dev and ops in the cloud – Bryan Hopkins (Penn)

Team lead for cloud app dev team. Cloud First program – replace homegrown frameworks with off the fhelf frameworks; replace waterfall with agile; replace monliths with integrations and composed apps

Three things we’ve learned so far: 1. Have a clear try-and-scrap phase in R&D – give it leeway. 2. Accept that interests and traditional roles will collide. Dev team can help with platform tasks, ops team can help with dev. Everyone cares about Jenkins. Bring them together. 3. Let go of notions of perfection and clean lines. Off-the-shelf means you get what’s on the shelf.