Gavin Burris – Wharton School
HPC – Computing resources characterized by many nodes, many cores, lots of ram, high speed, low-latency networks, large data stores.
XSede — it’s free (if you’re funded by a national agency).
cloud – more consoles, more code, less hardware
Using Ansible to configure cloud resources the same as on-premise, both to deploy EC2 clusters in Python, CfnCluster – cloud formation cluster to build and manage HPC clusters.
Univa UniCloud enables cloud resources to integrate with Univa scheduler.
Use Case: C++ simulation modeling code, needed 500 iterations, each took 3-4 days. Used MIT StarCLuster with spot bids. For $500 finished job in 4 days.
Use case: Where are the GPUs? Nobody was using – had to use different toolkits and code to utilize. So got rid of GPUs in refresh. Used UniCloud to use cloud instances with GPU
“Cloud can accommodate outliers” — GPUs, large memory. A la carte to the researcher based on tagged billing. Policy-based launching of cloud instances.
Seamless transition – VPC VPN link provided by central networking, AWS looks like another server room subnet. Consistent configuration management with the same Ansible playbooks. Cloud mandate by 2020 for Wharton – getting rid of server rooms to reclaim space.
They’re doing NFS over the VPN – getting great throughput.
Cost comparison – HPCC local hardware $328k, AWS $294 for flop equiv.
Spotinst – manages preemption and moves loads to free instances.