This cloud clinic is intended for NAIRR investigators and other research teams using cloud platforms for data science. Specifically we will focus on checkpointing, or storing partial results from a cloud compute task so that if it is interrupted it can be restarted roughly where it left off.
The emphasis will be on cloud efficiency, terminology, use cases, and best practices, including GPU access, persistent (object) storage, distinguishing preemptible VM types e.g. “one time” versus “persistent” on AWS, and useful details such as the user data option on AWS.
The session will be recorded and will include links to documentation.
When: Thursday, January 23 from 11:30am-12:30pm Pacific (2:30-3:30pm Eastern)
Zoom: https://washington.zoom.us/j/93136411861 (Meeting ID: 931 3641 1861)
Email: help@cloudbank.org with any questions
Abstract: In today’s busy world we can lose track of small details that have a big impact. Suppose you have a cloud budget of $10,000 but your computations could be scaled up beyond that limitation to produce better results. What you need is access to immutable storage (easy), access to cheap preemptible cloud VM instances (easy) and a reliable method of checkpointing your progress (easy? hard?). This one-two-three punch means you can purchase $33,333 worth of cloud computing for a mere $10,000 and get better research results as a consequence. This cloud clinic will catch you up on the how-tos and other small details of such a substantial gain in compute power. We use a CNN as our example implementation of a compute-intensive research task.