Cold Storage Pilot Overview
Pilot Overview
Vision for the Cold Storage Pilot:
Select viable solution(s) for cold storage services for the HMS research community, validate the use cases, and test potential solutions.
Pilot goals:
Collect and validate use cases from labs and researchers
Gather feedback and address issues associated with proposed workflow(s)
Validate performance and cost expectations associated with storage targets and use cases
The pilot workflow will test the use of Starfish (a data movement tool) to transfer inactive data to AWS Glacier Services and Azure Cool Blob Archive. The Starfish graphical user interface (GUI) tool is a self-service visual interface that allows users to view their group’s storage amounts and locations.
HMS IT will gather feedback on:
user experience
platform limitations
functionality
The pilot will be built to test data archiving, tracking, and retrieving with pre-determined criteria.
Important Information
User testing will begin on November 22, 2021 and will extend through the end of January 2022 to allow labs to adequately test their workflows and provide feedback.
This is an ongoing pilot and features may be added or removed throughout the process. Additionally, HMS IT internal testing will be ongoing in parallel with user testing.
The cold storage work streams that are being tested are not in production, and any data copied to the pilot platform is not replicated or backed up.
Pilot participants have been asked to identify a dataset(s) that is a representative subset of data you anticipate moving to Cold storage. For the purposes of this pilot testing, please follow this guidance:
Total test data size should be less than ~10 TiB per lab
Please MAKE A COPY of the data you will be testing and add “test” or something similar to the copy. We want to simulate what the data movement would look like in a real world scenario, which means we want to actually remove the data from the source and have the data be moved to Cold.
If you need assistance with copying data, please reach out to rdmhelp@hms.harvard.edu
Storage limits on lab folders where pilot testing is occurring can be temporarily increased during the pilot to accommodate duplicate datasets. If you anticipate needing a temporary increase, email rdmhelp@hms.harvard.edu.
All pilot participant testing will be completed in the Starfish GUI. Starfish and HMS are exploring the possibility of bringing the command line interface to HMS users for work post-pilot.
Both AWS Glacier Services and Azure Cool Blob Archive will be tested by pilot participants. At the start of the pilot, AWS will be tested, and then participants will test the same test workflows with Azure. The goal is to keep the testing of each space separate to avoid complexities with set up, testing, reporting, and feedback. We anticipate starting Azure testing in January; additional instructions will be provided at that time.
During our earlier meetings, we requested feedback on your preferred method for verifying and tracking data transfers to/from Cold storage. We determined that possible options could include email notifications, manifests in the source/destination folders, or a visual indicator in the online interface. We are currently working with Starfish to determine the best option moving forward. In the interim, you should be able to see that data has moved from one storage location to another when viewing the storage folders in the GUI.
Lab-Specific Details
Lab-specific details will be communicated with pilot participants individually via email from the RDM team (rdmhelp@hms.harvard.edu).
How to Contact Us
Please direct all communications to the RDM team by emailing rdmhelp@hms.harvard.edu.