Galaxy, an open-source, scientific workflow system developed by the Galaxy Project (GP) Community, provides a means to build multi-step computational analyses using a graphical web user interface that allows a user to specify the type of data to operate on, what steps to take, and in what order. It accelerates innovation by allowing researchers to carry out analyses without having to do any programming. Galaxy is also heavily used as a tool integration platform for biology and genomics with thousands of popular tools available. It supports data uploads from a user endpoint and many well-known, online data sources (such as the UCSC Genome Browser, BioMart, and InterMine), allowing users to analyze public data or bring their own.
In the second half of 2020, the Galaxy Project team engaged with Trusted CI to review the security of a new Galaxy software distribution being developed as a containerized package, with an eye toward its use with sensitive information such as protected health information (PHI). The Trusted CI team used effort funded by the SGCI and Trusted CI partnership.
The teams met weekly over the engagement period to develop a shared understanding of Galaxy’s architecture, data flows, existing safeguards, and software development practices. Trusted CI used the NIST 800-53 control catalog to guide the discussions and created a Galaxy System Security Plan (SSP), which will be offered to the Galaxy Community as a template to support compliance with security regulations for local installations. The engagement concluded with a report containing a series of recommendations to further improve Galaxy’s security posture. Trusted CI also identified opportunities for future engagements between Trusted CI and Galaxy as the scope for the present engagement was limited to the containerized package.
The Trusted CI team would like to extend our sincere thanks to the entire Galaxy team for their partnership throughout the engagement and we look forward to future opportunities to collaborate.