Tuesday, August 3, 2021

Trusted CI new co-PIs: Peisert and Shute

I am happy to announce that Sean Peisert and Kelli Shute have taken on co-PI roles with Trusted CI. Both already have substantial leadership roles with Trusted CI. Sean is leading the 2021 annual challenge on software assurance and Kelli has been serving as Trusted CI’s Executive Director since August of 2020.

Thank you to Sean and Kelli for being willing to step up and take on these responsibilities.

Von

Trusted CI PI and Director


Initial Findings of the 2021 Trusted CI Annual Challenge on Software Assurance

 In 2021, Trusted CI is conducting our focused “annual challenge” on the assurance of software used by scientific computing and cyberinfrastructure. The goal of this year-long project, involving seven Trusted CI members, is to broadly improve the robustness of software used in scientific computing with respect to security. The Annual Challenge team spent the first half of the 2021 calendar year engaging with developers of scientific software to understand the range of software development practices used and identifying opportunities to improve practices and code implementation to minimize the risk of vulnerabilities. In this blog post, the 2021 Trusted CI Annual Challenge team gives a high-level description of some of its more important findings during the past six months. 

Later this year, the team will be leveraging its insights from open-science developer engagements to develop a guide specifically aimed at the scientific software community that covers software assurance in a way most appropriate to that community. Trusted CI will be reaching back out to the community sometime in the Fall for feedback on draft versions of that guide before the final version is published late in 2021.

In support of this effort, Trusted CI gratefully acknowledges the input from the following teams who contributed to this effort: FABRIC, the Galaxy Project, High Performance SSH/SCP (HPN-SSH) by the Pittsburgh Supercomputing Center (PSC), Open OnDemand by the Ohio Supercomputer Center, Rolling Deck to Repository (R2R) by Columbia University, and the Vera C. Rubin Observatory

At a high level, the team identified challenges that developers face with robust policy and process documentation; difficulties in identifying and staffing security leads, and ensuring strong lines of security responsibilities among developers; difficulties in effective use of code analysis tools; confusion about when, where, and how to find effective security training; and challenges with controlling source code developed and external libraries used, to ensure strong supply chain security. We now describe our examination process and findings in greater detail.


Goals and Approach

The motivation for this year’s Annual Challenge is that Trusted CI has reviewed many projects in its history and found significant anecdotal evidence that there are worrisome gaps in software assurance practices in scientific computing. We determined that if some common themes could be identified and paired with the proportional remediations, the state of software assurance in science might be significantly improved. 

Trusted CI has observed that currently available software development resources often do not match well with the needs of scientific projects; the backgrounds of the developers, the available resources, and the way the software is used do not necessarily map to existing resources available for software assurance. Hence, Trusted CI put together a team including a range of security expertise with backgrounds in the field from academic researchers to operational expertise. That team then examined several software projects covering a range of sizes, applications, and NSF directorate funding sources, looking for commonalities among them related to software security. Our focus was on both procedures and practical application of security measures and tools. 

In preparing our examinations of these individual software projects, the Annual Challenge team enumerated several details that it felt would shed light on the software security challenges faced by scientific software developers, some of the most successful ways in which existing teams are addressing those challenges, and observations from developers about the way that they wish things might be different in the future, or if they were able to do things over again from the beginning.


Findings

The Annual Challenge team’s findings are generally aligned with one of five categories: process, organization/mission, tools, training, and code storage.

Process: The team found several common threads of challenges facing developers, most notably related to policy process documentation, including policies relating to onboarding, offboarding, code commits and pull requests, coding standards, design, communication about vulnerabilities with user communities, patching methodologies, and auditing practices. One cause for this finding is often that software projects start small and do not plan to grow or be used widely. And when the software does grow and starts to be used broadly, it can be hard to develop formal policies after developers are used to working in an informal, ad hoc manner. In addition, organizations do not budget for security. Further, where policy documentation does exist, it can easily go stale -- “documentation rot.” As a result, it would be helpful for Trusted CI to develop guides for and examples of such policies that could be used and implemented even at early stages by the scientific software development community.

Organization and Mission: Most projects faced difficulties in identifying, staffing, or funding a security lead and/or project manager. The few projects that had at least one of these roles filled had an advantage in regards to DevSecOps. In terms of acquiring broader security skills, some projects attempted to use institutional “audit services” but found mixed results. Several projects struggled with the challenge of integrating security knowledge among different teams or individuals. Strong lines of responsibility can create valuable modularity but can also create points of weakness when interfaces between different authors or repositories are not fully evaluated for security issues. Developers can ease this tension by using processes for developing security policies around software, ensuring ongoing management support and enforcement of policies, and helping development teams understand the assignment of key responsibilities. These topics will be addressed in the software assurance guide that Trusted CI is developing.

Tools: Static analysis tools are commonly employed in continuous integration (CI) workflows to help detect security flaws, poor coding style, and potential errors in the project. A primary attribute of a static analysis tool is the set of language-specific rules and patterns it uses to search for style, correctness, and security issues in a project. One major issue with static analysis tools is that they report a high number of false positives, which, as the Trusted CI team found, can cause developers to avoid using them. It was determined that it would be helpful for Trusted CI to develop tutorials that are appropriate for the developers in the scientific software community to learn how to properly use these tools and overcome their traditional weaknesses without being buried in useless results.

The Trusted CI team found that dependency checking tools were commonly employed, particularly given some of the automation and analysis features built into GitHub. Such tools are useful to ensure the continued security of a project’s dependencies as new vulnerabilities are found over time. Thus, the Trusted CI team will explore developing (or referencing existing) materials to ensure that the application of dependency tracking is effective for the audience and application in question. It should be noted that tools in general could give a false sense of security if they are not carefully used.

Training: Projects shared that developers of scientific software received almost no specific training on security or secure software development. A few of the projects that attempted to find online training resources reported finding themselves lost in a quagmire of tutorials. In some cases, developers had computer science backgrounds and relied on what they learned early in their careers, sometimes decades ago. In other cases, professional training was explored but found to be at the wrong level of detail to be useful, had little emphasis on security specifically, or was extremely expensive. In yet other cases, institutional training was leveraged. We found that any kind of ongoing training tended to be seen by developers as not worth the time and/or expense. To address this, Trusted CI should identify training resources appropriate for the specific needs, interests, and budgets of the scientific software community.

Code Storage: Although most projects were using version control in external repositories, the access controls methods governing pull requests and commits were not sufficiently restricted to maintain a secure posture. Many projects leverage GitHub’s dependency checking software; however, that tool is limited to only checking libraries within GitHub’s domain. A few projects developed their own software in an attempt to navigate a dependency nightmare. Further, there was often little ability or attempt to vet external libraries; these were often accepted without inspection mainly because there is no straightforward mechanism in place to vet these packages. In the Trusted CI software assurance guide, it would be useful to describe processes for leveraging two-factor authentication and developing policies governing access controls, commits, pull requests, and vetting of external libraries.


Next Steps

The findings derived from our examination of several representative scientific software development projects will direct our efforts towards addressing what new content we believe is most needed by the scientific software development community.

Over the next six months, the Trusted CI team will be developing a guide consisting of this material, targeted toward anyone who is either planning or has an ongoing software project that needs a security plan in place. While we hope that the guide will be broadly usable, a particular focus of the guide will be on projects that provide a user-facing front end exposed to the Internet because such software is most likely to be attacked. 

This guide is meant as a “best practices” approach to the software lifecycle. We will recommend various resources that should be leveraged in scientific software, including the types of tools to run to expose vulnerabilities, best practices in coding, and some procedures that should be followed when engaged in a large collaborative effort and how to share the code safely. Ultimately, we hope the guide will support scientific discovery itself by providing guidance around how to minimize possible risks incurred in creating and using scientific software.

Monday, July 19, 2021

Higher Education Regulated Research Workshop Series: A Collective Perspective

Regulated research data is a growing challenge for NSF funded organizations in research and academia, with little guidance on how to tackle regulated research institutionally. Trusted CI would like to bring the community’s attention to an important report released today by the organizers of a recent, NSF-sponsored* Higher Education Regulated Research Workshop Series that distills the input of 155 participants from 84 Higher Education institutions. Motivated by the Higher Ed community’s desire to standardize strategies and practices, the facilitated** workshop sought to find efficient ways for institutions large and small to manage regulated research data and smooth the path to compliance. It identified six main pillars of a successful research cybersecurity compliance program, namely Ownership and Roles, Financials and Cost, Training and Education, Auditing, Clarity of Controls, and Scoping. The report presents each pillar as a chapter, complete with best practices, challenges, and recommendations for research enablers on campus. While it focuses on Department of Defense (DOD) funded research, Controlled Unclassified Information (CUI), and health research, the report offers ideas and guidance on how to stand up a well managed campus program that applies to all regulated research data. It represents a depth and breadth of community collaboration and institutional experience never before compiled in a single place.

Organized by Purdue University with co-organizers from Duke University, University of Florida, and Indiana University, the workshop comprised six virtual sessions between November 2020 and June 2021. Participants ranged from research computing directors, information security officers, compliance professionals, research administration officers, and personnel who support and train researchers.

The full report is available at the EDUCAUSE Cybersecurity Resources page at https://library.educause.edu/resources/2021/7/higher-education-regulated-research-workshop-series-a-collective-perspective. It was co-authored by contributors from Purdue University, Duke University, University of Florida, Indiana University, Case Western Reserve University, University of Central Florida, Clemson University, Georgia Institute of Technology, and University of South Carolina.

See https://www.trustedci.org/compliance-programs for additional materials from Trusted CI on the topic of compliance programs.

* NSF Grant #1840043, “Supporting Controlled Unclassified Information with a Campus Awareness and Risk Management Framework”, awarded to Purdue University
** by Knowinnovation

Tuesday, July 13, 2021

Trusted CI webinar: A capability-based authorization infrastructure for distributed High Throughput Computing July 26th @11am Eastern

Open Science Grid's Brian Bockelman, is presenting the talk, A capability-based authorization infrastructure for distributed High Throughput Computing, on Monday July 26th at 11am (Eastern).

Please register here. Be sure to check spam/junk folder for registration confirmation email.

The OSG Consortium provides researchers with the ability to bring their distributed high throughput computing (dHTC) workloads to a pool of resources consisting of hardware across approximately 100 different sites.  Using this “Open Science Pool” resource, projects can leverage the opportunistic access (nodes that would be otherwise idle at the site), dedicated hardware, or allocated time at large-scallel NSF-funded resources.

While dHTC can be a powerful tool to advance scientific discovery, managing trust relationships with so many sites can be challenging; the OSG helps bootstrap the trust relationships between project and provider.  Further, authorization in the OSG ecosystem is an evolving topic.  On the national and international infrastructure, we are leading the transition from identity-based authorization -- basing decisions on “who you are” -- to capability based authorization.  Capability-based authorization focuses on “what can you do?” and is implemented through tools like bearer tokens.  Changing the mindset of an entire ecosystem is wide-ranging work, involving dedicated projects such as the new NSF-funded “SciAuth” and international partners like the Worldwide LHC Computing Grid.

In this talk, we’ll cover the journey of the OSG to a capability-based authorization as well as the challenges and opportunities of changing trust models for a functioning infrastructure.

Speaker Bio

Brian Bockelman is a Principal Investigator at the Morgridge Institute for Research and co-PI on the Partnership to Advance Throughput Computing (PATh) and Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP).  Within the OSG, he leads the Technology Area, which provides the software and technologies that underpin the OSG fabric of services.  He is also a co-PI on the new SciAuth project, led by Jim Basney, which aims to coordinate the deployment of capability-based authorization across the science and engineering cyberinfrastructure.

Before joining Morgridge, Bockelman received a joint PhD in Mathematics and Computer Science from the University of Nebraska-Lincoln (UNL) and was an integral member of the Holland Computing Center at UNL.  His team helps advance Research Computing activities at Morgridge and are partners within the Center for High Throughput Computing (CHTC) at University of Wisconsin-Madison.

Join Trusted CI's announcements mailing list for information about upcoming events. To submit topics or requests to present, see our call for presentations. Archived presentations are available on our site under "Past Events."

Wednesday, July 7, 2021

Trusted CI Concludes Engagement with FABRIC

FABRIC: Adaptive Programmable Research Infrastructure for Computer Science and Science Applications, funded under NSF grants 1935966 and 2029261, is a national scale testbed that connects to connects to existing NSF testbeds (e.g., PAWR), as well as NSF Clouds (e.g., Chameleon and CloudLab), HPC Facilities, and the real Internet. FABRIC aims to expand its outreach, enabling new science applications, using a diverse array of networks, integrating machine learning, and preparing the next generation of computer science researchers.

FABRIC received its initial funding in 2019 and is projected to go into operational phase in September of 2023. FABRIC reached out to Trusted CI to request a review of its software development process, the trust boundaries in the FABRIC system, and the FABRIC security and monitoring architecture.

The five-month engagement began in February and completed in June. In that time the teams worked together to review FABRIC’s project documentation, which included a deep analysis of the security architecture. We moved on to completing an asset inventory and risk assessment, covering over 70 project assets, identifying attack surfaces and potential threats, and documenting current and planned security controls. Lastly, we documented engagement findings in an internal report shared with FABRIC project leadership.

FABRIC also assisted with the Trusted CI 2021 Annual Challenge (Software Assurance) by participating in an interview with members of the software assurance team. The results of that interview will provide input to Trusted CI's forthcoming guide on software assurance for NSF projects.

Tuesday, July 6, 2021

Join Trusted CI at PEARC21, July 19th - 22nd

PEARC21 will be held virtually on July 19th - 22nd, 2021 (PEARC website).

Trusted CI will be hosting two events, our annual workshop and our Security Log Analysis tutorial.

Both events are scheduled at the same time, please note that when planning your agenda.

The details for each event are listed below. 

Workshop: The Fifth Trusted CI Workshop on Trustworthy Scientific Cyberinfrastructure provides an opportunity for sharing experiences, recommendations, and solutions for addressing cybersecurity challenges in research computing.  

Monday July 19th @ 8am - 11am Pacific.

  • 8:00 am - Welcome and opening remarks
  • 8:10 am - The Trusted CI Framework: A Minimum Standard for Cybersecurity Programs
    • Presenters: Scott Russell, Ranson Ricks, Craig Jackson, and Emily Adams; Trusted CI / Indiana University’s Center for Applied Cybersecurity Research
  • 8:40 am - Google Drive: The Unknown Unknowns
    • Presenter: Mark Krenz; Trusted CI / Indiana University’s Center for Applied Cybersecurity Research
  • 9:10 am - Experiences Integrating and Operating Custos Security Services
    • Presenters: Isuru Ranawaka, Dimuthu Wannipurage, Samitha Liyanage, Yu Ma, Suresh Marru, and Marlon Pierce; Indiana University
    • Dannon Baker, Alexandru Mahmoud, Juleen Graham, and Enis Afgan; Johns Hopkins University
    • Terry Fleury, and Jim Basney; University of Illinois Urbana Champaign
  • 9:40 am - 10 minute Break
  • 9:50 am - Drawing parallels and synergies between NSF and NIH cybersecurity projects
    • Presenters: Enis Afgan, Alexandru Mahmoud, Dannon Baker, and Michael Schatz; Johns Hopkins University
    • Jeremy Goecks; Oregon Health and Sciences University
  • 10:20 am - How InCommon is helping its members to meet NIH requirements for federated credentials
    • Presenters: Tom Barton; Internet2
  • 10:50 am - Wrap up and final thoughts (10 minutes)

        More detailed information about the presentations is available on our website.


Tutorial: Security Log Analysis: Real world hands-on methods and techniques to detect attacks.  

Monday July 19th @ 8am - 11am Pacific.

A half-day training to tie together various log and data sources and provide a more rounded, coherent picture of a potential security event. It will also present log analysis as a life cycle (collection, event management, analysis, response), that becomes more efficient over time. Interactive demonstrations will cover both automated and manual analysis using multiple log sources, with examples from real security incidents.

Monday July 19th @ 8am - 11am Pacific time

Thursday, June 24, 2021

The 2021 NSF Cybersecurity Summit Call For Participation - NOW OPEN - Deadline is Friday, July 2nd

It is our pleasure to announce that the 2021 NSF Cybersecurity Summit is scheduled to take place the week of October 11th with the plenary sessions occurring on Tuesday, October 12th and Wednesday October 13th. Due to the impact of the global pandemic, we will hold this year’s summit on-line instead of in-person as originally planned.

The final program is still evolving, but we will maintain the mission to provide a format designed to increase the NSF community’s understanding of cybersecurity strategies that strengthen trustworthy science: what data, processes, and systems are crucial to the scientific mission, what risks they face, and how to protect them.

 

Call for Participation (CFP)

Program content for the summit is driven by our community. We invite proposals for presentations, breakout and training sessions, as well as nominations for student scholarships. The deadline for CFP submissions is July 5th. To learn more about the CFP, please visit: https://www.trustedci.org/2021-summit-cfp 

 More information can be found at https://www.trustedci.org/2021-cybersecurity-summit