Wednesday, January 27, 2021

2020 Trusted CI Fellow, Laura Christopherson, reports on Science and Security

Laura Christopherson, a 2020 Trusted CI Fellow, prepared the following final report, and agreed to publish it on the Trusted CI blog. 

Science and Security: Sound Odd?

I served as a Trusted CI Fellow during 2020 while also working on the Cyberinfrastructure Center of Excellence (CI CoE) Pilot project.1 In fact, it was through the CI CoE that I learned about Trusted CI and became interested in the fellowship. Over the past year, my work with CI CoE and the fellowship exposed me to the importance of information security in science. When I mentioned this intersection of science and security to friends or others outside of technology and academia, I often got puzzled looks. I think part of the confusion was because I was mainly talking about earth sciences (which is the type of research largely conducted by the research facilities that CI CoE supports) and I suppose people initially assumed I must be talking about health sciences. They could, of course, understand why security would be important in medicine. We all want our personal information (e.g., medical records) protected. And since March of 2020, COVID has been the leading story in all news sources, and those stories have included discussions about the importance of maintaining the integrity of COVID research data so that we can develop a vaccine as quickly as possible. It's a life or death issue. 

When it was suggested that other kinds of sciences, other kinds of research, might need some protection however, then they seemed a little dumbfounded. For instance, I received comments to the effect of, "Well why would anyone want to steal images of a black hole? It's not private, confidential information. It's up there for all to see." And after all, don't we want to share this information? That's why scientists shared the first image of a black hole in April of 2019.2 That gave me pause, I admit: Well yeah, that's not private, personal information. No person would be compromised in any way or suffer any harm if the read-outs from a particle accelerator where disclosed by WikiLeaks, right? Earth science is not a life-or-death situation, after all. There’s no money in stealing data from Laser Interferometer Gravitational-Wave Observatory (LIGO)3 or IceCube.4 Furthermore, don't people want this information shared? What about all that "open science" jazz anyway?

While it may be true that cyber thieves would be less inclined to attempt to steal information or disrupt the activities of scientists when sexier alternatives are available (e.g., the bank accounts of millions of Wells Fargo users, presidential election tabulations, juicy emails between a senator and his mistress, design schematics of a nuclear warhead, personal health information of patients participating in a highly controversial drug trial), it is possible that cyber criminals—in targeting those sexier-alternatives—may unknowingly hit humble research organizations because they also happen to use the same systems that businesses and governments use. The SolarWinds hack5 is a good example of this, as described by Kim Milford, the executive director of the Research and Education Networks Information Sharing and Analysis Center at Indiana University. "While it does not seem at this time that higher education institutions or sensitive research secrets were the target of this attack, it is possible that hackers may have scooped up so much information they do not yet realize what they have," Milford said in an Inside Higher Ed article. In other words, cyberattackers may unknowingly steal a scientific easter egg that they could crack later for what may turn out to be a goldmine of competitive research secrets.  

When I think about the comments I received, I'm just not sure if the everyday Joe or Jane even thinks about science when considering the importance of security. I suspect they largely think about themselves and their personal information instead. It's only natural. But I wanted to understand how to articulate the importance of information security in science. So, I set out to find information on various questions around the intersection of science and security. Putting aside healthcare-related research (for the remainder of this paper), I wanted to know:
  • How does the average American conceive of security? What are average concerns about  information security? Do scientific research organizations even enter the picture for the average American? Does the average American think it is important to protect scientific research?
  • Do scientific research facilities get attacked? What risks do they face? 
I scoured the library's databases of research articles, traipsed through the Internet using a variety of search terms, perused various polling/social science research organization websites (e.g., Pew, the National Academies Press), trolled through popular online tech magazines and blogs, and was ultimately unable to find fully satisfactory answers to my questions. So, I asked Von Welch, director of Trusted CI, the NSF Cybersecurity Center of Excellence, if he knew of any reports of attacks on scientific research organizations. He was able to locate only two publications related to this subject: one from the  Australian National University,6 reporting a breach to their administrative systems, and an FBI case study7 reporting attacks on military sites, federal research labs, universities, and other sites, discovered in 2004 and resulting in the arrest of a 19-year-old man in 2005. 

It appears that there is dearth of information on:
  • The public's awareness of or views on whether security matters in science
  • Threats faced by scientific research organizations
  • Consequences and impacts if scientific research organizations experience loss or damage to            precious research findings.
In my review of information about attacks/security in the non-academic/research world, I uncovered two themes. One was about the nature of the attack and the second about who is usually attacked. The nature of an attack is often described as a theft of some kind. There always appears to be some discussion of what was lost and its value, what the hacker sought as his reward. The most commonly discussed prizes seem to be money (Wells Fargo bank accounts), power (presidential election tabulations), reputation (juicy emails), strength (nuclear warhead), or access to some deep secret (personal health information). Because earth science data won't really give you money, power, reputation, and strength in the way we usually think about those things, and because it won't give you access to deep, dark, personal secrets to leverage against your enemies, why would a cyber thief bother? 

Frequently discussed targets of attacks were financial institutions (money), governmental institutions (power), nations (strength), and individuals (reputation and secrets). (Research bodies are also mentioned but they tend to be those that conduct biomedical research which I would still classify as reputation and secrets, because the data at risk is often personal information of specific individuals, and it is often the risk to these individuals that the discussion centers on.) 

Had I surveyed the news over the past decade, I imagine I would have found very similar results… that most news stories primarily report on that which was stolen from individuals, profit seeking businesses, or governmental/national/political organizations. Off the top of my head, when I think of recent, big news stories about security, I think Russia and the 2016 presidential election, Facebook and Cambridge Analytica, Independence Blue Cross, Wannacry (ransomware), Target, Hillary Clinton's email server, Cal Cunningham (NC senatorial candidate), and Equifax, to name a few. I can't think of a single instance of any news story discussing an attack on an earth science research facility. Although my personal recollections don't confirm the absence of attacks (i.e., it just confirms that I haven't heard of any), I still ask you, my reader, did you hear of any? If you did, how many compared to the other kinds of attacks you also heard about? I suspect it's just not a hot topic for most news outlets.

The point of all this is to say that in spite of not finding any information that said, "Hell yeah, security is really important in science, for good reason,” I still conclude that Hell yeah, security is really important in science for good reason… in fact the same reasons, but perhaps with a different way of thinking about them. First, I think the more mainstream definition of security and what it means to secure data might require expansion when discussing research. For instance, many of the research facilities we work with in CI CoE have to protect their data from harsh environmental conditions. IceCube is located at the South Pole. Its equipment could freeze and data could be lost. So the data must be protected… from the ice (less so probably than from some hacker). 

Additionally it may be worthwhile to rethink those more commonly discussed prizes (money, power, reputation, strength, and juicy secrets). If we concede that a cyberattacker is less likely to find these prizes from stealing scientific data, do they (money, power, reputation, strength, and juice secrets) enter into the discussion at all? I would say yes, but in a different way. Instead of being the reward at the end of the maze, I would argue that they are qualities inherent to science and so can't be stolen from it. They are not the hoped-for results of some activity (e.g., theft), but that which is intrinsic to science, and consequently, make it so vital to protect science. 

Money = Valuable 

The NSF spends millions of dollars funding earth science research. If research activities are disrupted, if data is corrupted or lost, then that money has been wasted. So although you may not get rich off of studying earthquakes or by stealing images of the moon, science is a priority in our society and we've invested decades of money into it. The American public's tax dollars support scientific research, and we all want a good return on our investment. This affects us all. 

Power = Powerful  

It is said that "knowledge is power." Science seeks to uncover new knowledge, and it has empowered us in numerous ways. Consider a very simple and practical example of how science has improved our everyday lives. Because we sought to understand electricity and harness its power, we are able to enjoy the comfort of heating and cooling, have light to see by, and can enjoy hot meals cooked on a stove from ingredients preserved in a refrigerator. Science also tackles issues vital to our survival as a species on this planet. It explores questions about natural energy (which can be used to power medical devices that keep us alive), our carbon footprint (which impacts the resilience of Earth's ability to sustain life), and weather and climate change (which affects the habitability of Earth, important when considering future generations). So, in a sense, earth science is a life or death issue after all, but perhaps on a broader scale, because it concerns mankind as a whole.

Reputation = Noteworthy

Because we depend on science for so many things, it is important that the outcomes of scientific studies are accurate. If scientific data is put at risk, it calls into question the findings of scientific researchers. Years of work can be invalidated, reputations destroyed, and trust eroded. Each year in the history of our existence, we have continued to build upon this knowledge. We have a very sizeable bank account of knowledge from which to draw on and help us advance. Just as our personal or business bank accounts containing money ought to be protected, so should this wealth of knowledge the scientific community has socked away.

Strength/Bold

To use another cliché, it is said that "there is strength in numbers." Most of the science research facilities that we work with in the CI CoE serve thousands of scientists (students and professionals) from around the world. For example, the partnership of the Seismological Facilities for the Advancement of Geoscience (SAGE)8 and the Geodetic Facility for the Advancement of Geoscience9 estimate they serve, roughly, 10,000 scientists worldwide. NOIRLab10 (a collection of five telescopes) estimates a user base of 3,000-5,000 per quarter. The Natural Hazards Engineering Research Infrastructure (NHERI) is composed of multiple units. One of those, DesignSafe,  which provides computation services for analyzing hazards data, estimates a user base of 5,000, with roughly 1,000 using their services each month. If each of these facilities serve approximately 1,000 people each month, then they collectively serve several hundreds of thousands of scientists (students and professionals) from a variety of earth-science disciplines throughout each year. 

These facilities also manage very large datasets. NHERI-DesignSafe manages roughly 200 TBs of data. The Oceans Observatory Initiative12 pulls in around 15,000 rows of data every 30 seconds, roughly 10 TBs of data every three months. The Cornell High Energy Synchrotron Source (CHESS)13 collects around 120 TBs every few months. SAGE ingests around 10 TBs of data per year and has a total archive of roughly 650 TBs that has been collected over 40 years. The image archive for NOIRLab manages almost five PBs of data.  

Within these PBs of data is the possibility to uncover tremendous new insights about our world. Scientists from all over the globe depend on these facilities to support their work. Even though the present-day Galileo may not come to mind when thinking of information security, he and many others exist, and they rely on these PBs and PBs of data to uncover new knowledge about our world. Across the various facilities we serve in CI CoE, there exists a very strong userbase that uses extremely large, multifaceted datasets that may very well exceed the bytes needed to store the emails on Hillary Clinton's server, the Target credit card accounts that were breached, and the 2016 election tabulations that may have been tampered with. 

Secrets = Discoveries

Science data probably doesn't contain any personal information that might embarrass someone, put them in a negative light, or compromise their credit rating. However, it probably does contain an entire host of secrets that, unlike personal secrets, we want and need to uncover. For instance, LIGO collected data for more than a decade, waiting to discover new knowledge, before they finally detected gravitational waves that allowed us to look back 1.3 billion light years at two colliding black holes.

This discovery comes at the culmination of decades of instrument research and development, through a world-wide effort of thousands of researchers, and made possible by dedicated support for LIGO from the National Science Foundation. It also proves a prediction made 100 years ago by Einstein that gravitational waves exist. More excitingly, it marks the beginning of a new era of gravitational wave astronomy – the possibilities for discovery are as rich and boundless as they have been with light-based astronomy.14 

So to end on that auspicious note, I hope I have made a good case for the importance of security for science, in spite of the lack of research I was able to find in this area. It is because of this lack of work, that I will now try to convince you of one last thing: There needs to be more research in this area. In the Inside Higher Ed SolarWinds article, Kim Milford encourages "cybersecurity leaders to provide thought leadership and guidance" on this subject. 

For my part, I suggest there be work around the following questions/themes:

What does security look like in science? What are the threats?

I have suggested that it may be unlikely that cyber thieves will target science when it doesn't really afford them the prizes they may typically seek. So security may be less about guarding against malicious actors and more about making sure the data is well protected from other kinds of threats faced by so many research facilities. Perhaps this calls for redefining security when applied to science.

Are earth-science facilities the targets of malicious attacks? If so, how and why does this happen? How does that compare with the other threats they face?

Although I found little evidence of malicious attacks, Von Welch was able to locate information on the subject, so malicious attacks do happen. Why do they happen if they don't yield the same prizes that are stolen from other types of targets? Are there things to be gained—other prizes—I did not imagine? If so, this may be very helpful information to technology professionals working in science. It could also expose other dimensions to the motivations of black hat hackers, which could be explored by social scientists as well as computer scientists.

Why is it important to protect science? How does science benefit us all? 

I have suggested that when people think of security, they tend to think of themselves, their valuables, their secrets, their associations; and that perhaps this is why science may fail to come to mind when thinking about security. I have also attempted to point out that science has implications beyond the individual, group, or organization, that it concerns and benefits mankind as a whole. If this is so, then it is particularly important to raise awareness about the importance of protecting scientific data. I believe this will also help validate the work of technology professionals who stand guard at the gates of science. We hear about the latest scientific discovery and the scientists involved, but the contributions these guardians make to science may not be considered newsworthy. I get the sense that, as a result, they are often overlooked and perhaps not valued in the way they deserve. So I urge both the science and technology community to work on changing this.

Finally, I think future Trusted CI Fellows are the perfect candidates to explore these questions and to publish on these subjects. I hope that I have inspired future Fellows to pursue these questions. May they achieve success no matter what they pursue in the future, and I wish them well.

1 https://cicoe-pilot.org
2 https://www.jpl.nasa.gov/edu/news/2019/4/19/how-scientists-captured-the-first-image-of-a-black-hole/
3 https://www.ligo.caltech.edu/
4 https://icecube.wisc.edu/
5 https://www.insidehighered.com/news/2021/01/06/unraveling-solarwinds-hacks-fallout-higher-ed
6 Australian National University. (2019). Incident Report on the Breach of the Australian National University's Administrative Systems. https://imagedepot.anu.edu.au/scapa/Website/SCAPA190209_Public_report_web_2.pdf
7 Ricker, Kathleen & Barlow, James & Adams, Craig. (2008). FBI Major Case 216: A Case Study. 10.13140/2.1.2775.2644.
8 https://www.iris.edu/hq/news/story/nsf_makes_5_year_93m_award_to_iris_to_manage_the_sage_facility
9 https://www.unavco.org/about/about.html
10 https://noirlab.edu/public/
11 https://www.designsafe-ci.org/
12 https://oceanobservatories.org/
13 https://www.chess.cornell.edu/
14 https://www.ligo.caltech.edu/detection








Friday, January 22, 2021

Trusted CI and SCiMMA Complete Engagement

The Scalable Cyberinfrastructure for Multi-Messenger Astrophysics (SCiMMA) project is a planned collaboration between data scientists, computer scientists, astronomers, astro-particle physicists, and gravitational wave physicists (https://scimma.org). Leveraging NSF investments in astronomical and multi-messenger facilities, and in advanced cyberinfrastructure (CI), SCiMMA intends to prototype a publish-subscribe system based on KAFKA to distribute alerts from gravitational wave, neutrino and electromagnetic observatories to authorized subscribers The system will additionally rely on supporting infrastructure, including: machine learning algorithms to analyze and classify alerts; and event databases for richer data mining. The pub/sub prototype will be hosted on cloud resources, including a commercial cloud (e.g., AWS). Upon award completion, SCiMMA will request funding for a sustained distributed institute that will expand the scope and depth of the prototyped system.

To this end, a group from SCiMMA solicited information security guidance from Trusted CI on and-or with various components of their prototype CI. For example, they sought help in developing an IT security program, identifying appropriate security control sets/catalogs, and performing a risk assessment with a corresponding residual risk registry.

Trusted CI and the SCiMMA team refined and prioritized SCiMMA’s needs to the following goals: (i) performing a security review of SCiMMA’s CI using the Trusted CI Security Program Evaluation worksheet (https://trustedci.org/evalws) in order to assess the target level of cybersecurity needed; (ii) developing a nascent security program with the information documented in step 1. and leveraging the master information security policies and procedures document (https://www.trustedci.org/guide); and (iii) documenting assets to be used by the security program in step 2.

The SCiMMA team completed the Trusted CI Security Program Evaluation spreadsheet, finding the exercise highly valuable as it encouraged the team to discuss the cybersecurity concerns broached in the evaluation. From there, the SCiMMA team deemed that having data to present to stakeholders that captured the CI risk -- conveying the need for security resources -- was of high priority to the team. So the engagement decided to tackle the task of documenting assets in order to produce an asset-based risk assessment spreadsheet. The task, however, was not without challenge; SCiMMA had a large number of assets, and its CI was still in flux. Thus, the team focused on documenting only critical assets, e.g., admin credentials, source code, DLP backups, etc.

In parallel to this, the SCiMMA team, after attending ‘The Trusted CI Framework’ workshop at the NSF Cybersecurity Summit (https://www.trustedci.org/2020-nsf-summit), sought to adopt many of the ideas promoted during that workshop, including leveraging the ‘CIS Controls v7.1 Tracking Tool’ (the tool was released by the presenters during the workshop and will be part of the Trusted CI Framework upon release in early 2021). Thus, in conjunction with working on the asset inventory, quality effort was also spent in understanding what controls comprised (at least) ‘Implementation Group 1’ from their base-line control set and-or catalog (i.e., the CIS Critical Security Controls - Version 7.1: https://www.sans.org/critical-security-controls), and how they would be applied to SCiMMA’s CI.

The SCiMMA team’s desire to both identify a control set for their CI and then strive to understand the residual risk that would still be present after implementing the controls displays their grasp of key cybersecurity essentials. Similarly, their understanding of the need for a cybersecurity budget and dedicated personnel -- also key components of a sound security program -- bodes well for the project.

The engagement ran from July 1, 2020 to December 31, 2020, and was recorded in the document “SCiMMA / Trusted CI Engagement Final Report” (https://hdl.handle.net/2142/109187).

Thursday, January 21, 2021

SAVE THE DATE: Announcing the 2021 Virtual NSF Cybersecurity Summit, Oct 12th & 13th.

It is our pleasure to announce that the 2021 NSF Cybersecurity Summit plenary session is scheduled to take place October 12th and 13th. The dates for additional events, like training sessions and workshops, are still being determined and will likely take place in the days surrounding the plenary. Due to the continued impact of the global pandemic, we will hold this year’s summit online instead of in-person.

The final program is still evolving, but we will maintain the mission to provide a format designed to increase the NSF community’s understanding of cybersecurity strategies that strengthen trustworthy science: what data, processes, and systems are crucial to the scientific mission, what risks they face, and how to protect them.

Please save the dates on your schedule. We look forward to seeing you there

Monday, January 11, 2021

Trusted CI Webinar: SciTokens: Federated Authorization for Distributed Scientific Computing Mon Jan 25 @11am Eastern

Members of SciTokens are presenting the talk,
SciTokens: Federated Authorization for Distributed Scientific Computing
on Monday January 25th at 11am (Eastern)

Please register here. Be sure to check spam/junk folder for registration confirmation email.

SciTokens (https://scitokens.org/), an NSF CICI project, works to advance the use of bearer tokens and capabilities in distributed scientific infrastructures. It applies the JSON Web Token (JWT) and OAuth standards to the needs of scientific cyberinfrastructure, where widely-distributed computing, data, instruments, and software services are harnessed for scientific workflows, requiring an authorization mechanism that itself is distributed. Typically, JWTs are used in a single web application, with a single token issuer and verifier and OAuth2 deployment scenarios support only one or a few token issuers, using opaque tokens that must be validated by a callback to the corresponding issuer. In contrast, SciTokens supports many token issuers, with signing keys, policies, and endpoint URLs published via OAuth Authorization Server Metadata, using self-describing JWTs rather than opaque tokens, so the tokens can be independently verified by distributed services without requiring a callback to the token issuer.

The use of JWTs with OAuth is now a draft profile of the IETF OAuth working group. OAuth token refresh enables long-lived scientific workflows, and OAuth Token Exchange enables workflow systems to reduce token privileges, effectively implementing least-privilege delegation across the cyberinfrastructure ecosystem.

In this webinar, members of the SciTokens project will discuss progress since their 2019 NSF Summit presentation, including the project's latest open source software releases, interoperability with the WLCG Common JWT Profiles, updates from Fermilab, LIGO, XSEDE, and WLCG (presented at the recent TAGPMA Workshop on Token-Based Authentication and Authorization), and support for SciTokens in CILogon and HTCondor.

Speaker Bios: Jim Basney is a Principal Research Scientist in NCSA's Cybersecurity Division, Brian Bockelman is an Investigator at Morgridge Institute for Research, Todd Tannenbaum is a Researcher in Distributed Computing at University of Wisconsin-Madison, and Derek Weitzel is a Research Assistant Professor at University of Nebraska-Lincoln.

Join Trusted CI's announcements mailing list for information about upcoming events. To submit topics or requests to present, see our call for presentations. Archived presentations are available on our site under "Past Events."

 

Monday, January 4, 2021

Cyberinfrastructure Vulnerabilities 2020 Annual Report

The Cyberinfrastructure Vulnerabilities team provides concise announcements on critical vulnerabilities that affect science cyberinfrastructure (CI) of research and education centers, including those threats which may impact scientific instruments. This service is freely available to all by subscribing to Trusted CI’s mailing lists (see below).


We monitor a number of sources for software vulnerabilities of interest, then determine which ones are of the most critical interest to the community. While it’s easy to identify issues that have piqued the public news cycle, we strive to alert on issues that affect the CI community in particular. These are identified using the following criteria: the affected technology’s or software’s pervasiveness in the CI community; the technology’s or software’s importance to the CI community; type and severity of potential threat, e.g., remote code execution; the threat’s ability to be remotely triggered; the threat’s ability to affect critical core functions; and if mitigation is available. For those issues which warrant alerts to the Trusted CI mailing lists, we also provide guidance on how operators and developers can reduce risks and mitigate threats. We coordinate with XSEDE, Open Science Grid (OSG), the NSF supercomputing centers, and the ResearchSOC on drafting and distributing alerts to minimize duplication of effort and maximize benefit from community expertise. Some of the sources we monitor for possible threats to CI include:


OpenSSL and OpenSSH

US-CERT advisories

XSEDE announcements

RHEL/EPEL advisories

REN-ISAC Alerts and Advisories

Social media, such as Twitter, and Reddit (/r/netsec and /r/security)

News sources, such as The Hacker News, Threatpost, The Register, Naked Security, Slashdot, Krebs, SANS Internet Storm Center and Schneier


In 2020 the Cyberinfrastructure Vulnerabilities team discussed 50 vulnerabilities and issued 22 alerts to 158 subscribers.  Additionally, the team solicited the community with a survey to gauge the team’s impact; 87% of the respondents said that the alerts were relevant to their science mission, would recommend the services to peers, and all participants thought the alerts were concise.


If you wish to subscribe to the Cyberinfrastructure Vulnerability Alerts mailing list you may do so through https://list.iu.edu/sympa/subscribe/cv-announce-l. This mailing list is public and the archives are available at https://list.iu.edu/sympa/arc/cv-announce-l.


If you believe you have information on a cyberinfrastructure vulnerability, let us know by sending us an email at alerts@trustedci.org.