February 6, 2024

Machine learning identifies federally protected waters

At a Glance

  • Researchers used machine learning to estimate which waters are protected by the Clean Water Act, which provides the basis for federal regulation of water quality, including water for drinking.
  • The findings show the effects of recent rule changes and how machine learning could be used to help inform decisions about how to implement regulations.
Pristine river running through a forest. The study showed that machine learning could help determine whether waters are protected by federal regulations. DariaGa / Shutterstock

The Clean Water Act (CWA) of 1972 provides the basis for federal regulation of water quality, including water for drinking. This act protects the “Waters of the United States,” but doesn't define what this includes. Thus, deciding what the CWA protects has been left open to interpretation by the courts and executive agencies. Based on executive and judicial rules, the Army Corps of Engineers (ACE) determines CWA jurisdiction on a case-by-case basis.

No nationwide account of which waters are regulated under the CWA exists. Earlier estimates have assumed that all waters sharing certain characteristics were regulated. But they didn't consider prior ACE determinations of which waters were actually covered by the CWA.

A team of researchers led by Simon Greenhill and Dr. Joseph Shapiro at the University of California, Berkeley, sought to create a more accurate estimate of which waters the act covers. To do so, they developed a machine learning model that they call Waters of the United States Machine Learning, or WOTUS-ML. The model incorporates aerial imagery along with data on soil variables, weather, and wetland and stream coverage to determine how likely a given site is to fall under CWA protection. The team trained the model on more than 150,000 approved determinations made by the ACE. The study, which was funded in part by NIH, appeared in Science on January 25, 2024.

Researcher Simon Greenhill explains the findings from the paper.

Before 2020, the ACE primarily enforced regulations based on the Supreme Court’s ruling in Rapanos v. United States. Using this standard, the model estimated that two-thirds of the nation’s streams and more than half of its wetlands fell under CWA protection.

In 2020, the Trump administration issued the Navigable Waters Protection Rule. Under the new rule, the share of the nation’s streams subject to CWA protection fell to less than half, and the share of wetlands dropped to one-quarter. This implies that the rule deregulated almost 690,000 stream miles and 35 million wetland acres. These include 30% of the stream and wetland areas that supply drinking water to American households.

The team tested the model’s accuracy using ACE determinations not used to train the model. Accuracy varied depending on the setting, with the greatest accuracy among sites predicted to have very high or very low probability of being protected.

The researchers suggest their model could provide immediate estimates of how likely a site is to be protected. If the model is confident enough in a site’s protection status, a determination could be made much more quickly than by the traditional process, which can take months. Doing so could save hundreds of millions of dollars per year in permitting costs. The model could also inform policy decisions by predicting the impact of rule changes. This study is just one example of how machine learning could be used to help clarify how to implement complex regulatory laws.

“Using machine learning to understand these rules helps decode the DNA of environmental policy,” Shapiro says. “We can finally understand what the Clean Water Act actually protects.” 

The Supreme Court’s Sackett v. EPA decision in 2023 again limited CWA protections. This decision had not yet been implemented when the team conducted its analysis. Once Sackett is fully implemented, WOTUS-ML can help determine the new scope of protections and the decision’s impact.

—by Brian Doctrow, Ph.D.

Related Links

References:  Greenhill S, Druckenmiller H, Wang S, Keiser DA, Girotto M, Moore JK, Yamaguchi N, Todeschini A, Shapiro JS. Science. 2024 Jan 26;383(6681):406-412. doi: 10.1126/science.adi3794. Epub 2024 Jan 25. PMID: 38271507.

Funding: NIH’s National Institute on Aging (NIA); Ciriacy-Wantrup Postdoctoral Fellowship; Giannini Foundation of Agricultural Economics; Google Cloud Research Credits Program.