Skip to main content

Empowering biological and environmental research with machine learning and artificial intelligence

Modeling, simulation, and data visualization advance scientific discovery 

Genoa Blankenship |
multi-colored data points

Computing capabilities at the Environmental Molecular Sciences Laboratory enable researchers to simulate, model, and analyze data related to environmental and biological research. (Illustration provided by iStock)

Register to attend the free webinar at noon Pacific daylight time on Wednesday, June 28, to learn about EMSL's available computing capabilities. 

Artificial intelligence (AI) and machine learning (ML) are technological tools used universally to simplify and advance every aspect of human life, ranging from selecting a recipe to choosing a song based on a person’s current mood. 

But these technologies aren’t limited to menial daily tasks. They’re revolutionizing science.  

At the Environmental Molecular Sciences Laboratory (EMSL), a Department of Energy (DOE), Office of Science user facility, computational scientists are helping researchers conducting biological and environmental experiments to turn large, complex datasets into transformative scientific findings. 

Researchers remotely access a limitless array of software and workflows and run them on EMSL’s mid-range scientific computing resource, a cluster known as Tahoma. This customizable approach allows researchers to analyze datasets, identify patterns, and provide insights by integrating simulation, modeling, and visualization, all with EMSL’s world-class computational staff available to consult and advise on the best next steps. 

These capabilities as well as EMSL’s expertise are available to researchers through the EMSL User Program. These opportunities are sponsored by DOE’s Biological and Environmental Research (BER) program funding of EMSL's open calls for proposals. 

“EMSL’s Computing, Analytics, and Modeling [science area] aims to help BER scientists to get more and better science out of their data,” said Jay Bardhan, EMSL’s Computing, Analytics, and Modeling science area leader.  

A free webinar will be held at noon Pacific daylight time on Wednesday, June 28, to demonstrate EMSL computing capabilities for biological and environmental research. The hourlong webinar will highlight the EMSL Open OnDemand web portal and two open-source applications, JupyterHUB and PFLOTRAN.  

EMSL LEARN Webinar, Using Open OnDemand Software to Remotely Access EMSL's Computing Resources, Wednesday, June 28, noon to 1 p.m. PDT, Evan Felix and Maruti Mudunuru

EMSL Open OnDemand and Tahoma 

EMSL’s AI and ML capabilities can be accessed through a graphical user interface, known as Open OnDemand. EMSL users─researchers chosen for the user program that are granted the opportunity to utilize EMSL resources for conducting their research─remotely access and run applications through EMSL’s Open OnDemand on the Tahoma cluster.  

Evan Felix, an EMSL computing engineer and capability contact for EMSL Open OnDemand, said the portal is a customizable “platform for building things.”  

“Researchers can pull together the power of 184 computer nodes (from Tahoma) with lots of memory and run their science on that directly,” explained Felix. “The users can use the cluster to either simulate models of what's happening in real life, or to make predictions.” 

Person walks next to Tahoma computing cluster
Researchers remotely access and run applications on EMSL's scientific computer resource called Tahoma. (Photo by Andrea Starr | Pacific Northwest National Laboratory) 

Computational workflows for BER science 

Applications available through EMSL Open OnDemand support BER’s model-experiment (ModEx) approach of integrating observations, experiments, and measurements from the field or laboratory, with modeling and simulations of the same processes. 

Maruti Mudunuru, an Earth scientist at Pacific Northwest National Laboratory, has been using the JupyterHub and PFLOTRAN applications on Tahoma to provide workflows and an established ModEx pipeline that will take users through simulation, modeling, data analysis, and AI integrations. EMSL users also can use different components of this workflow for PFLOTRAN modeling, sensitivity analysis, and/or data analysis (such as data collected using Fourier-transform ion cyclotron resonance analysis).  

“Our workflows on Tahoma are reproducible,” said Mudunuru. “They can use the JupyterHub notebooks and Open OnDemand tools to perform exploratory data analysis, AI modeling, PFLOTRAN simulations, and sensitivity analysis. Each part of the workflow can be tested separately and used by EMSL users as per their needs.” 

Mudunuru applied PFLOTRAN to EMSL’s Molecular Observation Network (MONet) to develop reaction network models to better understand carbon cycling and to acquire time-series data on organic carbon. MONet is an open science network designed to provide comprehensive molecular-level and microstructural information on the composition and structure of soil, water, resident microbial communities, and biogenic emissions. 

“Open-source infrastructure allows the EMSL community to collaborate easily and help advance science at a faster pace,” Mudunuru said. 

For more information on using EMSL computing resources for research, review the mid-range scientific computing resource web page on the EMSL website or register to attend the June 28 webinar.