Mentorship

Each summer, ASTG mentors talented interns through the NASA internship program. The inters are mentored by ASTG members and engage in a variety of software and reasearch and development activities. Here we show some of the interesting projects from recent interns!

Mentorship

2024

sViz: A browser tool for creating visualization using eViz

sViz
Sample plots created by sViz from a web browser.

Anna Boone
University of Oregon

Anna developed a web-based plugin to enable interactivity and display of eviz static-map visualizations using a streamlit-based framework. It allows users to generate visualizations with only a dataset, without the need for cloning or working with code. Visualizations are in one place, that can be accessed from the browser.

Implementation of the GISS ModelE in ASSERT

Report
Sample report produced by ASSERT after running two ModelE rundecks.

Michael Aidoo
University of Maryland, Baltimore County
Computer Science

ASSERT (A Software Suite for Earth-systems Regression Testing) is a centralized, Python-based regression testing toolkit that aims to allow users to build regression testing workflows for numerous earth-systems models. ASSERT, created in ASTG, is designed to easily accommodate many models without any interference and without changing its underlying structure. Michael added code compoents to obtain, configure, compile and run the GISS ModelE code using the ASSERT tool. He did extensive tests to validate his components and show that ASSERT produces expected answer when performing resgession test on ModelE. His effort will help ASTG staff to use ASSERT for daily resgression testing of ModelE.

Geofoundational Modeling of Land Surface Fluxes

Architecture
Architecture of a vanilla autoencoder, a type of generative model.

Grace Liu
PhD Student
Penn State University
Department of Meteorology and Atmospheric Science

Geofoundation models (GFMs) are pre-trained on large amounts of unlabeled geospatial data, allowing modelers to utilize more data without paying the computational cost of processing it directly. To explore the use of GFMs in TERRAHydro, an Earth System Digital Twin being developed by ASTG, Grace investigated a foundation modeling approach within the underlying software framework and built an ERA5 emulator prototype. It is expected that a transformer-based GFM will improve TERRAHydro’s current performance.

NCCS Task Farming Toolbox: Essential Tools for Task Parallelism

Task distribution
Illustration of task distribution when SLURM srun option is used.

Rodiat Sanni
University of Maryland, College Park
Computer Science

Rodiat developed two Python tools that allow users to run independent tasks on NCCS multicore platforms. The tools rely on GNU Parallel and SLURM srun native commands. Users only need to provide the list of their tasks (included in a text file) through command line interfaces and the tools will distribute the tasks to the available cores.

Kickstarting Moist Physics Porting in GEOS to NDSL for HPC on GPU’s: A Focus on Radiation Coupling and Aerosol Activation

Matthew Weil, PhD student
Boston University, Department of Earth and Environment

The ASTG is beginning the port of Moist Physics from the GEOS model to the NOAA-NASA Domain Specific Language (NDSL) middleware layer for the Software Modernization Team’s 2024-2026 initiative, and Matthew helped to kickstart this effort. By leveraging open-source packages like GT4Py and DaCe, Matthew focused on Radiation Coupling and Aerosol Activation in the Atmospheric General Circulation Model, while contributing to the maturation and debugging of the developing framework.

Tracking the Movements of the International Space Station

ISS orbit
ISS orbits where the overpassed countries are identified.

Cindy Chen
High School student
Acton Boxborough Regional High School

The International Space Station (ISS) maintains an orbit with an average altitude of 250 miles. In 24 hours, ISS makes 16 orbits of Earth, traveling though 16 sunrises and sunsets. It travels at a speed of five miles per second, circling Earth about every 90 minutes and passes over places between latitudes 52 degrees south and 52 degrees north at different times of the day.

At any given time, it is possible to know the planar position of ISS. Cindy wrote a Python application (relying on Pandas, Shapely, GeoPandas, MovingPandas, etc.) that collects a time series of the positions of ISS, identifies the orbits, determines the country name and the weather conditions at each position along the track, and performs visualizations.

2023

eViz: Enhanced visualization utilities for NU-WRF models

WRF output
WRF visualizations with their corresponding YAML configuration files.

Aarav Khanna
Cornell University
Department of Computer Sciences
B.S. Candidate

The ASTG is developing the EViz data visualization tool (eViz) which is used to aid Earth scientists in easily diagnosing outputs from large-scale Earth models and thus assist in the model validation process. Aarav developed EViz functions to support visualization of NU-WRF models (WRF and LIS) and wrote a utility named metadump which facilitates the creation of YAML files that are used to provide specifications for the eViz tool.

ASSERT - A Software Suite for Earth-Systems Regression Testing

ASSERT class family trees
ASSERT class family trees.

Deon Kouatchou
Carnegie Mellon University
School of Computer Science
B.S. Candidate

ASSERT is a centralized, Python-based regression testing toolkit that aims to allow users to build regression testing workflows for numerous Earth System models. Deon helped in the early-stage development of ASSERT, particularly the development of the software infrastructure. This included, among other things, a large set of utilities, logging support, a comprehensive set of unit tests, and thorough documentation.

Processing ASTG Course Evaluation Surveys

Each year, ASTG provides several computer programming courses. At the end of each class, participants submit an evaluation survey which results are contained in an Excel file. It is difficult for us to quickly analyze the results of each survey and extract important information that we can use for presentations. Cindy Chen (a rising Sophomore in high school) has been writing a Python tool that automatically reads Excel files and generate useful reports (in terms of plots, texts, etc.) based on the responses of an evaluation surveys.

2022

Integrating satellite data sources in EViz visualization toolkit

Deon Kouatchou-Ngongang pres
Expanding data sources available in EViz.

Deon Kouatchou-Ngongan
Carnegie Mellon University
School of Computer Science
B.S. Candidate

NASA Earth System models scientists often have the need to compare model results with satellite observational data to evaluate experiments and simulations. eViz provides a quick, accessible, and flexible visualization toolkit for exploring and comparing data. By integrating satellite data sources into EViz, ASTG aims to enable comparison model-observation visualization.

2021

eViz: Exploring Earth System Data Products and diagnosing Earth System Models using a visualization environment

Deepthi Raguhnandan Poster
Streamling the visualization of Earth System model data products.

Deepthi Raghunandan
University of Maryland
Department of Computer Sciences
PhD Candidate

NASA Earth Systems models, such as numerical weather models, produce large amounts of outputs. Scientist need quick access and flexible visualization tools to analyze and validate the data. ASTG has developed and continues to improve a modern Python framework for easy visualizaton (eViz) of Earth System data products.

Machine Learning for Non-parametric Data Assimilation in Hydrological Models

Logan Qualls Poster
The value of remote sensing data in data-driven streamflow models.

Logan Qualls
University of Alabama
Department of Geological Sciences
Masters Candidate

Recent advances in machine learning have led to powerful streamflow models based on the long-short term memory models that have been created using the CAMELS data, an in-situ 30-year record for streamflow that spans accross 600 water basins in the CONUS. This project investigates the integration and use of remote sensing data for training.

2019

Jonathan Frame Poster
Combining process-based models with machine learning.

Machine Learning for Non-parametric Data Assimilation in Hydrological Models

Johnathan Frame
University of Alabama
Department of Geological Sciences
PhD Candidate

We test a method of analyzing hydrologic modeling predictions of soil moisture with a hybrid machine learning (ML) + physically-based modeling approach. This method is capable of non-parametric data assimilation and corrections to model structural error. Dr. Pelissier has developed a parallelized machine learning code for Gaussian Process Regression (GPR) as the ML component of this project. We are applying this code to improve predictability of soil moisture data at 10 FluxNet towers. The GPR shows significant improvement in prediction of out-of-sample soil moisture predictions over calibrated Noah-MP results. To compare the GPR against ‘traditional’ data assimilation techniques we ran the Noah-MP model with an Ensemble Kalman Filter (EnKF). The results show a similar performance improvement between GPR and EnKF when run in sample, but EnKF provides little to no benefit when making predictions after only a few time steps following an observation. Our results show that this hybrid approach continues improving model predictions even without soil moisture state observations.

Publications

Pelissier, C., J. Frame, and G. Nearing. “Combining parametric land surface 889 models with machine learning.” arXiv preprint arXiv:2002.06141 890 (2020).

2018

Quantum Computing and Linears Systems

Chris Culver Poster
Solving Linear Systems with Quantum Computers.

Chris Culver
The George Washington University
Department of Physics
PhD Candidate

The goal of this intersnship was to assess the current abilities of the IBM quantum computer, and to implement and investigate quantum algorithms to solve linear systems. Linear systems are ubiquitous throughout scientific computing and quantum algorithms that provide increased computing power would have a large impact on NASA’s missions. The first phase of the project is to understand the errors and limitations of the current IBM quantum computer. The second phase is to understand, implement, and test quantum linear-system-solvers on the IBM machine. Lastly, we will report the results of the study and the feasibility of solving linear systems and real-world problems on quantum computers today and in the near future.

Using Machine Learning to Model Global Terrestrial Carbon Flux

Donovan_Murphy Poster
Carbon flux monitoring using machine learning.

Donovan Murphy
University of Alabama
Bachelors in Science Candidate

This project evaluates the use of machine learning to meet the challenges of scalability and continuity in carbon flux monitoring and prediction. Today, the measurement of terrestrial gas fluxes is limited in spatial representation, evidenced by the over eight hundred sites capturing carbon information in a globally distributed yet patchy network, FLUXNET. Using the eddy covariance method, these sites compute net ecosystem exchange (NEE), a measure of the vectoring of carbon dioxide through their ecosystems. Though these sites deliver high quality, in situ data, they are inherently restricted to hyper-local observation, limiting accurate inferences toward a finer spatial resolution. Additionally, many ecoregions lack FLUXNET site representation, including the Pacific coast of South America, Central Asia and the Middle East, and the vegetation-rich archipelagos of the Sunda Shelf (see Fig. 1). These absences are based on sites’ geographic distribution and independent of ecosystem type.