Each summer, ASTG mentors talented interns through the NASA internship program. The inters are mentored by ASTG members and engage in a variety of software and reasearch and development activities. Here we show some of the interesting projects from recent interns!
Sample plots created by sViz from a web browser.
Anna Boone
University of Oregon
Anna developed a web-based plugin to enable interactivity and display of eviz static-map visualizations using a streamlit-based framework. It allows users to generate visualizations with only a dataset, without the need for cloning or working with code. Visualizations are in one place, that can be accessed from the browser.
Sample report produced by ASSERT after running two ModelE rundecks.
Michael Aidoo
University of Maryland, Baltimore County
Computer Science
ASSERT (A Software Suite for Earth-systems Regression Testing) is a centralized, Python-based regression testing toolkit that aims to allow users to build regression testing workflows for numerous earth-systems models. ASSERT, created in ASTG, is designed to easily accommodate many models without any interference and without changing its underlying structure. Michael added code compoents to obtain, configure, compile and run the GISS ModelE code using the ASSERT tool. He did extensive tests to validate his components and show that ASSERT produces expected answer when performing resgession test on ModelE. His effort will help ASTG staff to use ASSERT for daily resgression testing of ModelE.
Architecture of a vanilla autoencoder, a type of generative model.
Grace Liu
PhD Student
Penn State University
Department of Meteorology and Atmospheric Science
Geofoundation models (GFMs) are pre-trained on large amounts of unlabeled geospatial data, allowing modelers to utilize more data without paying the computational cost of processing it directly. To explore the use of GFMs in TERRAHydro, an Earth System Digital Twin being developed by ASTG, Grace investigated a foundation modeling approach within the underlying software framework and built an ERA5 emulator prototype. It is expected that a transformer-based GFM will improve TERRAHydro’s current performance.
Illustration of task distribution when SLURM srun option is used.
Rodiat Sanni
University of Maryland, College Park
Computer Science
Rodiat developed two Python tools that allow users to run independent tasks on NCCS multicore platforms. The tools rely on GNU Parallel and SLURM srun native commands. Users only need to provide the list of their tasks (included in a text file) through command line interfaces and the tools will distribute the tasks to the available cores.
Matthew Weil,
PhD student
Boston University,
Department of Earth and Environment
The ASTG is beginning the port of Moist Physics from the GEOS model to the NOAA-NASA Domain Specific Language (NDSL) middleware layer for the Software Modernization Team’s 2024-2026 initiative, and Matthew helped to kickstart this effort. By leveraging open-source packages like GT4Py and DaCe, Matthew focused on Radiation Coupling and Aerosol Activation in the Atmospheric General Circulation Model, while contributing to the maturation and debugging of the developing framework.
ISS orbits where the overpassed countries are identified.
Cindy Chen
High School student
Acton Boxborough Regional High School
The International Space Station (ISS) maintains an orbit with an average altitude of 250 miles. In 24 hours, ISS makes 16 orbits of Earth, traveling though 16 sunrises and sunsets. It travels at a speed of five miles per second, circling Earth about every 90 minutes and passes over places between latitudes 52 degrees south and 52 degrees north at different times of the day.
At any given time, it is possible to know the planar position of ISS. Cindy wrote a Python application (relying on Pandas, Shapely, GeoPandas, MovingPandas, etc.) that collects a time series of the positions of ISS, identifies the orbits, determines the country name and the weather conditions at each position along the track, and performs visualizations.
WRF visualizations with their corresponding YAML configuration files.
Aarav Khanna
Cornell University
Department of Computer Sciences
B.S. Candidate
The ASTG is developing the EViz data visualization tool (eViz) which is used to aid Earth scientists in easily diagnosing outputs from large-scale Earth models and thus assist in the model validation process. Aarav developed EViz functions to support visualization of NU-WRF models (WRF and LIS) and wrote a utility named metadump which facilitates the creation of YAML files that are used to provide specifications for the eViz tool.
ASSERT class family trees.
Deon Kouatchou
Carnegie Mellon University
School of Computer Science
B.S. Candidate
ASSERT is a centralized, Python-based regression testing toolkit that aims to allow users to build regression testing workflows for numerous Earth System models. Deon helped in the early-stage development of ASSERT, particularly the development of the software infrastructure. This included, among other things, a large set of utilities, logging support, a comprehensive set of unit tests, and thorough documentation.
Each year, ASTG provides several computer programming courses. At the end of each class, participants submit an evaluation survey which results are contained in an Excel file. It is difficult for us to quickly analyze the results of each survey and extract important information that we can use for presentations. Cindy Chen (a rising Sophomore in high school) has been writing a Python tool that automatically reads Excel files and generate useful reports (in terms of plots, texts, etc.) based on the responses of an evaluation surveys.
Expanding data sources available in EViz.
Deon Kouatchou-Ngongan
Carnegie Mellon University
School of Computer Science
B.S. Candidate
NASA Earth System models scientists often have the need to compare model results with satellite observational data to evaluate experiments and simulations. eViz provides a quick, accessible, and flexible visualization toolkit for exploring and comparing data. By integrating satellite data sources into EViz, ASTG aims to enable comparison model-observation visualization.
Streamling the visualization of Earth System model data products.
Deepthi Raghunandan
University of Maryland
Department of Computer Sciences
PhD Candidate
NASA Earth Systems models, such as numerical weather models, produce large amounts of outputs. Scientist need quick access and flexible visualization tools to analyze and validate the data. ASTG has developed and continues to improve a modern Python framework for easy visualizaton (eViz) of Earth System data products.
The value of remote sensing data in data-driven streamflow models.
Logan Qualls
University of Alabama
Department of Geological Sciences
Masters Candidate
Recent advances in machine learning have led to powerful streamflow models based on the long-short term memory models that have been created using the CAMELS data, an in-situ 30-year record for streamflow that spans accross 600 water basins in the CONUS. This project investigates the integration and use of remote sensing data for training.
Combining process-based models with machine learning.
Johnathan Frame
University of Alabama
Department of Geological Sciences
PhD Candidate
We test a method of analyzing hydrologic modeling predictions of soil moisture with a hybrid machine learning (ML) + physically-based modeling approach. This method is capable of non-parametric data assimilation and corrections to model structural error. Dr. Pelissier has developed a parallelized machine learning code for Gaussian Process Regression (GPR) as the ML component of this project. We are applying this code to improve predictability of soil moisture data at 10 FluxNet towers. The GPR shows significant improvement in prediction of out-of-sample soil moisture predictions over calibrated Noah-MP results. To compare the GPR against ‘traditional’ data assimilation techniques we ran the Noah-MP model with an Ensemble Kalman Filter (EnKF). The results show a similar performance improvement between GPR and EnKF when run in sample, but EnKF provides little to no benefit when making predictions after only a few time steps following an observation. Our results show that this hybrid approach continues improving model predictions even without soil moisture state observations.
Pelissier, C., J. Frame, and G. Nearing. “Combining parametric land surface 889 models with machine learning.” arXiv preprint arXiv:2002.06141 890 (2020).
Solving Linear Systems with Quantum Computers.
Chris Culver
The George Washington University
Department of Physics
PhD Candidate
The goal of this intersnship was to assess the current abilities of the IBM quantum computer, and to implement and investigate quantum algorithms to solve linear systems. Linear systems are ubiquitous throughout scientific computing and quantum algorithms that provide increased computing power would have a large impact on NASA’s missions. The first phase of the project is to understand the errors and limitations of the current IBM quantum computer. The second phase is to understand, implement, and test quantum linear-system-solvers on the IBM machine. Lastly, we will report the results of the study and the feasibility of solving linear systems and real-world problems on quantum computers today and in the near future.
Carbon flux monitoring using machine learning.
Donovan Murphy
University of Alabama
Bachelors in Science Candidate
This project evaluates the use of machine learning to meet the challenges of scalability and continuity in carbon flux monitoring and prediction. Today, the measurement of terrestrial gas fluxes is limited in spatial representation, evidenced by the over eight hundred sites capturing carbon information in a globally distributed yet patchy network, FLUXNET. Using the eddy covariance method, these sites compute net ecosystem exchange (NEE), a measure of the vectoring of carbon dioxide through their ecosystems. Though these sites deliver high quality, in situ data, they are inherently restricted to hyper-local observation, limiting accurate inferences toward a finer spatial resolution. Additionally, many ecoregions lack FLUXNET site representation, including the Pacific coast of South America, Central Asia and the Middle East, and the vegetation-rich archipelagos of the Sunda Shelf (see Fig. 1). These absences are based on sites’ geographic distribution and independent of ecosystem type.