eviz.lib.data package

Submodules

eviz.lib.data.data_source module

class eviz.lib.data.data_source.CSVDataSource

Bases: DataSource

load_data(file_path: str)
class eviz.lib.data.data_source.DataProcessor(model_name: str, file_list: dict, meta_coords: dict, meta_attrs: dict, season: Any | None = None)

Bases: DataSource

This class provides methods to access and process EVIZ data sources.

An instance of DataProcessor is created for each model and its associated file list. To maintain model agnosticism, the names for the model’s coordinates are represented by generic names as xc, yc, zc, and tc. These names are mapped to the actual model coordinate names in the YAML file meta_coordinates.yaml. Likewise, the data attributes are stored and mapped in a dictionary defined in meta_attributes.yaml.

Parameters:
  • model_name (str) – The name of the supported model.

  • file_list (list) – The list of data file names.

  • meta_coords (dict) – A dictionary of metadata coordinate names from the file list.

  • meta_attrs (dict) – A dictionary of metadata attribute names from the file list.

static adjust_units(units)
static check_units(ref_da, dev_da, enforce_units=True)
static convert_kg_to_target_units(data_kg, target_units, kg_to_kgC)
static convert_units(dr, species_name, species_properties, target_units, interval=2678400.0, area_m2=None, delta_p=None, box_height=None)
static data_unit_is_mol_per_mol(da)
file_list: dict
static get_attrs(data, key)

Get attributes associated with a key

get_dataset(i)
get_datasets()
get_ds_index()
get_field(name, ds_index=0)

Extract field from xarray Dataset

Parameters:
  • name (str) – name of field to extract from dataset

  • ds_index (int) – fid index associated with dataset containing field name

Returns:

DataArray containing field data

get_meta_attrs(data, key)

Get attributes associated with a key

load_data(file_path)
property logger: Logger
meta_attrs: dict
meta_coords: dict
model_name: str
process_data()
season: Any = None
class eviz.lib.data.data_source.DataSource

Bases: ABC

Abstract class that defines data source object

A data source is an in-file representation of a model, such as GEOS. Therefore, the GEOS data source or “-s geos”, as represented in thr CLI, represents objects that embody the data files generated by GEOS. Currently, these are all NetCDF4 f ile formats, but could be in other formats. For example, WRF model data can be in NetCDF4 and Grib2 data formats (The latter is typically used by WPS). Thus, “-s wrf” represents data files that are either NetCDF4 or Grib2. The distinction is by the file extension (if any) or by considering a model as a “special” case.

abstract load_data(file_path)
property logger: Logger
class eviz.lib.data.data_source.DataSourceFactory

Bases: object

static get_data_class(file_extension: str) DataSource
class eviz.lib.data.data_source.HDF4DataSource

Bases: DataSource

load_data(file_path: str)
class eviz.lib.data.data_source.HDF5DataSource

Bases: DataSource

load_data(file_path: str)
class eviz.lib.data.data_source.NetCDFDataSource(fid: int = 0)

Bases: DataSource

fid: int = 0
load_data(file_name: str)

Helper function to open and define a dataset

Parameters:
  • fid (int) – file id (starts at 0)

  • file_name (str) – name of file associated with fid

Returns:

dict with xarray dataset information

Return type:

unzipped_data (xarray.Dataset)

eviz.lib.data.data_source.get_season_from_file(file_name)
eviz.lib.data.data_source.make_fake_4D_dataset(nt=366, path=None)
eviz.lib.data.data_source.make_fake_column_dataset(path=None)
eviz.lib.data.data_source.make_fake_timeseries_dataset(path=None)

eviz.lib.data.data_utils module

eviz.lib.data.data_utils.apply_conversion(config, data2d, name)

Apply a unit conversion based on SPECS file entries

Parameters:
  • config (Config)

  • data2d (DataArray)

  • name (str)

Returns:

data2d (DataArray) with target units

For single-plots, we rely on specs file to determine the units and unit conversion factor For comparison plots, we rely on the “target” units specified in the specs file and the unit conversion is provided by the Units conversion module.

eviz.lib.data.data_utils.apply_mean(config, d, level=None)

Compute various averages over coordinates

eviz.lib.data.data_utils.apply_zsum(config, data2d)

Sum over vertical levels (column sum)

eviz.lib.data.date_time module

Internal utilities for managing datetime objects and strings Adopted from GCpy - with minor modifications

eviz.lib.data.date_time.add_months(start_date, n_months)
Parameters:
  • start_date – numpy.datetime64 numpy datetime64 object

  • n_months – integer

Returns:

numpy.datetime64

numpy datetime64 object with exactly n_months added to the date

Return type:

new_date

eviz.lib.data.date_time.get_timestamp_string(date_array)

Convenience function returning the datetime timestamp based on the given input

Parameters:

date_array – array Array of integers corresponding to [year, month, day, hour, minute, second]. Any integers not provided will be padded accordingly

Returns:

string

string in datetime format (eg. 2019-01-01T00:00:00Z)

Return type:

date_str

eviz.lib.data.date_time.is_full_year(start_date, end_date)

Verifies if two dates are a full year starting Jan 1.

Parameters:
  • start_date – numpy.datetime64 numpy datetime64 object

  • end_date – numpy.datetime64 numpy datetime64 object

Returns: boolean

eviz.lib.data.hdf4_reader module

class eviz.lib.data.hdf4_reader.HDF4DataReader(source_name: str)

Bases: DataReader

Class definitions for reading HDF4 files.

check_fid_coords(fid)

Checks if there are any file-level coordinates

Parameters:

fid – a file reader (SD) object

Returns:

False if there are no file-level coordinates, True if there are any

Return type:

bool

get_array(fid)

Returns an XArray DataArray of an HDF4 dataset given the file and Landsat reader objects

Parameters:

fid – an SD object

Returns:

an XArray DataArray

get_coord_bounds(fid)

Returns the coordinate boundaries for constructing coordinates at the dataset level

Parameters:

fid – a file reader (SD) object

Returns:

a Python dictionary of String keys and numeric values

get_dim_attrs(dim)

Returns the attributes of a given dimension (SDim) object

Parameters:

dim – an SDim object

Returns:

a Python dictionary (String keys and values)

get_dims(ds)

Returns the dimension (SDim) objects of a given dataset (SDS) object

Parameters:

ds – an SDim object

Returns:

a Python list of SDim objects

get_dims_attrs(ds)

Returns the dimension attributes for a given dataset (SDS) object

Parameters:

ds – an SDS object

Returns:

a Python dictionary of String keys and dictionary values

get_ds_coords(fid, ds)

Returns constructed coordinates given the file reader (SD) and a dataset (SDS) object

Parameters:
  • fid – an SD object

  • ds – an SDS object

Returns:

a Python dictionary of String keys and NumPy array values

get_fid()

Accesses the file reader object for a file

Returns:

a file reader (SD) object

get_fill(ds)

Returns the fill value of a given dataset (SDS) object

Parameters:

ds – an SDS object

Returns:

float, int, or ‘None’

get_offset(ds)

Returns the offset value of a given dataset (SDS) object

Parameters:

ds – an SDS object

Returns:

float, int, or ‘None’

get_scale(ds)

Returns the scale factor of a given dataset (SDS) object

Parameters:

ds – an SDS object

Returns:

float, int, or ‘None’

get_time(fid)

Returns the time(s) at which the data was measured/acquired

Parameters:

fid – an SD object

Returns:

a DateTime object

process_file(fid, ds)
read_data(file_path: str)

Reads an HDF4 data file and returns its data as an XArray Dataset

Returns:

an XArray Dataset

restore_data(ds)

Restores the data o a given dataset (SDS) object

Parameters:

ds – an SDS object

Returns:

a NumPy array

eviz.lib.data.hdf5_reader module

class eviz.lib.data.hdf5_reader.HDF5DataReader(source_name: str)

Bases: DataReader

Class definitions for reading HDF5 files.

check_coords(dims, coords)

Rearranges order of the coordinates list to match the dimension shapes

Parameters:
  • dims – a Python dictionary of dimension name String keys and dimension size integer values

  • coords – a Python dictionary of file coordinates (NumPy arrays)

Returns:

a Python dictionary of rearranged file coordinates (NumPy arrays)

convert_dict_dtype(sample_dict)

Converts a dictionary of attributes from NumPy data types to general Python data types

Parameters:

sample_dict – a Python dictionary of attributes

Returns:

a Python dictionary of attributes

get_array(data_group, fid_coords)

Returns an XArray DataArray of an HDF5 dataset given the data field subgroup contents and file-level coordinates.

Parameters:
  • data_group – a Python dictionary of String keys and HDF5 dataset object values

  • fid_coords – a Python dictionary of file coordinates (NumPy arrays)

Returns:

an XArray DataArray

get_coords(fid)

Returns the coordinates of a file

Parameters:

fid – a file reader object

Returns:

a Python dictionary of String keys and NumPy array values

get_data_group(fid)

Finds and returns the contents of the file data field subgroup in dictionary format

Parameters:

fid – a file identifier object

Returns:

a Python dictionary of dataset name String keys and dataset object values

get_ds_attrs(ds)

Returns the attributes of an HDF5 dataset

Parameters:

ds – an HDF5 dataset object

Returns:

a Python dictionary of attributes

get_ds_dims(ds, coords)

Returns the dimension names of a dataset

Parameters:
  • ds – a dataset object

  • coords – a Python dictionary of file coordinates (NumPy arrays)

Returns:

a Python dictionary of dimension name String keys and dimension size integer values

get_fid()

Access the file reader object for a given HDF5 file

Returns:

an h5py file reader object

get_fid_attrs(fid)

Returns the file-level attributes (in Python data types)

Parameters:

fid – a file reader object

Returns:

a Python dictionary of attributes

get_fill(ds_attrs)

Returns the fill value of a dataset

Parameters:

ds_attrs – a Python dictionary of dataset attributes

Returns:

an integer, float, or ‘None’

get_findex(data_to_plot)
get_offset(ds_attrs)

Returns the offset value of a dataset

Parameters:

ds_attrs – a Python dictionary of dataset attributes

Returns:

an integer, float, or &/or 0

get_plot_attrs(fid)

Returns the file plotting attributes (in Python data types)

Parameters:

fid – a file reader object

Returns:

a Python dictionary of attributes

get_scale(ds_attrs)

Returns the scale factor of a dataset

Parameters:

ds_attrs – a Python dictionary of dataset attributes

Returns:

an integer, float, or ‘None’

get_time(fid)

Returns the time at which the data was measured/acquired

Parameters:

fid – a file reader object

Returns:

a list of a DateTime object

process_file(fid)
read_data(file_path: str) Any

Given an HDF5 file name, convert the data into an Xarray Dataset.

Parameters

file_path : str HDF5 file name containing OMI data

Returns:

xr_ds

Return type:

Xarray Dataset

restore_data(ds)

Restores the data of a given dataset object

Parameters:

ds – an HDF5 dataset object

Returns:

a NumPy array

eviz.lib.data.netcdf4_reader module

class eviz.lib.data.netcdf4_reader.NetCDFDataReader(source_name: str, datasets: list = <factory>, findex: int = 0)

Bases: DataReader

datasets: list
findex: int = 0
get_findex(data_to_plot)
read_data(file_path: str) Any

Helper function to open and define a dataset

Parameters:
  • fid (int) – file id (starts at 0)

  • file_path (str) – name of file associated with fid

Returns:

dict with xarray dataset information

Return type:

unzipped_data (xarray.Dataset)

eviz.lib.data.processor module

class eviz.lib.data.processor.Interp(config: eviz.lib.autoviz.config.Config, data: List[Any])

Bases: object

config: Config
data: List[Any]
property logger: Logger
regrid(pid)

Wrapper for regrid method

This function regrids two fields (if necessary)

Parameters:

pid (str) – a plot identifier

Returns:

Regridded fields

class eviz.lib.data.processor.Overlays(config: Config, plot_type: str)

Bases: object

Class that define overlays

Example of overlays include:
  • specialized contours

  • specialized line plots

Parameters:
  • config (Config) – Config object

  • data (list) – list of data files

config: Config
static get_attrs(data, key)

Get attributes associated with a key

get_field(name, ds_index=0)

Extract field from xarray Dataset

Parameters:
  • name (str) – name of field to extract from dataset

  • ds_index (int) – fid index associated with dataset containing field name

Returns:

DataArray containing field data

get_meta_attrs(data, key)

Get attributes associated with a key

get_processed_data(field: str) Any | None
property logger: Logger
plot_type: str
process_data()

Get tropopause field and apply to a given experiment _name

Parameters:
  • ds_meta (dict) – Dataset metadata

  • findex (int) – Dataset index (default=0, i.e. just one dataset)

static process_difference(data1: numpy.ndarray, data2: numpy.ndarray) numpy.ndarray
static process_scat(data: Any, field: str) numpy.ndarray
static process_xy(data: Any, field: str) numpy.ndarray
static process_yz(data: Any, field: str) numpy.ndarray
sphum_field(ds_meta, findex=0)

Get specific humidity field and apply to a given experiment _name

Parameters:
  • ds_meta (dict) – Dateset metadata

  • findex (int) – Dataset index (default=0, i.e. just one dataset)

eviz.lib.data.reader module

class eviz.lib.data.reader.DataReader(source_name: str)

Bases: ABC

static get_attrs(data, key)

Get attributes associated with a key

get_dataset(i)
get_datasets()
get_field(name: str, ds_index: int)

Extract field from xarray Dataset

Parameters:
  • name (str) – name of field to extract from dataset

  • ds_index (int) – index of dataset to extract from

Returns:

DataArray containing field data

property logger: Logger
abstract read_data(file_path: str) Any
source_name: str
eviz.lib.data.reader.get_data_coords(data2d, attribute_name)

eviz.lib.data.tabular_reader module

class eviz.lib.data.tabular_reader.CSVDataReader(source_name: str, file_path: str | None = None)

Bases: DataReader

Class definitions for reading CSV data files.

file_path: str = None
read_data(file_path: str) Any

Reads CSV data files and returns its data as Pandas dataframe

Returns:

a Pandas dataframe

eviz.lib.data.units module

eviz.lib.data.units.Pa_to_hPa(Pa)
eviz.lib.data.units.Pa_to_mb(Pa)
class eviz.lib.data.units.Units(config: Config)

Bases: object

This class defines attributes and methods to perform unit conversions of xarray data arrays.

The conversion will be automatic if the fields are registered in eviz’s species database and the units are supported. Otherwise, the conversion specification can be made in eviz’s config files (APP and SPECS YAML files). Please see user’s guide for more information.

Parameters:

config (Config) :

Representation of the model configuration used to specify data sources and user choices for the map generation. The config instance is created at the application level.

airmass: float
config: Config
convert(data, species_name, to_unit)

Conversion method for non-chemical species (e.g. atmospheric fields)

Parameters:
  • to_unit (str) – data destination unit

  • species_name (str) – species name of the data

  • data (xArray) – data to undergo unit conversion

convert_chem(data, sp_name, to_unit, air_column_density=None, airmass=None)

Conversion method for chemical species

Parameters:
  • to_unit (str) – data destination unit

  • species_name (str) – species name of the data

  • data (xArray) – data to undergo unit conversion

property logger: Logger
species_db: dict
eviz.lib.data.units.adjust_units(units)

Creates a consistent unit string that will be used in the unit conversion routines below.

Parameters:

units – str Input unit string.

Returns:

str

Output unit string, adjusted to a consistent value.

Return type:

adjusted_units

Remarks:

Unit list is incomplete – currently is geared to units from common model diagnostics (e.g. kg/m2/s, kg, and variants).

eviz.lib.data.units.c_to_f(c)

Convert celsius to fahrenheit

eviz.lib.data.units.c_to_k(c)

Convert celsius to kelvin

eviz.lib.data.units.calculate_total_area(field)

Calculates the total surface area

Parameters:

field (xarray.DataArray) – The input data array.

Returns:

The total area over which the data array is defined.

Return type:

float

eviz.lib.data.units.calculate_total_column(species, airmass, species_name)

Calculate the total column of a given species in mol/m², molecules/cm², and Dobson Units (DU).

Parameters:
  • species (xarray.DataArray) – Mixing ratio of the species (in mol/mol)

  • airmass (xarray.DataArray) – Airmass (in kg/m²)

  • species_name (str) – Name of the species (for DU conversion)

Returns:

Total column of the species in mol/m² total_column_molecules_cm2 (xarray.DataArray): Total column of the species in molecules/cm² total_column_du (xarray.DataArray): Total column of the species in Dobson Units (DU)

Return type:

total_column_mol_m2 (xarray.DataArray)

eviz.lib.data.units.calculate_total_mass(airmass)

Calculates the total mass given the airmass per square meter field.

Parameters:

airmass (xarray.DataArray) – The input data array representing mass per square meter.

Returns:

The total mass in kilograms.

Return type:

float

eviz.lib.data.units.check_units(ref_da, dev_da, enforce_units=True) bool

Ensures the units of two xarray DataArrays are the same.

Parameters:
  • ref_da – xarray DataArray First data array containing a units attribute.

  • dev_da – xarray DataArray Second data array containing a units attribute.

Keyword Args (optional):
enforce_units: bool

Whether to stop program if ref and dev units do not match (default: True)

eviz.lib.data.units.data_unit_is_mol_per_mol(da)

Check if the units of an xarray DataArray are mol/mol based on a set list of unit strings mol/mol may be.

Parameters:

da – xarray DataArray Data array containing a units attribute

Returns:

bool

Whether input units are mol/mol

Return type:

is_molmol

eviz.lib.data.units.download_airmass(url)

Downloads airmass file

Parameters:

url (str) – URL of the file

Returns:

xArray dataset

eviz.lib.data.units.du_to_mol(du, molar_mass_species, airmass)

Convert DU to molar fraction

eviz.lib.data.units.f_to_c(f)

Convert fahrenheit to celsius

eviz.lib.data.units.f_to_k(f)

Convert fahrenheit to kelvin

eviz.lib.data.units.g_to_kg(gram)

Convert grams to kilograms

eviz.lib.data.units.g_to_mg(gram)

Convert grams to mg

eviz.lib.data.units.get_airmass(config)

Retrieves airmass field stored in a file or URL

Parameters:

config – eviz config object

Returns:

xArray

Return type:

airmass field

eviz.lib.data.units.get_species_name(species_name)
eviz.lib.data.units.hPa_to_Pa(hPa)
eviz.lib.data.units.hPa_to_mb(hPa)
eviz.lib.data.units.k_to_c(k)

Convert kelvin to celsius

eviz.lib.data.units.k_to_f(k)

Convert kelvin to fahrenheit

eviz.lib.data.units.kg_to_du(kg_frac, molar_mass_species, airmass)

Convert kg to DU

eviz.lib.data.units.kg_to_g(kilogram)

Convert kilograms to grams

eviz.lib.data.units.kg_to_mg(kilogram)

Convert kilograms to mg

eviz.lib.data.units.kg_to_mol(kg_frac, molar_mass_species)

Convert kg to molar fraction

eviz.lib.data.units.kg_to_ppb(kg_frac, molar_mass_species)

Convert from kg kg⁻¹ to parts per billion (ppb).

Parameters:
  • kg_frac (dataarray values) – mass fraction of the species (kg kg⁻¹)

  • air_column_density – air column density (molecules cm⁻²)

Returns:

concentration in parts per billion (ppb)

eviz.lib.data.units.logger = <Logger eviz.lib.data.units (WARNING)>

Contains methods for converting the units of data. Some functions are adopted from GCpy - with minor modifications

eviz.lib.data.units.mass_to_moles(g_of_element, molar_mass)

Convert mass of an element to moles

eviz.lib.data.units.mb_to_Pa(mb)
eviz.lib.data.units.mb_to_hPa(mb)
eviz.lib.data.units.mg_to_g(mgram)

Convert mg to grams

eviz.lib.data.units.mg_to_kg(mgram)

Convert mg to kilograms

eviz.lib.data.units.mol_to_du(mol_frac, molar_mass_species, airmass)

Convert molar fraction to DU

eviz.lib.data.units.mol_to_kg(mol_frac, molar_mass_species)

Convert molar fraction to kg

eviz.lib.data.units.mol_to_molecules_cm2(mol_frac, air_column_density)

Convert from mol mol⁻¹ to molecules cm⁻². :param mol_frac: molar fraction of the species (mol mol⁻¹) :param air_column_density: air column density (molecules cm⁻²)

Returns:

number of molecules per cm²

eviz.lib.data.units.mol_to_ppb(mol_frac)

Convert from mol mol⁻¹ to parts per billion (ppb).

Parameters:
  • mol_frac (dataarray values) – molar fraction of the species (mol mol⁻¹)

  • air_column_density – air column density (molecules cm⁻²)

Returns:

concentration in parts per billion (ppb)

eviz.lib.data.units.moles_to_mass(num_moles, molar_mass)

Convert moles to mass

eviz.lib.data.units.ppb_to_mol(ppb)

Convert from parts per billion (ppb) to mol mol⁻¹.

Parameters:

ppb (dataarray values) – concentration in parts per billion

Returns:

molar fraction of the species (mol mol⁻¹)

Return type:

mol_frac

Module contents