eviz.lib.data package
Submodules
eviz.lib.data.data_source module
- class eviz.lib.data.data_source.CSVDataSource
Bases:
DataSource
- load_data(file_path: str)
- class eviz.lib.data.data_source.DataProcessor(model_name: str, file_list: dict, meta_coords: dict, meta_attrs: dict, season: Any | None = None)
Bases:
DataSource
This class provides methods to access and process EVIZ data sources.
An instance of DataProcessor is created for each model and its associated file list. To maintain model agnosticism, the names for the model’s coordinates are represented by generic names as xc, yc, zc, and tc. These names are mapped to the actual model coordinate names in the YAML file meta_coordinates.yaml. Likewise, the data attributes are stored and mapped in a dictionary defined in meta_attributes.yaml.
- Parameters:
model_name (str) – The name of the supported model.
file_list (list) – The list of data file names.
meta_coords (dict) – A dictionary of metadata coordinate names from the file list.
meta_attrs (dict) – A dictionary of metadata attribute names from the file list.
- static adjust_units(units)
- static check_units(ref_da, dev_da, enforce_units=True)
- static convert_kg_to_target_units(data_kg, target_units, kg_to_kgC)
- static convert_units(dr, species_name, species_properties, target_units, interval=2678400.0, area_m2=None, delta_p=None, box_height=None)
- static data_unit_is_mol_per_mol(da)
- file_list: dict
- static get_attrs(data, key)
Get attributes associated with a key
- get_dataset(i)
- get_datasets()
- get_ds_index()
- get_field(name, ds_index=0)
Extract field from xarray Dataset
- Parameters:
name (str) – name of field to extract from dataset
ds_index (int) – fid index associated with dataset containing field name
- Returns:
DataArray containing field data
- get_meta_attrs(data, key)
Get attributes associated with a key
- load_data(file_path)
- property logger: Logger
- meta_attrs: dict
- meta_coords: dict
- model_name: str
- process_data()
- season: Any = None
- class eviz.lib.data.data_source.DataSource
Bases:
ABC
Abstract class that defines data source object
A data source is an in-file representation of a model, such as GEOS. Therefore, the GEOS data source or “-s geos”, as represented in thr CLI, represents objects that embody the data files generated by GEOS. Currently, these are all NetCDF4 f ile formats, but could be in other formats. For example, WRF model data can be in NetCDF4 and Grib2 data formats (The latter is typically used by WPS). Thus, “-s wrf” represents data files that are either NetCDF4 or Grib2. The distinction is by the file extension (if any) or by considering a model as a “special” case.
- abstract load_data(file_path)
- property logger: Logger
- class eviz.lib.data.data_source.DataSourceFactory
Bases:
object
- static get_data_class(file_extension: str) DataSource
- class eviz.lib.data.data_source.HDF4DataSource
Bases:
DataSource
- load_data(file_path: str)
- class eviz.lib.data.data_source.HDF5DataSource
Bases:
DataSource
- load_data(file_path: str)
- class eviz.lib.data.data_source.NetCDFDataSource(fid: int = 0)
Bases:
DataSource
- fid: int = 0
- load_data(file_name: str)
Helper function to open and define a dataset
- Parameters:
fid (int) – file id (starts at 0)
file_name (str) – name of file associated with fid
- Returns:
dict with xarray dataset information
- Return type:
unzipped_data (xarray.Dataset)
- eviz.lib.data.data_source.get_season_from_file(file_name)
- eviz.lib.data.data_source.make_fake_4D_dataset(nt=366, path=None)
- eviz.lib.data.data_source.make_fake_column_dataset(path=None)
- eviz.lib.data.data_source.make_fake_timeseries_dataset(path=None)
eviz.lib.data.data_utils module
- eviz.lib.data.data_utils.apply_conversion(config, data2d, name)
Apply a unit conversion based on SPECS file entries
- Parameters:
config (Config)
data2d (DataArray)
name (str)
- Returns:
data2d (DataArray) with target units
For single-plots, we rely on specs file to determine the units and unit conversion factor For comparison plots, we rely on the “target” units specified in the specs file and the unit conversion is provided by the Units conversion module.
- eviz.lib.data.data_utils.apply_mean(config, d, level=None)
Compute various averages over coordinates
- eviz.lib.data.data_utils.apply_zsum(config, data2d)
Sum over vertical levels (column sum)
eviz.lib.data.date_time module
Internal utilities for managing datetime objects and strings Adopted from GCpy - with minor modifications
- eviz.lib.data.date_time.add_months(start_date, n_months)
- Parameters:
start_date – numpy.datetime64 numpy datetime64 object
n_months – integer
- Returns:
- numpy.datetime64
numpy datetime64 object with exactly n_months added to the date
- Return type:
new_date
- eviz.lib.data.date_time.get_timestamp_string(date_array)
Convenience function returning the datetime timestamp based on the given input
- Parameters:
date_array – array Array of integers corresponding to [year, month, day, hour, minute, second]. Any integers not provided will be padded accordingly
- Returns:
- string
string in datetime format (eg. 2019-01-01T00:00:00Z)
- Return type:
date_str
- eviz.lib.data.date_time.is_full_year(start_date, end_date)
Verifies if two dates are a full year starting Jan 1.
- Parameters:
start_date – numpy.datetime64 numpy datetime64 object
end_date – numpy.datetime64 numpy datetime64 object
Returns: boolean
eviz.lib.data.hdf4_reader module
- class eviz.lib.data.hdf4_reader.HDF4DataReader(source_name: str)
Bases:
DataReader
Class definitions for reading HDF4 files.
- check_fid_coords(fid)
Checks if there are any file-level coordinates
- Parameters:
fid – a file reader (SD) object
- Returns:
False if there are no file-level coordinates, True if there are any
- Return type:
bool
- get_array(fid)
Returns an XArray DataArray of an HDF4 dataset given the file and Landsat reader objects
- Parameters:
fid – an SD object
- Returns:
an XArray DataArray
- get_coord_bounds(fid)
Returns the coordinate boundaries for constructing coordinates at the dataset level
- Parameters:
fid – a file reader (SD) object
- Returns:
a Python dictionary of String keys and numeric values
- get_dim_attrs(dim)
Returns the attributes of a given dimension (SDim) object
- Parameters:
dim – an SDim object
- Returns:
a Python dictionary (String keys and values)
- get_dims(ds)
Returns the dimension (SDim) objects of a given dataset (SDS) object
- Parameters:
ds – an SDim object
- Returns:
a Python list of SDim objects
- get_dims_attrs(ds)
Returns the dimension attributes for a given dataset (SDS) object
- Parameters:
ds – an SDS object
- Returns:
a Python dictionary of String keys and dictionary values
- get_ds_coords(fid, ds)
Returns constructed coordinates given the file reader (SD) and a dataset (SDS) object
- Parameters:
fid – an SD object
ds – an SDS object
- Returns:
a Python dictionary of String keys and NumPy array values
- get_fid()
Accesses the file reader object for a file
- Returns:
a file reader (SD) object
- get_fill(ds)
Returns the fill value of a given dataset (SDS) object
- Parameters:
ds – an SDS object
- Returns:
float, int, or ‘None’
- get_offset(ds)
Returns the offset value of a given dataset (SDS) object
- Parameters:
ds – an SDS object
- Returns:
float, int, or ‘None’
- get_scale(ds)
Returns the scale factor of a given dataset (SDS) object
- Parameters:
ds – an SDS object
- Returns:
float, int, or ‘None’
- get_time(fid)
Returns the time(s) at which the data was measured/acquired
- Parameters:
fid – an SD object
- Returns:
a DateTime object
- process_file(fid, ds)
- read_data(file_path: str)
Reads an HDF4 data file and returns its data as an XArray Dataset
- Returns:
an XArray Dataset
- restore_data(ds)
Restores the data o a given dataset (SDS) object
- Parameters:
ds – an SDS object
- Returns:
a NumPy array
eviz.lib.data.hdf5_reader module
- class eviz.lib.data.hdf5_reader.HDF5DataReader(source_name: str)
Bases:
DataReader
Class definitions for reading HDF5 files.
- check_coords(dims, coords)
Rearranges order of the coordinates list to match the dimension shapes
- Parameters:
dims – a Python dictionary of dimension name String keys and dimension size integer values
coords – a Python dictionary of file coordinates (NumPy arrays)
- Returns:
a Python dictionary of rearranged file coordinates (NumPy arrays)
- convert_dict_dtype(sample_dict)
Converts a dictionary of attributes from NumPy data types to general Python data types
- Parameters:
sample_dict – a Python dictionary of attributes
- Returns:
a Python dictionary of attributes
- get_array(data_group, fid_coords)
Returns an XArray DataArray of an HDF5 dataset given the data field subgroup contents and file-level coordinates.
- Parameters:
data_group – a Python dictionary of String keys and HDF5 dataset object values
fid_coords – a Python dictionary of file coordinates (NumPy arrays)
- Returns:
an XArray DataArray
- get_coords(fid)
Returns the coordinates of a file
- Parameters:
fid – a file reader object
- Returns:
a Python dictionary of String keys and NumPy array values
- get_data_group(fid)
Finds and returns the contents of the file data field subgroup in dictionary format
- Parameters:
fid – a file identifier object
- Returns:
a Python dictionary of dataset name String keys and dataset object values
- get_ds_attrs(ds)
Returns the attributes of an HDF5 dataset
- Parameters:
ds – an HDF5 dataset object
- Returns:
a Python dictionary of attributes
- get_ds_dims(ds, coords)
Returns the dimension names of a dataset
- Parameters:
ds – a dataset object
coords – a Python dictionary of file coordinates (NumPy arrays)
- Returns:
a Python dictionary of dimension name String keys and dimension size integer values
- get_fid()
Access the file reader object for a given HDF5 file
- Returns:
an h5py file reader object
- get_fid_attrs(fid)
Returns the file-level attributes (in Python data types)
- Parameters:
fid – a file reader object
- Returns:
a Python dictionary of attributes
- get_fill(ds_attrs)
Returns the fill value of a dataset
- Parameters:
ds_attrs – a Python dictionary of dataset attributes
- Returns:
an integer, float, or ‘None’
- get_findex(data_to_plot)
- get_offset(ds_attrs)
Returns the offset value of a dataset
- Parameters:
ds_attrs – a Python dictionary of dataset attributes
- Returns:
an integer, float, or &/or 0
- get_plot_attrs(fid)
Returns the file plotting attributes (in Python data types)
- Parameters:
fid – a file reader object
- Returns:
a Python dictionary of attributes
- get_scale(ds_attrs)
Returns the scale factor of a dataset
- Parameters:
ds_attrs – a Python dictionary of dataset attributes
- Returns:
an integer, float, or ‘None’
- get_time(fid)
Returns the time at which the data was measured/acquired
- Parameters:
fid – a file reader object
- Returns:
a list of a DateTime object
- process_file(fid)
- read_data(file_path: str) Any
Given an HDF5 file name, convert the data into an Xarray Dataset.
- Parameters
file_path : str HDF5 file name containing OMI data
- Returns:
xr_ds
- Return type:
Xarray Dataset
- restore_data(ds)
Restores the data of a given dataset object
- Parameters:
ds – an HDF5 dataset object
- Returns:
a NumPy array
eviz.lib.data.netcdf4_reader module
- class eviz.lib.data.netcdf4_reader.NetCDFDataReader(source_name: str, datasets: list = <factory>, findex: int = 0)
Bases:
DataReader
- datasets: list
- findex: int = 0
- get_findex(data_to_plot)
- read_data(file_path: str) Any
Helper function to open and define a dataset
- Parameters:
fid (int) – file id (starts at 0)
file_path (str) – name of file associated with fid
- Returns:
dict with xarray dataset information
- Return type:
unzipped_data (xarray.Dataset)
eviz.lib.data.processor module
- class eviz.lib.data.processor.Interp(config: eviz.lib.autoviz.config.Config, data: List[Any])
Bases:
object
- data: List[Any]
- property logger: Logger
- regrid(pid)
Wrapper for regrid method
This function regrids two fields (if necessary)
- Parameters:
pid (str) – a plot identifier
- Returns:
Regridded fields
- class eviz.lib.data.processor.Overlays(config: Config, plot_type: str)
Bases:
object
Class that define overlays
- Example of overlays include:
specialized contours
specialized line plots
- Parameters:
config (Config) – Config object
data (list) – list of data files
- static get_attrs(data, key)
Get attributes associated with a key
- get_field(name, ds_index=0)
Extract field from xarray Dataset
- Parameters:
name (str) – name of field to extract from dataset
ds_index (int) – fid index associated with dataset containing field name
- Returns:
DataArray containing field data
- get_meta_attrs(data, key)
Get attributes associated with a key
- get_processed_data(field: str) Any | None
- property logger: Logger
- plot_type: str
- process_data()
Get tropopause field and apply to a given experiment _name
- Parameters:
ds_meta (dict) – Dataset metadata
findex (int) – Dataset index (default=0, i.e. just one dataset)
- static process_difference(data1: numpy.ndarray, data2: numpy.ndarray) numpy.ndarray
- static process_scat(data: Any, field: str) numpy.ndarray
- static process_xy(data: Any, field: str) numpy.ndarray
- static process_yz(data: Any, field: str) numpy.ndarray
- sphum_field(ds_meta, findex=0)
Get specific humidity field and apply to a given experiment _name
- Parameters:
ds_meta (dict) – Dateset metadata
findex (int) – Dataset index (default=0, i.e. just one dataset)
eviz.lib.data.reader module
- class eviz.lib.data.reader.DataReader(source_name: str)
Bases:
ABC
- static get_attrs(data, key)
Get attributes associated with a key
- get_dataset(i)
- get_datasets()
- get_field(name: str, ds_index: int)
Extract field from xarray Dataset
- Parameters:
name (str) – name of field to extract from dataset
ds_index (int) – index of dataset to extract from
- Returns:
DataArray containing field data
- property logger: Logger
- abstract read_data(file_path: str) Any
- source_name: str
- eviz.lib.data.reader.get_data_coords(data2d, attribute_name)
eviz.lib.data.tabular_reader module
- class eviz.lib.data.tabular_reader.CSVDataReader(source_name: str, file_path: str | None = None)
Bases:
DataReader
Class definitions for reading CSV data files.
- file_path: str = None
- read_data(file_path: str) Any
Reads CSV data files and returns its data as Pandas dataframe
- Returns:
a Pandas dataframe
eviz.lib.data.units module
- eviz.lib.data.units.Pa_to_hPa(Pa)
- eviz.lib.data.units.Pa_to_mb(Pa)
- class eviz.lib.data.units.Units(config: Config)
Bases:
object
- This class defines attributes and methods to perform unit conversions of xarray data arrays.
The conversion will be automatic if the fields are registered in eviz’s species database and the units are supported. Otherwise, the conversion specification can be made in eviz’s config files (APP and SPECS YAML files). Please see user’s guide for more information.
Parameters:
- config (Config) :
Representation of the model configuration used to specify data sources and user choices for the map generation. The config instance is created at the application level.
- airmass: float
- convert(data, species_name, to_unit)
Conversion method for non-chemical species (e.g. atmospheric fields)
- Parameters:
to_unit (str) – data destination unit
species_name (str) – species name of the data
data (xArray) – data to undergo unit conversion
- convert_chem(data, sp_name, to_unit, air_column_density=None, airmass=None)
Conversion method for chemical species
- Parameters:
to_unit (str) – data destination unit
species_name (str) – species name of the data
data (xArray) – data to undergo unit conversion
- property logger: Logger
- species_db: dict
- eviz.lib.data.units.adjust_units(units)
Creates a consistent unit string that will be used in the unit conversion routines below.
- Parameters:
units – str Input unit string.
- Returns:
- str
Output unit string, adjusted to a consistent value.
- Return type:
adjusted_units
- Remarks:
Unit list is incomplete – currently is geared to units from common model diagnostics (e.g. kg/m2/s, kg, and variants).
- eviz.lib.data.units.c_to_f(c)
Convert celsius to fahrenheit
- eviz.lib.data.units.c_to_k(c)
Convert celsius to kelvin
- eviz.lib.data.units.calculate_total_area(field)
Calculates the total surface area
- Parameters:
field (xarray.DataArray) – The input data array.
- Returns:
The total area over which the data array is defined.
- Return type:
float
- eviz.lib.data.units.calculate_total_column(species, airmass, species_name)
Calculate the total column of a given species in mol/m², molecules/cm², and Dobson Units (DU).
- Parameters:
species (xarray.DataArray) – Mixing ratio of the species (in mol/mol)
airmass (xarray.DataArray) – Airmass (in kg/m²)
species_name (str) – Name of the species (for DU conversion)
- Returns:
Total column of the species in mol/m² total_column_molecules_cm2 (xarray.DataArray): Total column of the species in molecules/cm² total_column_du (xarray.DataArray): Total column of the species in Dobson Units (DU)
- Return type:
total_column_mol_m2 (xarray.DataArray)
- eviz.lib.data.units.calculate_total_mass(airmass)
Calculates the total mass given the airmass per square meter field.
- Parameters:
airmass (xarray.DataArray) – The input data array representing mass per square meter.
- Returns:
The total mass in kilograms.
- Return type:
float
- eviz.lib.data.units.check_units(ref_da, dev_da, enforce_units=True) bool
Ensures the units of two xarray DataArrays are the same.
- Parameters:
ref_da – xarray DataArray First data array containing a units attribute.
dev_da – xarray DataArray Second data array containing a units attribute.
- Keyword Args (optional):
- enforce_units: bool
Whether to stop program if ref and dev units do not match (default: True)
- eviz.lib.data.units.data_unit_is_mol_per_mol(da)
Check if the units of an xarray DataArray are mol/mol based on a set list of unit strings mol/mol may be.
- Parameters:
da – xarray DataArray Data array containing a units attribute
- Returns:
- bool
Whether input units are mol/mol
- Return type:
is_molmol
- eviz.lib.data.units.download_airmass(url)
Downloads airmass file
- Parameters:
url (str) – URL of the file
- Returns:
xArray dataset
- eviz.lib.data.units.du_to_mol(du, molar_mass_species, airmass)
Convert DU to molar fraction
- eviz.lib.data.units.f_to_c(f)
Convert fahrenheit to celsius
- eviz.lib.data.units.f_to_k(f)
Convert fahrenheit to kelvin
- eviz.lib.data.units.g_to_kg(gram)
Convert grams to kilograms
- eviz.lib.data.units.g_to_mg(gram)
Convert grams to mg
- eviz.lib.data.units.get_airmass(config)
Retrieves airmass field stored in a file or URL
- Parameters:
config – eviz config object
- Returns:
xArray
- Return type:
airmass field
- eviz.lib.data.units.get_species_name(species_name)
- eviz.lib.data.units.hPa_to_Pa(hPa)
- eviz.lib.data.units.hPa_to_mb(hPa)
- eviz.lib.data.units.k_to_c(k)
Convert kelvin to celsius
- eviz.lib.data.units.k_to_f(k)
Convert kelvin to fahrenheit
- eviz.lib.data.units.kg_to_du(kg_frac, molar_mass_species, airmass)
Convert kg to DU
- eviz.lib.data.units.kg_to_g(kilogram)
Convert kilograms to grams
- eviz.lib.data.units.kg_to_mg(kilogram)
Convert kilograms to mg
- eviz.lib.data.units.kg_to_mol(kg_frac, molar_mass_species)
Convert kg to molar fraction
- eviz.lib.data.units.kg_to_ppb(kg_frac, molar_mass_species)
Convert from kg kg⁻¹ to parts per billion (ppb).
- Parameters:
kg_frac (dataarray values) – mass fraction of the species (kg kg⁻¹)
air_column_density – air column density (molecules cm⁻²)
- Returns:
concentration in parts per billion (ppb)
- eviz.lib.data.units.logger = <Logger eviz.lib.data.units (WARNING)>
Contains methods for converting the units of data. Some functions are adopted from GCpy - with minor modifications
- eviz.lib.data.units.mass_to_moles(g_of_element, molar_mass)
Convert mass of an element to moles
- eviz.lib.data.units.mb_to_Pa(mb)
- eviz.lib.data.units.mb_to_hPa(mb)
- eviz.lib.data.units.mg_to_g(mgram)
Convert mg to grams
- eviz.lib.data.units.mg_to_kg(mgram)
Convert mg to kilograms
- eviz.lib.data.units.mol_to_du(mol_frac, molar_mass_species, airmass)
Convert molar fraction to DU
- eviz.lib.data.units.mol_to_kg(mol_frac, molar_mass_species)
Convert molar fraction to kg
- eviz.lib.data.units.mol_to_molecules_cm2(mol_frac, air_column_density)
Convert from mol mol⁻¹ to molecules cm⁻². :param mol_frac: molar fraction of the species (mol mol⁻¹) :param air_column_density: air column density (molecules cm⁻²)
- Returns:
number of molecules per cm²
- eviz.lib.data.units.mol_to_ppb(mol_frac)
Convert from mol mol⁻¹ to parts per billion (ppb).
- Parameters:
mol_frac (dataarray values) – molar fraction of the species (mol mol⁻¹)
air_column_density – air column density (molecules cm⁻²)
- Returns:
concentration in parts per billion (ppb)
- eviz.lib.data.units.moles_to_mass(num_moles, molar_mass)
Convert moles to mass
- eviz.lib.data.units.ppb_to_mol(ppb)
Convert from parts per billion (ppb) to mol mol⁻¹.
- Parameters:
ppb (dataarray values) – concentration in parts per billion
- Returns:
molar fraction of the species (mol mol⁻¹)
- Return type:
mol_frac