Code Documentation

class ProcessOBS.MJOobsProcessor(yaml_file_path, scaling_dict=None)

An example docstring for a class definition.

check_MJO_orientation(eof_list, pcs, lons)

Check the orientation of MJO’s first two Empirical Orthogonal Functions (EOFs).

Parameters:
eof_list (list): List of empirical orthogonal functions (EOFs). pcs (xarray.Dataset): Xarray dataset containing the principal components (PCs). lons (array_like): Longitudes.
Returns:
loc1 (int): Index of the first dominant EOF. loc2 (int): Index of the second dominant EOF. scale1 (int): Scaling factor for the first dominant EOF. scale2 (int): Scaling factor for the second dominant EOF.
check_obs_eofs()

Checks the correlation between observed dataset EOF values and ERA5 EOF values for specific modes.

This function opens an observed dataset file, interpolates the data, and calculates the correlation coefficients between specific EOF modes of the observed dataset and corresponding ERA5 EOF values. It compares the calculated correlations to a predefined threshold and raises an assertion error if any correlation is below the threshold.

Returns:
bool: True if all correlations are above the threshold, indicating a high correlation between observed and ERA5 EOF values.
get_phase_and_eofs(eof_list, pcs, lons)

Calculate MJO phase and related EOFs.

Parameters:
eof_list (list): List of empirical orthogonal functions (EOFs). pcs (xarray.Dataset): Xarray dataset containing the principal components (PCs). lons (array_like): Longitudes.
Returns:
tot_dict (dict): Dictionary containing MJO phase, RMM indices, and EOFs.
make_observed_MJO()

Generate observed MJO indices using observational data.

Parameters:
yml_data (dict): Dictionary containing YAML configuration data. lons_forecast (array_like): Longitudes of the forecast data.
Returns:
OBS_DS (xarray.Dataset): Xarray dataset containing the observed data. eof_list (list): List of empirical orthogonal functions (EOFs). pcs (xarray.Dataset): Xarray dataset containing the principal components (PCs). MJO_fobs (xarray.Dataset): Xarray dataset with observed MJO indices.
pass_eof_scaling_factors(scaling_dictionary=None)

Sets the scaling factors for MJO observation EOF modes based on a user-defined dictionary.

This function allows users to provide a dictionary containing scaling factors for EOF modes. If no scaling dictionary is provided, default scaling factors are set and determined for the observations.

Args:
scaling_dict (dict, optional): A dictionary containing scaling factors for EOF modes.
The dictionary should include ‘loc1’, ‘loc2’, ‘scale1’, and ‘scale2’ keys, representing the location of the EOF (0/1) and scaling factors (1 or -1) for the EOF modes 1 & 2 in the WH RMM calculation.
Raises:
KeyError: If the provided scaling_dict is missing any of the required keys.
plot_eof(ax)

Customize the EOF plot axis.

Parameters:
ax (matplotlib.axes.Axes): The axis of the plot to customize.
Returns:
ax (matplotlib.axes.Axes): The customized axis of the plot.
plot_obs_eof(eof_list, pcs, varfrac, lons)

Plot the spatial structures of EOF1, EOF2, and EOF3.

Parameters:
eof_list (list): List of EOF arrays for OLR, U850, and U200. pcs (numpy.array): Array containing principal components. varfrac (numpy.array): Array containing the variance fraction values for each EOF. lons (numpy.array): Array of longitudes.
Returns:
None
plot_phase_space(date_start, days_forward)

Plot MJO RMM phase space with a scatter plot indicating the progression of time.

Parameters:
date_start (str or pandas.Timestamp): Starting date for the plot. days_forward (int): Number of days to project forward.
Raises:
RuntimeError: If ‘make_observed_MJO’ hasn’t been executed to create observational MJO data.
plot_varfrac(varfrac)

Plot the fraction of the total variance represented by each EOF.

Parameters:
varfrac (numpy.array): Array containing the variance fraction values for each EOF.
save_out_obs(tot_dict, u200, u850, olr)

Save the observed MJO dataset to a NetCDF file.

Parameters:
tot_dict (dict): Dictionary containing MJO phase, RMM indices, and EOFs. u200 (xarray.DataArray): Xarray DataArray containing normalized u200 data. u850 (xarray.DataArray): Xarray DataArray containing normalized u850 data. olr (xarray.DataArray): Xarray DataArray containing normalized OLR data.
Returns:
MJO_fobs (xarray.Dataset): Xarray dataset containing the observed MJO data.
class ProcessForecasts.MJOforecaster(yaml_file_path, eof_dict, MJO_fobs)

A class for forecasting the Madden-Julian Oscillation (MJO) using given parameters.

This class provides methods for forecasting the MJO based on user-defined configurations.

Parameters:
yaml_file_path (str): The path to the YAML configuration file. eof_dict (dict): A dictionary containing Empirical Orthogonal Function (EOF) configurations. MJO_fobs (xarray.Dataset): Observed MJO data as an xarray Dataset.
Attributes:
yml_data (dict): Parsed data from the YAML configuration file. yml_usr_info (dict): User-defined information from the YAML configuration. forecast_lons (list): List of longitudes used for forecasting. base_dir (str): The base directory specified in the YAML configuration. eof_dict (dict): Dictionary containing EOF configurations. MJO_fobs (xarray.Dataset): Observed MJO data. made_forecast_file (bool): Indicates if a forecast file has been created.
anomaly_ERA5(yml_data, DS_CESM_for, DS_climo_forecast, numdays_out)

Calculate anomalies of u850, u200, and OLR from forecast and ERA5 climatology datasets.

Parameters:
yml_data (dict): A dictionary containing user-defined information. DS_CESM_for (xr.Dataset): Forecast dataset (CESM format). DS_climo_forecast (xr.Dataset): ERA5 climatology dataset. numdays_out (int): Number of forecast days.
Returns:
U850_cesm_anom (xr.DataArray): Anomalies of u850. U200_cesm_anom (xr.DataArray): Anomalies of u200. OLR_cesm_anom (xr.DataArray): Anomalies of OLR.
anomaly_LTD(yml_data, DS_CESM_for, DS_climo_forecast, numdays_out)

Calculate anomalies of u850, u200, and OLR from forecast and forecast climatology datasets.

Parameters:
yml_data (dict): A dictionary containing user-defined information. DS_CESM_for (xr.Dataset): Forecast dataset (CESM format). DS_climo_forecast (xr.Dataset): Forecast climatology dataset. numdays_out (int): Number of forecast days.
Returns:
U850_cesm_anom (xr.DataArray): Anomalies of u850. U200_cesm_anom (xr.DataArray): Anomalies of u200. OLR_cesm_anom (xr.DataArray): Anomalies of OLR.
check_forecast_files_runtime(for_file_list, yml_usr_info)

Check the forecast files for required variables and ensemble dimension.

Parameters:
for_file_list (str): File path of the forecast file to be checked. yml_usr_info (dict): Dictionary containing user settings from YAML file.
Returns:
Bingo (bool): True if the required variables are present, False otherwise. DS (xarray.Dataset): Updated dataset with added ‘ensemble’ dimension (if required).
create_forecasts(num_files=None)

Create forecast files for a given range of latitude and settings.

This function generates forecast files based on provided configurations, including latitude range, filtering parameters, and EOF settings. It processes each forecast file and saves the forecasted data into netCDF files. This function also performs data manipulation and filtering operations.

Parameters:
num_files (int, optional): The maximum number of forecast files to process. If None, process all files.
Returns:
DS_CESM_for (xr.Dataset): Forecast dataset containing processed forecasted data. OLR_cesm_anom_filterd (numpy.ndarray): Filtered OLR anomalies. U200_cesm_anom_filterd (numpy.ndarray): Filtered u200 anomalies. U850_cesm_anom_filterd (numpy.ndarray): Filtered u850 anomalies.
filt_ndays(yml_data, DS_CESM_for, U850_cesm_anom, U200_cesm_anom, OLR_cesm_anom, DS_climo_forecast, numdays_out, AvgdayN, nensembs)

Perform anomaly filtering for atmospheric variables.

Parameters:
yml_data (dict): YAML data containing user-defined information. DS_CESM_for: Not defined in the code snippet provided. U850_cesm_anom (xarray.Dataset): Anomaly dataset for U850 variable. U200_cesm_anom (xarray.Dataset): Anomaly dataset for U200 variable. OLR_cesm_anom (xarray.Dataset): Anomaly dataset for OLR variable. DS_climo_forecast: Not defined in the code snippet provided. numdays_out: Not defined in the code snippet provided. AvgdayN: The number of days to be averaged for filtering. nensembs (int): Number of ensemble members.
Returns:
Updated U850_cesm_anom_filterd, U200_cesm_anom_filterd, OLR_cesm_anom_filterd datasets.
get_forecast_LT_climo(yml_data, lons_forecast)

Get the forecast climatology dataset.

Parameters:
yml_data (dict): A dictionary containing user-defined information. lons_forecast (array-like): Array of forecast longitudes.
Returns:
DS_climo_forecast (xr.Dataset): Forecast climatology dataset.
plot_phase_space(Num_Ensembles, Lead)

Plot MJO RMM Ensemble phase space with a scatter plot indicating the progression of time.

Parameters:
date_start (str or pandas.Timestamp): Starting date for the plot. days_forward (int): Number of days to project forward.
Raises:
RuntimeError: If ‘make_observed_MJO’ hasn’t been executed to create observational MJO data.
project_eofs(OLR_cesm_anom_filterd, U850_cesm_anom_filterd, U200_cesm_anom_filterd, numdays_out, nensembs, neofs_save, neof, eof_dict, svname, U200_cesm_anom)

Calculate and save RMM indices and EOFs.

Parameters:
OLR_cesm_anom_filterd (xarray.Dataset): Anomaly dataset for OLR variable. U850_cesm_anom_filterd (xarray.Dataset): Anomaly dataset for U850 variable. U200_cesm_anom_filterd (xarray.Dataset): Anomaly dataset for U200 variable. numdays_out (int): Number of days to project. nensembs (int): Number of ensemble members. neofs_save (int): Number of EOFs to save. neof (int): Number of EOFs to use for RMM calculation. eof_dict (dict): Dictionary containing normalization factors and other parameters.
Returns:
RMM1 (numpy.ndarray): RMM index 1. RMM2 (numpy.ndarray): RMM index 2. eofs_save (numpy.ndarray): Array of EOFs. sv_olr (numpy.ndarray): Scaled and normalized OLR data. sv_u200 (numpy.ndarray): Scaled and normalized U200 data. sv_u850 (numpy.ndarray): Scaled and normalized U850 data. sv_olr_unscaled (numpy.ndarray): Unscaled OLR data.
save_out_forecast_nc(RMM1, RMM2, RMM1_emean, RMM2_emean, RMM1_obs_cera20c, RMM2_obs_cera20c, eofs_save, MJO_fobs, sv_olr, sv_u200, sv_u850, eof_dict, neofs_save, OLR_cesm_anom_filterd_latmean, svname, U200_cesm_anom, U200_cesm_anom_filterd)

Save forecasted MJO data to a netCDF file and set attribute information.

This function saves forecasted MJO-related data into a netCDF file and assigns attribute information for better metadata representation.

Parameters:
RMM1 (numpy.ndarray): Array containing forecasted RMM1 data. RMM2 (numpy.ndarray): Array containing forecasted RMM2 data. RMM1_emean (numpy.ndarray): Array containing forecasted RMM1 ensemble mean data. RMM2_emean (numpy.ndarray): Array containing forecasted RMM2 ensemble mean data. RMM1_obs_cera20c (numpy.ndarray): Array containing observed RMM1 data (CERA-20C). RMM2_obs_cera20c (numpy.ndarray): Array containing observed RMM2 data (CERA-20C). eofs_save (numpy.ndarray): Array containing saved EOFs data. MJO_fobs (xarray.Dataset): Observed MJO data as an xarray Dataset. sv_olr (numpy.ndarray): Array containing saved normalized OLR data. sv_u200 (numpy.ndarray): Array containing saved normalized u200 data. sv_u850 (numpy.ndarray): Array containing saved normalized u850 data. eof_dict (dict): Dictionary containing EOF configurations. neofs_save (int): Number of EOFs to save. OLR_cesm_anom_filterd_latmean (xarray.DataArray): Filtered and latitude-mean OLR data. svname (str): Name of the netCDF file to save. U200_cesm_anom (numpy.ndarray): Array containing u200 anomalies data (CESM2). U200_cesm_anom_filterd (numpy.ndarray): Array containing filtered u200 anomalies data (CESM2).
ProcessForecasts.make_DF_ense(files)

Create a DataFrame with files and their corresponding initialization dates.

Parameters:
files (list): List of file paths.
Returns:
DF (pd.DataFrame): DataFrame containing files and their initialization dates.
WHtools.Create_Driver_Yaml(filename, user, base_dir, use_era5, usr_named_obs, obs_data_loc, forecast_data_loc, forecast_data_name_str, forecast_olr_name, forecast_u200_name, forecast_u850_name, forecast_ensemble_dimension_name, output_plot_loc, output_files_loc, output_files_string, use_forecast_climo, use_observed_climo, regenerate_climo, use_dask_for_climo)

Create a YAML configuration file from the provided parameters.

Parameters: filename (str): The name of the YAML file to be generated. user (str): your username. base_dir (str): The base directory. use_era5 (bool): Whether to use ERA5 data. usr_named_obs (str): Alternative name for observations file. obs_data_loc (str): Location of observation data. forecast_data_loc (str): Location of forecast data. forecast_data_name_str (str): String pattern for forecast data filenames. forecast_olr_name (str): Name of OLR forecast variable. forecast_u200_name (str): Name of 200mb wind forecast variable. forecast_u850_name (str): Name of 850mb wind forecast variable. forecast_ensemble_dimension_name (str): Name of the ensemble dimension. output_plot_loc (str): Location for output plots. output_files_loc (str): Location for output files. output_files_string (str): String for naming MJO forecast files. use_forecast_climo (bool): Whether to use forecast climatology. use_observed_climo (bool): Whether to use observed climatology. regenerate_climo (bool): Whether to regenerate climatology. use_dask_for_climo (bool): Whether to use Dask for climatology processing.

WHtools.check_forecast_files(for_file_list, yml_usr_info)

Check the forecast files for required variables and ensemble dimension.

Parameters:
for_file_list (str): File path of the forecast file to be checked. yml_usr_info (dict): Dictionary containing user settings from YAML file.
Returns:
Bingo (bool): True if the required variables are present, False otherwise. DS (xarray.Dataset): Updated dataset with added ‘ensemble’ dimension (if required).
WHtools.check_lat_lon_coords(data)

Check if either “latitude/longitude” or “lat/lon” are in the coordinates of the xarray dataset.

Parameters:
data (xr.Dataset): Input xarray dataset.
Returns:
has_lat_lon_coords (bool): True if either “latitude/longitude” or “lat/lon” are in the coordinates, False otherwise.
WHtools.check_or_create_paths(yml_data)

Check the paths and files required for the forecast data.

Parameters:
yml_data (dict): Dictionary containing information from the YAML file.
Returns:
DS (xarray.Dataset): Xarray dataset of the forecast data.
WHtools.flip_lat_if_necessary(data)

Check the orientation of the latitude dimension in an xarray dataset and flip it if necessary.

Parameters:
data (xr.Dataset): Input xarray dataset.
Returns:
data_flipped (xr.Dataset): Flipped xarray dataset, if necessary.
WHtools.func1(arg1, arg2)

This function takes two arguements setting the first to the second

Parameters:
  • arg1
  • arg2
Returns:

WHtools.interpolate_obs(OBS_DS, lons_forecast)

Interpolate observed data to match forecast longitudes.

Parameters:
OBS_DS (xarray.Dataset): Xarray dataset containing the observed data. lons_forecast (array_like): Longitudes of the forecast data.
Returns:
OBS_DS (xarray.Dataset): Interpolated xarray dataset of the observed data.
WHtools.plot_phase_space(ax)

Plots a phase space diagram with annotations and labels.

Parameters: ax (matplotlib.axes._axes.Axes): The matplotlib axes on which to plot the diagram.

Returns: matplotlib.axes._axes.Axes: The axes with the plotted phase space diagram.

WHtools.switch_lon_to_0_360(data)

Check if the longitude values in the xarray dataset are in the range -180 to 180 degrees, and switch them to the range 0 to 360 degrees if needed.

Parameters:
data (xr.Dataset or xr.DataArray): Input xarray dataset or data array.
Returns:
data_with_lon_0_360 (xr.Dataset or xr.DataArray): Xarray dataset or data array with longitude values in the range 0 to 360 degrees.