A climate change scenario-building analysis framework, built with xclim/xarray.
Project description
xscen
A climate change scenario-building analysis framework, built with Intake-esm catalogs and xarray-based packages such as xclim and xESMF.
For documentation concerning xscen, see: https://xscen.readthedocs.io/en/latest/
Features
Supports workflows with YAML configuration files for better transparency, reproducibility, and long-term backups.
Intake_esm-based catalog to find and manage climate data.
Climate dataset extraction, subsetting, and temporal aggregation.
Calculate missing variables through Intake-esm’s DerivedVariableRegistry.
Regridding with xESMF.
Bias adjustment with xclim.
Installation
Please refer to the installation docs.
Acknowledgments
This package was created with Cookiecutter and the Ouranosinc/cookiecutter-pypackage project template.
History
v0.6.0 (2023-05-04)
Contributors to this version: Trevor James Smith (@Zeitsperre), Juliette Lavoie (@juliettelavoie), Pascal Bourgault (@aulemahal), Gabriel Rondeau-Genesse (@RondeauG).
Announcements
xscen is now offered as a conda package available through Anaconda.org. Refer to the installation documentation for more information. (GH/149, PR/171).
Deprecation: Release 0.6.0 of xscen will be the last version to support xscen.extract.clisops_subset. Use xscen.spatial.subset instead. (PR/182, PR/184).
Deprecation: The argument region, used in multiple functions, has been slightly reformatted. Release 0.6.0 of xscen will be the last version to support the old format. (GH/99, GH/101, PR/184).
New features and enhancements
Support for computing anomalies in compute_deltas. (PR/165).
Add function diagnostics.measures_improvement_2d. (PR/167).
Add function regrid.create_bounds_rotated_pole and automatic use in regrid_dataset and spatial_mean. This is temporary, while we wait for a functionning method in cf_xarray. (PR/174, GH/96).
Add spatial submodule with functions creep_weights and creep_fill for filling NaNs using neighbours. (PR/174).
Allow passing GeoDataFrame instances in spatial_mean’s region argument, not only geospatial file paths. (PR/174).
Allow searching for periods in catalog.search. (GH/123, PR/170).
Allow searching and extracting multiple frequencies for a given variable. (GH/168, PR/170).
New function xs.spatial.subset to replace xs.extract.clisops_subset and add method “sel”. (GH/180, PR/182).
Add long_name attribute to diagnostics. ( PR/189).
New utils.standardize_periods to standardize that argument across multiple functions. (GH/87, PR/192).
New coverage_kwargs argument added to search_data_catalogs to allow modifying the default values of subset_file_coverage. (GH/87, PR/192).
Breaking changes
‘mean’ averaging has been deprecated in spatial_mean. (PR/125).
‘interp_coord’ has been renamed to ‘interp_centroid’ in spatial_mean. (PR/125).
The ‘datasets’ dimension of the output of diagnostics.measures_heatmap is renamed ‘realization’. (PR/167).
_subset_file_coverage was renamed subset_file_coverage and moved to catalog.py to prevent circular imports. (PR/170).
extract_dataset doesn’t fail when a variable is in the dataset, but not variables_and_freqs. (PR/185).
The argument period, used in multiple function, is now always a single list, while periods is more flexible. (GH/87, PR/192).
The parameters reference_period and simulation_period of xscen.train and xscen.adjust were renamed period/periods to respect the point above. (GH/87, PR/192).
Bug fixes
Forbid pandas v1.5.3 in the environment files, as the linux conda build breaks the data catalog parser. (GH/161, PR/162).
Only return requested variables when using DataCatalog.to_dataset. (PR/163).
compute_indicators no longer crashes if less than 3 timesteps are produced. (PR/125).
xarray is temporarily pinned below v2023.3.0 due to an API-breaking change. (GH/175, PR/173).
xscen.utils.unstack_fill_nan` can now handle datasets that have non dimension coordinates. (GH/156, PR/175).
extract_dataset now skips a simulation way earlier if the frequency doesn’t match. (PR/170).
extract_dataset now correctly tries to extract in reverse timedelta order. (PR/170).
compute_deltas no longer creates all NaN values if the input dataset is in a non-standard calendar. (PR/188).
Internal changes
xscen now manages packaging for PyPi and TestPyPI via GitHub workflows. (PR/159).
Pre-load coordinates in extract.clisops_subset (PR/163).
Minimal documentation for templates. (PR/163).
xscen is now indexed in Zenodo, under the ouranos community of projects. (PR/164).
Better warning messages in _subset_file_coverage when coverage is insufficient. (PR/125).
The top-level Makefile now includes a linkcheck recipe, and the ReadTheDocs configuration no longer reinstalls the llvmlite compiler library. (PR/173).
The checkups on coverage and duplicates can now be skipped in subset_file_coverage. (PR/170).
Changed the ProjectCatalog docstrings to make it more obvious that it needs to be created empty. (GH/99, PR/184).
Added parse_config to creep_fill, creep_weights, and reduce_ensemble (PR/191).
v0.5.0 (2023-02-28)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre), Sarah Gammon (@sg2475962) and Pascal Bourgault (@aulemahal).
New features and enhancements
Possibility of excluding variables read from file from the catalog produced by parse_directory. (PR/107).
New functions extract.subset_warming_level and aggregate.produce_horizon. (PR/93).
add round_var to xs.clean_up. (PR/93).
New “timeout_cleanup” option for save_to_zarr, which removes variables that were in the process of being written when receiving a TimeoutException. (PR/106).
New scripting.skippable context, allowing the use of CTRL-C to skip code sections. (PR/106).
Possibility of fields with underscores in the patterns of parse_directory. (PR/111).
New utils.show_versions function for printing or writing to file the dependency versions of xscen. (GH/109, PR/112).
Added previously private notebooks to the documentation. (PR/108).
Notebooks are now tested using pytest with nbval. (PR/108).
New restrict_warming_level argument for extract.search_data_catalogs to filter dataset that are not in the warming level csv. (GH/105, PR/138).
Set configuration value programmatically through CONFIG.set. (PR/144).
New to_dataset method on DataCatalog. The same as to_dask, but exposing more aggregation options. (PR/147).
New templates folder with one general template. (GH/151, PR/158).
Breaking changes
Functions that are called internally can no longer parse the configuration. (PR/133).
Bug fixes
properties_and_measures no longer casts month coordinates to string. (PR/106).
search_data_catalogs no longer crashes if it finds nothing. (GH/42, PR/92).
Prevented fixed fields from being duplicated during _dispatch_historical_to_future (GH/81, PR/92).
Added missing parse_config to functions in reduce.py (PR/92).
Added deepcopy before skipna is popped in spatial_mean (PR/92).
subset_warming_level now validates that the data exists in the dataset provided (GH/117, PR/119).
Adapt stack_drop_nan for the newest version of xarray (2022.12.0). (GH/122, PR/126).
Fix stack_drop_nan not working if intermediate directories don’t exist (GH/128).
Fixed a crash when compute_indicators produced fixed fields (PR/139).
Internal changes
compute_deltas skips the unstacking step if there is no time dimension and cast object dimensions to string. (PR/9)
Added the “2sem” frequency to the translations CVs. (PR/111).
Skip files we can’t read in parse_directory. (PR/111).
Fixed non-numpy-standard Docstrings. (PR/108).
Added more metadata to package description on PyPI. (PR/108).
Faster search_data_catalogs and extract_dataset through a faster DataCatalog.unique, date parsing and a rewrite of the ensure_correct_time logic. (PR/127).
The search_data_catalogs function now accepts str or pathlib.Path variables (in addition to lists of either data type) for performing catalog lookups. (PR/121).
produce_horizons now supports fixed fields (PR/139).
Rewrite of unstack_dates for better performance with dask arrays. (PR/144).
v0.4.0 (2022-09-28)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre) and Pascal Bourgault (@aulemahal).
New features and enhancements
New functions diagnostics.properties_and_measures, diagnostics.measures_heatmap and diagnostics.measures_improvement. (GH/5, PR/54).
Add argument resample_methods to xs.extract.resample. (GH/57, PR/57)
Added a ReadTheDocs configuration to expose public documentation. (GH/65, PR/66).
xs.utils.stack_drop_nans/ xs.utils.unstack_fill_nan will now format the to_file/coords string to add the domain and the shape. (GH/59, PR/67).
New unstack_dates function to “extract” seasons or months from a timeseries. (PR/68).
Better spatial_mean for cases using xESMF and a shapefile with multiple polygons. (PR/68).
- Yet more changes to parse_directory: (PR/68).
Better parallelization by merging the finding and name-parsing step in the same dask tree.
Allow cvs for the variable columns.
Fix parsing the variable names from datasets.
Sort the variables in the tuples (for a more consistent output)
In extract_dataset, add option ensure_correct_time to ensure the time coordinate matches the expected freq. Ex: monthly values given on the 15th day are moved to the 1st, as expected when asking for “MS”. (:issue: 53).
- In regrid_dataset: (PR/68).
Allow passing skipna to the regridder kwargs.
Do not fail for any grid mapping problem, includin if a grid_mapping attribute mentions a variable that doesn’t exist.
Default email sent to the local user. (PR/68).
Special accelerated pathway for parsing catalogs with all dates within the datetime64[ns] range. (PR/75).
New functions reduce_ensemble and build_reduction_data to support kkz and kmeans clustering. (GH/4, PR/63).
ensemble_stats can now loop through multiple statistics, support functions located in xclim.ensembles._robustness, and supports weighted realizations. (PR/63).
New function ensemble_stats.generate_weights that estimates weights based on simulation metadata. (PR/63).
New function catalog.unstack_id to reverse-engineer IDs. (PR/63).
generate_id now accepts Datasets. (PR/63).
Add rechunk option to properties_and_measures (PR/76).
Breaking changes
statistics / stats_kwargs have been changed/eliminated in ensemble_stats, respectively. (PR/63).
Bug fixes
Internal changes
Default method of xs.extract.resample now depends on frequency. (GH/57, PR/58).
Bugfix for _restrict_by_resolution with CMIP6 datasets (PR/71).
More complete check of coverage in _subset_file_coverage. (GH/70, PR/72)
The code that performs common_attrs_only in ensemble_stats has been moved to clean_up. (PR/63).
Removed the default to_level in clean_up. (PR/63).
xscen now has an official logo. (PR/69).
Use numpy max and min in properties_and_measures (PR/76).
Cast catalog date_start and date_end to “%4Y-%m-%d %H:00” when writing to disk. (GH/83, PR/79)
Skip test of coverage on the sum if the list of select files is empty. (PR/79)
Added missing CMIP variable names in conversions.yml and added the ability to provide a custom file instead (GH/86, PR/88)
Changed ‘allow_conversion’ and ‘allow_resample’ default to False in search_data_catalogs (GH/86, PR/88)
v0.3.0 (2022-08-23)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre) and Pascal Bourgault (@aulemahal).
New features and enhancements
parse_directory: Fixes to xr_open_kwargs and support for wildcards (*) in the directories. (PR/19).
New function xscen.ensemble.ensemble_stats added. (GH/3, PR/28).
New functions spatial_mean, climatological_mean and deltas added. (GH/4, PR/35).
Add argument intermediate_reg_grids to xscen.regridding.regrid. (GH/34, PR/39).
Add argument moving_yearly_window to xscen.biasadjust.adjust. (PR/39).
Many adjustments to parse_directory: better wildcards (GH/24), allow custom columns, fastpaths for parse_from_ds, and more (PR/30).
Documentation now makes better use of autodoc to generate package index. (PR/41).
periods argument added to compute_indicators to support datasets with jumps in time (PR/35).
Breaking changes
Patterns in parse_directory start at the end of the paths in directories. (PR/30).
Argument extension of parse_directory has been renamed globpattern. (PR/30).
- The xscen API and filestructure have been significantly refactored. (GH/40, PR/41). The following functions are available from the top-level:
adjust, train, ensemble_stats, clisops_subset, dispatch_historical_to_future, extract_dataset, resample, restrict_by_resolution, restrict_multimembers, search_data_catalogs, save_to_netcdf, save_to_zarr, rechunk, compute_indicators, regrid_dataset, and create_mask.
xscen now requires geopandas and shapely (PR/35).
Following a change in intake-esm xscen now uses “cat:” to prefix the dataset attributes extracted from the catalog. All catalog-generated attributes should now be valid when saving to netCDF. (GH/13, PR/51).
Internal changes
parse_directory: Fixes to xr_open_kwargs. (PR/19).
Fix for indicators removing the ‘time’ dimension. (PR/23).
Security scanning using CodeQL and GitHub Actions is now configured for the repository. (PR/21).
Bumpversion action now configured to automatically augment the version number on each merged pull request. (PR/21).
Add align_on = 'year' argument in bias adjustment converting of calendars. (PR/39).
GitHub Actions using Ubuntu-22.04 images are now configured for running testing ensemble using tox-conda. (PR/44).
import xscen smoke test is now run on all pull requests. (PR/44).
Fix for create_mask removing attributes (PR/35).
v0.2.0 (first official release)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Pascal Bourgault (@aulemahal), Trevor James Smith (@Zeitsperre), Juliette Lavoie (@juliettelavoie).
Announcements
This is the first official release for xscen!
New features and enhancements
Supports workflows with YAML configuration files for better transparency, reproducibility, and long-term backups.
Intake_esm-based catalog to find and manage climate data.
Climate dataset extraction, subsetting, and temporal aggregation.
Calculate missing variables through Intake-esm’s DerivedVariableRegistry.
Regridding with xESMF.
Bias adjustment with xclim.
Breaking changes
N/A
Internal changes
N/A
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.