Core Module¶

core ¶

Core module for effective precipitation calculations using Google Earth Engine.

This module provides the main :class:EffectivePrecipitation class for calculating effective precipitation from various climate datasets available on Google Earth Engine.

The module supports multiple effective precipitation methods:

Ensemble: Mean of 6 methods (default)
CROPWAT: FAO CROPWAT method
FAO/AGLW: FAO Dependable Rainfall (80% exceedance)
Fixed Percentage: Simple fixed percentage method
Dependable Rainfall: FAO Dependable Rainfall method
FarmWest: FarmWest method
USDA-SCS: Soil moisture depletion method (requires AWC and ETo)
TAGEM-SuET: Turkish irrigation method (requires ETo)
PCML: Physics-Constrained ML for Western U.S. (pre-computed GEE asset)

Example

from pycropwat import EffectivePrecipitation
ep = EffectivePrecipitation(
    asset_id='ECMWF/ERA5_LAND/MONTHLY_AGGR',
    precip_band='total_precipitation_sum',
    geometry_path='study_area.geojson',
    start_year=2015,
    end_year=2020,
    precip_scale_factor=1000,
    method='ensemble'
)
results = ep.process(output_dir='./output', n_workers=4)

See Also

pycropwat.methods : Individual effective precipitation calculation functions. pycropwat.analysis : Post-processing and analysis tools. pycropwat.utils : Utility functions for GEE and file operations.

EffectivePrecipitation ¶

EffectivePrecipitation(asset_id: str, precip_band: str, geometry_path: Optional[Union[str, Path]] = None, start_year: int = None, end_year: int = None, scale: Optional[float] = None, precip_scale_factor: float = 1.0, gee_project: Optional[str] = None, gee_geometry_asset: Optional[str] = None, method: PeffMethod = 'ensemble', method_params: Optional[dict] = None)

Calculate effective precipitation from GEE climate data.

Supports multiple effective precipitation calculation methods including CROPWAT, FAO/AGLW, Fixed Percentage, Dependable Rainfall, FarmWest, and USDA-SCS (which requires AWC and ETo data).

Parameters:

Name	Type	Description	Default
`asset_id`	`str`	GEE ImageCollection asset ID for precipitation data. Common options: `ECMWF/ERA5_LAND/MONTHLY_AGGR` (ERA5-Land, global, ~11km), `IDAHO_EPSCOR/TERRACLIMATE` (TerraClimate, global, ~4km), `IDAHO_EPSCOR/GRIDMET` (GridMET, CONUS, ~4km), `OREGONSTATE/PRISM/AN81m` (PRISM, CONUS, ~4km), `UCSB-CHG/CHIRPS/DAILY` (CHIRPS, 50°S-50°N, ~5km), `NASA/GPM_L3/IMERG_MONTHLY_V06` (GPM IMERG, global, ~11km).	required
`precip_band`	`str`	Name of the precipitation band in the asset. Examples: ERA5-Land: `total_precipitation_sum` TerraClimate: `pr` GridMET: `pr` PRISM: `ppt` CHIRPS: `precipitation` GPM IMERG: `precipitation`	required
`geometry_path`	`str, Path, or None`	Path to shapefile or GeoJSON file defining the region of interest. Can also be a GEE FeatureCollection asset ID. Set to None if using gee_geometry_asset instead.	`None`
`start_year`	`int`	Start year for processing (inclusive).	`None`
`end_year`	`int`	End year for processing (inclusive).	`None`
`scale`	`float`	Output resolution in meters. If None (default), uses native resolution of the dataset.	`None`
`precip_scale_factor`	`float`	Factor to convert precipitation to mm. Default is 1.0. Common values: ERA5-Land (m to mm) = 1000, TerraClimate = 1.0, GridMET = 1.0.	`1.0`
`gee_project`	`str`	GEE project ID for authentication. Required for cloud-based GEE access.	`None`
`gee_geometry_asset`	`str`	GEE FeatureCollection asset ID for the region of interest. Takes precedence over geometry_path if both are provided.	`None`
`method`	`str`	Effective precipitation calculation method. Default is 'ensemble'. Options: `'ensemble'` - Mean of 6 methods (default, requires AWC and ETo) `'cropwat'` - CROPWAT method (FAO standard) `'fao_aglw'` - FAO Dependable Rainfall (80% exceedance) `'fixed_percentage'` - Simple fixed percentage method `'dependable_rainfall'` - FAO Dependable Rainfall method `'farmwest'` - FarmWest method `'usda_scs'` - USDA-SCS soil moisture depletion method (requires AWC and ETo data via method_params) `'suet'` - TAGEM-SuET method (Turkish Irrigation Management System) (requires ETo data via method_params) `'pcml'` - Physics-Constrained ML (Western U.S. only, Jan 2000 - Sep 2024) Uses default GEE asset: projects/ee-peff-westus-unmasked/assets/effective_precip_monthly_unmasked	`'ensemble'`
`method_params`	`dict`	Additional parameters for the selected method: For `'fixed_percentage'`: - `percentage` (float): Fraction 0-1. Default 0.7. For `'dependable_rainfall'`: - `probability` (float): Probability level 0.5-0.9. Default 0.75. For `'usda_scs'`: - `awc_asset` (str): GEE Image asset ID for AWC data. Required. U.S.: projects/openet/soil/ssurgo_AWC_WTA_0to152cm_composite Global: projects/sat-io/open-datasets/FAO/HWSD_V2_SMU - `awc_band` (str): Band name for AWC. Default 'AWC'. - `eto_asset` (str): GEE ImageCollection asset ID for ETo. Required. U.S.: projects/openet/assets/reference_et/conus/gridmet/monthly/v1 Global: projects/climate-engine-pro/assets/ce-ag-era5-v2/daily - `eto_band` (str): Band name for ETo. Default 'eto'. U.S. (GridMET): 'eto', Global (AgERA5): 'ReferenceET_PenmanMonteith_FAO56' - `eto_is_daily` (bool): Whether ETo is daily. Default False. Set True for AgERA5 daily data. - `eto_scale_factor` (float): Scale factor for ETo. Default 1.0. - `rooting_depth` (float): Rooting depth in meters. Default 1.0. - `mad_factor` (float): Management Allowed Depletion factor (0-1). Controls what fraction of soil water storage is available. Default 0.5.	`None`

Attributes:

Name	Type	Description
`geometry`	`Geometry`	The loaded geometry for the region of interest.
`collection`	`ImageCollection`	The filtered and scaled precipitation image collection.
`bounds`	`list`	Bounding box coordinates of the geometry.

Examples:

Basic usage with Ensemble method (default):

from pycropwat import EffectivePrecipitation
ep = EffectivePrecipitation(
    asset_id='ECMWF/ERA5_LAND/MONTHLY_AGGR',
    precip_band='total_precipitation_sum',
    geometry_path='roi.geojson',
    start_year=2015,
    end_year=2020,
    precip_scale_factor=1000
)
ep.process(output_dir='./output', n_workers=4)

Using GEE FeatureCollection asset:

ep = EffectivePrecipitation(
    asset_id='ECMWF/ERA5_LAND/MONTHLY_AGGR',
    precip_band='total_precipitation_sum',
    gee_geometry_asset='projects/my-project/assets/study_area',
    start_year=2015,
    end_year=2020,
    precip_scale_factor=1000,
    gee_project='my-gee-project'
)

Using FAO/AGLW method:

ep = EffectivePrecipitation(
    asset_id='IDAHO_EPSCOR/TERRACLIMATE',
    precip_band='pr',
    geometry_path='study_area.geojson',
    start_year=2000,
    end_year=2020,
    method='fao_aglw'
)

Using fixed percentage method (80%):

ep = EffectivePrecipitation(
    asset_id='IDAHO_EPSCOR/GRIDMET',
    precip_band='pr',
    geometry_path='farm.geojson',
    start_year=2010,
    end_year=2020,
    method='fixed_percentage',
    method_params={'percentage': 0.8}
)

Using USDA-SCS method with AWC and ETo data:

ep = EffectivePrecipitation(
    asset_id='ECMWF/ERA5_LAND/MONTHLY_AGGR',
    precip_band='total_precipitation_sum',
    geometry_path='arizona.geojson',
    start_year=2015,
    end_year=2020,
    precip_scale_factor=1000,
    method='usda_scs',
    method_params={
        'awc_asset': 'projects/my-project/assets/soil_awc',
        'awc_band': 'AWC',
        'eto_asset': 'IDAHO_EPSCOR/GRIDMET',
        'eto_band': 'eto',
        'eto_is_daily': True,
        'rooting_depth': 1.0
    }
)

See Also

pycropwat.methods : Individual effective precipitation calculation functions. pycropwat.analysis : Post-processing and analysis tools.

Source code in pycropwat/core.py

def __init__(
    self,
    asset_id: str,
    precip_band: str,
    geometry_path: Optional[Union[str, Path]] = None,
    start_year: int = None,
    end_year: int = None,
    scale: Optional[float] = None,
    precip_scale_factor: float = 1.0,
    gee_project: Optional[str] = None,
    gee_geometry_asset: Optional[str] = None,
    method: PeffMethod = 'ensemble',
    method_params: Optional[dict] = None,
):
    self.asset_id = asset_id
    self.precip_band = precip_band
    self.geometry_path = geometry_path
    self.gee_geometry_asset = gee_geometry_asset
    self.start_year = start_year
    self.end_year = end_year
    self.scale = scale  # None means use native resolution
    self.precip_scale_factor = precip_scale_factor
    self.gee_project = gee_project
    self.method = method
    self.method_params = method_params or {}

    # Get the effective precipitation function
    self._peff_function = get_method_function(method)

    # USDA-SCS specific: cache for AWC data (loaded once)
    self._awc_cache = None

    # Input directory for saving downloaded data (set during process())
    self._input_dir = None

    # Check if this is PCML method (uses single multi-band Image instead of ImageCollection)
    self._is_pcml = (method == 'pcml' or self.precip_band == PCML_DEFAULT_BAND)

    # For PCML, use default asset if placeholder provided
    if self._is_pcml:
        if self.asset_id == 'PLACEHOLDER' or self.asset_id is None:
            self.asset_id = PCML_DEFAULT_ASSET
            logger.info(f"Using default PCML asset: {self.asset_id}")
        self.precip_band = PCML_DEFAULT_BAND

    # Validate that at least one geometry source is provided (not required for PCML)
    if geometry_path is None and gee_geometry_asset is None and not self._is_pcml:
        raise ValueError("Either geometry_path or gee_geometry_asset must be provided")

    # Initialize GEE
    initialize_gee(self.gee_project)

    # For PCML, use the asset's own geometry if no geometry provided
    if self._is_pcml and geometry_path is None and gee_geometry_asset is None:
        # Load PCML image first
        self._pcml_image = ee.Image(self.asset_id)
        # Use predefined Western U.S. bounding box since PCML image geometry is unbounded
        self.geometry = ee.Geometry.Polygon([PCML_WESTERN_US_BOUNDS])
        self.bounds = PCML_WESTERN_US_BOUNDS
        logger.info("Using predefined Western U.S. bounding box for PCML")
    else:
        # Load geometry from GEE asset or local file
        self.geometry = load_geometry(geometry_path, gee_asset=gee_geometry_asset)
        self.bounds = self.geometry.bounds().getInfo()['coordinates'][0]

    # Get date range
    self.start_date, self.end_date = get_date_range(start_year, end_year)

    # Load and filter image collection (or load PCML image)
    self._load_collection()

cropwat_effective_precip `staticmethod` ¶

cropwat_effective_precip(pr: ndarray) -> np.ndarray

Calculate CROPWAT effective precipitation.

Parameters:

Name	Type	Description	Default
`pr`	`ndarray`	Precipitation in mm.	required

Returns:

Type	Description
`ndarray`	Effective precipitation in mm.

Source code in pycropwat/core.py

@staticmethod
def cropwat_effective_precip(pr: np.ndarray) -> np.ndarray:
    """
    Calculate CROPWAT effective precipitation.

    Parameters
    ----------

    pr : np.ndarray
        Precipitation in mm.

    Returns
    -------
    np.ndarray
        Effective precipitation in mm.
    """
    ep = np.where(
        pr <= 250,
        pr * (125 - 0.2 * pr) / 125,
        0.1 * pr + 125
    )
    return ep

process ¶

process(output_dir: Union[str, Path], n_workers: int = 4, months: Optional[List[int]] = None, input_dir: Optional[Union[str, Path]] = None, save_inputs: bool = False) -> List[Tuple[Optional[Path], Optional[Path]]]

Process all months and save effective precipitation rasters.

Downloads precipitation data from Google Earth Engine, calculates effective precipitation using the configured method, and saves results as GeoTIFF files. Uses Dask for parallel processing of multiple months.

Parameters:

Name	Type	Description	Default
`output_dir`	`str or Path`	Directory to save output rasters. Will be created if it doesn't exist.	required
`n_workers`	`int`	Number of parallel workers for Dask. Default is 4. Set to 1 for sequential processing.	`4`
`months`	`list of int`	List of months to process (1-12). If None, processes all months in the date range. Useful for seasonal analyses.	`None`
`input_dir`	`str or Path`	Directory to save downloaded input data (precipitation, AWC, ETo). If None and save_inputs is True, uses `output_dir/../analysis_inputs`.	`None`
`save_inputs`	`bool`	Whether to save downloaded input data as GeoTIFF files. Default is False. Useful for debugging or further analysis.	`False`

Returns:

Type	Description
`list of tuple`	List of tuples containing paths to saved files: `(effective_precip_path, effective_precip_fraction_path)`. Returns `(None, None)` for months that failed to process.

Notes

Output files are named:

effective_precip_YYYY_MM.tif - Effective precipitation in mm
effective_precip_fraction_YYYY_MM.tif - Effective/total ratio (non-PCML methods)
effective_precip_fraction_YYYY.tif - Annual (water year) fraction (PCML method only)

For the USDA-SCS method, AWC and ETo data are automatically downloaded and cached for efficiency.

Examples:

Process all months in parallel:

ep = EffectivePrecipitation(...)
results = ep.process(output_dir='./output', n_workers=8)

Process only summer months:

results = ep.process(
    output_dir='./output',
    months=[6, 7, 8]  # June, July, August
)

Save input data for debugging:

results = ep.process(
    output_dir='./output',
    save_inputs=True,
    input_dir='./inputs'
)

See Also

process_sequential: Sequential processing for debugging.

Source code in pycropwat/core.py

def process(
    self,
    output_dir: Union[str, Path],
    n_workers: int = 4,
    months: Optional[List[int]] = None,
    input_dir: Optional[Union[str, Path]] = None,
    save_inputs: bool = False
) -> List[Tuple[Optional[Path], Optional[Path]]]:
    """
    Process all months and save effective precipitation rasters.

    Downloads precipitation data from Google Earth Engine, calculates
    effective precipitation using the configured method, and saves
    results as GeoTIFF files. Uses Dask for parallel processing of
    multiple months.

    Parameters
    ----------

    output_dir : str or Path
        Directory to save output rasters. Will be created if it
        doesn't exist.

    n_workers : int, optional
        Number of parallel workers for Dask. Default is 4.
        Set to 1 for sequential processing.

    months : list of int, optional
        List of months to process (1-12). If None, processes all months
        in the date range. Useful for seasonal analyses.

    input_dir : str or Path, optional
        Directory to save downloaded input data (precipitation, AWC, ETo).
        If None and save_inputs is True, uses ``output_dir/../analysis_inputs``.

    save_inputs : bool, optional
        Whether to save downloaded input data as GeoTIFF files.
        Default is False. Useful for debugging or further analysis.

    Returns
    -------
    list of tuple
        List of tuples containing paths to saved files:
        ``(effective_precip_path, effective_precip_fraction_path)``.
        Returns ``(None, None)`` for months that failed to process.

    Notes
    -----
    Output files are named:

    - ``effective_precip_YYYY_MM.tif`` - Effective precipitation in mm
    - ``effective_precip_fraction_YYYY_MM.tif`` - Effective/total ratio (non-PCML methods)
    - ``effective_precip_fraction_YYYY.tif`` - Annual (water year) fraction (PCML method only)

    For the USDA-SCS method, AWC and ETo data are automatically downloaded
    and cached for efficiency.

    Examples
    --------
    Process all months in parallel:

    ```python
    ep = EffectivePrecipitation(...)
    results = ep.process(output_dir='./output', n_workers=8)
    ```

    Process only summer months:

    ```python
    results = ep.process(
        output_dir='./output',
        months=[6, 7, 8]  # June, July, August
    )
    ```

    Save input data for debugging:

    ```python
    results = ep.process(
        output_dir='./output',
        save_inputs=True,
        input_dir='./inputs'
    )
    ```

    See Also
    --------
        process_sequential: Sequential processing for debugging.
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)

    # Set up input directory for saving downloaded data
    if save_inputs:
        if input_dir is not None:
            self._input_dir = Path(input_dir)
        else:
            # Default: parallel to output_dir in analysis_inputs
            self._input_dir = output_dir.parent / 'analysis_inputs' / output_dir.name
        self._input_dir.mkdir(parents=True, exist_ok=True)
        logger.info(f"Input data will be saved to: {self._input_dir}")
    else:
        self._input_dir = None

    # Generate list of (year, month) to process
    all_dates = get_monthly_dates(self.start_year, self.end_year)

    if months is not None:
        all_dates = [(y, m) for y, m in all_dates if m in months]

    logger.info(f"Processing {len(all_dates)} months with {n_workers} workers")

    # Create delayed tasks
    tasks = [
        delayed(self._process_single_month)(year, month, output_dir)
        for year, month in all_dates
    ]

    # Execute in parallel with progress bar
    with ProgressBar():
        results = compute(*tasks, num_workers=n_workers)

    return list(results)

process_sequential ¶

process_sequential(output_dir: Union[str, Path], months: Optional[List[int]] = None, input_dir: Optional[Union[str, Path]] = None, save_inputs: bool = False) -> List[Tuple[Optional[Path], Optional[Path]]]

Process all months sequentially (useful for debugging).

Same as :meth:process but without parallel processing. Useful for debugging issues, testing on small datasets, or when GEE rate limits are a concern.

Parameters:

Name	Type	Description	Default
`output_dir`	`str or Path`	Directory to save output rasters. Will be created if it doesn't exist.	required
`months`	`list of int`	List of months to process (1-12). If None, processes all months in the date range.	`None`
`input_dir`	`str or Path`	Directory to save downloaded input data (precipitation, AWC, ETo). If None and save_inputs is True, uses `output_dir/../analysis_inputs`.	`None`
`save_inputs`	`bool`	Whether to save downloaded input data. Default is False.	`False`

Returns:

Type	Description
`list of tuple`	List of tuples containing paths to saved files: `(effective_precip_path, effective_precip_fraction_path)`. Returns `(None, None)` for months that failed to process.

Examples:

Debug a single month:

ep = EffectivePrecipitation(...)
results = ep.process_sequential(
    output_dir='./output',
    months=[1]  # Process only January
)

See Also

process: Parallel processing method (recommended for production).

Source code in pycropwat/core.py

def process_sequential(
    self,
    output_dir: Union[str, Path],
    months: Optional[List[int]] = None,
    input_dir: Optional[Union[str, Path]] = None,
    save_inputs: bool = False
) -> List[Tuple[Optional[Path], Optional[Path]]]:
    """
    Process all months sequentially (useful for debugging).

    Same as :meth:`process` but without parallel processing. Useful for
    debugging issues, testing on small datasets, or when GEE rate limits
    are a concern.

    Parameters
    ----------

    output_dir : str or Path
        Directory to save output rasters. Will be created if it
        doesn't exist.

    months : list of int, optional
        List of months to process (1-12). If None, processes all months
        in the date range.

    input_dir : str or Path, optional
        Directory to save downloaded input data (precipitation, AWC, ETo).
        If None and save_inputs is True, uses ``output_dir/../analysis_inputs``.

    save_inputs : bool, optional
        Whether to save downloaded input data. Default is False.

    Returns
    -------
    list of tuple
        List of tuples containing paths to saved files:
        ``(effective_precip_path, effective_precip_fraction_path)``.
        Returns ``(None, None)`` for months that failed to process.

    Examples
    --------
    Debug a single month:

    ```python
    ep = EffectivePrecipitation(...)
    results = ep.process_sequential(
        output_dir='./output',
        months=[1]  # Process only January
    )
    ```

    See Also
    --------
        process: Parallel processing method (recommended for production).
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)

    # Set up input directory for saving downloaded data
    if save_inputs:
        if input_dir is not None:
            self._input_dir = Path(input_dir)
        else:
            # Default: parallel to output_dir in analysis_inputs
            self._input_dir = output_dir.parent / 'analysis_inputs' / output_dir.name
        self._input_dir.mkdir(parents=True, exist_ok=True)
        logger.info(f"Input data will be saved to: {self._input_dir}")
    else:
        self._input_dir = None

    all_dates = get_monthly_dates(self.start_year, self.end_year)

    if months is not None:
        all_dates = [(y, m) for y, m in all_dates if m in months]

    results = []
    for year, month in all_dates:
        result = self._process_single_month(year, month, output_dir)
        results.append(result)

    return results

Core Module¶

core ¶

EffectivePrecipitation ¶

cropwat_effective_precip staticmethod ¶

process ¶

process_sequential ¶

cropwat_effective_precip `staticmethod` ¶