water-timeseries-v2🔗
Automated analysis of water timeseries data from satellite imagery and remote sensing sources.
Documentation🔗
📖 Full Documentation: View Documentation
The documentation includes: - Getting started guide - API reference (auto-generated from code) - Usage examples - Tutorial notebooks
Documentation is automatically built and deployed on every push to main using GitHub Actions.
Features🔗
- Dynamic World Handler: Process Dynamic World land cover classifications
- JRC Water Handler: Handle JRC water occurrence and classification data
- Data Normalization: Automatic normalization and scaling of time series
- Breakpoint Detection: Statistical (SimpleBreakpoint) and advanced (RBEAST) methods for detecting water extent changes
- Batch Processing: Efficient processing of multiple spatial entities
- Comprehensive Testing: Full test coverage including breakpoint detection, normalization, and integration tests
Quick Start🔗
Python API🔗
from water_timeseries.dataset import DWDataset
import xarray as xr
# Load data
ds = xr.open_dataset("water_data.nc")
# Process with Dynamic World handler
processor = DWDataset(ds)
# Access time series
water_extent = processor.ds_normalized[processor.water_column]
# Access normalized time series
water_extent = processor.ds_normalized["water"]
Command Line Interface🔗
The package provides a hierarchical CLI tool water-timeseries for running breakpoint detection on water datasets.
Installation🔗
# Using uv (recommended)
uv sync
# Or using pip
pip install .
Basic Usage🔗
# Show help
uv run water-timeseries --help
# Show breakpoint-analysis subcommand help
uv run water-timeseries breakpoint-analysis --help
# Show plot-timeseries subcommand help
uv run water-timeseries plot-timeseries --help
# Run with required arguments
uv run water-timeseries breakpoint-analysis data.zarr output.parquet
# Run with optional parameters
uv run water-timeseries breakpoint-analysis \
data.zarr \
output.parquet \
--chunksize 100 \
--n-jobs 4
Using a Config File🔗
You can also use a YAML configuration file:
uv run water-timeseries breakpoint-analysis --config-file configs/config.yaml
Example config file:
# config.yaml
water_dataset_file: /path/to/data.zarr
output_file: /path/to/output.parquet
# Optional: vector dataset for bbox filtering
vector_dataset_file: /path/to/lakes.parquet
# Bounding box filter (optional)
bbox_west: -160
bbox_east: -155
bbox_north: 68
bbox_south: 66
# Processing options
chunksize: 100
n_jobs: 20
min_chunksize: 10
CLI Options🔗
| Option | Short | Description | Default |
|---|---|---|---|
water_dataset_file |
Path to water dataset (zarr or parquet) | Required* | |
output_file |
Path to output parquet file | Required* | |
--config-file |
Path to config YAML/JSON file | None | |
--vector-dataset-file |
-v |
Path to vector dataset (gpkg, shp, geojson) | None |
--chunksize |
-c |
Number of IDs per chunk | 100 |
--n-jobs |
-j |
Number of parallel jobs (>1 for Ray) | 1 |
--min-chunksize |
-m |
Minimum chunk size | 10 |
--bbox-west |
Minimum longitude (west) | -180 | |
--bbox-south |
Minimum latitude (south) | 60 | |
--bbox-east |
Maximum longitude (east) | 180 | |
--bbox-north |
Maximum latitude (north) | 90 | |
--output-geometry |
Export output with geometries | True | |
--output-geometry-all |
Export output all geometries including non breakpoints | True |
*Can also be provided via config file
Plot Timeseries🔗
Plot time series for a specific lake:
# Plot lake timeseries
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc
# Save figure to file
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc --output-figure plot.png
# Save only (no popup window)
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc --output-figure plot.png --no-show
# Use config file
uv run water-timeseries plot-timeseries --config-file configs/plot_config.yaml
# Plot lake timeseries with break detection
uv run water-timeseries plot-timeseries tests/data/lakes_dw_test.zarr --lake-id b7uefy0bvcrc --output-figure examples/dw_example_b7uefy0bvcrc.png --break-method beast

# Plot lake timeseries with JRC data
uv run water-timeseries plot-timeseries tests/data/lakes_jrc_test.zarr --lake-id b7uefy0bvcrc --output-figure examples/jrc_example_b7uefy0bvcrc.png --break-method beast

Plot options:
| Option | Short | Description | Default |
|---|---|---|---|
water_dataset_file |
Path to water dataset (zarr or netCDF) | Required* | |
--lake-id |
Geohash ID of the lake | Required* | |
--output-figure |
Path to save output figure | None | |
--break-method |
Break method to overlay (beast or simple) | None | |
--no-show |
Don't show popup window, only save if output-figure is provided | False | |
--config-file |
Path to config YAML/JSON file | None |
*Can also be provided via config file
Main Classes🔗
Datasets🔗
- DWDataset: Dynamic World land cover processor
- JRCDataset: JRC water classification processor
Download🔗
- EarthEngineDownloader: Download data from Google Earth Engine
Breakpoints🔗
- SimpleBreakpoint: Statistical breakpoint detection
- BeastBreakpoint: Advanced RBEAST-based detection
Interactive Dashboard🔗
The package includes an interactive Streamlit dashboard for visualizing lake polygons and time series data.
Running the Dashboard🔗
streamlit run src/water_timeseries/dashboard/app.py
Features🔗
- Map Viewer: Interactive map displaying lake polygons from parquet files
- Hover Tooltips: View attributes (id_geohash, Area_start_ha, Area_end_ha, NetChange_ha, NetChange_perc)
- Click Selection: Click on a polygon to select it and view its time series
- Time Series Plot: Automatic visualization of water extent over time
- Automatic Download: If the selected lake's data is not in the cached dataset, it automatically downloads from Google Earth Engine
- Popup View: Click "Open Time Series in Popup" for a larger view
- EE Project Config: Set your Google Earth Engine project in the sidebar
Dashboard UI🔗

Configuration🔗
The dashboard accepts two optional arguments:
| Parameter | Description | Default |
|---|---|---|
data_path |
Path to parquet file with lake polygons | tests/data/lake_polygons.parquet |
zarr_path |
Path to zarr file with cached time series data | tests/data/lakes_dw_test.zarr |
Example custom paths:
from water_timeseries.dashboard.map_viewer import create_app
# With custom paths
create_app(
data_path="/path/to/lakes.parquet",
zarr_path="/path/to/data.zarr"
)
Documentation Links🔗
- Getting Started - Installation and setup guide
- Examples - Usage examples and tutorials
- API Reference - Complete API documentation
Contributing🔗
We welcome contributions! Please ensure you:
1. Add docstrings to new functions and classes (Google style)
2. Update documentation in the docs/ folder
3. Run tests before submitting PRs
License🔗
[Add your license here]
Author🔗
Ingmar Nitze