water-timeseries-v2🔗

Automated analysis of water timeseries data from satellite imagery and remote sensing sources.

Documentation🔗

📖 Full Documentation: View Documentation

The documentation includes: - Getting started guide - API reference (auto-generated from code) - Usage examples - Tutorial notebooks

Documentation is automatically built and deployed on every push to main using GitHub Actions.

Features🔗

Dynamic World Handler: Process Dynamic World land cover classifications
JRC Water Handler: Handle JRC water occurrence and classification data
Data Normalization: Automatic normalization and scaling of time series
Breakpoint Detection: Statistical (SimpleBreakpoint) and advanced (RBEAST) methods for detecting water extent changes
Batch Processing: Efficient processing of multiple spatial entities
Comprehensive Testing: Full test coverage including breakpoint detection, normalization, and integration tests

Quick Start🔗

Python API🔗

from water_timeseries.dataset import DWDataset
import xarray as xr

# Load data
ds = xr.open_dataset("water_data.nc")

# Process with Dynamic World handler
processor = DWDataset(ds)

# Access time series
water_extent = processor.ds_normalized[processor.water_column]

# Access normalized time series
water_extent = processor.ds_normalized["water"]

Command Line Interface🔗

The package provides a hierarchical CLI tool water-timeseries for running breakpoint detection on water datasets.

Installation🔗

# Using uv (recommended)
uv sync

# Or using pip
pip install .

Basic Usage🔗

# Show help
uv run water-timeseries --help

# Show breakpoint-analysis subcommand help
uv run water-timeseries breakpoint-analysis --help

# Show plot-timeseries subcommand help
uv run water-timeseries plot-timeseries --help

# Run with required arguments
uv run water-timeseries breakpoint-analysis data.zarr output.parquet

# Run with optional parameters
uv run water-timeseries breakpoint-analysis \
    data.zarr \
    output.parquet \
    --chunksize 100 \
    --n-jobs 4

Using a Config File🔗

You can also use a YAML configuration file:

uv run water-timeseries breakpoint-analysis --config-file configs/config.yaml

Example config file:

# config.yaml
water_dataset_file: /path/to/data.zarr
output_file: /path/to/output.parquet

# Optional: vector dataset for bbox filtering
vector_dataset_file: /path/to/lakes.parquet

# Bounding box filter (optional)
bbox_west: -160
bbox_east: -155
bbox_north: 68
bbox_south: 66

# Processing options
chunksize: 100
n_jobs: 20
min_chunksize: 10

CLI Options🔗

Option	Short	Description	Default
`water_dataset_file`		Path to water dataset (zarr or parquet)	Required*
`output_file`		Path to output parquet file	Required*
`--config-file`		Path to config YAML/JSON file	None
`--vector-dataset-file`	`-v`	Path to vector dataset (gpkg, shp, geojson)	None
`--chunksize`	`-c`	Number of IDs per chunk	100
`--n-jobs`	`-j`	Number of parallel jobs (>1 for Ray)	1
`--min-chunksize`	`-m`	Minimum chunk size	10
`--bbox-west`		Minimum longitude (west)	-180
`--bbox-south`		Minimum latitude (south)	60
`--bbox-east`		Maximum longitude (east)	180
`--bbox-north`		Maximum latitude (north)	90
`--output-geometry`		Export output with geometries	True
`--output-geometry-all`		Export output all geometries including non breakpoints	True

*Can also be provided via config file

Plot Timeseries🔗

Plot time series for a specific lake:

# Plot lake timeseries
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc

# Save figure to file
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc --output-figure plot.png

# Save only (no popup window)
uv run water-timeseries plot-timeseries data.zarr --lake-id b7uefy0bvcrc --output-figure plot.png --no-show

# Use config file
uv run water-timeseries plot-timeseries --config-file configs/plot_config.yaml

# Plot lake timeseries with break detection
uv run water-timeseries plot-timeseries tests/data/lakes_dw_test.zarr --lake-id b7uefy0bvcrc --output-figure examples/dw_example_b7uefy0bvcrc.png --break-method beast

Example Timeseries Plot

# Plot lake timeseries with JRC data
uv run water-timeseries plot-timeseries tests/data/lakes_jrc_test.zarr --lake-id b7uefy0bvcrc --output-figure examples/jrc_example_b7uefy0bvcrc.png --break-method beast

Example Timeseries Plot

Plot options:

Option	Description	Default
`water_dataset_file`	Path to water dataset (zarr or netCDF)	Required*
`--lake-id`	Geohash ID of the lake	Required*
`--output-figure`	Path to save output figure	None
`--break-method`	Break method to overlay (beast or simple)	None
`--no-show`	Don't show popup window, only save if output-figure is provided	False
`--config-file`	Path to config YAML/JSON file	None

*Can also be provided via config file

Main Classes🔗

Datasets🔗

DWDataset: Dynamic World land cover processor
JRCDataset: JRC water classification processor

Download🔗

EarthEngineDownloader: Download data from Google Earth Engine

Breakpoints🔗

SimpleBreakpoint: Statistical breakpoint detection
BeastBreakpoint: Advanced RBEAST-based detection

Interactive Dashboard🔗

The package includes an interactive Streamlit dashboard for visualizing lake polygons and time series data.

Running the Dashboard🔗

streamlit run src/water_timeseries/dashboard/app.py

Features🔗

Map Viewer: Interactive map displaying lake polygons from parquet files
Hover Tooltips: View attributes (id_geohash, Area_start_ha, Area_end_ha, NetChange_ha, NetChange_perc)
Click Selection: Click on a polygon to select it and view its time series
Time Series Plot: Automatic visualization of water extent over time
Automatic Download: If the selected lake's data is not in the cached dataset, it automatically downloads from Google Earth Engine
Popup View: Click "Open Time Series in Popup" for a larger view
EE Project Config: Set your Google Earth Engine project in the sidebar

Dashboard UI🔗

Dashboard

Configuration🔗

The dashboard accepts two optional arguments:

Parameter	Description	Default
`data_path`	Path to parquet file with lake polygons	`tests/data/lake_polygons.parquet`
`zarr_path`	Path to zarr file with cached time series data	`tests/data/lakes_dw_test.zarr`

Example custom paths:

from water_timeseries.dashboard.map_viewer import create_app

# With custom paths
create_app(
    data_path="/path/to/lakes.parquet",
    zarr_path="/path/to/data.zarr"
)

Documentation Links🔗

Getting Started - Installation and setup guide
Examples - Usage examples and tutorials
API Reference - Complete API documentation

Contributing🔗

We welcome contributions! Please ensure you: 1. Add docstrings to new functions and classes (Google style) 2. Update documentation in the docs/ folder 3. Run tests before submitting PRs

License🔗

[Add your license here]

Author🔗

Ingmar Nitze