Harden precipitation fetch against partial ACIS responses by michaelmfoley · Pull Request #82 · nesanders/MAenvironmentaldata

michaelmfoley · 2026-06-11T03:26:46Z

Summary

Two defensive fixes to the precipitation pipeline, found while investigating the dry-weather CSO cohort. Neither changes current outputs — see verification below — but both remove latent failure modes that would silently corrupt the rainfall data feeding the dashboard and any dry-weather analysis.

Changes

get_data/get_MA_precipitation.py — The station loop silently dropped any station whose row count didn't match the full-calendar-year date range. ACIS currently pads every station's response to the requested range (filling future/missing days with 'M'), so the guard never fires today — but if ACIS ever returns a short array (partial response, mid-year truncation), the old code would discard those stations wholesale with no warning, silently degrading the statewide average. Now:

edate is capped at today, so the current year doesn't request (or write) future-dated rows. The cached CSV currently carries 228 all-NaN rows for the rest of 2026; those go away on the next fetch.
Each station is reindexed to the requested range, so a partial response contributes its valid days instead of being dropped entirely.

analysis/EEA_DP_CSO_map.py — Monthly precipitation aggregation used .sum(), which returns 0 (not NaN) for a month with no data, rendering a data gap as "zero rainfall" on the dashboard's monthly volume + rainfall chart. min_count=1 makes such months honest gaps. No currently shipped month is affected (verified all months in the live chart have real values); this guards the display if the precip fetch ever falls behind the CSO data window.

Verification (no regression)

Fetched live ACIS data (2026-06-10) and processed the same responses under old and new logic, for 2022–2026:

Daily statewide averages: identical where both are defined (max |diff| = 0.000").
Station coverage: identical per year (the length guard never fires against current ACIS responses).
Statewide-dry CSO cohort (precip_48h < 0.05" across all 17,732 incidents): unchanged at 1,955 events / 1,987 Mgal; zero events flip classification.
Cached CSV values match a fresh fetch to within 0.008" on overlapping 2026 days.

Out of scope / follow-up

The larger dry-weather question is methodological, not pipeline: the statewide average smooths localized storms to near-zero, so most "dry" events have substantial operator-reported rainfall at the outfall (MAEEADP_CSO.rainfallData). Redefining the dry cohort using local precipitation is being pursued separately.

Also out of scope: the fetch script only re-fetches from the most recent cached year, so any historical correction would require forcing a full re-fetch.

🤖 Generated with Claude Code

Two defensive fixes to the precipitation pipeline. Verified against live ACIS data (2026-06-10) that neither changes current outputs — daily averages are identical to within rounding, and the statewide-dry CSO cohort (precip_48h < 0.05") is unchanged at 1,955 events / 1,987 Mgal. 1. get_MA_precipitation.py: the station loop silently dropped any station whose row count didn't match the full-calendar-year range. ACIS currently pads every station to the requested range, so the guard never fires today — but if ACIS ever returns a short array (partial response, mid-year truncation), the old code would silently discard those stations wholesale. Now edate is capped at today and each station is reindexed to the requested range, so partial responses contribute their valid days. Also stops writing future-dated all-NaN rows for the remainder of the current year (the cached CSV carried 228 such rows). 2. EEA_DP_CSO_map.py: monthly aggregation used .sum(), which returns 0 (not NaN) for a month with no precipitation data — rendering a data gap as "zero rainfall" on the dashboard. min_count=1 makes such months honest gaps. No currently shipped month is affected (all months in the live chart have data); this guards the display if the precip fetch ever falls behind the CSO data window. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

michaelmfoley mentioned this pull request Jun 12, 2026

Align rain–discharge windows with ACIS observation-day dating #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden precipitation fetch against partial ACIS responses#82

Harden precipitation fetch against partial ACIS responses#82
michaelmfoley wants to merge 1 commit into
mainfrom
fix/precip-partial-year-coverage

michaelmfoley commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michaelmfoley commented Jun 11, 2026

Summary

Changes

Verification (no regression)

Out of scope / follow-up

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant