Skip to content

Align rain–discharge windows with ACIS observation-day dating#84

Open
michaelmfoley wants to merge 1 commit into
mainfrom
fix/rain-observation-day-alignment
Open

Align rain–discharge windows with ACIS observation-day dating#84
michaelmfoley wants to merge 1 commit into
mainfrom
fix/rain-observation-day-alignment

Conversation

@michaelmfoley

Copy link
Copy Markdown
Collaborator

Problem

Most COOP/GHCN stations observe in the morning and attribute the preceding 24 hours of rain to the observation day, so rain that falls on day x is predominantly recorded in MA_precipitation_daily on day x+1. The shared prior-rain helper in EEA_DP_CSO_map.py compounds this: rolling(2).sum().shift(1) produces a [d−2, d−1] recorded-day window, which excludes the event day and misses the next-day-dated storm rain entirely.

The result is that storm-triggered discharges systematically appear to have happened in dry weather. Three independent lines of evidence:

  • Of the 1,041 events classified "dry" (statewide 48-hr rain < 0.05", Verified Data Reports), 90% have ≥ 0.05" of statewide rain recorded the day after the event (median 0.30").
  • Operator-reported rainfall (rainfallData) correlates best with statewide precip at day +1 (r = 0.21) vs day 0 (0.14) and day −1 (0.00) — the fingerprint of observation-day dating.
  • Correcting the window restores the physically expected rain–discharge relationship (numbers below).

Fix

In _load_rain_and_cso: shift the daily precip series back one day (recorded → physical alignment) and drop the trailing .shift(1), so the window covers the event day and the day before it in physical time.

Also:

  • Relabels the monthly chart rainfall overlay from "Precipitation (48-hr lookback)" to "Precipitation (monthly total)" — it has always plotted monthly sums.
  • Documents the observation-day convention in the dashboard methodology note.

Effect (verified against AMEND.db, 2022-06-30 → 2026-04-19)

Metric Before After
corr(prior 48-hr rain, daily discharge volume) 0.01 0.58
Spearman 0.08 0.61
Fraction of dry days (<0.05") with any discharge 34% 11%
"Dry-weather" events (statewide 48-hr < 0.05") 1,041 63

Implications for the 2026-04 post

The post's "surprisingly weak rainfall↔discharge relationship" and the dry-weather discharge framing (34% of dry days → NPDES violations) are largely artifacts of this window. The post's charts use frozen MAEEADP_through_2025_* filenames and are not regenerated here — flagging for your call on whether to refresh them and/or revise the post text. The corrected dry cohort (63 events; ~half still showing operator-reported rain, so a defensible infrastructure-failure set is ~30–60 events) still concentrates in Fall River, New Bedford, and Springfield.

Notes

  • The dashboard monthly chart label fix deploys automatically on the next weekly chart CI run; the prior-rain charts (rainfall_discharge_freq/scatter/cdf) are not currently generated by dashboard_charts.py.
  • The one-day shift is an approximation: observation times vary by station (ASOS stations are midnight-dated). Per-station observation-time handling via ACIS metadata, or gridded NOAA AORC, would be more rigorous follow-ups.
  • Independent of Harden precipitation fetch against partial ACIS responses #82 (no overlapping lines; merges cleanly in either order).
  • Minor related issue, not fixed here: the ACIS parser maps 'T' (trace, ~4.6% of station-days) to NaN instead of 0.0, slightly biasing the statewide mean wet.

🤖 Generated with Claude Code

Most COOP/GHCN stations observe in the morning and attribute the
preceding ~24 hours of rain to the observation day, so rain that falls
on day d is predominantly recorded in MA_precipitation_daily on day
d+1. The prior-rain helper compounded this: rolling(2).sum().shift(1)
yields a [d-2, d-1] recorded window, which both excludes the event day
and misses the next-day-dated storm rain entirely.

Fix: shift the daily series back one day (recorded -> physical
alignment) and drop the trailing shift, so the window covers the event
day and the day before it in physical time.

Effect (verified against AMEND.db, 2022-06-30..2026-04-19, Verified
Data Reports):
- corr(prior 48-hr rain, daily discharge volume): 0.01 -> 0.58
  (Spearman 0.08 -> 0.61)
- fraction of dry days (<0.05") with any discharge: 34% -> 11%
- "dry-weather" events (statewide 48-hr rain < 0.05"): 1,041 -> 63;
  90% of the removed events have >= 0.05" recorded the day after the
  event, and operator-reported rainfall correlates best with statewide
  precip at day +1 - the fingerprint of observation-day dating.

Also relabels the monthly chart rainfall overlay from "48-hr lookback"
to "monthly total" (it has always plotted monthly sums), and documents
the observation-day convention in the dashboard methodology note.

Affected outputs: rainfall_discharge_freq, rainfall_discharge_scatter,
rainfall_cdf charts (currently included only in the 2026-04 post with
frozen filenames; not regenerated here), and the dashboard monthly
chart label (auto-regenerates via weekly CI).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant