Align rain–discharge windows with ACIS observation-day dating#84
Open
michaelmfoley wants to merge 1 commit into
Open
Align rain–discharge windows with ACIS observation-day dating#84michaelmfoley wants to merge 1 commit into
michaelmfoley wants to merge 1 commit into
Conversation
Most COOP/GHCN stations observe in the morning and attribute the preceding ~24 hours of rain to the observation day, so rain that falls on day d is predominantly recorded in MA_precipitation_daily on day d+1. The prior-rain helper compounded this: rolling(2).sum().shift(1) yields a [d-2, d-1] recorded window, which both excludes the event day and misses the next-day-dated storm rain entirely. Fix: shift the daily series back one day (recorded -> physical alignment) and drop the trailing shift, so the window covers the event day and the day before it in physical time. Effect (verified against AMEND.db, 2022-06-30..2026-04-19, Verified Data Reports): - corr(prior 48-hr rain, daily discharge volume): 0.01 -> 0.58 (Spearman 0.08 -> 0.61) - fraction of dry days (<0.05") with any discharge: 34% -> 11% - "dry-weather" events (statewide 48-hr rain < 0.05"): 1,041 -> 63; 90% of the removed events have >= 0.05" recorded the day after the event, and operator-reported rainfall correlates best with statewide precip at day +1 - the fingerprint of observation-day dating. Also relabels the monthly chart rainfall overlay from "48-hr lookback" to "monthly total" (it has always plotted monthly sums), and documents the observation-day convention in the dashboard methodology note. Affected outputs: rainfall_discharge_freq, rainfall_discharge_scatter, rainfall_cdf charts (currently included only in the 2026-04 post with frozen filenames; not regenerated here), and the dashboard monthly chart label (auto-regenerates via weekly CI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Most COOP/GHCN stations observe in the morning and attribute the preceding 24 hours of rain to the observation day, so rain that falls on day x is predominantly recorded in
MA_precipitation_dailyon day x+1. The shared prior-rain helper inEEA_DP_CSO_map.pycompounds this:rolling(2).sum().shift(1)produces a[d−2, d−1]recorded-day window, which excludes the event day and misses the next-day-dated storm rain entirely.The result is that storm-triggered discharges systematically appear to have happened in dry weather. Three independent lines of evidence:
rainfallData) correlates best with statewide precip at day +1 (r = 0.21) vs day 0 (0.14) and day −1 (0.00) — the fingerprint of observation-day dating.Fix
In
_load_rain_and_cso: shift the daily precip series back one day (recorded → physical alignment) and drop the trailing.shift(1), so the window covers the event day and the day before it in physical time.Also:
Effect (verified against AMEND.db, 2022-06-30 → 2026-04-19)
Implications for the 2026-04 post
The post's "surprisingly weak rainfall↔discharge relationship" and the dry-weather discharge framing (34% of dry days → NPDES violations) are largely artifacts of this window. The post's charts use frozen
MAEEADP_through_2025_*filenames and are not regenerated here — flagging for your call on whether to refresh them and/or revise the post text. The corrected dry cohort (63 events; ~half still showing operator-reported rain, so a defensible infrastructure-failure set is ~30–60 events) still concentrates in Fall River, New Bedford, and Springfield.Notes
rainfall_discharge_freq/scatter/cdf) are not currently generated bydashboard_charts.py.'T'(trace, ~4.6% of station-days) to NaN instead of 0.0, slightly biasing the statewide mean wet.🤖 Generated with Claude Code