2012 · Problem B — How Much Gas Should I Buy This Week?
Time-series forecasting Decision policy ARIMA / exponential smoothing Out-of-sample validationRead the official problem page →
The prompt, restated
A consumer who drives a standard sedan must decide each Monday whether to top up the tank now (a full fill) or to defer half of the purchase to next Monday (a half fill, paying this week's price for half a tank and next week's price for the other half). Teams are given weekly U.S. retail gasoline prices for 2011 and asked to: (1) Build a model — fitted on 2011 data — that takes the recent price history and recommends "full" or "half" each week, with a clearly stated objective (minimize expected annual fuel cost). (2) Examine how the recommendation changes with the consumer's typical weekly driving distance — a long-haul commuter has different stockout risk than a weekend driver. (3) Build at least one major-city variant (the EIA publishes weekly retail prices for several metro areas) and report whether the rule generalizes. (4) Validate the rule out-of-sample against 2012 weekly prices. (5) Deliver a one-page summary and a letter suitable for publication in a local newspaper that a non-technical reader can act on next Monday.
The framing is small and concrete, but the modeling space is wide — anything from a hand-tuned moving-average rule to ARIMA forecasting to dynamic programming on the two-week state. The winning papers pick the simplest model that beats the obvious benchmarks (always-full, always-half) by a defensible margin and explain why a more complex model isn't worth the added fragility.
Key modeling idea
Let $p_t$ be this week's price per gallon, $G$ the tank capacity (gallons), and $d$ the weekly demand (gallons per week, $d \le G$). The decision is a binary policy $a_t \in \{\text{full}, \text{half}\}$. The cost of "full" this week is $G\, p_t$; the cost of "half" is $\tfrac{G}{2} p_t + \tfrac{G}{2}\, p_{t+1}$ (with the implicit assumption that half a tank covers a week of driving, i.e., $d \le G/2$ — a constraint we revisit when the driving distance is large). The decision rule is therefore
$$\text{buy half if } \mathbb{E}[p_{t+1} \mid \mathcal{F}_t] < p_t,\quad \text{else buy full.}$$
So the modeling problem collapses to a one-week-ahead forecast of the weekly retail gasoline price, plus a feasibility check that half a tank covers expected driving. Candidate forecasters, from simplest to most elaborate: (i) random walk $\hat{p}_{t+1} = p_t$ (no buy-half signal — the benchmark to beat); (ii) one-week momentum $\hat{p}_{t+1} = p_t + \alpha (p_t - p_{t-1})$; (iii) exponential smoothing / Holt's linear trend; (iv) ARIMA(1,1,1) fit on a rolling window (technique 8 — ARIMA); (v) a regression on lagged crude-oil prices.
Suggested approach
- Step 1 — Pull the data. Download EIA weekly U.S. retail gasoline prices for 2011 (fit) and 2012 (validation), plus at least one major-city series (e.g., Los Angeles, Chicago). Plot, log-difference, and inspect autocorrelation (technique 4 — regression / EDA).
- Step 2 — Pick a forecast. Fit options (i)–(v) on 2011 only, with rolling-origin cross-validation. Report one-week MAE and directional accuracy (fraction of weeks where the sign of $\hat{p}_{t+1} - p_t$ matches reality) (technique 8 — ARIMA / smoothing).
- Step 3 — Map forecast to action. Apply the buy-half-if-cheaper- next-week rule. Tie-break (and noise-protect) with a deadband: only buy half if $\hat{p}_{t+1} - p_t < -\epsilon$ where $\epsilon$ is roughly the forecast MAE.
- Step 4 — Add the driving-distance dimension. Let $d$ be weekly gallons consumed. If $d > G/2$, a "half" purchase forces a re-fill mid-week at this week's price anyway — the rule degrades to always-full. Plot annual cost as a function of $d$ for both rule and benchmarks.
- Step 5 — Out-of-sample test on 2012. Replay the rule weekly through 2012 using only past data. Report annual cost vs. always-full and always-half benchmarks, plus a paired-week confidence interval (technique 10 — bootstrap CI). Repeat for the major-city series.
Data sources to consider
| Source | What you get |
|---|---|
| U.S. EIA "Weekly Retail Gasoline and Diesel Prices" (series EMM_EPMR_PTE_NUS_DPG) | National weekly regular retail gasoline prices, 1990–present — the headline dataset |
| EIA major-city weekly series (Los Angeles, Chicago, New York, Houston, Seattle, Boston) | Metro-area variants for the "at least one city" requirement |
| EIA weekly WTI & Brent crude prices | Strong lagged predictor of retail gasoline — useful exogenous regressor |
| Hyndman & Athanasopoulos, Forecasting: Principles and Practice | Reference for ETS, ARIMA, and rolling-origin cross-validation |
| U.S. DOT National Household Travel Survey | Distribution of weekly driving distance — used to defend the $d$ sweep range |
Common pitfalls
- Look-ahead leakage. If your forecaster uses any data from the week it is predicting — including a "future" smoothing kernel — your validation collapses. Use strictly causal rolling-origin fits.
- Beating the wrong benchmark. The honest benchmark is always-full (no decisions at all). Beating always-half is too easy because always-half is a pessimistic strategy in a rising-price year.
- Ignoring transaction friction. Two trips to the pump cost more than one — fuel, time, and risk of a higher mid-week price. Add a small per-stop penalty $\tau$ and show the optimal rule is insensitive to it up to a threshold.
- Treating prices as i.i.d. Weekly retail gasoline is heavily autocorrelated (slow pass-through from crude). A random-walk model is fine; an i.i.d. mean-reverting model is wrong.
- Overfitting an ARIMA. On 52 weeks of training data, ARIMA(2,1,2) will look great in sample and collapse out of sample. Prefer ETS or ARIMA(1,1,0) and defend the choice with AIC plus a rolling-window MAE.
- No major-city variant. The prompt asks for at least one. A model fit on national prices that fails on Los Angeles (which has its own refinery-driven shocks) needs explicit acknowledgement.
- Letter buried in jargon. The newspaper-ready piece should fit a fridge magnet: "If pump prices fell this week, buy a full tank; if they rose, buy half — unless you drive more than $G/2$ gallons a week."
Python sketch
Rolling-origin one-week-ahead forecast with a deadband decision rule, validated against 2012 prices and a major-city series.
import numpy as np
import pandas as pd
# --- Step 1: load weekly retail gasoline prices ($/gal) ---
# columns: week (date), national, los_angeles [illustrative subset]
df = pd.read_csv("eia_weekly_gas.csv", parse_dates=["week"]).set_index("week")
train = df.loc["2011"]
test = df.loc["2012"]
G = 15.0 # tank capacity in gallons
d = 7.5 # weekly driving demand in gallons [illustrative ~225 mi @ 30 mpg]
tau = 0.05 # per-stop friction in $ [illustrative]
# --- Step 2: simple one-week-ahead forecasters ---
def forecast_naive(series, t):
return series.iloc[t] # random walk
def forecast_holt(series, t, alpha=0.6, beta=0.2):
# causal Holt's linear trend — fit on series[:t+1] only
s = series.iloc[: t + 1].values
L, T = s[0], 0.0
for x in s[1:]:
L_new = alpha * x + (1 - alpha) * (L + T)
T = beta * (L_new - L) + (1 - beta) * T
L = L_new
return L + T
# --- Step 3 + 5: simulate the policy on 2012 with deadband epsilon ---
def annual_cost(series, forecaster, epsilon=0.02):
s = series.values
cost = 0.0
stops = 0
for t in range(len(s) - 1):
p_now = s[t]
p_pred = forecaster(series, t)
if p_pred - p_now < -epsilon and d <= G / 2:
# buy half now, half next week
cost += 0.5 * G * p_now + 0.5 * G * s[t + 1]
stops += 2
else:
cost += G * p_now
stops += 1
return cost + tau * stops
bench_full = annual_cost(test["national"], lambda s, t: s.iloc[t] - 1.0) # never buys half
rule_holt = annual_cost(test["national"], forecast_holt)
print(f"2012 national, always-full: ${bench_full:8.2f}")
print(f"2012 national, Holt rule : ${rule_holt:8.2f}")
print(f"savings: ${bench_full - rule_holt:.2f} ({(bench_full - rule_holt)/bench_full:.1%})")
# major-city replication
print(f"2012 LA, Holt rule : ${annual_cost(test['los_angeles'], forecast_holt):8.2f}")
Sensitivity & validation checklist
- Sweep deadband $\epsilon$ from \$0.00 to \$0.10 — the chosen $\epsilon$ should sit near the forecast MAE; cost should be flat across a wide plateau (a good sign).
- Sweep weekly driving demand $d$ from 2 to 14 gallons — the savings should taper to zero as $d \to G/2$ and turn negative if you ignore the feasibility constraint.
- Block-bootstrap 2012 into 1,000 resampled years and report a 95% CI on annual savings; if the CI crosses zero, the rule is not statistically distinguishable from always-full [illustrative point estimate ~3–6% savings].
- Replay on Los Angeles and one other metro — if the rule loses money on either, the newspaper letter must caveat geography.
- Check directional accuracy of the forecaster — a hit rate near 50% means the policy is gambling; you want ≥ 55% before claiming the rule is real.
- Verify no look-ahead: refit on data shifted by one week and confirm identical decisions.