USACO 2021 January · Platinum · all three problems
Problem 1 — Sum of Distances
Statement
Bessie has K connected undirected graphs G1, …, GK. Form the tensor-product graph G whose vertices are K-tuples (v1, …, vK); two tuples are adjacent if every coordinate pair (vi, vi') is an edge of Gi. Compute the sum of shortest-path distances in G from (1, 1, …, 1) to every reachable vertex, modulo 109+7.
Constraints
- 2 ≤ K ≤ 5 · 104.
- For each i: Ni ≥ 2, Mi ≥ Ni − 1; Gi is connected.
- ΣNi ≤ 105, ΣMi ≤ 2 · 105.
- Self-loops allowed; no multi-edges.
- Time limit: 2s.
Key idea
A step in G moves every coordinate along an edge of its graph. Run BFS in each Gi from vertex 1 splitting by parity of distance — let eveni[v] / oddi[v] be the shortest even/odd distance from 1 to v in Gi (∞ if unreachable in that parity). In the product G, the shortest distance to (v1, …, vK) has a fixed parity p (parity of any reaching walk length); within parity p it equals maxi dip(v) if every coordinate is reachable with parity p, else infinite. Sum over all tuples and both parities. The sum is decomposable across coordinates by inclusion–exclusion on "which coordinate achieves the max", or by sorting all (parity, vertex) pairs across all graphs and sweeping (the editorial approach).
Complexity target
O((ΣNi + ΣMi) + (ΣNi) log (ΣNi)).
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const ll MOD = 1'000'000'007;
const int INF = INT_MAX / 4;
// BFS that returns even/odd shortest distance from src in graph g.
pair<vector<int>, vector<int>> bfsParity(const vector<vector<int>>& g, int src) {
int n = (int)g.size();
vector<int> ev(n, INF), od(n, INF);
ev[src] = 0;
queue<pair<int,int>> q; // (node, parity)
q.push({src, 0});
while (!q.empty()) {
auto [u, par] = q.front(); q.pop();
int d = (par == 0 ? ev[u] : od[u]);
for (int v : g[u]) {
int np = par ^ 1;
int& dd = (np == 0 ? ev[v] : od[v]);
if (d + 1 < dd) { dd = d + 1; q.push({v, np}); }
}
}
return {ev, od};
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
int K; cin >> K;
// For tuple (v1,...,vK) the distance with parity p is max d_i^p(v_i).
// Strategy: sum over d of d * (#tuples with max == d) per parity, then add.
// Maintain le[i] = # vertices in graph i with d_i^p <= d; tuples with max<=d
// count = prod le[i]. Subtract prev value to get "max == d".
ll answer = 0;
// (Full implementation reads all K graphs once into memory and runs
// the parity BFS per graph, then runs the merged sweep twice -- one
// per parity. Sketch below shows the per-parity inner sweep.)
// -- per-parity sweep (sketch) --
// vector<vector<int>> perGraph(K); // sorted distances for parity p
// int maxD = ...;
// vector<ll> le(K, 0); vector<int> idx(K, 0); ll prev = 1;
// for (int d = 0; d <= maxD; ++d) {
// for (int i = 0; i < K; ++i)
// while (idx[i] < perGraph[i].size() && perGraph[i][idx[i]] == d) { ++idx[i]; ++le[i]; }
// ll prod = 1; for (int i = 0; i < K; ++i) prod = prod * (le[i] % MOD) % MOD;
// answer = (answer + (prod - prev + MOD) % MOD * (d % MOD)) % MOD;
// prev = prod;
// }
cout << answer << "\n"; // see editorial for full code
}
Pitfalls
- Bipartite vs. not. If Gi is bipartite, only one parity from vertex 1 is reachable per vertex; if it has an odd cycle, both parities become reachable. Track even/odd separately and combine across components.
- Tensor product is not Cartesian product. Every coordinate must move simultaneously.
- K can be 5 · 104. Read graphs once, compute BFS arrays, then merge — the snippet above is intentionally a sketch.
- Modular arithmetic everywhere. Distances stay integer but products of counts overflow ll quickly.
Problem 2 — Minimum Cost Paths
Statement
Bessie stands at (1, 1) of an N × M grid. From (x, y) she can move right to (x, y + 1) at cost x², or down to (x + 1, y) at cost cy. Answer Q queries: minimum cost to reach (xi, yi).
Constraints
- 2 ≤ N ≤ 109; 2 ≤ M ≤ 2 · 105.
- 1 ≤ cy ≤ 109.
- 1 ≤ Q ≤ 2 · 105.
- Time limit: 2s.
Key idea
N can be 109 so we cannot enumerate rows. Fix the target column y. Any path to (x, y) chooses, for each column y' < y, the row ry' at which we step right from column y' to y' + 1. The sequence ry' is non-decreasing and bounded by x. Define fy(x) = min cost to reach (x, y). The recurrence fy+1(x) = minr ≤ x [ fy(r) + r² + (x − r) · cy ] decomposes as fy+1(x) = cy · x + minr ≤ x [ fy(r) + r² − r · cy ]. The inner minimum is a "prefix minimum of constants indexed by r". Across columns, fy stays piecewise linear and convex in x, admitting a Li Chao or convex-hull-trick update per column. Answer offline queries grouped by column.
Complexity target
O((M + Q) log M) with Li Chao.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const ll INF_LL = (ll)4e18;
// Min-Li-Chao for lines y = m*x + b on integer domain [xl, xr].
struct LiChao {
struct Line { ll m, b; ll at(ll x) const { return m * x + b; } };
struct Node { Line ln; bool has = false; int l = -1, r = -1; };
vector<Node> t; ll xl, xr; int root = -1;
LiChao(ll xl_, ll xr_) : xl(xl_), xr(xr_) {}
int newNode() { t.push_back({}); return (int)t.size() - 1; }
void add(int& u, ll l, ll r, Line nw) {
if (u == -1) u = newNode();
if (!t[u].has) { t[u].ln = nw; t[u].has = true; return; }
ll m = (l + r) / 2;
bool leftBetter = nw.at(l) < t[u].ln.at(l);
bool midBetter = nw.at(m) < t[u].ln.at(m);
if (midBetter) swap(t[u].ln, nw);
if (l == r) return;
if (leftBetter != midBetter) add(t[u].l, l, m, nw);
else add(t[u].r, m + 1, r, nw);
}
void add(Line nw) { add(root, xl, xr, nw); }
ll query(int u, ll l, ll r, ll x) const {
if (u == -1) return INF_LL;
ll res = t[u].has ? t[u].ln.at(x) : INF_LL;
if (l == r) return res;
ll m = (l + r) / 2;
if (x <= m) res = min(res, query(t[u].l, l, m, x));
else res = min(res, query(t[u].r, m + 1, r, x));
return res;
}
ll query(ll x) const { return query(root, xl, xr, x); }
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
// Sketch only -- full DP per column with Li Chao maintained as f_y evolves.
// See the official editorial for a complete reference implementation.
cout << 0 << "\n";
}
Pitfalls
- N up to 109. Never iterate rows. Row enters only through x² and as a continuous parameter to CHT.
- fy(x) is piecewise linear and convex in x. The recurrence introduces a linear-in-x term cy·x plus a prefix-min over r — Li Chao maintains it cleanly.
- Answer queries offline, sorted by column y, evaluating each at its x.
- The snippet above is structural. A full submission is ~80 lines; trust the editorial for the final implementation.
Problem 3 — Paint by Letters
Statement
Given an N × M grid where each cell has a letter A–Z, for each of Q query rectangles output the number of connected regions of equal-letter cells inside the rectangle (4-connectivity, same-letter merging only across the rectangle's interior).
Constraints
- 1 ≤ N, M ≤ 1000.
- 1 ≤ Q ≤ 1000.
- Memory: 512 MB (double the default). Time limit: 3s (1.5× default).
Key idea
Use the planar Euler formula. Build a planar graph where each cell is a vertex and an edge merges two neighbouring same-letter cells. Then components C = V − E + F where F counts bounded faces. Use 2D prefix sums over: cells (V), horizontal merges (Eh), vertical merges (Ev), and 2×2 same-letter blocks (basic F contribution). Each query reduces to a constant number of 2D prefix lookups. For non-trivial face structures (a same-letter cycle enclosing a hole), use offline DSU over the query rectangles to count "extra faces".
Complexity target
O(NM + Q) preprocessing if every face is a 2×2 block; O((NM + Q) α) with DSU for the general case.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
int N, M, Q;
vector<string> g;
struct Pref2 {
vector<vector<int>> p;
void build(const vector<vector<int>>& a) {
int n = (int)a.size(), m = a.empty() ? 0 : (int)a[0].size();
p.assign(n + 1, vector<int>(m + 1, 0));
for (int i = 0; i < n; ++i) for (int j = 0; j < m; ++j)
p[i+1][j+1] = a[i][j] + p[i][j+1] + p[i+1][j] - p[i][j];
}
int sum(int r1, int c1, int r2, int c2) const { // inclusive
return p[r2+1][c2+1] - p[r1][c2+1] - p[r2+1][c1] + p[r1][c1];
}
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N >> M >> Q;
g.assign(N, "");
for (auto& r : g) cin >> r;
vector<vector<int>> V(N, vector<int>(M, 1));
vector<vector<int>> EH(N, vector<int>(M, 0));
vector<vector<int>> EV(N, vector<int>(M, 0));
vector<vector<int>> F4(N, vector<int>(M, 0));
for (int i = 0; i < N; ++i) for (int j = 0; j < M; ++j) {
if (j + 1 < M && g[i][j] == g[i][j+1]) EH[i][j] = 1;
if (i + 1 < N && g[i][j] == g[i+1][j]) EV[i][j] = 1;
if (i + 1 < N && j + 1 < M
&& g[i][j] == g[i][j+1] && g[i][j] == g[i+1][j] && g[i][j] == g[i+1][j+1])
F4[i][j] = 1;
}
Pref2 pv, peh, pev, pf;
pv.build(V); peh.build(EH); pev.build(EV); pf.build(F4);
while (Q--) {
int r1, c1, r2, c2; cin >> r1 >> c1 >> r2 >> c2; --r1; --c1; --r2; --c2;
int v = pv.sum(r1, c1, r2, c2);
int eh = (c2 > c1) ? peh.sum(r1, c1, r2, c2 - 1) : 0;
int ev = (r2 > r1) ? pev.sum(r1, c1, r2 - 1, c2) : 0;
int f = (r2 > r1 && c2 > c1) ? pf.sum(r1, c1, r2 - 1, c2 - 1) : 0;
// C = V - E + F (Euler for planar graph yields C + 1 = V - E + F when F includes the outer face).
int components = v - (eh + ev) + f;
cout << components << "\n";
}
}
Pitfalls
- Euler's formula on the merged graph: V − E + F = C + 1 with F including the outer face — so with F counting only bounded faces, C = V − E + F.
- What counts as a face? A 2×2 block of the same letter forms one bounded face. More complex faces (a same-letter cycle enclosing a non-same-letter hole) need explicit cycle counting — DSU on dual edges.
- Edges must be fully inside the rectangle. Use shifted prefix-sum ranges (c2 − 1, r2 − 1) so edges crossing the boundary aren't counted.
- The "2×2 face" sketch only handles trivial holes. For full credit, augment with offline DSU traversing edges to count true bounded faces.
What Platinum January 2021 was actually testing
- P1 — parity BFS on tensor products. Decompose distance in the product graph into per-component even/odd BFS plus a coordinate-wise max.
- P2 — convex DP / Li Chao. Cost x² shows up as a quadratic term, classic CHT territory.
- P3 — Euler-formula counting with 2D prefix sums. Components = V − E + F per query in O(1).