USACO 2020 US Open · Platinum · all three problems
Problem 1 — Sprinklers 2: Return of the Alfalfa
Statement
Farmer John has an N×N grid. Some cells are pre-filled with alfalfa (immutable), the rest are empty. Each empty cell must be planted with either a "down-and-left" sprinkler or an "up-and-right" sprinkler, where each sprinkler waters its own cell plus an axis-aligned quadrant of cells. The arrangement is valid iff every alfalfa cell is watered by both a down-left and an up-right sprinkler (or, equivalently, the down-left region and up-right region are separated by a monotonically-staircased boundary that "covers" the alfalfa). Count the number of valid assignments modulo a given prime.
Constraints
- 1 ≤ N ≤ 2000.
- Output modulo 109+7.
- Time limit: 2s, Memory limit: 256 MB.
Key idea
The "down-left" cells form an upper-left staircase region monotone in row and column; equivalently, fix bi = number of down-left cells in row i, with b1 ≤ b2 ≤ … ≤ bN. Constraint: every alfalfa cell (r, c) must lie strictly inside the boundary band (br-1 < c < br+1 with proper edge handling). DP across rows: for each row, track the current "b value." Transitions add a single integer step br ≥ br-1, constrained by alfalfa positions in rows r and r+1. With prefix sums over the "b" axis, the DP is O(N²).
Complexity target
O(N²) with row-by-row DP and prefix-sum transitions.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const ll MOD = 1'000'000'007LL;
int N;
vector<string> g;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N;
g.resize(N);
for (auto& r : g) cin >> r;
// For each row r, let alf_lo[r] = leftmost alfalfa column (or N if none),
// alf_hi[r] = rightmost alfalfa column (or -1 if none).
vector<int> lo(N, N), hi(N, -1);
for (int r = 0; r < N; ++r)
for (int c = 0; c < N; ++c)
if (g[r][c] == 'X') { // [verify alfalfa symbol] cpid=1044
lo[r] = min(lo[r], c);
hi[r] = max(hi[r], c);
}
// dp[b] = ways for current row with boundary b (b cells of down-left to
// the left, N-b cells of up-right to the right). Constraint per row r:
// b_r > hi[r] (alfalfa must be in down-left side, i.e. column < b_r)
// AND lo[r] >= ? -- depends on framing. We keep this skeleton and refer
// to the editorial for the exact inequality on lo/hi and the monotone
// chain constraint between b_{r-1} and b_r.
vector<ll> dp(N + 1, 0), nd(N + 1, 0), pref(N + 2, 0);
for (int b = 0; b <= N; ++b) dp[b] = 1;
for (int r = 0; r < N; ++r) {
// Build prefix sum of dp.
pref[0] = 0;
for (int b = 0; b <= N; ++b) pref[b + 1] = (pref[b] + dp[b]) % MOD;
fill(nd.begin(), nd.end(), 0);
int lb = hi[r] + 1; // b_r must be > hi[r]
int ub = lo[r]; // b_r must be <= lo[r] [verify]
for (int b = max(lb, 0); b <= min(ub, N); ++b) {
// Sum dp[prev] for prev in [0..b] (monotone chain b_{r-1} <= b_r).
nd[b] = pref[b + 1];
}
dp = nd;
}
ll ans = 0;
for (int b = 0; b <= N; ++b) ans = (ans + dp[b]) % MOD;
cout << ans << "\n";
// [verify] cpid=1044 — exact lo/hi inequalities are subtle (which side
// the alfalfa must be on for each sprinkler quadrant); align with the
// editorial for production code.
}
Pitfalls
- Monotone boundary. The down-left region's right edge per row is non-decreasing; the column index "br" only grows.
- Alfalfa constraint per row. Every alfalfa cell must be inside the down-left region; that pins br from below.
- Sprinkler quadrant must not cover the alfalfa "from the wrong side." The exact lo/hi inequality is per-editorial; mark these "[verify]" before shipping a graded solution.
- Prefix sums. The naive transition is O(N) per row per b; prefix sums cut it to O(N) total per row.
Problem 2 — Exercise
Statement
Same as Gold P3 (Exercise), but with much larger N. Sum over all N! permutations of {1, …, N} of LCM(cycle lengths), modulo M.
Constraints
- 1 ≤ N ≤ 7500.
- 108 ≤ M ≤ 109 + 7 (M not necessarily prime).
- Time limit: 2s.
Key idea
Same prime-power knapsack as Gold, just larger. The LCM of cycle lengths only depends on the maximum exponent of each prime appearing in any cycle. Walk through primes p ≤ N; for each prime, decide either "skip" (contributes factor 1) or "use exponent k" (contributes factor pk and consumes pk from a knapsack over N). Combine with the multinomial counting of permutations of given cycle multiset. The DP is O(N · #primes · logp N) ≈ N² / ln N, feasible for N = 7500.
Complexity target
O(N · π(N)) with care, around 4·107 ops at N = 7500.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
int N; ll MOD;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N >> MOD;
// Sieve primes <= N.
vector<char> sv(N + 1, 1); sv[0] = sv[1] = 0;
vector<int> primes;
for (int i = 2; i <= N; ++i) {
if (sv[i]) {
primes.push_back(i);
for (int j = 2 * i; j <= N; j += i) sv[j] = 0;
}
}
// dp[j] = sum over choices of (subset of primes with assigned powers,
// total mass = j) of (product of p^k). Initially dp[0] = 1.
vector<ll> dp(N + 1, 0); dp[0] = 1;
for (int p : primes) {
// For each existing dp[j], optionally "spend" p^k for k = 1, 2, ...
// The knapsack must be done carefully so each prime is used at most
// one power. Build a tmp[] = sum of dp[j - p^k] * p^k over k.
vector<ll> add(N + 1, 0);
for (ll pk = p; pk <= N; pk *= p) {
ll w = pk % MOD;
for (int j = (int)pk; j <= N; ++j) {
add[j] = (add[j] + dp[j - (int)pk] * w) % MOD;
}
}
for (int j = 0; j <= N; ++j) dp[j] = (dp[j] + add[j]) % MOD;
}
// The cycle-multinomial coefficients fold in via a precomputed array:
// # permutations of N with cycle multiset {c} = N! / (prod c_i · prod m!).
// We'd convolve dp with the "filler" generating function for the cycle
// lengths that are NOT prime powers (their LCM contribution is 1).
// For the Platinum constraints, the standard editorial implementation
// achieves this with a second knapsack over "uncolored cycle lengths."
// See editorial for the exact factor placement; this skeleton shows the
// prime-power core.
// Output dp[N] as a placeholder; final code multiplies by the filler
// generating function evaluated at the right point. [verify] cpid=1045
cout << dp[N] << "\n";
}
Pitfalls
- Composite modulus. No modular inverses. Pre-compute everything additively/multiplicatively; avoid division.
- Filler cycles. Cycle lengths that are not prime powers contribute 1 to the LCM but still count permutations; the editorial folds them in via a generating-function convolution.
- Knapsack ordering. Process each prime once; within a prime, each power option is exclusive — don't accidentally use two different powers of the same prime.
- Memory. One dp[] of size N + 1 is plenty; don't keep per-prime tables.
Problem 3 — Circus
Statement
A tree on N vertices. For each K = 1..N, count the number of equivalence classes of ways to place K labeled cows on K vertices and then "slide" them along tree edges, where two configurations are equivalent if one can be reached from the other by sliding (an inner cow can move along edges as long as it doesn't pass through another cow). Output answers modulo 109+7.
Constraints
- 1 ≤ N ≤ 105.
- Output modulo 109+7.
- Time limit: 4s.
Key idea
Root the tree arbitrarily. For each subtree we track a DP over (size of placement, "frontier" type — whether the topmost cow is at the root or one step below). When merging children into a parent, the only thing that matters is whether the parent is occupied; if so, the children's placements are independent compositions. If the parent has at least one child with a cow at its root, the parent can "promote" that cow upward, leading to canonical-form counting. The total work is O(N log N) with a careful merge that bounds the cross-multiplication by the smaller side, or O(N · √N) with subtree-size truncation. Final answer for K is the coefficient of xK in the root's polynomial.
Complexity target
O(N²) is too slow at N = 105; the intended bound is O(N log² N) using small-to-large polynomial merging or O(N√N) with sqrt-decomposition over subtree sizes.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const ll MOD = 1'000'000'007LL;
int N;
vector<vector<int>> adj;
// For each subtree v, f[v][k][s] = # canonical placements of k cows in v,
// where s = 1 iff the root v is occupied. We store f[v] as two polynomials.
vector<vector<ll>> F0, F1; // F0[v] for s=0, F1[v] for s=1
vector<int> sz;
// Polynomial multiplication truncated to length cap.
vector<ll> mul(const vector<ll>& a, const vector<ll>& b, int cap) {
vector<ll> c(min((int)(a.size() + b.size() - 1), cap + 1), 0);
for (int i = 0; i < (int)a.size() && i <= cap; ++i)
for (int j = 0; j < (int)b.size() && i + j <= cap; ++j)
c[i + j] = (c[i + j] + a[i] * b[j]) % MOD;
return c;
}
void dfs(int u, int p) {
F0[u] = {1}; // empty placement
F1[u] = {0, 1}; // single cow at u
sz[u] = 1;
for (int v : adj[u]) if (v != p) {
dfs(v, u);
// Combine child v into u. The merge rule (sketch): if u is unoccupied
// (case s=0 for u), child can contribute any of its states freely
// (canonical form merges the child's root-cow with u via sliding).
// If u is occupied (s=1), child must not have an occupied root — its
// root-cow would conflict on the edge to u, so only F0[v] contributes.
// The exact merge constants depend on the canonicalization choice.
auto newF0 = mul(F0[u], F0[v], N);
// Allow child to "donate" its root-cow upward into u: F0[u] · F1[v]
// contributes to F1[u] (shifted by 0 since u stays at index k).
auto donate = mul(F0[u], F1[v], N);
auto newF1 = mul(F1[u], F0[v], N);
// Combine.
if (donate.size() > newF1.size()) newF1.resize(donate.size(), 0);
for (int i = 0; i < (int)donate.size(); ++i)
newF1[i] = (newF1[i] + donate[i]) % MOD;
F0[u] = move(newF0);
F1[u] = move(newF1);
sz[u] += sz[v];
}
// [verify] cpid=1046 — exact merge formula per editorial; this skeleton
// captures the (s=0/s=1) two-state structure but the canonicalization
// factor between sibling subtrees needs the editorial's precise rule.
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N;
adj.assign(N + 1, {});
F0.assign(N + 1, {}); F1.assign(N + 1, {});
sz.assign(N + 1, 0);
for (int i = 0; i < N - 1; ++i) {
int a, b; cin >> a >> b;
adj[a].push_back(b); adj[b].push_back(a);
}
dfs(1, 0);
// Answer for K: F0[1][K] + F1[1][K].
for (int K = 1; K <= N; ++K) {
ll v = 0;
if (K < (int)F0[1].size()) v = (v + F0[1][K]) % MOD;
if (K < (int)F1[1].size()) v = (v + F1[1][K]) % MOD;
cout << v << "\n";
}
}
Pitfalls
- Canonical form is the heart of the problem. Two placements are equivalent under sliding iff their "skeleton" (which subtree pockets are occupied) matches; getting this rule right is the entire crux.
- Naive O(N²) merge is too slow. For N = 105, you need the standard tree-knapsack amortization: each pair of nodes contributes once at their LCA, giving O(N²) overall, but with the truncation cap it becomes O(N · K). To go faster, use small-to-large or polynomial-DnC.
- Two-state DP. "Root occupied" vs "root empty" is the minimum sufficient state — fewer fails to count, more wastes time.
- Mod arithmetic. 109+7, standard; no inverses needed for this DP.
- Editorial alignment. The merge formula sketched here is a skeleton; the editorial's exact coefficients (especially around the "donate" step for sliding a child cow up to its parent) must be cross-checked before contest use.
What Platinum 2020 US Open was actually testing
- P1 — monotone-staircase DP. Recognize that the down-left/up-right boundary is a non-decreasing sequence; prefix-sum the transition.
- P2 — number-theoretic knapsack at scale. Same idea as Gold P3 but tighter on constants; composite modulus rules out naive inverse tricks.
- P3 — tree DP with canonical-form counting. The hard part is the modeling: equivalence classes under sliding ⇒ count occupied-subtree skeletons, then aggregate via small-to-large polynomial merge.