Hardness of Approximate Nearest Neighbor Search - Theory
Explaining the conditional hardness of Approximate Nearest Neighbor (ANN) search based on SETH and innovations like Algebraic Geometry (AG) codes.
Seminar Presentation
Hardness of Approximate Nearest Neighbor Search
Aviad Rubinstein — Harvard University, 2018
Presented by: [Your Name]
arXiv:1803.00904
Talk Outline
1. Problem Setup
ANN & Bichromatic Closest Pair
2. Main Results
What does the paper prove?
3. Proof Ingredients
SETH, OVC, Distributed PCP, AG Codes
4. Theorem 3.1 (FULL PROOF)
MA-Communication Protocol
5. Theorem 4.1 (FULL PROOF)
Hardness of Approx. Closest Pair
6. Extensions & Open Questions
SETH / k-SAT
Orthogonal Vectors
Bichromatic Closest Pair
Approx. Nearest Neighbor
Problem Setup
Approximate Nearest Neighbor (ANN)
ANN Search
Bichromatic Closest Pair (BCP)
BCP is easier (offline). Hardness of BCP ⟹ hardness of ANN.
Metrics considered: Euclidean (ℓ₂), Manhattan (ℓ₁), Hamming, Edit distance
Main Results
What Does the Paper Prove?
Theorem 1.1 (Euclidean / Manhattan / Hamming)
Assuming SETH, for every <i>δ</i> > 0 there exists <i>ε</i> > 0 such that computing a (1+<i>ε</i>)-approximation to Bichromatic Closest Pair requires Ω(<i>N</i><sup style="font-size: 0.7em;">2−δ</sup>) time.
i.e., no strongly subquadratic algorithm exists for any fixed approximation factor
Theorem 1.2 (Edit Distance)
Same conclusion holds for edit distance.
Corollary 1.3 (ANN Query Lower Bound)
Assuming SETH, for every <i>δ</i>, <i>c</i> > 0 there exists <i>ε</i> such that no algorithm can preprocess <i>N</i> vectors in O(<i>N</i><sup style="font-size: 0.7em;">c</sup>) time and answer (1+<i>ε</i>)-ANN queries in O(<i>N</i><sup style="font-size: 0.7em;">1−δ</sup>) time.
First conditional hardness result for approximate (not just exact) ANN
Best algorithm: O(<i>N</i><sup style="font-size: 0.7em;">2−Θ̃(<i>ε</i><sup style="font-size: 0.7em;">1/3</sup>)</sup>) [ACW16]
This paper: Ω(<i>N</i><sup style="font-size: 0.7em;">2−δ</sup>) for some <i>ε</i>(<i>δ</i>)
PROOF INGREDIENTS
Four Key Building Blocks
SETH
Strong Exponential Time Hypothesis: k-SAT on n variables requires (2−ε)<sup>n</sup> time for some k.
Our hardness base assumption.
Orthogonal Vectors (OVC)
SETH implies: deciding if two sets A, B ⊂ {0,1}<sup>m</sup> contain an orthogonal pair requires Ω(N<sup>2−δ</sup>) time.
Intermediate stepping stone.
Distributed PCP [ARW17]
Split a PCP proof into 'half-proofs' to avoid PCP blowup. Enables hardness of approximation in P.
Framework from prior work — we improve it.
Algebraic Geometry (AG) Codes
Replace Reed-Solomon with AG codes: constant rate + systematic + polynomial closure.
Key innovation: eliminates log-factor overhead in communication.
SETH
OVC
Dist. PCP + AG Codes
BCP Hardness
Motivation for Distributed PCP
Why Standard PCP Fails Here
<i>k</i>-SAT instance φ on <i>n</i> variables
N ≈ 2<sup>n/2</sup> gadgets
standard reduction
PCP Theorem gives φ' on <i>n' ≫ n</i> variables
construct gadgets
N' ≈ 2<sup>n'/2</sup> ≫ 2<sup>n</sup> gadgets
Running time lower bound becomes meaningless!
Distributed PCP [ARW17]
Construct a <strong style="color: #D48B2A;">'half-proof'</strong> π(α), π(β) for each half-assignment α, β ∈ {0,1}<sup>n/2</sup>
Number of gadgets stays <span style="font-family: 'SFMono-Regular', Consolas, monospace; background: #F7F4EF; padding: 4px 10px; border-radius: 6px; border: 1px solid #EAEAEA; font-weight: 600;">N ≈ 2<sup>n/2</sup></span> — no blowup!
Key insight: distribute the proof generation across the two parties of the reduction.
ALGEBRAIC GEOMETRY CODES
Replacing Reed-Solomon with AG Codes
Property
Reed-Solomon Code
AG Code
Rate
Θ(1/log n)
— BAD
Constant (≥ 0.1)
— GOOD
Distance
Θ(1/log n)
Constant (≥ 0.1)
Systematic
Yes
Polynomial Closure
Reed-Solomon rate = Θ(1/log n)
Merlin's message: O(n/T · log n) bits, Bob's message: O(T log n) bits
Need T = Ω(log n) → Bob sends Ω(log² n) bits → only (1+1/poly(n))-hardness
AG Code rate = Θ(1)
Merlin's message: O(n log T / T) bits, Bob's message: O(T log T) bits
T can be a large CONSTANT → constant-factor hardness gap!
AG codes (Goppa 1981, SAK+01): multivariate rational functions over finite fields — generalize Reed-Solomon while achieving constant rate.
THEOREM 3.1 — FULL PROOF
MA-Communication Protocol for Set Disjointness
For every <i style="font-family: 'Playfair Display', serif; font-weight: 600;">T ∈ [2,m]</i>, there is an efficient MA-communication protocol for Set Disjointness over universe <i style="font-family: 'Playfair Display', serif; font-weight: 600;">[m]</i> where:
<div style="display: flex; flex-direction: column; gap: 14px;"> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">1.</span> <span>Merlin sends Alice <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(m log T / T)</i> bits</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">2.</span> <span>Alice and Bob toss <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(log m)</i> coins</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">3.</span> <span>Bob sends Alice <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(T log T)</i> bits</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">4.</span> <span>Alice returns <span style="color: #2e7d32; font-weight: 600; background-color: rgba(46,125,50,0.1); padding: 2px 8px; border-radius: 4px;">Accept</span> or <span style="color: #c62828; font-weight: 600; background-color: rgba(198,40,40,0.1); padding: 2px 8px; border-radius: 4px;">Reject</span></span> </div> </div>
<span style="font-weight: 600;">Disjoint sets</span> → Alice <span style="color: #2e7d32; font-weight: 600;">always accepts</span>.
<span style="font-weight: 600;">Non-disjoint</span> → Alice accepts with probability ≤ <span style="font-family: 'Playfair Display', serif; font-weight: 600; font-size: 24px;">1/2</span>.
What is Set Disjointness?
<div style="color: #4A4A4A; font-size: 23px; line-height: 1.6; margin-bottom: 20px;"> Alice holds <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">α ⊆ [m]</span>, Bob holds <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">β ⊆ [m]</span>.<br> They want to decide: is <span style="font-weight: 600; color: #D48B2A; background-color: rgba(212, 139, 42, 0.1); padding: 2px 8px; border-radius: 6px; font-family: 'Playfair Display', serif;">α ∩ β = ∅</span> ? </div> <div style="color: #4A4A4A; font-size: 21px; line-height: 1.5; margin-bottom: 25px; padding: 15px; background: #FAFAFA; border-left: 4px solid #7A7A7A; border-radius: 0 8px 8px 0;"> <span style="font-weight: 700; color: #2C2C2C;">MA Protocol</span>: Merlin (prover) sends a hint, then Alice and Bob execute a randomized communication protocol. </div>
<div style="display: flex; align-items: center; justify-content: center; margin-top: auto; padding: 25px; background-color: #FAFAFA; border-radius: 12px; gap: 35px; border: 1px solid #EAEAEA;"> <!-- Merlin --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 0; height: 0; border-left: 25px solid transparent; border-right: 25px solid transparent; border-bottom: 45px solid #D48B2A; position: relative;"> <span style="position: absolute; bottom: -50px; left: -9px; font-size: 16px; font-weight: 700; color: #FFF;">M</span> </div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600; margin-top: 8px;">Merlin</div> </div> <!-- Arrow M->A --> <div style="display: flex; flex-direction: column; align-items: center; margin-bottom: 25px;"> <div style="font-size: 13px; color: #D48B2A; font-weight: 700; letter-spacing: 1px;">HINT</div> <div style="font-size: 24px; color: #D48B2A;">→</div> </div> <!-- Alice --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 55px; height: 55px; background-color: #ffffff; border: 3px solid #2C2C2C; border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 20px; font-weight: 700; color: #2C2C2C; box-shadow: 0 4px 6px rgba(0,0,0,0.05);">A</div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600;">Alice</div> </div> <!-- Arrow A<->B --> <div style="display: flex; flex-direction: column; align-items: center; margin-bottom: 25px;"> <div style="font-size: 13px; color: #7A7A7A; font-weight: 700; letter-spacing: 1px;">COMM</div> <div style="font-size: 24px; color: #2C2C2C;">↔</div> </div> <!-- Bob --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 55px; height: 55px; background-color: #ffffff; border: 3px solid #2C2C2C; border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 20px; font-weight: 700; color: #2C2C2C; box-shadow: 0 4px 6px rgba(0,0,0,0.05);">B</div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600;">Bob</div> </div> </div>
Why do we care?
This protocol is the core <span style="font-weight: 600; color: #2C2C2C;">communication complexity primitive</span> used to build the hardness reduction for BCP.
<span style="font-weight: 500;">number of gadgets</span> (vectors) we enumerate.
<span style="font-weight: 500;">approximation gap</span> we achieve.
Compare to prior work [AW09]: <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">O(m log n / T)</span> and <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">O(T log² n)</span> — an extra log factor due to Reed-Solomon rate.
THEOREM 3.1 — PROOF (STEP 1: SETUP)
Encoding the Inputs with AG Codes
Step 1
Step 2
Step 3
Key Observation
Theorem 3.1 — Proof (Step 2: The Protocol)
The Four-Step Protocol
Step 1 — Merlin's Message
Merlin sends Alice <span style="font-family: 'Playfair Display', serif; font-style: italic;">μ̂</span> — the alleged encoding of <span style="font-family: 'Playfair Display', serif; font-style: italic;">μ = Σ<sub>t</sub> μ<sup>t</sup></span>
Message size: <span style="font-family: 'Playfair Display', serif; font-style: italic;">n<sub>C</sub> · ⌈log T⌉ = O(m log T / T)</span> bits <span style="color: #10B981; font-weight: bold; margin-left: 4px;">✓</span>
Step 2 — Shared Randomness
Alice and Bob jointly pick a random index <span style="font-family: 'Playfair Display', serif; font-style: italic;">i<sup>*</sup> ∈ [n<sub>C</sub>]</span>
(<span style="font-family: 'Playfair Display', serif; font-style: italic;">O(log m)</span> random coins)
Step 3 — Bob's Message
Bob sends Alice <span style="font-family: 'Playfair Display', serif; font-style: italic;">C(β<sup>t</sup>)<sub>i<sup>*</sup></sub></span> for all <span style="font-family: 'Playfair Display', serif; font-style: italic;">t ∈ [T]</span>
Message size: <span style="font-family: 'Playfair Display', serif; font-style: italic;">T · ⌈log T⌉ = O(T log T)</span> bits <span style="color: #10B981; font-weight: bold; margin-left: 4px;">✓</span>
Step 4 — Alice's Decision
Alice accepts iff ALL of:
<span style="font-family: 'Playfair Display', serif; font-style: italic;">μ̂</span> is a valid codeword in <span style="font-family: 'Playfair Display', serif; font-style: italic;">C'</span> <span style="color: #888888; font-size: 0.9em;">[checks Merlin's honesty]</span>
<span style="font-family: 'Playfair Display', serif; font-style: italic;">μ̂<sub>i<sup>*</sup></sub> = Σ<sub>t</sub> C(α<sup>t</sup>)<sub>i<sup>*</sup></sub> · C(β<sup>t</sup>)<sub>i<sup>*</sup></sub></span> <span style="color: #888888; font-size: 0.9em;">[spot-check consistency]</span>
<span style="font-family: 'Playfair Display', serif; font-style: italic;">μ̂<sub>i</sub> = 0</span> for all <span style="font-family: 'Playfair Display', serif; font-style: italic;">i ∈ [m/T]</span> <span style="color: #888888; font-size: 0.9em;">[checks disjointness]</span>
Total communication:
Merlin → Alice:
O(m log T / T)
Bob → Alice:
O(T log T)
Coins:
O(log m)
Repeat steps 2–4 a constant number of times (in parallel) to boost soundness to 1/2.
THEOREM 3.1 — PROOF (STEP 3: CORRECTNESS)
Completeness & Soundness Analysis
Completeness
Claim: If α ∩ β = ∅, Alice always accepts.
Since sets are disjoint: μ<sup style='font-size: 0.7em'>t</sup><sub style='font-size: 0.7em'>i</sub> = C(α<sup style='font-size: 0.7em'>t</sup>)<sub style='font-size: 0.7em'>i</sub> · C(β<sup style='font-size: 0.7em'>t</sup>)<sub style='font-size: 0.7em'>i</sub> = 0 for all i,t<br><span style='color: #497A49; font-size: 20px; display: inline-block; margin-top: 8px;'>(systematic property: entries on S<sub style='font-size: 0.7em'>n</sub> are in {0,1}, so product of disjoint bits = 0)</span>
By linearity of C': μ = Σ<sub style='font-size: 0.7em'>t</sub> μ<sup style='font-size: 0.7em'>t</sup> is the all-zeros codeword
Merlin sends the true μ̂ = μ = 0
All three checks (a), (b), (c) pass <span style='color: #497A49; font-weight: bold;'>✓</span>
Soundness
Claim: If α ∩ β ≠ ∅, Alice accepts with prob ≤ 1/2.
If Alice accepts with nonzero probability → μ̂ passes codeword check (a)
If Alice accepts with prob > 1 − δ<sub style='font-size: 0.7em'>C'</sub>: μ̂ must equal the true μ<br><span style='color: #BA771B; font-size: 20px; display: inline-block; margin-top: 8px;'>(by distance of C' ≥ 0.1)</span>
But then check (c) implies μ<sub style='font-size: 0.7em'>i</sub> = 0 for all i → sets are disjoint. Contradiction!
So Alice accepts with prob ≤ 1 − δ<sub style='font-size: 0.7em'>C'</sub> ≤ 0.9
After O(1) parallel repetitions of steps 2–4: soundness ≤ 1/2 <span style='color: #BA771B; font-weight: bold;'>✓</span>
<strong style='color: #2C2C2C;'>Key role of AG codes:</strong> The constant rate of C means n<sub style='font-size: 0.7em'>C</sub> = O(m/T) — so Merlin's message is O(m log T / T), not O(m log n / T) as with Reed-Solomon. This is the crucial improvement.
[Proof of Theorem 3.1 complete ✓]
THEOREM 4.1 — FULL PROOF
From OVC to Bichromatic Closest Pair
<strong>Theorem 4.1.</strong> Unless SETH and OVC are false, for any <em>ℓ<sub>p</sub></em> metric (constant <em>p</em>), for every <em>δ > 0</em> there exists <em>ε = ε(δ,p) > 0</em> such that given sets <em>A, B ⊂ {0,1}<sup>d</sup></em> of <em>N</em> vectors (<em>d = O(log N)</em>), computing a <em>(1+ε)</em>-approximation to Bichromatic Closest Pair requires <em>Ω(N<sup>2−δ</sup>)</em> time.
<div>We reduce Orthogonal Vectors (OVC) to (1+<em>ε</em>)-approx BCP.<br><br>Given OV instance (<em>A<sub>OV</sub></em>, <em>B<sub>OV</sub></em>) over {0,1}<sup><em>m</em></sup>:<br><ol style="margin-top: 15px; margin-bottom: 0; padding-left: 28px; display: flex; flex-direction: column; gap: 12px; color: #2C2C2C;"> <li><span style="color: #4A4A4A;">Use Theorem 3.1's MA-protocol with parameter <em>T</em></span></li> <li><span style="color: #4A4A4A;">Encode each OV vector as a BCP vector by 'embedding' the protocol's acceptance behavior</span></li> <li><span style="color: #4A4A4A;">Show: orthogonal pairs → small distance, non-orthogonal → larger distance</span></li> <li><span style="color: #4A4A4A;">The approximation ratio separates the two cases</span></li></ol></div>
<div style="font-family: inherit;"> Set <em>T = O(log(1/ε) / log log(1/ε))</em> so that: <div style="padding-left: 20px; font-family: 'SFMono-Regular', Consolas, monospace; margin: 15px 0; color: #2C2C2C; font-size: 20px;"> T' = 2<sup>O(T log T)</sup> = O(1/ε) </div> <div style="color: #7A7A7A; font-size: 17px; margin-bottom: 20px; padding-left: 20px;">(number of Bob messages)</div> This makes the distance gap exactly a <span style="font-weight: 700; color: #2C2C2C;">(1 + Ω(1/T')) = (1+ε)</span> factor.<br><br> <div style="background-color: #FDF9F3; border-radius: 8px; border-left: 4px solid #D48B2A; padding: 18px 24px; margin-top: 15px;"> <span style="font-weight: 700; color: #D48B2A; text-transform: uppercase; font-size: 15px; letter-spacing: 1.5px;">Size blowup</span><br> <div style="font-family: 'SFMono-Regular', Consolas, monospace; font-size: 18px; color: #2C2C2C; margin-top: 8px;"> N = 2<sup>m · (1/c + O(log² log(1/ε) / log(1/ε)))</sup> vectors </div> </div></div>
THEOREM 4.1 — PROOF (STEP 1: CONSTRUCTION)
Building the BCP Vectors
THEOREM 4.1 — PROOF (STEP 2: DISTANCE ANALYSIS)
From Inner Products to ℓ<sub>p</sub> Distances
Key Calculation — Hamming distance probability:
Pr[a<sub>i,j,b</sub> ≠ b<sub>i,j,b</sub>] = 1/2 + 1/(2T′) − (1/T′) · Pr[Alice accepts on α, β, μ]
<b>Orthogonal (α ⊥ β):</b> there exists μ with Pr[accept]=1, so min<sub>μ</sub> ||a−b||<sup>p</sup> = m(T′−1)
<b>Non-orthogonal (α·β ≠ 0):</b> all μ give Pr[accept] ≤ 1/2, so min<sub>μ</sub> ||a−b||<sup>p</sup> ≥ mT′
Approximation ratio = (T′/(T′−1))<sup>1/p</sup> ≥ 1 + Ω(1/T′) = 1+ε <span style="font-family: sans-serif; margin-left: 8px;">✔</span>
<b>Reduction size:</b> N = 2<sup>m(1/c + O(log²log(1/ε)/log(1/ε)))</sup> → BCP requires Ω(N<sup>2−δ</sup>). Dimension d = O(log N). <span style="color: #2E7D32; font-weight: 600; margin-left: 10px;">[Proof of Theorem 4.1 complete ✔]</span>
- nearest-neighbor
- computational-complexity
- seth
- ag-codes
- pcp-theorem
- algorithms
- theoretical-cs