Made byBobr AI

Hardness of Approximate Nearest Neighbor Search - Theory

Explaining the conditional hardness of Approximate Nearest Neighbor (ANN) search based on SETH and innovations like Algebraic Geometry (AG) codes.

#nearest-neighbor#computational-complexity#seth#ag-codes#pcp-theorem#algorithms#theoretical-cs
Watch
Pitch
Seminar Presentation
Hardness of Approximate Nearest Neighbor Search
Aviad Rubinstein — Harvard University, 2018
Presented by: [Your Name]
arXiv:1803.00904
Abstract Diagram
Made byBobr AI
Talk Outline
1. Problem Setup
ANN & Bichromatic Closest Pair
2. Main Results
What does the paper prove?
3. Proof Ingredients
SETH, OVC, Distributed PCP, AG Codes
4. Theorem 3.1 (FULL PROOF)
MA-Communication Protocol
5. Theorem 4.1 (FULL PROOF)
Hardness of Approx. Closest Pair
6. Extensions & Open Questions
SETH / k-SAT
Orthogonal Vectors
Bichromatic Closest Pair
Approx. Nearest Neighbor
Made byBobr AI
Problem Setup
Approximate Nearest Neighbor (ANN)
ANN Search
Preprocess a set A of N vectors. Given query b, find a* ∈ A such that:
||a*b|| ≈ mina ∈ A ||ab||
Bichromatic Closest Pair (BCP)
Given sets A, B of N vectors, find a* ∈ A, b* ∈ B such that:
||a*b*|| ≈ mina∈A, b∈B ||ab||
BCP is easier (offline). Hardness of BCP ⟹ hardness of ANN.
Metrics considered: Euclidean (ℓ₂), Manhattan (ℓ₁), Hamming, Edit distance
A B a* b*
Made byBobr AI
Main Results
What Does the Paper Prove?
Algorithmic Gap
Best algorithm: O(N2−Θ̃(ε1/3)) [ACW16]
This paper: Ω(N2−δ) for some ε(δ)
Theorem 1.1 (Euclidean / Manhattan / Hamming)
Assuming SETH, for every δ > 0 there exists ε > 0 such that computing a (1+ε)-approximation to Bichromatic Closest Pair requires Ω(N2−δ) time.
i.e., no strongly subquadratic algorithm exists for any fixed approximation factor
Theorem 1.2 (Edit Distance)
Same conclusion holds for edit distance.
Corollary 1.3 (ANN Query Lower Bound)
Assuming SETH, for every δ, c > 0 there exists ε such that no algorithm can preprocess N vectors in O(Nc) time and answer (1+ε)-ANN queries in O(N1−δ) time.
First conditional hardness result for approximate (not just exact) ANN
Made byBobr AI
PROOF INGREDIENTS
Four Key Building Blocks
1
SETH
Strong Exponential Time Hypothesis: k-SAT on n variables requires (2−ε)n time for some k.
Our hardness base assumption.
2
Orthogonal Vectors (OVC)
SETH implies: deciding if two sets A, B ⊂ {0,1}m contain an orthogonal pair requires Ω(N2−δ) time.
Intermediate stepping stone.
3
Distributed PCP [ARW17]
Split a PCP proof into 'half-proofs' to avoid PCP blowup. Enables hardness of approximation in P.
Framework from prior work — we improve it.
4
Algebraic Geometry (AG) Codes
Replace Reed-Solomon with AG codes: constant rate + systematic + polynomial closure.
Key innovation: eliminates log-factor overhead in communication.
SETH
OVC
Dist. PCP + AG Codes
BCP Hardness
Made byBobr AI
Motivation for Distributed PCP
Why Standard PCP Fails Here
k-SAT instance φ on n variables
N ≈ 2n/2 gadgets
standard reduction
PCP Theorem gives φ' on n' ≫ n variables
construct gadgets
N' ≈ 2n'/2 ≫ 2n gadgets
Running time lower bound becomes meaningless!
Distributed PCP [ARW17]
Construct a 'half-proof' π(α), π(β) for each half-assignment α, β ∈ {0,1}n/2
Number of gadgets stays N ≈ 2n/2 — no blowup!
Alice (α)
π(α)
Verifier
Bob (β)
π(β)
Key insight: distribute the proof generation across the two parties of the reduction.
Made byBobr AI
ALGEBRAIC GEOMETRY CODES
Replacing Reed-Solomon with AG Codes
Property Reed-Solomon Code AG Code
Rate Θ(1/log n) — BAD Constant (≥ 0.1) — GOOD
Distance Θ(1/log n) Constant (≥ 0.1)
Systematic Yes Yes
Polynomial Closure Yes Yes
Reed-Solomon rate = Θ(1/log n)
Merlin's message: O(n/T · log n) bits, Bob's message: O(T log n) bits
Need T = Ω(log n) → Bob sends Ω(log² n) bits → only (1+1/poly(n))-hardness
AG Code rate = Θ(1)
Merlin's message: O(n log T / T) bits, Bob's message: O(T log T) bits
T can be a large CONSTANT → constant-factor hardness gap!
AG codes (Goppa 1981, SAK+01): multivariate rational functions over finite fields — generalize Reed-Solomon while achieving constant rate.
Made byBobr AI
THEOREM 3.1 — FULL PROOF
MA-Communication Protocol for Set Disjointness
Theorem 3.1 For every T ∈ [2,m], there is an efficient MA-communication protocol for Set Disjointness over universe [m] where:
1. Merlin sends Alice O(m log T / T) bits
2. Alice and Bob toss O(log m) coins
3. Bob sends Alice O(T log T) bits
4. Alice returns Accept or Reject
Completeness
Disjoint sets → Alice always accepts.
Soundness
Non-disjoint → Alice accepts with probability ≤ 1/2.
ℹ️
What is Set Disjointness?
Alice holds α ⊆ [m], Bob holds β ⊆ [m].
They want to decide: is α ∩ β = ∅ ?
MA Protocol: Merlin (prover) sends a hint, then Alice and Bob execute a randomized communication protocol.
M
Merlin
HINT
A
Alice
COMM
B
Bob
🎯
Why do we care?
This protocol is the core communication complexity primitive used to build the hardness reduction for BCP.
Merlin's message → number of gadgets (vectors) we enumerate.
Bob's message → approximation gap we achieve.
Compare to prior work [AW09]: O(m log n / T) and O(T log² n) — an extra log factor due to Reed-Solomon rate.
Made byBobr AI
THEOREM 3.1 — PROOF (STEP 1: SETUP)
Encoding the Inputs with AG Codes
Step 1
Partition [m] into T disjoint blocks U 1, …, U T, each of size m/T.
Let α t = α ∩ U t and β t = β ∩ U t (Alice & Bob's restrictions to block t).
Step 2
Let C be an AG code over field F with characteristic q ≥ T (from Theorem 2.4).
Parameters: rate ρC ≥ 0.1, distance δC ≥ 0.1, length nC = O(m/T).
Step 3
For each t ∈ [T], encode:
C(α t) = encoding of α t,     C(β t) = encoding of β t
By Polynomial Closure: their entrywise product μ ti = C(α t)i · C(β t)i is a codeword of C′.
Key Observation
Sets are disjoint ⟺ μ ti = 0 for all i ∈ [m/T] and t ∈ [T]
μi = ∑t μ ti = 0 for all i ∈ [m/T]
(since char ≥ T, sum=0 iff all terms=0)
Universe [m]
U 1
α: ● ●
β:
...
U T
α:
β: ○ ○
μ t = C(α t) C(β t)
Entrywise Product
μ = ∑t μ t ∈ C′
By Linearity
Made byBobr AI
Theorem 3.1 — Proof (Step 2: The Protocol)
The Four-Step Protocol
Step 1 — Merlin's Message
Merlin sends Alice μ̂ — the alleged encoding of μ = Σt μt
Message size: nC · ⌈log T⌉ = O(m log T / T) bits
Step 2 — Shared Randomness
Alice and Bob jointly pick a random index i* ∈ [nC]
(O(log m) random coins)
Step 3 — Bob's Message
Bob sends Alice C(βt)i* for all t ∈ [T]
Message size: T · ⌈log T⌉ = O(T log T) bits
Step 4 — Alice's Decision
Alice accepts iff ALL of:
(a)
μ̂ is a valid codeword in C'   [checks Merlin's honesty]
(b)
μ̂i* = Σt C(αt)i* · C(βt)i*   [spot-check consistency]
(c)
μ̂i = 0 for all i ∈ [m/T]   [checks disjointness]
Total communication:
Merlin → Alice:
O(m log T / T)
Bob → Alice:
O(T log T)
Coins:
O(log m)
Repeat steps 2–4 a constant number of times (in parallel) to boost soundness to 1/2.
Made byBobr AI
THEOREM 3.1 — PROOF (STEP 3: CORRECTNESS)
Completeness & Soundness Analysis

Completeness

Claim: If α ∩ β = ∅, Alice always accepts.
  • Since sets are disjoint: μti = C(αt)i · C(βt)i = 0 for all i,t
    (systematic property: entries on Sn are in {0,1}, so product of disjoint bits = 0)
  • By linearity of C': μ = Σt μt is the all-zeros codeword
  • Merlin sends the true μ̂ = μ = 0
  • All three checks (a), (b), (c) pass

Soundness

Claim: If α ∩ β ≠ ∅, Alice accepts with prob ≤ 1/2.
  • If Alice accepts with nonzero probability → μ̂ passes codeword check (a)
  • If Alice accepts with prob > 1 − δC': μ̂ must equal the true μ
    (by distance of C' ≥ 0.1)
  • But then check (c) implies μi = 0 for all i → sets are disjoint. Contradiction!
  • So Alice accepts with prob ≤ 1 − δC' ≤ 0.9
  • After O(1) parallel repetitions of steps 2–4: soundness ≤ 1/2
Key role of AG codes: The constant rate of C means nC = O(m/T) — so Merlin's message is O(m log T / T), not O(m log n / T) as with Reed-Solomon. This is the crucial improvement.
[Proof of Theorem 3.1 complete ✓]
Made byBobr AI
THEOREM 4.1 — FULL PROOF
From OVC to Bichromatic Closest Pair
Theorem 4.1. Unless SETH and OVC are false, for any p metric (constant p), for every δ > 0 there exists ε = ε(δ,p) > 0 such that given sets A, B ⊂ {0,1}d of N vectors (d = O(log N)), computing a (1+ε)-approximation to Bichromatic Closest Pair requires Ω(N2−δ) time.
Proof Strategy (high level)
We reduce Orthogonal Vectors (OVC) to (1+ε)-approx BCP.

Given OV instance (AOV, BOV) over {0,1}m:
  1. Use Theorem 3.1's MA-protocol with parameter T
  2. Encode each OV vector as a BCP vector by 'embedding' the protocol's acceptance behavior
  3. Show: orthogonal pairs → small distance, non-orthogonal → larger distance
  4. The approximation ratio separates the two cases
Parameter Setting
Set T = O(log(1/ε) / log log(1/ε)) so that:
T' = 2O(T log T) = O(1/ε)
(number of Bob messages)
This makes the distance gap exactly a (1 + Ω(1/T')) = (1+ε) factor.

Size blowup
N = 2m · (1/c + O(log² log(1/ε) / log(1/ε))) vectors
OV Instance (Binary)
A: [1, 0, 1, 1, 0]
A: [0, 1, 0, 0, 1]
A: [1, 1, 1, 0, 0]
B: [0, 1, 1, 0, 1]
B: [1, 0, 0, 1, 0]
B: [1, 1, 0, 1, 1]
Orthogonal Pair Highlighted
Thm 3.1 Protocol
reduction
BCP Instance (Reals)
A': [0.12, 0.88, 0.45]
A': [0.55, 0.21, 0.90]
A': [0.93, 0.10, 0.24]
B': [0.44, 0.50, 0.72]
B': [0.57, 0.19, 0.82]
B': [0.81, 0.33, 0.61]
Closest Pair Highlighted
Made byBobr AI
THEOREM 4.1 — PROOF (STEP 1: CONSTRUCTION)
Building the BCP Vectors
Step 1
Start with OV instance (AOV, BOV) over {0,1}m
Step 2
Fix T from Thm 3.1. Let T' = 2O(T log T) = number of distinct Bob messages.
Step 3 — B-vectors
For each β ∈ BOV, build b̃β ∈ {0,1}T'×m:
[b̃β]i,j = 1 iff Bob sends message i on input β with randomness j
→ One row per possible Bob-message, one column per coin flip
Step 4 — A-vectors
For each Merlin-msg μ and α ∈ AOV, build ãμ,α ∈ {0,1}T'×m:
μ,α]i,j = 1 iff Alice accepts on input α, Merlin msg μ, Bob msg i, coins j
Step 5 — Balancing
To equalize Hamming weights: replace ã ∈ {0,1}T'×m with a ∈ {0,1}T'×m×2:
ai,j,1 = ãi,j   &   ai,j,2 = 1 − ãi,j
This ensures every a-vector has exactly T'·m ones.
For b-vectors: bi,j,1 = b̃i,j and bi,j,2 = 0.
β
Bob msgs (i)
coins (j)
1
0
··
1
0
0
··
0
1
1
··
0
×
ãμ,α
1
1
··
0
0
1
··
1
1
0
··
1
Inner product ãμ,α · b̃β = (Pr[Alice accepts]) × m
If α ⊥ β (orthogonal):
∃μ s.t. inner product = m (full acceptance)
If α · β ≠ 0 (not orthogonal):
maxμ inner product ≤ m/2
Made byBobr AI
THEOREM 4.1 — PROOF (STEP 2: DISTANCE ANALYSIS)
From Inner Products to ℓp Distances
Key Calculation — Hamming distance probability:
Pr[ai,j,b ≠ bi,j,b] = 1/2 + 1/(2T′) − (1/T′) · Pr[Alice accepts on α, β, μ]
Distance Gap:
Orthogonal (α ⊥ β): there exists μ with Pr[accept]=1, so minμ ||a−b||p = m(T′−1)
Non-orthogonal (α·β ≠ 0): all μ give Pr[accept] ≤ 1/2, so minμ ||a−b||p ≥ mT′
Approximation ratio = (T′/(T′−1))1/p ≥ 1 + Ω(1/T′) = 1+ε
Reduction size: N = 2m(1/c + O(log²log(1/ε)/log(1/ε))) → BCP requires Ω(N2−δ). Dimension d = O(log N). [Proof of Theorem 4.1 complete ✔]
minμ ||a−b||p
m(T'−1)
mT'
orthogonal
distance
non-orthogonal
distance
(1+ε) factor
Made byBobr AI
Bobr AI

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT

Create your own professional slide deck with real images, data charts, and unique design in under a minute.

Generate For Free

Hardness of Approximate Nearest Neighbor Search - Theory

Explaining the conditional hardness of Approximate Nearest Neighbor (ANN) search based on SETH and innovations like Algebraic Geometry (AG) codes.

Seminar Presentation

Hardness of Approximate Nearest Neighbor Search

Aviad Rubinstein — Harvard University, 2018

Presented by: [Your Name]

arXiv:1803.00904

Talk Outline

1. Problem Setup

ANN & Bichromatic Closest Pair

2. Main Results

What does the paper prove?

3. Proof Ingredients

SETH, OVC, Distributed PCP, AG Codes

4. Theorem 3.1 (FULL PROOF)

MA-Communication Protocol

5. Theorem 4.1 (FULL PROOF)

Hardness of Approx. Closest Pair

6. Extensions & Open Questions

SETH / k-SAT

Orthogonal Vectors

Bichromatic Closest Pair

Approx. Nearest Neighbor

Problem Setup

Approximate Nearest Neighbor (ANN)

ANN Search

Bichromatic Closest Pair (BCP)

BCP is easier (offline). Hardness of BCP ⟹ hardness of ANN.

Metrics considered: Euclidean (ℓ₂), Manhattan (ℓ₁), Hamming, Edit distance

Main Results

What Does the Paper Prove?

Theorem 1.1 (Euclidean / Manhattan / Hamming)

Assuming SETH, for every <i>δ</i> &gt; 0 there exists <i>ε</i> &gt; 0 such that computing a (1+<i>ε</i>)-approximation to Bichromatic Closest Pair requires Ω(<i>N</i><sup style="font-size: 0.7em;">2−δ</sup>) time.

i.e., no strongly subquadratic algorithm exists for any fixed approximation factor

Theorem 1.2 (Edit Distance)

Same conclusion holds for edit distance.

Corollary 1.3 (ANN Query Lower Bound)

Assuming SETH, for every <i>δ</i>, <i>c</i> &gt; 0 there exists <i>ε</i> such that no algorithm can preprocess <i>N</i> vectors in O(<i>N</i><sup style="font-size: 0.7em;">c</sup>) time and answer (1+<i>ε</i>)-ANN queries in O(<i>N</i><sup style="font-size: 0.7em;">1−δ</sup>) time.

First conditional hardness result for approximate (not just exact) ANN

Best algorithm: O(<i>N</i><sup style="font-size: 0.7em;">2−Θ̃(<i>ε</i><sup style="font-size: 0.7em;">1/3</sup>)</sup>) [ACW16]

This paper: Ω(<i>N</i><sup style="font-size: 0.7em;">2−δ</sup>) for some <i>ε</i>(<i>δ</i>)

PROOF INGREDIENTS

Four Key Building Blocks

SETH

Strong Exponential Time Hypothesis: k-SAT on n variables requires (2−ε)<sup>n</sup> time for some k.

Our hardness base assumption.

Orthogonal Vectors (OVC)

SETH implies: deciding if two sets A, B ⊂ {0,1}<sup>m</sup> contain an orthogonal pair requires Ω(N<sup>2−δ</sup>) time.

Intermediate stepping stone.

Distributed PCP [ARW17]

Split a PCP proof into 'half-proofs' to avoid PCP blowup. Enables hardness of approximation in P.

Framework from prior work — we improve it.

Algebraic Geometry (AG) Codes

Replace Reed-Solomon with AG codes: constant rate + systematic + polynomial closure.

Key innovation: eliminates log-factor overhead in communication.

SETH

OVC

Dist. PCP + AG Codes

BCP Hardness

Motivation for Distributed PCP

Why Standard PCP Fails Here

<i>k</i>-SAT instance &phi; on <i>n</i> variables

N &approx; 2<sup>n/2</sup> gadgets

standard reduction

PCP Theorem gives &phi;' on <i>n' &gg; n</i> variables

construct gadgets

N' &approx; 2<sup>n'/2</sup> &gg; 2<sup>n</sup> gadgets

Running time lower bound becomes meaningless!

Distributed PCP [ARW17]

Construct a <strong style="color: #D48B2A;">'half-proof'</strong> &pi;(&alpha;), &pi;(&beta;) for each half-assignment &alpha;, &beta; &isin; {0,1}<sup>n/2</sup>

Number of gadgets stays <span style="font-family: 'SFMono-Regular', Consolas, monospace; background: #F7F4EF; padding: 4px 10px; border-radius: 6px; border: 1px solid #EAEAEA; font-weight: 600;">N &approx; 2<sup>n/2</sup></span> &mdash; no blowup!

Key insight: distribute the proof generation across the two parties of the reduction.

ALGEBRAIC GEOMETRY CODES

Replacing Reed-Solomon with AG Codes

Property

Reed-Solomon Code

AG Code

Rate

&Theta;(1/log n)

&mdash; BAD

Constant (&ge; 0.1)

&mdash; GOOD

Distance

&Theta;(1/log n)

Constant (&ge; 0.1)

Systematic

Yes

Polynomial Closure

Reed-Solomon rate = &Theta;(1/log n)

Merlin's message: O(n/T &middot; log n) bits, Bob's message: O(T log n) bits

Need T = &Omega;(log n) &rarr; Bob sends &Omega;(log&sup2; n) bits &rarr; only (1+1/poly(n))-hardness

AG Code rate = &Theta;(1)

Merlin's message: O(n log T / T) bits, Bob's message: O(T log T) bits

T can be a large CONSTANT &rarr; constant-factor hardness gap!

AG codes (Goppa 1981, SAK+01): multivariate rational functions over finite fields &mdash; generalize Reed-Solomon while achieving constant rate.

THEOREM 3.1 — FULL PROOF

MA-Communication Protocol for Set Disjointness

For every <i style="font-family: 'Playfair Display', serif; font-weight: 600;">T &isin; [2,m]</i>, there is an efficient MA-communication protocol for Set Disjointness over universe <i style="font-family: 'Playfair Display', serif; font-weight: 600;">[m]</i> where:

<div style="display: flex; flex-direction: column; gap: 14px;"> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">1.</span> <span>Merlin sends Alice <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(m log T / T)</i> bits</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">2.</span> <span>Alice and Bob toss <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(log m)</i> coins</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">3.</span> <span>Bob sends Alice <i style="font-family: 'Playfair Display', serif; color: #2C2C2C; font-weight: 600;">O(T log T)</i> bits</span> </div> <div style="display: flex; gap: 15px; align-items: flex-start;"> <span style="font-weight: 700; color: #D48B2A; font-size: 24px; min-width: 25px;">4.</span> <span>Alice returns <span style="color: #2e7d32; font-weight: 600; background-color: rgba(46,125,50,0.1); padding: 2px 8px; border-radius: 4px;">Accept</span> or <span style="color: #c62828; font-weight: 600; background-color: rgba(198,40,40,0.1); padding: 2px 8px; border-radius: 4px;">Reject</span></span> </div> </div>

<span style="font-weight: 600;">Disjoint sets</span> &rarr; Alice <span style="color: #2e7d32; font-weight: 600;">always accepts</span>.

<span style="font-weight: 600;">Non-disjoint</span> &rarr; Alice accepts with probability &le; <span style="font-family: 'Playfair Display', serif; font-weight: 600; font-size: 24px;">1/2</span>.

What is Set Disjointness?

<div style="color: #4A4A4A; font-size: 23px; line-height: 1.6; margin-bottom: 20px;"> Alice holds <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">&alpha; &sube; [m]</span>, Bob holds <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">&beta; &sube; [m]</span>.<br> They want to decide: is <span style="font-weight: 600; color: #D48B2A; background-color: rgba(212, 139, 42, 0.1); padding: 2px 8px; border-radius: 6px; font-family: 'Playfair Display', serif;">&alpha; &cap; &beta; = &empty;</span> ? </div> <div style="color: #4A4A4A; font-size: 21px; line-height: 1.5; margin-bottom: 25px; padding: 15px; background: #FAFAFA; border-left: 4px solid #7A7A7A; border-radius: 0 8px 8px 0;"> <span style="font-weight: 700; color: #2C2C2C;">MA Protocol</span>: Merlin (prover) sends a hint, then Alice and Bob execute a randomized communication protocol. </div>

<div style="display: flex; align-items: center; justify-content: center; margin-top: auto; padding: 25px; background-color: #FAFAFA; border-radius: 12px; gap: 35px; border: 1px solid #EAEAEA;"> <!-- Merlin --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 0; height: 0; border-left: 25px solid transparent; border-right: 25px solid transparent; border-bottom: 45px solid #D48B2A; position: relative;"> <span style="position: absolute; bottom: -50px; left: -9px; font-size: 16px; font-weight: 700; color: #FFF;">M</span> </div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600; margin-top: 8px;">Merlin</div> </div> <!-- Arrow M->A --> <div style="display: flex; flex-direction: column; align-items: center; margin-bottom: 25px;"> <div style="font-size: 13px; color: #D48B2A; font-weight: 700; letter-spacing: 1px;">HINT</div> <div style="font-size: 24px; color: #D48B2A;">&rarr;</div> </div> <!-- Alice --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 55px; height: 55px; background-color: #ffffff; border: 3px solid #2C2C2C; border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 20px; font-weight: 700; color: #2C2C2C; box-shadow: 0 4px 6px rgba(0,0,0,0.05);">A</div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600;">Alice</div> </div> <!-- Arrow A<->B --> <div style="display: flex; flex-direction: column; align-items: center; margin-bottom: 25px;"> <div style="font-size: 13px; color: #7A7A7A; font-weight: 700; letter-spacing: 1px;">COMM</div> <div style="font-size: 24px; color: #2C2C2C;">&harr;</div> </div> <!-- Bob --> <div style="display: flex; flex-direction: column; align-items: center; gap: 8px;"> <div style="width: 55px; height: 55px; background-color: #ffffff; border: 3px solid #2C2C2C; border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 20px; font-weight: 700; color: #2C2C2C; box-shadow: 0 4px 6px rgba(0,0,0,0.05);">B</div> <div style="font-size: 15px; color: #4A4A4A; font-weight: 600;">Bob</div> </div> </div>

Why do we care?

This protocol is the core <span style="font-weight: 600; color: #2C2C2C;">communication complexity primitive</span> used to build the hardness reduction for BCP.

<span style="font-weight: 500;">number of gadgets</span> (vectors) we enumerate.

<span style="font-weight: 500;">approximation gap</span> we achieve.

Compare to prior work [AW09]: <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">O(m log n / T)</span> and <span style="font-weight: 600; color: #2C2C2C; font-family: 'Playfair Display', serif;">O(T log&sup2; n)</span> &mdash; an extra log factor due to Reed-Solomon rate.

THEOREM 3.1 — PROOF (STEP 1: SETUP)

Encoding the Inputs with AG Codes

Step 1

Step 2

Step 3

Key Observation

Theorem 3.1 &mdash; Proof (Step 2: The Protocol)

The Four-Step Protocol

Step 1 &mdash; Merlin's Message

Merlin sends Alice <span style="font-family: 'Playfair Display', serif; font-style: italic;">&mu;&#770;</span> &mdash; the alleged encoding of <span style="font-family: 'Playfair Display', serif; font-style: italic;">&mu; = &Sigma;<sub>t</sub> &mu;<sup>t</sup></span>

Message size: <span style="font-family: 'Playfair Display', serif; font-style: italic;">n<sub>C</sub> &middot; &lceil;log T&rceil; = O(m log T / T)</span> bits <span style="color: #10B981; font-weight: bold; margin-left: 4px;">&#10003;</span>

Step 2 &mdash; Shared Randomness

Alice and Bob jointly pick a random index <span style="font-family: 'Playfair Display', serif; font-style: italic;">i<sup>*</sup> &isin; [n<sub>C</sub>]</span>

(<span style="font-family: 'Playfair Display', serif; font-style: italic;">O(log m)</span> random coins)

Step 3 &mdash; Bob's Message

Bob sends Alice <span style="font-family: 'Playfair Display', serif; font-style: italic;">C(&beta;<sup>t</sup>)<sub>i<sup>*</sup></sub></span> for all <span style="font-family: 'Playfair Display', serif; font-style: italic;">t &isin; [T]</span>

Message size: <span style="font-family: 'Playfair Display', serif; font-style: italic;">T &middot; &lceil;log T&rceil; = O(T log T)</span> bits <span style="color: #10B981; font-weight: bold; margin-left: 4px;">&#10003;</span>

Step 4 &mdash; Alice's Decision

Alice accepts iff ALL of:

<span style="font-family: 'Playfair Display', serif; font-style: italic;">&mu;&#770;</span> is a valid codeword in <span style="font-family: 'Playfair Display', serif; font-style: italic;">C'</span> &nbsp; <span style="color: #888888; font-size: 0.9em;">[checks Merlin's honesty]</span>

<span style="font-family: 'Playfair Display', serif; font-style: italic;">&mu;&#770;<sub>i<sup>*</sup></sub> = &Sigma;<sub>t</sub> C(&alpha;<sup>t</sup>)<sub>i<sup>*</sup></sub> &middot; C(&beta;<sup>t</sup>)<sub>i<sup>*</sup></sub></span> &nbsp; <span style="color: #888888; font-size: 0.9em;">[spot-check consistency]</span>

<span style="font-family: 'Playfair Display', serif; font-style: italic;">&mu;&#770;<sub>i</sub> = 0</span> for all <span style="font-family: 'Playfair Display', serif; font-style: italic;">i &isin; [m/T]</span> &nbsp; <span style="color: #888888; font-size: 0.9em;">[checks disjointness]</span>

Total communication:

Merlin &rarr; Alice:

O(m log T / T)

Bob &rarr; Alice:

O(T log T)

Coins:

O(log m)

Repeat steps 2&ndash;4 a constant number of times (in parallel) to boost soundness to 1/2.

THEOREM 3.1 — PROOF (STEP 3: CORRECTNESS)

Completeness & Soundness Analysis

Completeness

Claim: If α ∩ β = ∅, Alice always accepts.

Since sets are disjoint: μ<sup style='font-size: 0.7em'>t</sup><sub style='font-size: 0.7em'>i</sub> = C(α<sup style='font-size: 0.7em'>t</sup>)<sub style='font-size: 0.7em'>i</sub> · C(β<sup style='font-size: 0.7em'>t</sup>)<sub style='font-size: 0.7em'>i</sub> = 0 for all i,t<br><span style='color: #497A49; font-size: 20px; display: inline-block; margin-top: 8px;'>(systematic property: entries on S<sub style='font-size: 0.7em'>n</sub> are in {0,1}, so product of disjoint bits = 0)</span>

By linearity of C': μ = Σ<sub style='font-size: 0.7em'>t</sub> μ<sup style='font-size: 0.7em'>t</sup> is the all-zeros codeword

Merlin sends the true μ̂ = μ = 0

All three checks (a), (b), (c) pass <span style='color: #497A49; font-weight: bold;'>✓</span>

Soundness

Claim: If α ∩ β ≠ ∅, Alice accepts with prob ≤ 1/2.

If Alice accepts with nonzero probability → μ̂ passes codeword check (a)

If Alice accepts with prob > 1 − δ<sub style='font-size: 0.7em'>C'</sub>: μ̂ must equal the true μ<br><span style='color: #BA771B; font-size: 20px; display: inline-block; margin-top: 8px;'>(by distance of C' ≥ 0.1)</span>

But then check (c) implies μ<sub style='font-size: 0.7em'>i</sub> = 0 for all i → sets are disjoint. Contradiction!

So Alice accepts with prob ≤ 1 − δ<sub style='font-size: 0.7em'>C'</sub> ≤ 0.9

After O(1) parallel repetitions of steps 2–4: soundness ≤ 1/2 <span style='color: #BA771B; font-weight: bold;'>✓</span>

<strong style='color: #2C2C2C;'>Key role of AG codes:</strong> The constant rate of C means n<sub style='font-size: 0.7em'>C</sub> = O(m/T) — so Merlin's message is O(m log T / T), not O(m log n / T) as with Reed-Solomon. This is the crucial improvement.

[Proof of Theorem 3.1 complete ✓]

THEOREM 4.1 — FULL PROOF

From OVC to Bichromatic Closest Pair

<strong>Theorem 4.1.</strong> Unless SETH and OVC are false, for any <em>&ell;<sub>p</sub></em> metric (constant <em>p</em>), for every <em>&delta; &gt; 0</em> there exists <em>&epsilon; = &epsilon;(&delta;,p) &gt; 0</em> such that given sets <em>A, B &sub; {0,1}<sup>d</sup></em> of <em>N</em> vectors (<em>d = O(log N)</em>), computing a <em>(1+&epsilon;)</em>-approximation to Bichromatic Closest Pair requires <em>&Omega;(N<sup>2&minus;&delta;</sup>)</em> time.

<div>We reduce Orthogonal Vectors (OVC) to (1+<em>&epsilon;</em>)-approx BCP.<br><br>Given OV instance (<em>A<sub>OV</sub></em>, <em>B<sub>OV</sub></em>) over {0,1}<sup><em>m</em></sup>:<br><ol style="margin-top: 15px; margin-bottom: 0; padding-left: 28px; display: flex; flex-direction: column; gap: 12px; color: #2C2C2C;"> <li><span style="color: #4A4A4A;">Use Theorem 3.1's MA-protocol with parameter <em>T</em></span></li> <li><span style="color: #4A4A4A;">Encode each OV vector as a BCP vector by 'embedding' the protocol's acceptance behavior</span></li> <li><span style="color: #4A4A4A;">Show: orthogonal pairs &rarr; small distance, non-orthogonal &rarr; larger distance</span></li> <li><span style="color: #4A4A4A;">The approximation ratio separates the two cases</span></li></ol></div>

<div style="font-family: inherit;"> Set <em>T = O(log(1/&epsilon;) / log log(1/&epsilon;))</em> so that: <div style="padding-left: 20px; font-family: 'SFMono-Regular', Consolas, monospace; margin: 15px 0; color: #2C2C2C; font-size: 20px;"> T' = 2<sup>O(T log T)</sup> = O(1/&epsilon;) </div> <div style="color: #7A7A7A; font-size: 17px; margin-bottom: 20px; padding-left: 20px;">(number of Bob messages)</div> This makes the distance gap exactly a <span style="font-weight: 700; color: #2C2C2C;">(1 + &Omega;(1/T')) = (1+&epsilon;)</span> factor.<br><br> <div style="background-color: #FDF9F3; border-radius: 8px; border-left: 4px solid #D48B2A; padding: 18px 24px; margin-top: 15px;"> <span style="font-weight: 700; color: #D48B2A; text-transform: uppercase; font-size: 15px; letter-spacing: 1.5px;">Size blowup</span><br> <div style="font-family: 'SFMono-Regular', Consolas, monospace; font-size: 18px; color: #2C2C2C; margin-top: 8px;"> N = 2<sup>m &middot; (1/c + O(log&sup2; log(1/&epsilon;) / log(1/&epsilon;)))</sup> vectors </div> </div></div>

THEOREM 4.1 — PROOF (STEP 1: CONSTRUCTION)

Building the BCP Vectors

THEOREM 4.1 — PROOF (STEP 2: DISTANCE ANALYSIS)

From Inner Products to ℓ<sub>p</sub> Distances

Key Calculation — Hamming distance probability:

Pr[a<sub>i,j,b</sub> &ne; b<sub>i,j,b</sub>] = 1/2 + 1/(2T&prime;) &minus; (1/T&prime;) &middot; Pr[Alice accepts on &alpha;, &beta;, &mu;]

<b>Orthogonal (&alpha; &perp; &beta;):</b> there exists &mu; with Pr[accept]=1, so min<sub>&mu;</sub> ||a&minus;b||<sup>p</sup> = m(T&prime;&minus;1)

<b>Non-orthogonal (&alpha;&middot;&beta; &ne; 0):</b> all &mu; give Pr[accept] &le; 1/2, so min<sub>&mu;</sub> ||a&minus;b||<sup>p</sup> &ge; mT&prime;

Approximation ratio = (T&prime;/(T&prime;&minus;1))<sup>1/p</sup> &ge; 1 + &Omega;(1/T&prime;) = 1+&epsilon; <span style="font-family: sans-serif; margin-left: 8px;">&#10004;</span>

<b>Reduction size:</b> N = 2<sup>m(1/c + O(log&sup2;log(1/&epsilon;)/log(1/&epsilon;)))</sup> &rarr; BCP requires &Omega;(N<sup>2&minus;&delta;</sup>). Dimension d = O(log N). <span style="color: #2E7D32; font-weight: 600; margin-left: 10px;">[Proof of Theorem 4.1 complete &#10004;]</span>