Summary
The idea of randomized algorithms is to use random numbers to make decisions.
- Monte Carlo algorithms always give good runtime but might not be correct.
- Las Vegas algorithms might be slow, but are always correct.
Matrix Products
and are matrices. We want to compute their product. The naive algorithm computes dot products between rows of and columns of , taking time ( entries, each requiring work). Strassen found an algorithm, with the best known bound today being .
Matrix Product Verification
Given matrices , , and a claimed product , we want to verify whether . If , then for any vector , . Checking this requires 3 matrix-vector products, with total runtime .
Algorithm (mod 2 arithmetic, entries in ):
- Pick a random binary vector .
- Compute and .
- If , output YES. Else output NO.
Example
. Here (mod 2). Pick : , , . They match, so we incorrectly output YES. But pick : , , . They differ, so we correctly output NO.
This is a Monte Carlo algorithm since a fixed can miss errors. Making random gives a probabilistic guarantee:
Theorem
If , the algorithm outputs NO with probability .
Proof
Let . Pick a row with some nonzero entry .
.
Expanding the dot product:
Let (this is fixed once we choose all for ).
- If : “iff” , which happens with probability .
- If : “iff” (since mod 2), also probability .
In either case , so .
Rank Finding
Given set of distinct numbers, find with . The deterministic algorithm works as follows:
- Pick a pivot .
- Compute and . Then .
- If , return . If , recurse on . Else find rank in .
Quick Select
Paranoid quick select adjusts the above by picking a random pivot and re-picking if it’s bad (i.e., or ). We expect 2 selections to get a good pivot, giving:
Quick Sort
Quick sort has less constant overhead and uses less memory than merge sort. The basic algorithm for distinct elements:
- Pick random pivot .
- Compute and .
- Return .
We again analyze a paranoid version: re-pick the pivot if or . Expected 2 selections until a good pivot. The recurrence is:
Each level sums to work. The shortest root-to-leaf path is (always taking the branch) and the longest is (always taking the branch). Both are , so total work is in expectation.
High-Probability Bound
We can strengthen “in expectation” to “with high probability” using the union bound and Chernoff bound.
Chernoff bound: for a binomial variable ,
Let be an indicator r.v. for picking a bad pivot at recursion depth for the subarray containing element . Since a pivot is bad with probability , we have .
At recursion depth , the probability of exceeding the expected number of bad pivots by 20%:
Set . Then:
If instead , then we had good pivots. Each good pivot shrinks the subarray by a factor of , so the subarray containing at level has size . This means after levels, the recursion has terminated.
By the union bound over all elements, the failure probability is . So with probability , quicksort runs in time.
Problem
We have a square with some subset of area . We support two operations:
- Randomly sample a point uniformly in the square ().
- Check if a point is in ().
a) Give a randomized algorithm to estimate the area of .
b) How many trials do we need so that our estimate with probability ?
Solution
a) Sample points uniformly at random. Let if point lands in , and . Then:
So is an unbiased estimator of .
b) Each (since for uniform sampling on the unit square), so with . We want , i.e. . By Chernoff (both tails bounded by ):
Problem - Seating Students
students arrive one by one to a room with seats. Each student picks a seat uniformly at random. If it’s taken, they try one more random seat. If that’s also taken, they need help. Let be the number of students that need help.
a) Find . (Hint: )
b) Suppose . Prove .
Solution
a) Let be the indicator that student needs help. When student arrives, seats are occupied, so both random picks hit an occupied seat with probability . By linearity of expectation:
b) With : . We want to show . Since , apply Chernoff with :