Summary
Competitive algorithms are concerned with cases where we only get inputs one at a time. Offline algorithms have access to the whole data structure in question, which makes it inherently more efficient and capable. We will be discussing ways to optimize cases where we don’t have access to all information at once.
To define efficiency, we will compare our online algorithms to our best possible offline algorithm and minimize how much worse it is for all prefixes of input.
Definition
A deterministic online algorithm is -competitive if for all sequences , the .
Self-Organizing Lists
Given distinct elements, we need to control:
Access(x)- Find and return The cost ofaccessis the rank of x, which tautologically is the index of in .
Example
Index 1 2 3 4 5 A C B E D
Access(B)has cost 3, since is at index 3.
Now, if we want to access a lot in our example, keeping it at the end would be quite inefficient. The key is to find the best way to arrange the list to be helpful to us.
Let us say that after every access, we can rearrange the list via transpositions between adjacent elements, wherein they are swapped. We will charge a cost of for each transposition. So after any given access(x) we perform the access and then do some transpositions. We will call each arrangement of the list as
We will thus rigorously define our cost as
The worst-case runtime is . This might seem bad, but if the best offline algorithm is already bad, it’s pretty competitive.
Our adversary might want to exploit this data structure by querying as follows (reminder that is the order of their queries):
If we do no transpositions, this would take . We cannot do better than this, unfortunately, so our algorithm is good compared to the optimal offline one.
Now let us say we access the same element over and over again.
The optimal approach would be to transpose to the front. The total cost would be for the first access plus transpositions to move to the front plus queries of cost . Hence, the optimal solution is . The difficulty is for our online algorithm to “see” this, as it won’t know that each query is .
Move-to-Front (MTF)
After Access(x_i), move x_i to the front via transpositions.
where is the list before the transposition.
Theorem
MTF is 4-competitive.
We will spend the rest of the lecture proving this statement. The key idea is to track how far MTF’s list diverges from OPT’s list at each step, and show the distance stays bounded.
We need some notation.
where is the number of transpositions done at step .
Hence, and . We will solve this with the potential method with a function such that for all . Recall that and .
We apply to . An inversion between and to be a pair where and .
In other words, this means B is before D in the MTF solution, but after D in the OPT solution. We now define our potential function as
Intuitively, tracking inversions gives us a pseudo-distance between the two lists. We can find how many transpositions are needed to go between the two lists. because the two lists are the same, and all , so this is a valid potential function.
Access(x_i)- .- Transposition between (x,y) in - Assuming was before before in both lists, the pair becomes an inversion. Hence, . If there was an inversion before, then .
When MTF moves to the front, it passes over every element that was before in . We classify each element by its position relative to in both lists:
- - before in and before in . The pair had the same order in both lists. MTF moves ahead of , creating an inversion. per element.
- - before in but after in . The pair was already an inversion. MTF moves ahead of , removing it. per element.
- - after in , before in . MTF doesn’t change the relative order of and . Unaffected.
- - after in and after in . Also unaffected.
Notice that and . Before we do anything, and contain our inversions.
Now we consider our amortized cost.
\
&<= 2r_i - 1 + 2(|A_i| - |B_i| + t_i) & ("def. of" c_i "and" Delta Phi_i) \
\
&= 2r_i - 1 + 2(|A_i| - (r_i - |A_i| - 1) + t_i) & (|B_i| = r_i - |A_i| - 1) \
\
&= 4|A_i| + 2t_i + 1 & ("simplify") \
\
&<= 4(|A_i| + |C_i|) + 4t_i + 1 & (|C_i| >= 0, t_i >= 0) \
\
&= 4(r_i^* - 1) + 4t_i + 1 & (r_i^* = |A_i| + |C_i| + 1) \
\
&<= 4(r_i^* + t_i) = 4c_i^* & square$$
We proved $c_i < 4c_i^*$.
## More on Competitive Analysis
In an online setting, you need to design the algorithm given that you don't know the full sequence of events upfront. This is much trickier than in offline algorithms, where you're given all the information immediately. **Competitive Analysis** involves a sequence of requests/inputs, and you want the algorithm to be as good compared to the online one as possible.
> [!example] Example - Job Scheduling
>
> We have a sequence of jobs arriving one at a time: $J_1, J_2, dots$. As each one comes in, we must assign it to a machine. The goal is to minimize the **makespan** — the longest total time any machine works.
>
> ![[attachments/job-scheduling-online.svg|500]]
>
> One idea is to greedily select the current machine with the earliest finish time. We want to show $C_"greedy" <= alpha dot C_"OPT"(R) + "constant"$