Summary


Competitive algorithms are concerned with cases where we only get inputs one at a time. Offline algorithms have access to the whole data structure in question, which makes it inherently more efficient and capable. We will be discussing ways to optimize cases where we don’t have access to all information at once.

To define efficiency, we will compare our online algorithms to our best possible offline algorithm and minimize how much worse it is for all prefixes of input.

Definition

A deterministic online algorithm 𝐴 is 𝛼-competitive if for all sequences 𝑅, the cost𝐴(𝑅)𝛼costOPT(𝑅)+𝑐.

Self-Organizing Lists

Given 𝑛 distinct elements, we need to control:

  • Access(x) - Find and return 𝑥 The cost of access is the rank of x, which tautologically is the index of 𝑥 in 𝐿.

Example

Index12345
𝐿ACBED

Access(B) has cost 3, since 𝐵 is at index 3.

Now, if we want to access 𝐷 a lot in our example, keeping it at the end would be quite inefficient. The key is to find the best way to arrange the list to be helpful to us.

Let us say that after every access, we can rearrange the list via transpositions between adjacent elements, wherein they are swapped. We will charge a cost of 1 for each transposition. So after any given access(x) we perform the access and then do some transpositions. We will call each arrangement of the list as

𝐿0𝐿1𝐿2

We will thus rigorously define our cost as

Cost𝐴(𝑅)=𝑘𝑖=1[Rank𝐿𝑖1(𝑥𝑖)+𝑡𝑖]

The worst-case runtime is Ω(𝑛|𝑅|). This might seem bad, but if the best offline algorithm is already bad, it’s pretty competitive.

Our adversary might want to exploit this data structure by querying as follows (reminder that 𝑅 is the order of their queries):

𝐿=[1,,𝑛]𝑅={𝑛,𝑛1,,1}

If we do no transpositions, this would take Θ(𝑛2). We cannot do better than this, unfortunately, so our algorithm is good compared to the optimal offline one.

Now let us say we access the same element over and over again.

𝐿=[1,,𝑛]𝑅={𝑛,𝑛,,𝑛}

The optimal approach would be to transpose 𝑛 to the front. The total cost would be 𝑛 for the first access plus 𝑛1 transpositions to move 𝑛 to the front plus 𝑛1 queries of cost 1. Hence, the optimal solution is 𝑂(𝑛). The difficulty is for our online algorithm to “see” this, as it won’t know that each query is 𝑛.

Move-to-Front (MTF)

After Access(x_i), move x_i to the front via transpositions.

CostMTF(𝑅)=𝑘𝑖=1[2Rank𝐿𝑖1(𝑥𝑖)1]

where 𝐿𝑖1 is the list before the transposition.

Theorem

MTF is 4-competitive.

We will spend the rest of the lecture proving this statement. The key idea is to track how far MTF’s list diverges from OPT’s list at each step, and show the distance stays bounded.

We need some notation.

𝑟𝑖=Rank𝐿𝑖1(𝑥𝑖)𝑟𝑖=Rank𝐿𝑖1(𝑥𝑖)𝑐𝑖=2𝑟𝑖1𝑐𝑖=𝑟𝑖+𝑡𝑖

where 𝑡𝑖 is the number of transpositions done at step 𝑖.

Hence, CostMTF(𝑅)=𝑐𝑖 and CostOPT(𝑅)=𝑐𝑖. We will solve this with the potential method with a function Φ such that Φ𝑖Φ0 for all 𝑖. Recall that ΔΦ𝑖=Φ𝑖Φ𝑖1 and ̂𝑐𝑖=𝑐𝑖+ΔΦ𝑖.

𝑘𝑖=1̂𝑐𝑖=𝑘𝑖+1𝑐𝑖+𝑘𝑖=1ΔΦ𝑖=𝑘𝑖+1𝑐𝑖+Φ𝑘Φ0𝑘𝑖+1𝑐𝑖

We apply Φ to (𝐿,𝐿). An inversion between 𝐿𝑖 and 𝐿𝑖 to be a pair (𝑥,𝑦) where 𝑥<𝐿𝑖𝑦 and 𝑥>𝐿𝑖𝑦.

In other words, this means B is before D in the MTF solution, but after D in the OPT solution. We now define our potential function as

Φ𝑖=Φ(𝐿𝑖,𝐿𝑖)=2×# inversions between𝐿𝑖and𝐿𝑖

Intuitively, tracking inversions gives us a pseudo-distance between the two lists. We can find how many transpositions are needed to go between the two lists. Φ0=0 because the two lists are the same, and all Φ𝑖Φ0, so this is a valid potential function.

  • Access(x_i) - ΔΦ=0.
  • Transposition between (x,y) in 𝐿𝑖 - Assuming 𝑥 was before 𝑦 before in both lists, the pair becomes an inversion. Hence, ΔΦ𝑖=2. If there was an inversion before, then ΔΦ𝑖=2.

When MTF moves 𝑥𝑖 to the front, it passes over every element 𝑦 that was before 𝑥𝑖 in 𝐿𝑖1. We classify each element 𝑦𝑥𝑖 by its position relative to 𝑥𝑖 in both lists:

  • 𝐴𝑖 - before 𝑥𝑖 in 𝐿𝑖1 and before 𝑥𝑖 in 𝐿𝑖1. The pair (𝑦,𝑥𝑖) had the same order in both lists. MTF moves 𝑥𝑖 ahead of 𝑦, creating an inversion. ΔΦ=+2 per element.
  • 𝐵𝑖 - before 𝑥𝑖 in 𝐿𝑖1 but after 𝑥𝑖 in 𝐿𝑖1. The pair was already an inversion. MTF moves 𝑥𝑖 ahead of 𝑦, removing it. ΔΦ=2 per element.
  • 𝐶𝑖 - after 𝑥𝑖 in 𝐿𝑖1, before 𝑥𝑖 in 𝐿𝑖1. MTF doesn’t change the relative order of 𝑦 and 𝑥𝑖. Unaffected.
  • 𝐷𝑖 - after 𝑥𝑖 in 𝐿𝑖1 and after 𝑥𝑖 in 𝐿𝑖1. Also unaffected.

Notice that 𝑟𝑖=|𝐴𝑖|+|𝐵𝑖|+1 and 𝑟𝑖=|𝐴𝑖|+|𝐶𝑖|+1. Before we do anything, 𝐵𝑖 and 𝐶𝑖 contain our inversions.

ΔΦ𝑖,MTF=2(|𝐴𝑖||𝐵𝑖|)ΔΦ𝑖,OPT2𝑡𝑖ΔΦ𝑖2(|𝐴𝑖||𝐵𝑖|+𝑡𝑖)

Now we consider our amortized cost.

\
&<= 2r_i - 1 + 2(|A_i| - |B_i| + t_i) & ("def. of" c_i "and" Delta Phi_i) \
\
&= 2r_i - 1 + 2(|A_i| - (r_i - |A_i| - 1) + t_i) & (|B_i| = r_i - |A_i| - 1) \
\
&= 4|A_i| + 2t_i + 1 & ("simplify") \
\
&<= 4(|A_i| + |C_i|) + 4t_i + 1 & (|C_i| >= 0, t_i >= 0) \
\
&= 4(r_i^* - 1) + 4t_i + 1 & (r_i^* = |A_i| + |C_i| + 1) \
\
&<= 4(r_i^* + t_i) = 4c_i^* & square$$
We proved $c_i < 4c_i^*$.

## More on Competitive Analysis
In an online setting, you need to design the algorithm given that you don't know the full sequence of events upfront. This is much trickier than in offline algorithms, where you're given all the information immediately. **Competitive Analysis** involves a sequence of requests/inputs, and you want the algorithm to be as good compared to the online one as possible. 


> [!example] Example - Job Scheduling
> 
> We have a sequence of jobs arriving one at a time: $J_1, J_2, dots$. As each one comes in, we must assign it to a machine. The goal is to minimize the **makespan** — the longest total time any machine works.
>
> ![[attachments/job-scheduling-online.svg|500]]
> 
> One idea is to greedily select the current machine with the earliest finish time. We want to show $C_"greedy" <= alpha dot C_"OPT"(R) + "constant"$