Chapter 8 problems

ch8-problems
- 8-1
- 8-2
- 8-3
- 8-4
- 8-5
- 8-6
- 8-7

ch8-problems

8-1

a.

There are total $n!$ permutations and therefore $n!$ unique leaves, every permutation is equally likely, and the algorithm is deterministic, thus there are exactly $n!$ leaves labeled by $1/n!$, we labeled the rest by $0$.
b.

Let $d_T(i)$ denote the depth of leaf $i$ in tree $T$, and $i \in T$ denote the leaf $i$ of the tree $T$, we have
\begin{align*} D(T) &=\sum_{i\in T}d_T(i)\\ &=\sum_{i\in LT}d_T(i)+\sum_{i\in RT}d_T(i)\\ &=\sum_{i\in LT}(d_{LT}(i)+1)+\sum_{i\in RT}(d_{RT}(i)+1)\\ &=\sum_{i\in LT}d_{LT}(i)+\sum_{i\in RT}d_{RT}(i)+\sum_{i\in T}1\\ &=D(LT)+D(RT)+k \end{align*}
c.

We have
\begin{align*} d(k) &=min\{D(T)\}\\ &=min\{D(LT)+D(RT)+k\}\\ &=min_{1\leq i\leq k-1}\{D(LT)+D(RT)+k\} &\text{, let $i$ to be the number of leaves in $LT$}\\ &=min_{1\leq i\leq k-1}\{min\{D(LT)\}+min\{D(RT)\}+k\} &\text{, the shapes of the left tree and the right tree are arbitrary}\\ &=min_{1\leq i\leq k-1}\{d(i)+d(k-i)+k\} \end{align*}
d.

Define $f(i) = i\lg i+(k-i)\lg(k-i)$, then we have
\begin{align*} f'(i)&=\frac{\ln i-\ln(k-i)}{\ln 2}\\ f''(i)&=\frac{\frac{1}{i}+\frac{1}{k-i}}{\ln 2} \end{align*}
When $1\leq i\leq k-1$, we have $f''(i)\geq 0$, thus $f(i)$ is a convex function, and we have $f'(1)<0$ and $f'(k-1)>0$, therefore there is a critical point in $1\leq i\leq k-1$, the critical point occurs at $i=k/2$ which $f'(k/2)=0$, i.e., the function $f(i)$ is minimized at $i=k/2$.

Then we use substitution method to prove the assumption $d(k)=\Omega(k\lg k)$, first we guess that $d(k)\geq ck\lg k$ for some constant $c>0$ when $k\geq 1$, then we perform mathematical induction.

Base case: When $k=1$, we have $d(1) \geq 0$.

Inductive step: We have
\begin{align*} d(k) &=min_{1\leq i\leq k-1}\{d(i)+d(k-i)+k\}\\ &\geq min_{1\leq i\leq k-1}\{ci\lg i+c(k-i)\lg(k-i)+k\}\\ &=ck\lg k+(1-c)k\\ &\geq ck\lg k &\text{, when $c\leq 1$} \end{align*}
In conclusion, we have $d(k) = \Omega(k\lg k)$.
e.

There are more than $n!$ leaves in $T_A$, thus we have $D(T_A)\geq d(k)=\Omega(n!\lg(n!))$, and the average case time is $E[D(T_A)/n!]=\Omega(lg(n!))=\Omega(n\lg n)$.
f.

The needed leaves do not reduce from a randomization node to any of its children, therefore if we replace each randomization node with its child which minimizing the external path length, we will get a deterministic comparison sort $A$ whose expected number of comparisons is no more than those made by $B$.

8-2

a.

The counting sort algorithm.

BIT-SORT(A)
    i = 1
    j = A.length
    while j > i
        if A[i] == 0
            i = i + 1
            continue
        if A[j] == 1
            j = j - 1
            continue
        exchange A[i] with A[j]
        i = i + 1
        j = j - 1

c.

The insertion sort algorithm.
d.

The sorting algorithm from part (a) can be used as the intermediate sorting algorithm in RADIX-SORT, because the sorting algorithm must be stable for RADIX-SORT to work, and the algorithm should have linear running time so that RADIX-SORT sorts $n$ records with $b$-bit keys in $O(bn)$ time.

The algorithm is not stable.

COUNTING-SORT-IN-PLACE(A, k)
    let C[0..k] be a new array
    for i = 0 to k
        C[i] = 0
    for j = 1 to A.length
        C[A[j]] = C[A[j]] + 1
    for i = 1 to k
        C[i] = C[i] + C[i - 1]
    j = A.length
    while j > 0
        p = C[A[j]]
        if j > p
            j = j - 1
        else exchange A[j] with A[p]
             C[A[p]] = p - 1

8-3

a.

We group the elements by the number of digits, and then perform radix sort on each group.

Assume there are $k$ groups out of the elements, the $i$th group contains $c_i$ $d_i$-digits elements, then the total running time is
\begin{align*} T(n) &=\Theta(\sum_{i=1}^{k}c_i d_i)\\ &=\Theta(n) \end{align*}
b. MSD radix sort (implementation)

8-4

The brute force algorithm uses $\Theta(n^2)$ comparisons to group the jugs into pairs.

FIND-WATER-JUGS-GROUPS(R, B)
    Let C be a new empty set
    while R is not empty
        i = R.pop()
        for j in B
            if i == j
                C.add((i, j))
                B.remove(j)
    return C

b.

We build the decision tree for the problem, which each permutation of the pairs appears as a reachable leaf, and each comparison generates at most three results.

Let $h$ be the height of the decision tree, and we know there are $n!$ permutations of pairs, thus we have
\begin{align*} h \geq \log_{3}(n!)\\ = \Omega(n\lg n) \end{align*}
The lower bound of the number of comparisons is $\Omega(n\lg n)$.

Let C be a new empty set

FIND-WARTER-JUGS-GROUPS(R, B, C)
    if R is empty
        return
    Let i be a random jug from R
    Let RL be a new empty set
    Let BL be a new empty set
    for j in B
        if i == j
            R.remove(i)
            B.remove(j)
            C.add((i, j))
            for k in R
                if k < j
                    R.remove(k)
                    RL.add(k)
        else if i < j
            B.remove(j)
            BL.add(j)
    FIND-WATER-JUGS-GROUPS(RL, BL, C)
    FIND-WATER-JUGS-GROUPS(R, B, C)

The number of comparisons is $S(n) = S(k) + S(n-k+1) +\Theta(n)$, which $k$ is the number of jugs smaller than the pivot jug, like the quicksort algorithm, the expected value of $S(n)$ is $O(n\lg n)$, the worst-case value of $S(n)$ is $\Theta(n^2)$.

8-5

a.

$1$-sorted means that $(\forall i\in 1,2,...,n-1)[A[i]\leq A[i+1]]$, i.e. normal sorted.
b.

$\{2, 1, 4, 3, 6, 5, 8, 7, 10, 9\}$ is $2$-sorted, but not sorted.
c.

For all $i=1,2,...,n-k$, we have
\begin{align*} \frac{\sum_{j=i}^{i+k-1}A[j]}{k}\leq\frac{\sum_{j=i+1}^{i+k}A[j]}{k} \iff A[i]\leq A[i+k] \end{align*}

The algorithm is as below.

K-SORT(A, k)
    for i = 1 to k
        sort A[i,k+i,2k+i,...,((A.length-1)//k)*k+i] with HEAPSORT

The running time is $O(n\lg(n/k))$.

The algorithm is as below, which MERGE-SORTED-LISTS is an $O(n\lg k)$ running time algorithm to merge $k$ sorted lists into one sorted list. (exercise ch6.5-9)

SORT-K-SORTED-LIST(A, k)
    B = MERGE-SORTED-LISTS([A[i,k+i,2k+i,...,((A.length-1)//k)*k+i]
                           for i = 1 to k], k)
    for i = 1 to A.length
        A[i] = B[i]

f.

We know that to $k$-sort an n-element array, we need to sort $k$ arrays of each $n / k$ elements. The running time is
\begin{align*} T(n) &=k\Omega((n/k)\lg(n/k))\\ &=\Omega(n\lg n) &\text{, $k$ is a constant} \end{align*}

8-6

a.

There are $\binom{2n}{n}$ possible ways.
b.

Build a decision tree of height $h$ for the problem, we have
\begin{align*} h &\geq\lg\binom{2n}{n}\\ &=2n-o(n) &\text{, Stirling's approximation} \end{align*}
c.

If two elements are consecutive in the sorted order, we cannot decide the order of them by comparing them to other elements, and they are from different lists, which means we must compare the two elements.
d.

Let $A, B$ to be the two lists, when
\begin{align*} A[1]\leq B[1]\leq A[2]\leq B[2]\leq\ldots\leq A[i]\leq B[i]\leq\ldots \leq A[n]\leq B[n] \end{align*}
we must compare at least $(\forall i\in[1,n])(A[i],B[i])$ and $(\forall i\in[1,n-1])(B[i],A[i+1])$ to merge the two lists, to generate $2n - 1$ comparisons in total.

8-7

a.

A[q] and A[p] are both wrong placed, thus we have $A[q] > A[p]$, so that $B[p] = 0$ and $B[q] = 1$.
b.

Let $N$ to be the sorting network of algorithm $X$, and we know that the mapping $f: A\to B$ is monotonic, thus we have
\begin{align*} N(B) &=N(f(A))\\ &=f(N(A)) &\text{, not sorted} \end{align*}
Hence the algorithm $X$ fails to sort array $B$ correctly.
c.

The sorting method we are using in the odd steps, do not affect the result after the steps, and the even steps are irrelevant to the values, thus we can treat columnsort as an oblivious compare-exchange algorithm.
d.

After step 1, we have sorted all the columns, which means, each column has at most one $0 \to 1$ transition.

After step 2, we have transposed each column into $r / s$ rows, and at most one of them is dirty, hence we have totally at most $s$ dirty rows.

After step 3, we have sorted all the columns, the array consists the clean rows of $0$s at the top, the clean rows of $1$s at the bottom, and the dirty rows between them, the dirty rows come from the differences between the numbers of $0$s and $1$s of each column, according to the result of step 2, we know the difference is at most $s$, thus the number of the dirty rows is at most $s$.
e.

After step 4, read in column-major order, we have:

The array starts with a clean area of $0$s, which comes from the top clean rows of $0$s after step 3;

The array ends with a clean area of $1$s, which comes from the bottom clean rows of $1$s after step 3;

The array has a dirty area of at most $s^2$ elements in the middle, which comes from the middle at most $s$ dirty rows after step 3.
f.

After step 5, the dirty area of at most $s^2$ elements is placed in the middle in column-major order, and also $r\geq s^2$, thus we have two situations: the dirty area is entirely in one column, or the dirty area spreads from one column to the next column.

After step 6, the dirty area in the first situation is cleaned.

After steps 7-8, the dirty area in the second situation was firstly moved in entirely one column, then cleaned.

In conclusion, steps 5-8 produce a fully sorted $0-1$ output, thus columnsort correctly sorts all $0-1$ inputs, according to the $0-1$ sorting lemma, columnsort correctly sorts all inputs containing arbitrary values.
g.

Suppose $s$ does not divide $r$.

After step 1, we have sorted all the columns, which means, each column has at most one $0 \to 1$ transition.

After step 2, we have at most $s$ dirty rows from the $0 \to 1$ transition, and $s - 1$ dirty rows from the crosses of the adjacent columns, totally at most $2s - 1$ dirty rows.

After step 3, we have sorted all the columns, the array consists the clean rows of $0$s at the top, the clean rows of $1$s at the bottom, and the dirty rows between them, the dirty rows come from the differences between the numbers of $0$s and $1$s of each column, according to the result of step 2, we know the difference is at most $2s - 1$, thus the number of the dirty rows is at most $2s - 1$.

When $s$ does not divide $r$, we need $r\geq 2s(2s-1)$ for columnsort to correctly sort.
h.

We could simply pad the array with some rows filled with zeros, to meet the requirement $r \equiv 0 \pmod{s}$, then perform columnsort, and finally remove the padding zeros to generate the correct result.

The columnsort algorithm could benefit from parallel processing. (implementation)

Chapter 8 problems

Table of Contents

ch8-problems

8-1

8-2

8-3

8-4

8-5

8-6

8-7