Chapter 9 problems
Table of Contents
ch9-problems
9-1
a.
First we use worst-case \(O(n\lg n)\) running time sorting method, such as heapsort or merge sort, to sort the \(n\) numbers, and list the \(i\) largest numbers.
The running time is \(O(n\lg n)\).
b.
First we use BUILD-MAX-HEAP to build a max-priority queue from the \(n\) numbers, which takes \(O(n)\) running time, then we call EXTRACT-MAX \(i\) times, which takes \(O(i\lg n)\) running time.
The running time is \(O(n + i\lg n)\).
c.
We use an order-statistic algorithm to find the \(i\)th largest number, and partition around that number, which takes \(O(n)\) running time, then we use worst-case \(O(n\lg n)\) running time sorting method, such as heapsort or merge sort, to sort the \(i\) largest numbers, which takes \(O(i\lg i)\) running time.
The running time is \(O(n + i\lg i)\).
9-2
a.
Assume \(x_k\) is the median of \(x_1,x_2,...,x_n\), then we have
\begin{align*} \sum_{x_i < x_k}\omega_{i} &=\sum_{i=1}^{\lfloor \frac{n}{2} \rfloor - 1}\frac{1}{n} =\frac{\lfloor \frac{n}{2} \rfloor - 1}{n} <\frac{1}{2}\\ \sum_{x_i > x_k}\omega_{i} &=\sum_{i=\lfloor \frac{n}{2} \rfloor + 1}^{n}\frac{1}{n} =\frac{\lfloor \frac{n}{2} \rfloor + 1}{n} \leq \frac{1}{2} \end{align*}Thus the median is also the weighted median of the \(x_i\) with weights \(\omega_{i} = 1/n\) for \(i=1,2,...,n\).
b.
First we use worst-case \(O(n\lg n)\) running time sorting method, such as heapsort or mergesort, to sort the \(n\) elements by weight, then we traverse the array, accumulate the weights until the summation exceeds \(1/2\), and the current element we arrived is the weighted median. The worst-case running time is \(O(n\lg n)\).
c.
The algorithm is as below
WEIGHTED-MEDIAN(A) return WEIGHTED-MEDIAN-R(A, 0.5) WEIGHTED-MEDIAN-R(A, x) if A.length == 1 return A[1] m = A.length / 2 med = SELECT(A, m) PARTITION(A, med) sum = 0 for i = 1 to m - 1 sum = sum + A[i] if sum < x if sum + med >= x return med return WEIGHTED-MEDIAN-R(A[m + 1..A.length], x - sum - med) return WEIGHTED-MEDIAN-R(A[1..m - 1], x)The worst-case running time is
\begin{align*} T(n) &=T(n/2)+\Theta(n)\\ &=\Theta(n) \end{align*}d.
Let \(p\) to be the weighted median, \(q\) to be an arbitrary point.
If \(q < p\), we have
\begin{align*} |p_i-q|-|p_i-p|&= \begin{cases} q-p &\text{if $p_i < q$}\\ 2p_i-p-q\geq q-p &\text{if $q\leq p_i < p$}\\ p-q &\text{if $p_i\geq p$} \end{cases} \end{align*}then
\begin{align*} \sum_{i=1}^{n}\omega_i d(p_i, q)-\sum_{i=1}^{n}\omega_i d(p_i, p) &=\sum_{i=1}^{n}\omega_i(|p_i-q|-|p_i-p|)\\ &\geq\sum_{p_i < p}\omega_i(q-p)+\sum_{p_i\geq p}\omega_i(p-q)\\ &=(p-q)(\sum_{p_i\geq p}\omega_i-\sum_{p_i < p}\omega_i)\\ &>0 \end{align*}It's similar when \(q > p\), thus the weighted median is the best solution for the 1-dimensional post-office location problem.
e.
The x-coordinate and the y-coordinate are independent, and we have
\begin{align*} \sum_{i=1}^{n}\omega_i d(p,p_i) &=\sum_{i=1}^{n}\omega_i|x_i-x|+\sum_{i=1}^{n}|y_i-y|\\ \end{align*}thus we could find the weighted median of the x-coordinates \(x_m\) and the weighted median of the y-coordinates \(y_m\), then we find the best solution for the 2-dimensional post-office problem \((x_m, y_m)\).
9-3
a.
If \(i\geq n/2\), we simply use SELECT to find the \(i\)th smallest of n elements. The number of comparisons is \(T(n)\).
If \(i < n/2\), we first break the \(n\) elements into \(\lfloor n/2 \rfloor\) pairs, and sort the two elements in each pair, then we compare the pairs base on the smaller element of each pair, to find the \(i\)th smallest pair and partition the pairs on it, then we know the \(i\)th smallest element must be in the \(2i\) elements of the \(i\) smallest pairs, and we use SELECT to find it. The total number of comparisons is \(\lfloor n/2 \rfloor + U_i(\lceil n/2 \rceil) + T(2i)\).
b.
If \(i < n/2\), then
\begin{align*} U_i(n) &=\lfloor n/2\rfloor+U_i(\lceil n/2\rceil)+T(2i)\\ &=\lfloor n/2\rfloor+\lfloor n/4\rfloor+U_i(\lceil n/4\rceil)+T(2i)\\ &=\cdots\\ &=\sum_{k=1}^{\lg(n/i)-1}(\lfloor n/2^k\rfloor+T(2i))+O(T(2i))\\ &=n+O(T(2i)lg(n/i)) \end{align*}c.
If \(i\) is a constant less than \(n/2\), then
\begin{align*} U_i(n) &=n+O(T(2i)lg(n/i))\\ &=n+O(T(2i)\lg{n}-T(2i)\lg{i})\\ &=n+O(\lg n) \end{align*}d.
If \(i = n/k\) for \(k \geq 2\), then
\begin{align*} U_i(n) &=n+O(T(2i)lg(n/i))\\ &=n+O(T(2n/k)\lg k) \end{align*}
9-4
a.
For \(1 \leq i < j \leq n\), we have
\begin{align*} X_{ijk} &=\text{I{$z_i$ is compared with $z_j$ sometime during the execution of the algorithm to find $z_k$}}\\ &=\text{I{$z_i$ or $z_j$ is chosen as the pivot between $z_{i..j}\cup z_{i..k}\cup z_{k..j}$}} \end{align*}The expected value of \(X_{ijk}\) is \(E[X_{ijk}] = \frac{2}{max(j-i+1,k-i+1,k-j+1)}\).
b.
Let \(X_k\) denote the total number of comparisons between elements of array \(A\) when finding \(z_k\), then
\begin{align*} E[X_k] &=\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\frac{2}{max(j-i+1,k-i+1,k-j+1)}\\ &=\sum_{i=k+1}^{n-1}\sum_{j=i+1}^{n}\frac{2}{j-k+1} +\sum_{i=1}^{k}\sum_{j=k+1}^{n}\frac{2}{j-i+1} +\sum_{i=1}^{k-1}\sum_{j=i+1}^{k}\frac{2}{k-i+1}\\ &\leq 2\Bigg(\sum_{i=1}^{k}\sum_{j=k}^{n}\frac{1}{j-i+1} +\sum_{j=k+1}^{n}\frac{j-k-1}{j-k+1} +\sum_{i=1}^{k-2}\frac{k-i-1}{k-i+1}\Bigg) \end{align*}c.
For the first addend, we have
\begin{align*} \sum_{i=1}^{k}\sum_{j=k}^{n}\frac{1}{j-i+1} &=\sum_{j=k}^{n}\frac{1}{j}+\sum_{j=k}^{n}\frac{1}{j-1}+\cdots +\sum_{j=k}^{n}\frac{1}{j-k+1}\\ &=\sum_{x=1}^{k}1+\sum_{x=k+1}^{n-k}\frac{k}{x} +\sum_{x=n-k+1}^{n}\frac{x}{n-x}\\ &\leq \sum_{x=1}^{n}1\\ &=n \end{align*}Then for the expected value, we have
\begin{align*} E[X_k] &\leq 2\Bigg(\sum_{i=1}^{k}\sum_{j=k}^{n}\frac{1}{j-i+1} +\sum_{j=k+1}^{n}\frac{j-k-1}{j-k+1} +\sum_{i=1}^{k-2}\frac{k-i-1}{k-i+1}\Bigg)\\ &\leq 2\Bigg(n+\sum_{j=k+1}^{n}\frac{j-k-1}{j-k+1} +\sum_{i=1}^{k-2}\frac{k-i-1}{k-i+1}\Bigg)\\ &\leq 2\Bigg(n+\sum_{j=k+1}^{n}1+\sum_{i=1}^{k-2}1\Bigg)\\ &< 4n \end{align*}d.
The total number of comparisons determines the running time of RANDOMIZED-SELECT, thus the expected running time is \(O(n)\).