Interval Sum Counting Algorithm
Problem Set 4
Problem 1
$I = $
These intervals are already sorted by finish time.
- Initially, there are no classrooms allocated, so is placed in .
- Next, is incompatible with , so it is placed in a newly allocated .
- is compatible with and , so it is placed in .
- isn't compatible with or (because of and respectively), so it is placed in a newly allocated
So we have...
As you can see below, the maximum depth is 2, so this algorithm allocated more than max-depth classrooms.
I_1 +-----------------+ +-----+ I_2
I_0 +-----+ +-----------------+ I_3
+-----+-----+-----+-----+-----+
0 1 2 3 4 5
Problem 2
Algorithm
- Let be the (initially empty) set of pairs
- Sort in increasing order to get .
- While there are elements remaining, pick the smallest and largest element (the first and last ), and remove them from .
- Add to
- Once is empty, return .
# O(nlog(n))
def min_max_pairs(A):
A.sort()
l, r = 0, len(A) - 1
res = -infinity
while l < r:
res = max(res, A[l] + A[r])
l += 1
r -= 1
return res
Running Time
Sorting takes . Then, we perform iterations of our while loop, each requiring constant work. This gives us an overall runtime of
Correctness: Greedy Stays Ahead
For notational convenience, let be sorted. Let be my algorithm, and be the optimum, and for the sake of contradiction, suppose chooses pairs differently from , e.g. not of the form as does.
Let be a function that outputs the "choice" of partner for any by , so , and let mean the same for .
Define as...
Given sorted numbers (with being even) , , where and are defined as follows:
Base Case:
: we have , with . There is only one possible pair, so , and holds.
IH: Suppose holds.
IS:
Let be sorted numbers. Remove and to get . Since is still sorted, and has , we know holds.
Therefore, we have $G'{max} \le X'A'$.
Consider an arbitrary pair chosen by on , , with . By our sort order, we have . Now, consider the ways we could swap elements between these two pairs
Case 1:
- , since
- , since
Case 2:
- , since
- , since
In either of the above cases, we are increasing the max of the two sums by swapping elements, and this holds for an arbitrary chosen by , which is at least as good as the optimum. Since we know $X'{max} \ge G'P(k)$ holds.}$, and also that any pairing other than the one picked by () leads to an increased sum, it must be the case that $G_{max} \le X_{max
Problem 3
Lemma (1): For any two trees and with and , if we add the edge between the two trees, we get a new tree .
Proof:
Let be the number of vertices in , and be the number of vertices in . Since they are both trees, we have edges in , and edges in . Connecting and on then results in a graph with vertices, and edges.
Furthermore, both and are connected. Therefore, after adding an the edge between them, this results in a graph that is also connected, since all vertices in can reach all vertices in through this edge, and vice versa.
Finally, since the resulting graph is connected and has edges, it must be a tree.
Lemma (2): For any tree removing an edge results in two trees.
Proof:
Let be an arbitrary tree and be an arbitrary edge in .
Removing the edge from results in two separate connected components, each containing a subset of the vertices from .
The resulting components have no cycles, since was a tree, and is therefore acyclic, and removing an edge cannot create a cycle.
Each component is connected, as every pair of vertices in was connected by a unique path (since is a tree), and removing does not disconnect any other pairs of vertices within each component.
Therefore, the two components formed by cutting on are both trees.
Lemma (3): For any partition of vertices of a tree into two sets and , there must be an edge () between some vertex and .
Proof:
Let be a tree and TAB$.
Let and be arbitrary vertices. Since is a tree, it is connected. Therefore, there must exist a path , so at some point along this path a vertex in must be connected to a vertex in , which is a contradiction.
Main Proof:
Let and be edge disjoint spanning trees over . Consider an arbitrary edge .
Let . Since was a tree, this splits into two connected components , both of which are also trees by (2). We have and .
Since is also a tree, and is therefore connected, by (3), there must exist an edge , such that and . For any two vertices in a tree, there is exactly one path between them, since otherwise there would either be a non-connected vertex (no path), or a cycle (more than one path). We know , since and are edge-disjoint, so letting be the path between and in , choose (where may or may not be ).
Cutting on to get , we have two connected components , , both of which are trees. Now, we can add to , and by (1), we get a tree, since and are both trees, with and .
Similarly, adding to , we also get a tree by (1), since is a tree with , and is a tree with .
Therefore, we have both and are trees.
Problem 5
Call my algorithm .
Correctness
Claim: Given a list of numbers , the number of interval sums in is correctly calculated by , where they are defined as...
i.e. the number of interval sums in the first half of .
i.e. the number of interval sums in the latter half of in .
i.e the number of interval sums that cross over the midpoint of and are in
Proof: By induction
Let be the above claim, i.e. that given numbers and a range , my algorithm returns the number of interval sums on that are between via the following calculation:
Base Case:
Given a single number, there is only one interval sum to consider, i.e. . In this case, the left half of is empty, and the right half consists of a single element. There are no interval sums that cross over the midpoint, so if , then , and otherwise, . This gives us a final answer of . There is only one possible interval to consider, and is set depending on whether it is in , so this is correct.
Given two numbers , we have intervals to check corresponding to , , and . We have that if $I(1, 1) \in 0R = 1I(2, 2) \in 0C = 1I(1, 2) \in 0L, R, C\lbrack l, u \rbrack$.
I.H.: Suppose
I.S.:
Let , and be arbitrary numbers such that . Let , and define , and .
Since , and , we have that and holds. Thus, and , which are the total number of interval sums in and respectively.
Considering the set $U \subset ^2AU$ into two sets, ones that don't cross over the midpoint, , and those that do cross over the midpoint, . Note that , and .
By our inductive hypothesis, we have...
since we considered precisely the intervals in each side of , which are exactly the intervals that don't cross the midpoint, and counted the number of such intervals that have $I(i, j) \in $.
To find $|{ (i, j) : (i, j) \in M \land I(i, j) \in }| = C(i, j)i < m \le jI(m, j)j \ge mI(i, m - 1)i \le m - 1I(m, j) + I(i, m - 1) \in MI(i, j) \in $.
My algorithm does this by first calculating each and sorting them into a list . Then, it iterates through all for , and searches for the lowest and highest index in such that $H_i + I(m, j) \in Ci_{max} - i_{min} + 1iji < m \le jH\lbrack i_{min}, i_{max} \rbrack\lbrack i_{min}, i_{max} \rbrack\lbrack l, u \rbrack$.
Therefore, we have...
And so correctly considers all relevant interval sums, counting the number of interval sums between .
def count_interval_sums_bruteforce(A, I, l, u, low, high):
ans = 0
for i in range(low, high):
for j in range(i, high):
s = I(i, j)
if s <= u and s >= l:
ans += 1
return ans
def count_interval_sums_recursive(A, I, l, u, low, high):
if high - low <= 2:
return count_interval_sums_bruteforce(A, I, l, u, low, high)
mid = low + (high - low)//2
left = count_interval_sums_recursive(A, I, l, u, low, mid)
right = count_interval_sums_recursive(A, I, l, u, mid, high)
total_sum = sum(A[low:high])
left_sum = sum(A[low:mid])
right_sum = total_sum - left_sum
prefix = prefix_sum(A[low:high])
suffix = suffix_sum(A[low:high])
possible_left_halfs = sorted(
[total - prefix[i] - right_sum for i, s in enumerate(suffix[:mid])]
)
cross = 0
for i in range(len(prefix)//2, len(prefix)):
right_half = right_sum - suffix[i]
min_left_half = l - right_half
max_left_half = u - right_half
j = possible_left_halfs.binary_search(min_left_half)
k = possible_left_halfs.binary_search(max_left_half)
cross += k - j + 1
return left + right + cross
def count_interval_sums(A, l, u):
n = len(A)
prefix, suffix = [0] * (n+1), [0] * (n+1)
pre = 0
total = sum(A)
for i in range(n):
prefix[i] = pre
pre += A[i]
suffix[i] = total - pre
def I(i, j):
return total - prefix[i] - suffix[j]
return count_interval_sums_recursive(A, I, l, u, 0, len(A))