1. Overview
In this tutorial, we’ll talk about the problem of finding the median of a matrix with sorted rows.
First, we’ll define the problem and provide an example to explain it. Then, we’ll present two approaches to solve this problem and work through their implementations and time complexity.
2. Defining the Problem
Suppose we have a matrix of sorted rows and columns. We were asked to get the median of that given matrix . (Recall the median is the element that’s located in the middle of a sorted list).
Let’s take a look at the following example for better understanding, given a matrix with sorted rows:
As we can see, the median here is , since if we put all the elements of the matrix in a sorted list , then the element that is located at the middle:
3. Naive Approach
3.1. Main Idea
The main idea of this approach is to put all the elements of the given matrix in a list, then sort it and get the middle element.
Suppose we have a matrix of rows and columns. First, we’ll iterate over it row by row. Then, for each row, we’ll insert all its elements in the predefined list , which will store all the elements of the given matrix.
Second, after having all the elements in the list , we’ll sort this list using any sorting algorithm.
Finally, the median of the given matrix is the element located at the index:
3.2. Algorithm
Let’s take a look at the implementation of the algorithm:
algorithm NaiveApproach(M):
// INPUT
// M = a matrix with R rows and C columns (0-indexed)
// OUTPUT
// The median of matrix M
L <- an empty list
for i <- 0 to R - 1:
for j <- 0 to C - 1:
L.append(M[i, j])
sort(L)
median_pos <- floor( length(L) / 2 )
return L[median_pos]
Initially, we have the function that will return the median of the given matrix . The function will have one parameter: the given matrix .
First, we declare an empty list . Then, we iterate over all the elements of the given matrix , and append them to the list .
Second, we sort the list and get the index of the middle element of the list .
Finally, we return the given matrix median, .
3.3. Complexity
The time complexity of the previous approach is , where is the number of rows of the given matrix , and is the number of columns. The reason behind this complexity is that we’re sorting a list of length , which equals the number of elements in the given matrix.
The space complexity here is . The reason behind this complexity is that we’re storing all the elements of the given matrix in an additional list .
4. Binary Search
4.1. Main Idea
The idea of this approach is that we’ll do a binary search on the value of the median.
Initially, we’ll do a binary search on the median value. The check of this binary search will be to count the number of elements smaller than or equal to the value we’re checking. If the count of these elements is greater than half of the matrix’s elements, then this value could be the median of the given matrix . Otherwise, we want to try more significant values.
In the end, we’ll have the element located in the middle if we sort all the elements of the given matrix.
4.2. Smaller or Equal Values
First of all, let’s take a look at the following algorithm to calculate the number of elements smaller than or equal to a specific value in the given sorted matrix :
algorithm Smaller(M, X):
// INPUT
// M = a matrix with R rows and C columns (0-indexed)
// X = the value to check
// OUTPUT
// count = the number of elements in M that are <= X
count <- 0
for row <- 0 to R - 1:
pos <- C
low <- 0
high <- C - 1
while low <= high:
mid <- (low + high) / 2
if M[row, mid] > X:
pos <- mid
high <- mid - 1
else:
low <- mid + 1
count <- count + pos
return count
The function will have two parameters. The first one is representing the given matrix, while the second one is , which is the value we are checking.
First, we declare , which will store the count of elements smaller than or equal to . Next, we iterate over the rows of the given matrix . Since the rows are sorted, then for each row, we can do a binary search to find the first position that has an element greater than .
Second, we’ll increase the by since it is also the number of elements smaller than or equals in the current row.
Finally, we return , which is the number of elements smaller than or equal to in the given matrix .
4.3. Algorithm
Let’s take a look at the implementation of the algorithm for finding the median element:
algorithm Median(M):
// INPUT
// M = the matrix with R sorted rows and C columns
// OUTPUT
// The median of the matrix M
answer <- infinity
low <- MIN_INT
high <- MAX_INT
while low <= high:
mid <- low + (high - low) / 2
if Smaller(M, mid) >= floor(R * C / 2):
answer <- mid
high <- mid - 1
else:
low <- mid + 1
return answer
Initially, we have the function the same as the previous approach.
First, we declare , which will store the median of the given matrix. Then, we declare the boundary of the binary search, where would be the minimum value possible, and is the maximum.
Second, we’ll get the number of elements smaller than or equal to the . If their count is greater than or equal to the expected position of the median , then the current could be our median. Therefore, we update the value and try to look for a smaller median. Otherwise, the current is not valid, and we try to look for a bigger median.
Finally, we return , which will be the median of the given matrix .
4.4. Complexity
The time complexity of the previous approach is , where is the number of rows of the given matrix , and is the number of columns. The reason behind this complexity is that we’re doing a binary search on the median value, so the logarithm of the length of the maximum range of integer values would be . Then, for each potential median, we’re checking the number of elements less than or equal to it using a binary search algorithm with complexity .
The space complexity here is because we store the value of the median.
5. Conclusion
In this article, we discussed the problem of finding the median of a matrix with sorted rows. We defined the problem and provided an example to explain it. Finally, we provided two approaches to solve it and walked through their implementations and time complexity.