Leetcode #4 Median of Two Sorted Arrays

Posted by blueskyson on February 16, 2021

Description

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

Follow up: The overall run time complexity should be O(log (m+n)).

Example 1:

Input: nums1 = [1,3], nums2 = [2]
Output: 2.00000
Explanation: merged array = [1,2,3] and median is 2.

Example 2:

Input: nums1 = [1,2], nums2 = [3,4]
Output: 2.50000
Explanation: merged array = [1,2,3,4] and median is (2 + 3) / 2 = 2.5.

Example 3:

Input: nums1 = [0,0], nums2 = [0,0]
Output: 0.00000

Example 4:

Input: nums1 = [], nums2 = [1]
Output: 1.00000

Example 5:

Input: nums1 = [2], nums2 = []
Output: 2.00000

Solution 1

排序合併的陣列,再取出中位數,複雜度為 $O(NlogN)$。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution {
public:
    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
        int length = nums1.size() + nums2.size();
        int *arr = new int[length];
        std::copy(nums1.begin(), nums1.end(), arr);
        std::copy(nums2.begin(), nums2.end(), arr + nums1.size());
        std::sort(arr, arr + length);
        
        int mid = length >> 1;
        double val;
        if (length & 0x1) {
            val = (double)arr[mid];
        } else {
            val = (double)(arr[mid - 1] + arr[mid]) / 2;
        }
        
        delete[] arr;
        return val;
    }
};

Runtime: 50 ms, faster than 60.55% of C++ online submissions for Median of Two Sorted Arrays.

Memory Usage: 89.6 MB, less than 46.04% of C++ online submissions for Median of Two Sorted Arrays.

Solution 2

由小到大搜尋 nums1nums2 的中位數,時間複雜度為 $O(N)$。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class Solution {
public:
    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
        int length = nums1.size() + nums2.size();        
        int mid = length >> 1;
        int *arr = new int[mid + 1];
        for (int i = 0, x = 0, y = 0; i <= mid; i++) {
            if (x == nums1.size()) {
                arr[i] = nums2[y++];
                continue;
            }
            if (y == nums2.size()) {
                arr[i] = nums1[x++];
                continue;
            }
            
            if (nums1[x] < nums2[y]) {
                arr[i] = nums1[x++];
            } else {
                arr[i] = nums2[y++];
            }
        }
        
        double val;
        if (length & 1) {
            val = (double)arr[mid];
        } else { 
            val = (double)(arr[mid] + arr[mid - 1]) / 2;
        }
        
        delete[] arr;
        return val;
    }
};

Runtime: 41 ms, faster than 74.21% of C++ online submissions for Median of Two Sorted Arrays.

Memory Usage: 89.6 MB, less than 46.04% of C++ online submissions for Median of Two Sorted Arrays.

Solution 3

與 solution 2 的想法類似,但是透過 binary search 來找中位數。先算出中位數所在的位置 middle ,對長度較短的 vector A 設定左邊界 left 和右邊界 right ,然後進行 binary search 。以 tmp = (left + right) / 2 為支點將 A 切為兩半、以 middle - tmp 為支點將 B 切兩半如下範例:

Suppose:
    A:   3   6   6   7   8    
    B:   1   2   4   5   7   8

    total = 11, middle = 11 / 2 = 5

Step 1:
    left = 0, right = A.size() = 5
    tmp = (left + right) / 2 = 2

    Aleft = A[tmp - 1] = A[1]
    Bleft = B[middle - tmp - 1] = B[2]

    A:    3    6    6    7    8
               ^    ^
            Aleft  Aright
    
    B:    1    2    4    5    7    8
                    ^    ^
                 Bleft  Bright

Step 2:
    left = 0, right = 2,
    tmp = (left + right) / 2 = 1

    Aleft = A[tmp - 1] = A[0]
    Bleft = B[middle - tmp - 1] = B[3]

    A:    3    6    6    7    8
          ^    ^
       Aleft  Aright
    
    B:    1    2    4    5    7    8
                         ^    ^
                      Bleft  Bright
Finally:
    median = min(Aright, Bright)

接著我們不斷調整 leftright 使得 Aleft <= BrightBleft <= Aright 便可以得知中位數,複雜度為 $log \ min(m, n)$。

注意 A 必須比 B 短,因為在較短的陣列上,所有位置都是可能的中位數切割位置,但在較長的陣列上,太左或太右的位置不可能進行合法切割。例如,[1] [2 3 4 5 6 7 8],顯然在 23 之間進行切割是不可能的,因為如果以這種方式進行切割,較短的陣列沒有那麼多元素來平衡 [3 4 5 6 7 8] 部分。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class Solution {
public:
    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
        if(nums1.size() > nums2.size()) {
            return binary_search(nums2, nums1);
        } else {
            return binary_search(nums1, nums2);
        }
    }
    
    // Size of A is always less than that of B
    double binary_search(vector<int>& A, vector<int>& B) {
        int total = A.size() + B.size();
        int middle = total >> 1;
        int left = 0;
        int right = A.size();
        while (left <= right) {
            int tmp = (left + right) >> 1; //partition of A
            int i = tmp - 1;
            int j = middle - tmp - 1;
            
            int Aleft = (i < 0 ? INT_MIN : A[i]);
            int Bleft = (j < 0 ? INT_MIN : B[j]);
            int Aright = (i + 1 >= A.size() ? INT_MAX : A[i + 1]);
            int Bright = (j + 1 >= B.size() ? INT_MAX : B[j + 1]);
            
            if (Aleft <= Bright && Bleft <= Aright) {
                if (total & 0x1) {
                    return min(Aright, Bright);
                } else {
                    return (double)(max(Aleft, Bleft) + min(Aright, Bright)) / 2;
                }
            }
            
            if (Aleft > Bright) {
                right = i + 1;
            } else {
                left = (i <= left ? i + 2 : i + 1);
            }
        }
        return 0;
    }
};

然而實際執行結果卻很差,推測是 branch 過多,也就是 (?:)if 用太多

Runtime: 28 ms, faster than 86.16% of C++ online submissions for Median of Two Sorted Arrays.

Memory Usage: 89 MB, less than 84.97% of C++ online submissions for Median of Two Sorted Arrays.