1. Overview

Finding the largest subarray with a sum of zero is a classic problem that can be tackled efficiently using a HashMap.

In this tutorial, we’ll walk through a detailed step-by-step approach to solving this problem in Java and also look at a brute-force comparison method.

2. Problem Statement

Given an array of integers, we want to find the length of the largest subarray with a sum of 0.

Input: arr = [4, -3, -6, 5, 1, 6, 8]
Output: 4
Explanation: The array from the 0th to 3rd index has a sum of 0.

3. Brute Force Approach

The brute force approach involves checking all possible subarrays to see if their sum is zero and keeping track of the maximum length of such subarrays.

Let’s first look at the implementation and then understand it step by step:

public static int maxLen(int[] arr) {
    int maxLength = 0;
    for (int i = 0; i < arr.length; i++) {
        int sum = 0;
        for (int j = i; j < arr.length; j++) {
            sum += arr[j];
            if (sum == 0) {
                maxLength = Math.max(maxLength, j - i + 1);
            }
        }
    }
    return maxLength;
}

Let’s review this code:

  • At first, we initialize a variable maxLength to 0
  • Then, use two nested loops to generate all possible subarrays
  • For each subarray, calculate the sum
  • If the sum is 0, update maxLength if the current subarray length exceeds maxLength

Now, let’s discuss the time and space complexity. We use two nested loops, each iterating over the array, leading to a quadratic time complexity. So, the time complexity is O(n^2). Since we used only a few extra variables, the space complexity is O(1).

4. Optimized Approach Using HashMap

In this approach, we maintain a cumulative sum of the elements as we iterate through the array. We use a HashMap to store the cumulative sum and its index. If the cumulative sum is seen before, it means the subarray between the previous index and the current index has a sum of 0. So, we keep tracking the maximum length of such subarrays.

Let’s first look at the implementation:

public static int maxLenHashMap(int[] arr) {
    HashMap<Integer, Integer> map = new HashMap<>();

    int sum = 0;
    int maxLength = 0;

    for (int i = 0; i < arr.length; i++) {
        sum += arr[i];

        if (sum == 0) {
            maxLength = i + 1;
        }

        if (map.containsKey(sum)) {
            maxLength = Math.max(maxLength, i - map.get(sum));
        }
        else {
            map.put(sum, i);
        }
    }
    return maxLength;
}

Let’s understand this code, along with a visual:

  • First, we initialize a HashMap to store the cumulative sum and its index
  • Then, we initialize variables for the subarray’s cumulative sum and maximum length with sum 0
  • We traverse the array and update the cumulative sum
  • We check if the cumulative sum is 0. If it is, we update the maximum length
  • If the cumulative sum is already in the HashMap, we calculate the length of the subarray and update the maximum length if it’s larger than the current maximum
  • If the cumulative sum isn’t in the HashMap, we add it with its index to the HashMap

We’ll now consider the example we mentioned at the start and have a dry run:

Largest sumarray sum zero using hashmap approach

If we look at time and space complexity, we traverse the array once, and each operation with the HashMap (insertion and lookup) is O(1) on average. So, the time complexity is O(n). In the worst case, the HashMap stores all the cumulative sums. So, the space complexity is O(n).

5. Comparison

The brute force approach has a time complexity of O(n^2), making it inefficient for large arrays. The optimized approach using a HashMap has a time complexity of O(n), making it much more suitable for large datasets.

The brute force approach uses O(1) space, while the optimized approach uses O(n) space due to the HashMap. The trade-off is between time efficiency and space usage.

6. Conclusion

In this article, we saw that using a HashMap to track cumulative sums allows us to find the largest subarray with a sum of zero efficiently. This approach ensures that we can solve the problem in linear time, making it scalable for large arrays.

The brute force method, while conceptually simpler, isn’t feasible for large input sizes due to its quadratic time complexity.

As always, the source code of all these examples is available over on GitHub.