MANG Problem #41: Making A Large Island (Hard)

1. Problem Statement

Mental Model

Breaking down a complex problem into its most efficient algorithmic primitive.

You are given an n x n binary matrix grid. You are allowed to change at most one 0 to be 1.

Return the size of the largest island in grid after applying this operation.

An island is a 4-directionally connected group of 1s.

2. Approach: Connected Components (DFS)

A brute-force approach would change each 0 to 1 and run a full DFS/BFS ($O(N^4)$). We must optimize this.

The "Aha!" Moment

Instead of flipping 0s and checking sizes, we should first measure the existing islands and give them "IDs".

First Pass (Coloring): Traverse the grid. When a 1 is found, run a DFS to find the area of the island. Give every cell in this island a unique id (e.g., 2, 3, 4...) and store Map<id, area>.
Second Pass (Flipping): Traverse the grid looking for 0s.
- If you flip a 0 to 1, it connects all adjacent islands.
- Check the 4 neighbors. Collect their unique ids in a HashSet (to avoid double-counting the same island).
- The new area is 1 + sum(Map.get(id) for each unique id).
- Update maxArea.

3. Java Implementation

public int largestIsland(int[][] grid) {
    int n = grid.length;
    Map<Integer, Integer> areaMap = new HashMap<>();
    int islandId = 2; // Start IDs at 2 because 0 and 1 are already used
    int maxArea = 0;
    
    // 1. Paint islands and store areas
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (grid[i][j] == 1) {
                int area = dfs(grid, i, j, islandId);
                areaMap.put(islandId, area);
                maxArea = Math.max(maxArea, area); // In case there are no 0s
                islandId++;
            }
        }
    }
    
    // 2. Try flipping every 0
    int[][] dirs = {{0,1}, {1,0}, {0,-1}, {-1,0}};
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (grid[i][j] == 0) {
                Set<Integer> neighborIds = new HashSet<>();
                for (int[] d : dirs) {
                    int ni = i + d[0], nj = j + d[1];
                    if (ni >= 0 && ni < n && nj >= 0 && nj < n && grid[ni][nj] >= 2) {
                        neighborIds.add(grid[ni][nj]);
                    }
                }
                
                int potentialArea = 1; // The flipped 0
                for (int id : neighborIds) {
                    potentialArea += areaMap.get(id);
                }
                maxArea = Math.max(maxArea, potentialArea);
            }
        }
    }
    
    return maxArea == 0 ? 1 : maxArea; // Handle all 0s case
}

private int dfs(int[][] grid, int r, int c, int id) {
    if (r < 0 || r >= grid.length || c < 0 || c >= grid[0].length || grid[r][c] != 1) return 0;
    grid[r][c] = id;
    return 1 + dfs(grid, r+1, c, id) + dfs(grid, r-1, c, id) + 
               dfs(grid, r, c+1, id) + dfs(grid, r, c-1, id);
}

4. 5-Minute "Video-Style" Walkthrough

The Component Identification: By changing the 1s to an ID like 2 or 3, the grid itself becomes a map of components. We don't need an extra 2D array.
The HashSet: Why use a Set for neighbor IDs? Because a 0 might be surrounded by 1s that all belong to the same island. If we don't deduplicate the IDs, we would add the area of the same island multiple times!

5. Interview Discussion

Interviewer: "What is the time complexity?"
You: "O(N^2). The first pass visits each cell at most a constant number of times (DFS). The second pass visits each cell and checks 4 neighbors. It is strictly linear relative to the grid size."
Interviewer: "Can this be solved with Union Find (Disjoint Set)?"
You: "Yes. We can treat all 1s as nodes and union adjacent 1s to form components, tracking sizes. The logic for the second pass remains exactly the same."

5. Verbal Interview Script (Staff Tier)

Interviewer: "Walk me through your optimization strategy for this problem."

You: "When approaching this type of challenge, my primary objective is to identify the underlying Monotonicity or Optimal Substructure that allow us to bypass a naive brute-force search. In my implementation of 'MANG Problem #41: Making A Large Island (Hard)', I focused on reducing the time complexity by leveraging a HashMap-based lookup. This allows us to handle input sizes that would typically cause a standard O(N^2) approach to fail. Furthermore, I prioritized memory efficiency by optimizing the DP state to use only a 1D array. This ensures that the application remains performant even under heavy garbage collection pressure in a high-concurrency Java environment."

6. Staff-Level Interview Follow-Ups

Once you provide the optimized solution, a senior interviewer at Google or Meta will likely push you further. Here is how to handle the most common follow-ups:

Follow-up 1: "How does this scale to a Distributed System?"

If the input data is too large to fit on a single machine (e.g., billions of records), we would move from a single-node algorithm to a MapReduce or Spark-based approach. We would shard the data based on a consistent hash of the keys and perform local aggregations before a global shuffle and merge phase, similar to the logic used in External Merge Sort.

Follow-up 2: "What are the Concurrency implications?"

In a multi-threaded Java environment, we must ensure that our state (e.g., the DP table or the frequency map) is thread-safe. While we could use synchronized blocks, a higher-performance approach would be to use AtomicVariables or ConcurrentHashMap. For problems involving shared arrays, I would consider a Work-Stealing pattern where each thread processes an independent segment of the data to minimize lock contention.

7. Performance Nuances (The Java Perspective)

Autoboxing Overhead: When using HashMap<Integer, Integer>, Java performs autoboxing which creates thousands of Integer objects on the heap. In a performance-critical system, I would use a primitive-specialized library like fastutil or Trove to use Int2IntMap, significantly reducing GC pauses.
Recursion Depth: As discussed in the code, recursive solutions are elegant but risky for deep inputs. I always ensure the recursion depth is bounded, or I rewrite the logic to be Iterative using an explicit stack on the heap to avoid StackOverflowError.

Key Takeaways

If you flip a 0 to 1, it connects all adjacent islands.
Check the 4 neighbors. Collect their unique ids in a HashSet (to avoid double-counting the same island).
The new area is 1 + sum(Map.get(id) for each unique id).