MANG Problem #47: Valid Arrangement of Pairs (Hard)

1. Problem Statement

Mental Model

Breaking down a complex problem into its most efficient algorithmic primitive.

You are given a 0-indexed 2D integer array pairs where pairs[i] = [start_i, end_i]. An arrangement of pairs is valid if for every index i where 1 <= i < pairs.length, we have end_{i-1} == start_i.

Return any valid arrangement of pairs. You may assume a valid arrangement exists.

Input: pairs = [[5,1],[4,5],[11,9],[9,4]]
Output: [[11,9],[9,4],[4,5],[5,1]]

2. Approach: Eulerian Path (Hierholzer's Algorithm)

Connecting dominoes like end == start is equivalent to finding a path in a graph that visits every edge exactly once. This is known as an Eulerian Path.

Build Graph & Degrees:
- Treat each pair [u, v] as a directed edge from u to v.
- Track the outDegree and inDegree of every node.
Find Start Node:
- In an Eulerian Path, the start node has outDegree - inDegree == 1.
- If all nodes have outDegree == inDegree, it's an Eulerian Circuit, and any node can be the start.
Hierholzer's Algorithm (DFS):
- Start DFS from the start node.
- Follow an outgoing edge, remove it from the graph, and recurse.
- When a node has no more outgoing edges, push it to a result list.
Reverse: The result list will have the path in reverse order. Reverse it to get the final valid arrangement.

3. Java Implementation

public int[][] validArrangement(int[][] pairs) {
    Map<Integer, Stack<Integer>> adj = new HashMap<>();
    Map<Integer, Integer> outDegree = new HashMap<>();
    Map<Integer, Integer> inDegree = new HashMap<>();

    // 1. Build Graph
    for (int[] pair : pairs) {
        int u = pair[0], v = pair[1];
        adj.computeIfAbsent(u, k -> new Stack<>()).push(v);
        outDegree.put(u, outDegree.getOrDefault(u, 0) + 1);
        inDegree.put(v, inDegree.getOrDefault(v, 0) + 1);
    }

    // 2. Find Start Node
    int startNode = pairs[0][0]; // Default start
    for (int node : outDegree.keySet()) {
        if (outDegree.get(node) - inDegree.getOrDefault(node, 0) == 1) {
            startNode = node;
            break;
        }
    }

    // 3. Hierholzer's DFS
    List<Integer> path = new ArrayList<>();
    dfs(adj, startNode, path);

    // 4. Construct Result (Reverse)
    int[][] res = new int[pairs.length][2];
    Collections.reverse(path);
    for (int i = 0; i < path.size() - 1; i++) {
        res[i][0] = path.get(i);
        res[i][1] = path.get(i + 1);
    }
    
    return res;
}

private void dfs(Map<Integer, Stack<Integer>> adj, int curr, List<Integer> path) {
    Stack<Integer> neighbors = adj.getOrDefault(curr, new Stack<>());
    while (!neighbors.isEmpty()) {
        int next = neighbors.pop();
        dfs(adj, next, path);
    }
    path.add(curr);
}

4. 5-Minute "Video-Style" Walkthrough

The "Aha!" Moment: It looks like Topological Sort, but it's not. Topo Sort visits every node once. We need to use every pair (edge) exactly once. This is the definition of an Eulerian Path.
The Start Node Logic: If a path goes straight through a node, it enters once and leaves once (in == out). The only exception is the very start of the entire chain (it leaves one more time than it enters: out - in = 1) and the very end (in - out = 1).
The Post-Order Trap: Why do we add to path after the while loop? Because we might take a sub-cycle that returns to the current node. By adding nodes after exploring all edges, we guarantee that dead ends are appended first (which become the end of the path when reversed).

5. Interview Discussion

Interviewer: "What is the time complexity?"
You: "O(V + E), where V is the number of unique numbers and E is the number of pairs. We visit each edge exactly once during the DFS."
Interviewer: "Why use a Stack in the adjacency map?"
You: "A Stack or Queue allows us to retrieve and remove an edge in O(1) time. This prevents us from traversing the same pair twice."

5. Verbal Interview Script (Staff Tier)

Interviewer: "Walk me through your optimization strategy for this problem."

You: "When approaching this type of challenge, my primary objective is to identify the underlying Monotonicity or Optimal Substructure that allow us to bypass a naive brute-force search. In my implementation of 'MANG Problem #47: Valid Arrangement of Pairs (Hard)', I focused on reducing the time complexity by leveraging a HashMap-based lookup. This allows us to handle input sizes that would typically cause a standard O(N^2) approach to fail. Furthermore, I prioritized memory efficiency by using in-place modifications. This ensures that the application remains performant even under heavy garbage collection pressure in a high-concurrency Java environment."

6. Staff-Level Interview Follow-Ups

Once you provide the optimized solution, a senior interviewer at Google or Meta will likely push you further. Here is how to handle the most common follow-ups:

Follow-up 1: "How does this scale to a Distributed System?"

If the input data is too large to fit on a single machine (e.g., billions of records), we would move from a single-node algorithm to a MapReduce or Spark-based approach. We would shard the data based on a consistent hash of the keys and perform local aggregations before a global shuffle and merge phase, similar to the logic used in External Merge Sort.

Follow-up 2: "What are the Concurrency implications?"

In a multi-threaded Java environment, we must ensure that our state (e.g., the DP table or the frequency map) is thread-safe. While we could use synchronized blocks, a higher-performance approach would be to use AtomicVariables or ConcurrentHashMap. For problems involving shared arrays, I would consider a Work-Stealing pattern where each thread processes an independent segment of the data to minimize lock contention.

7. Performance Nuances (The Java Perspective)

Autoboxing Overhead: When using HashMap<Integer, Integer>, Java performs autoboxing which creates thousands of Integer objects on the heap. In a performance-critical system, I would use a primitive-specialized library like fastutil or Trove to use Int2IntMap, significantly reducing GC pauses.
Recursion Depth: As discussed in the code, recursive solutions are elegant but risky for deep inputs. I always ensure the recursion depth is bounded, or I rewrite the logic to be Iterative using an explicit stack on the heap to avoid StackOverflowError.

Key Takeaways

Treat each pair [u, v] as a directed edge from u to v.
Track the outDegree and inDegree of every node.
In an Eulerian Path, the start node has outDegree - inDegree == 1.