MANG Problem #49: Prefix and Suffix Search (Hard)

1. Problem Statement

Mental Model

Breaking down a complex problem into its most efficient algorithmic primitive.

Design a special dictionary that searches the words in it by a prefix and a suffix.

Implement the WordFilter class:

WordFilter(String[] words) Initializes the object with the words in the dictionary.
int f(String pref, String suff) Returns the index of the word in the dictionary, which has the prefix pref and the suffix suff. If there is more than one valid index, return the largest of them. If there is no such word in the dictionary, return -1.

2. Approach: The "Wrapped Word" Trie

Searching by prefix is easy with a Trie. Searching by suffix requires a reverse Trie. But searching by BOTH simultaneously is hard.

The "Aha!" Moment

We can combine the suffix and prefix into a single string and insert it into a standard Trie!

For a word like "apple" with index 0, we insert the following variations into the Trie:

"apple{apple"
"pple{apple"
"ple{apple"
"le{apple"
"e{apple"
"{apple"

If the user queries pref = "ap" and suff = "le", we simply search our Trie for the combined string: "le{ap".

State: Every TrieNode stores the weight (the index of the word). Since we process words in order, overwriting the weight automatically guarantees we store the largest index.
Separator: We use { because its ASCII value is immediately after z, making the Trie array size 27.

3. Java Implementation

class WordFilter {
    class TrieNode {
        TrieNode[] children = new TrieNode[27];
        int weight = -1;
    }

    TrieNode root;

    public WordFilter(String[] words) {
        root = new TrieNode();
        for (int weight = 0; weight < words.length; weight++) {
            String word = words[weight] + "{" + words[weight];
            for (int i = 0; i <= words[weight].length(); i++) {
                TrieNode curr = root;
                curr.weight = weight;
                // Insert the suffix{prefix substring into the Trie
                for (int j = i; j < word.length(); j++) {
                    int k = word.charAt(j) - 'a';
                    if (curr.children[k] == null) {
                        curr.children[k] = new TrieNode();
                    }
                    curr = curr.children[k];
                    curr.weight = weight;
                }
            }
        }
    }

    public int f(String pref, String suff) {
        TrieNode curr = root;
        String query = suff + "{" + pref;
        for (char c : query.toCharArray()) {
            if (curr.children[c - 'a'] == null) {
                return -1;
            }
            curr = curr.children[c - 'a'];
        }
        return curr.weight;
    }
}

4. 5-Minute "Video-Style" Walkthrough

The Space-Time Tradeoff: This approach generates $L$ suffixes for each word of length $L$. Therefore, we insert $O(L)$ strings into the Trie per word. We are trading memory (Space) for blazing fast $O(Prefix + Suffix)$ lookup times.
The Overwrite Logic: The problem asks for the largest index if there are duplicates. Because our loop weight = 0 to words.length goes in ascending order, we blindly update curr.weight = weight at every node. The last word to pass through a node leaves its highest index there permanently.

5. Interview Discussion

Interviewer: "What is the time complexity?"
You: "Initialization takes $O(N \times L^2)$ where $N$ is the number of words and $L$ is the max word length. The search f() takes strictly $O(P + S)$ where $P$ is prefix length and $S$ is suffix length."
Interviewer: "What if memory is heavily constrained?"
You: "We could maintain two separate Tries (one forward, one backward). During f(), we get a list of valid indices from the prefix Trie, and a list of valid indices from the suffix Trie. Then we find the intersection. It saves memory but makes f() much slower."

5. Verbal Interview Script (Staff Tier)

Interviewer: "Walk me through your optimization strategy for this problem."

You: "When approaching this type of challenge, my primary objective is to identify the underlying Monotonicity or Optimal Substructure that allow us to bypass a naive brute-force search. In my implementation of 'MANG Problem #49: Prefix and Suffix Search (Hard)', I focused on reducing the time complexity by leveraging a Dynamic Programming state transition. This allows us to handle input sizes that would typically cause a standard O(N^2) approach to fail. Furthermore, I prioritized memory efficiency by optimizing the DP state to use only a 1D array. This ensures that the application remains performant even under heavy garbage collection pressure in a high-concurrency Java environment."

6. Staff-Level Interview Follow-Ups

Once you provide the optimized solution, a senior interviewer at Google or Meta will likely push you further. Here is how to handle the most common follow-ups:

Follow-up 1: "How does this scale to a Distributed System?"

If the input data is too large to fit on a single machine (e.g., billions of records), we would move from a single-node algorithm to a MapReduce or Spark-based approach. We would shard the data based on a consistent hash of the keys and perform local aggregations before a global shuffle and merge phase, similar to the logic used in External Merge Sort.

Follow-up 2: "What are the Concurrency implications?"

In a multi-threaded Java environment, we must ensure that our state (e.g., the DP table or the frequency map) is thread-safe. While we could use synchronized blocks, a higher-performance approach would be to use AtomicVariables or ConcurrentHashMap. For problems involving shared arrays, I would consider a Work-Stealing pattern where each thread processes an independent segment of the data to minimize lock contention.

7. Performance Nuances (The Java Perspective)

Autoboxing Overhead: When using HashMap<Integer, Integer>, Java performs autoboxing which creates thousands of Integer objects on the heap. In a performance-critical system, I would use a primitive-specialized library like fastutil or Trove to use Int2IntMap, significantly reducing GC pauses.
Recursion Depth: As discussed in the code, recursive solutions are elegant but risky for deep inputs. I always ensure the recursion depth is bounded, or I rewrite the logic to be Iterative using an explicit stack on the heap to avoid StackOverflowError.

Key Takeaways

WordFilter(String[] words) Initializes the object with the words in the dictionary.
int f(String pref, String suff) Returns the index of the word in the dictionary, which has the prefix pref and the suffix suff. If there is more than one valid index, return the largest of them. If there is no such word in the dictionary, return -1.
"apple{apple"