MANG Problem #14: K-th Smallest in Lexicographical Order (Hard)

1. Problem Statement

Mental Model

Mapping relationships and traversals between nodes.

Given two integers n and k, return the k-th lexicographically smallest integer in the range [1, n].

Input: n = 13, k = 2
Output: 10
Explanation: The lexicographical order is [1, 10, 11, 12, 13, 2, 3, 4, 5, 6, 7, 8, 9], so the second smallest is 10.

2. Approach: Denary Tree Traversal

Instead of sorting (which is too slow), we treat the numbers as nodes in a 10-ary tree (Denary Tree).

Root 1 has children 10, 11, 12... 19.
Root 10 has children 100, 101...

Count Nodes: For a current prefix, calculate how many numbers exist in the range [1, n] that start with that prefix.
Navigate:
- If count <= k: The target is inside this subtree. Move to the first child (prefix * 10) and decrement k by 1.
- If count > k: The target is outside this subtree. Move to the next sibling (prefix + 1) and subtract count from k.

3. Java Implementation

public int findKthNumber(int n, int k) {
    int curr = 1;
    k = k - 1; // We start at the first number (1)
    
    while (k > 0) {
        long steps = countSteps(n, curr, curr + 1);
        if (steps <= k) {
            curr += 1; // Move to sibling
            k -= steps;
        } else {
            curr *= 10; // Move to child
            k -= 1;
        }
    }
    return curr;
}

private long countSteps(int n, long n1, long n2) {
    long steps = 0;
    while (n1 <= n) {
        steps += Math.min(n + 1, n2) - n1;
        n1 *= 10;
        n2 *= 10;
    }
    return steps;
}

4. 5-Minute "Video-Style" Walkthrough

The "Aha!" Moment: Lexicographical order is just Pre-order Traversal on a 10-ary tree.
The Skipping Logic: The countSteps function is the hero. It tells us: "How many numbers are there between curr and curr + 1?" If that number is less than k, we don't need to visit any of them. We jump over them all in $O(1)$.
Efficiency: By skipping entire subtrees, we achieve $O(\log n \times \log n)$ time, which is incredibly fast for $n = 10^9$.

5. Interview Discussion

Interviewer: "Why use long for n1 and n2?"
You: "Because when we multiply by 10, we could easily exceed Integer.MAX_VALUE before the n1 <= n check."
Interviewer: "What is the tree height?"
You: "The height is $O(\log n)$, which is why this approach is so much better than generating or sorting all numbers."

5. Verbal Interview Script (Staff Tier)

Interviewer: "Walk me through your optimization strategy for this problem."

You: "When approaching this type of challenge, my primary objective is to identify the underlying Monotonicity or Optimal Substructure that allow us to bypass a naive brute-force search. In my implementation of 'MANG Problem #14: K-th Smallest in Lexicographical Order (Hard)', I focused on reducing the time complexity by leveraging a Dynamic Programming state transition. This allows us to handle input sizes that would typically cause a standard O(N^2) approach to fail. Furthermore, I prioritized memory efficiency by using in-place modifications. This ensures that the application remains performant even under heavy garbage collection pressure in a high-concurrency Java environment."

6. Staff-Level Interview Follow-Ups

Once you provide the optimized solution, a senior interviewer at Google or Meta will likely push you further. Here is how to handle the most common follow-ups:

Follow-up 1: "How does this scale to a Distributed System?"

If the input data is too large to fit on a single machine (e.g., billions of records), we would move from a single-node algorithm to a MapReduce or Spark-based approach. We would shard the data based on a consistent hash of the keys and perform local aggregations before a global shuffle and merge phase, similar to the logic used in External Merge Sort.

Follow-up 2: "What are the Concurrency implications?"

In a multi-threaded Java environment, we must ensure that our state (e.g., the DP table or the frequency map) is thread-safe. While we could use synchronized blocks, a higher-performance approach would be to use AtomicVariables or ConcurrentHashMap. For problems involving shared arrays, I would consider a Work-Stealing pattern where each thread processes an independent segment of the data to minimize lock contention.

7. Performance Nuances (The Java Perspective)

Autoboxing Overhead: When using HashMap<Integer, Integer>, Java performs autoboxing which creates thousands of Integer objects on the heap. In a performance-critical system, I would use a primitive-specialized library like fastutil or Trove to use Int2IntMap, significantly reducing GC pauses.
Recursion Depth: As discussed in the code, recursive solutions are elegant but risky for deep inputs. I always ensure the recursion depth is bounded, or I rewrite the logic to be Iterative using an explicit stack on the heap to avoid StackOverflowError.

Key Takeaways

Root 1 has children 10, 11, 12... 19.
Root 10 has children 100, 101...
If count <= k: The target is inside this subtree. Move to the first child (prefix * 10) and decrement k by 1.