Binary Search Tree In-order Iterator

Problem:  Assume Iterators only need 3 methods written when implemented:

void init(Node root){…}//sets up Iterator so it can use
bool hasNext(){…}//true if there is a node available.  false if there is no node available
Node next(){…}//returns next Node following the iterated pattern

Write pseudo code for an Iterator for a Binary Search Tree which will iterate the tree in-order.  You can assume you have a Node, Queue<Node>, List<Node>, and Stack<Node> class already written.

Initial Thoughts:  I actually didn’t know how Iterators worked internally at the time.  I was honest with the interviewer about this.   However, that is part of the problem, so I had to break it down.  First you must figure out how to do an in-order search iteratively, then you must figure out how to refactor it so it works as an Iterator.  Also, I noticed he mentioned I can assume all these data structures exist, but I had asked “do I need to use all of these data structures?”.  The interviewer said “no”.

A very common technique to do an in order search is recursively:

void inOrderSearch(Node root){
    if(root != null){
        inOrderSearch(root.left)
        print(root.value)
        inOrderSearch(root.right)
    }
}

You start by passing in the root Node of the tree, then you recursively go as far left as you can.  This will be the smallest value Node.  You will print that Node, then check if there is anything to the right of that farthest left Node.  If there is something on the right, then do an in-order search on it.  If there isn’t a right Node, then the recursive method will go back to the previous recursive call (which in turn will be the previous in order Node!).  It will keep doing this until the tree is printed in order.  This will help pave a way to do an iterative solution.

If you paid attention in class (don’t worry I didn’t either), you can recall that recursive methods create a new stack for each new call to the recursive method.  The risk with recursive methods is that if you have too many recursive method calls, then you have too many stacks being used at once.  This could cause a stack overflow!  Can we recreate this stack-process without having to resort on too many stacks?  Or preferably just one stack?  Say we have the following tree:

binary search tree example

And we have a stack that contains nothing but the root Node (5):

[Bottom][5][Top]

Then we must cycle until we find the smallest value Node.  We know the smallest value will always be the in the furthest left Node.

bst example step 1

As we cycle down to the left, we should add each Node we pass.  Our stack will look as so:

[Bottom][5, 3, 2][Top]

Now we know the smallest, we can print this value and find the next smallest.  First let’s pop the top Node of the stack.  Then we check if that Node has any right Node.  If so, push the right Node and cycle the left of that right Node (as you cycle, push each Node to the Stack).  If there isn’t a right Node, then we don’t have to do anything.  Now we can print the value of the currently popped Node. Now our stack looks like:

[Bottom][5, 3][Top]

In our case, we don’t have a right Node, so we pop the next Node.

bst example step 2

This currently popped Node does have a right Node.  Let’s push that Node and all it’s left Nodes (which there isn’t any in this case).  This gives us:

[Bottom][5, 4][Top]

We will continue this process until our stack goes empty:

bst example step 3

[Bottom][5,][Top]

bst example step 4

[Bottom][7, 6][Top]

bst example step 5

[Bottom][7][Top]

bst example step 6

[Bottom][8][Top]

bst example step 7

[Bottom][null][Top]

Now the pseudo code for this stack-iterative in-order search:

void iterativeInOrder(Node root){
    Stack stack;//new Stack<Node>

           //add root and all left nodes until we reach furthest left node
           while(root != null){
               stack.push(root)
               root = root.left
           }

   while(!stack.isEmpty()){
       Node temp = stack.pop()
       print(temp.value)
     
       //get right, then cycle right’s left nodes
       temp = temp.right
       while(temp != null){
           stack.push(temp)
           temp = temp.left
       }
   }
}

Now we need to fill in the Iterator methods.  This is a matter of dividing the code into separate parts.  Let’s assume we are given a member variable Stack which will be used for all three methods.  Then we can start with the easiest:

bool hasNext(){
    return !stack.isEmpty()
}

Now the more difficult methods.  The init() method will be the first method called to set up the iterator before actually iterating.  In our case, this means we need to set up the stack so that it has the first Node ready to go:

void init(Node root){
           while(root != null){
               stack.push(root)
               root = root.left
           }
}

Now we need a method to iterate the tree:

Node next(){
    Node retNode = null
    while(!stack.isEmpty()){
           retNode = stack.pop()

           Node temp = retNode.right
           while(temp != null){
               stack.push(temp)
               temp = temp.left
           }
    }

    return retNode
}

Notice that this is nearly the same code as before but with just two differences.  It’s split in half and we do a return instead of a print.  This is a fine answer, but I feel like we can do better…

Can we avoid using a stack altogether?  Are we being redundant in imitating the stack process?  Why did we use a stack in the first place?  The reason why we used it was because we needed to know the previous Node.   However, isn’t the previous Node just current Node’s parent?  We can easily avoid using a stack and just keep track of a fixed number of Nodes instead if our Node class has a getParent() method!

Final Solution:  Let’s assume our Node class has a getParent() method.  We could write an Iterator by just keep track of the current Node.  We will have member variables Node currentNode, Node rootNode, and bool inRightSubTree:

Node currentNode = null
Node rootNode = null
bool inRightSubTree = false

void init(Node root){
           currentNode = root
           rootNode = root
           while(currentNode != null)
               currentNode = currentNode.left
}

Node next(){
    Node retNode = currentNode

   if(rootNode == currentNode)
        inRightSubTree = true

    //check if this is the last node
    //check to see if we are in the right sub tree of the root
    //current node has no more right nodes
    //make sure current node is root node with no more right nodes
    //or is actually a right node (node.parent.right should equal node)
    if(inRightSubTree && currentNode.right == null && (currentNode.getParent() == null || currentNode.getParent().right == currentNode)){
        currentNode = null
        return retNode
    }

    if(currentNode.right != null){
        currentNode = currentNode.right
        while(currentNode.left != null)
            currentNode = currentNode.left
    }else{
        currentNode = currentNode.getParent()
    }
    return retNode
}

bool hasNext(){
    return currentNode != null
}

Although this is a bit more code and a bit more complex when checking if the search is over or not, it does avoid using a stack which saves memory. 

Lattice Paths

Problem:  Starting in the top left corner of a 2×2 grid, and only being able to move to the right and down, there are exactly 6 routes to the bottom right corner.

How many such routes are there through a 20×20 grid?  (Source)

Initial Thoughts:  There seems to be no obvious way to solve this iteratively (at first sight).  We could create a graph class with nodes that have 4 pointers to other nodes (one per direction), then recursively solve this problem.  However, I really don’t want to have to reinvent the wheel.  Is there a class or data type that can imitate a grid like structure?  A 2D array seems to be a good fit.  What would be the dimensions of this 2D array?  Say we have an MxN grid to replicate, should we have a MxN 2D array?  Each index in the 2D array does not represent each square in the grid, but each point (or node).  Say we have a 2×2 grid, we actually have 3×3 points (or nodes in the graph).  The same is for a 3×3 grid, we actually have 4×4 points (or nodes in the graph).  This means for a MxN grid, we should have a (M+1)x(N+1) 2D array.  Personally, I like making everything an Object, so I came up with a simple class:

public class LatticePath{

    public long[][] nodeGraph;

    public LatticePath(int x, int y){
        nodeGraph = new long[x+1][y+1];//number of nodes….not number of edges
        this.emptyNodeGraph();
    }
    
    public void emptyNodeGraph(){
        for(int i = 0; i < nodeGraph.length; i++)
            for(int j = 0; j < nodeGraph[i].length; j++)
                nodeGraph[i][j] = 0;
    }
    

    //this may seem useless, but it will help us keep track of how often we passed through a “node”
public void printNodeGraph(){
        for(int i = 0; i < nodeGraph.length; i++){
            for(int j = 0; j < nodeGraph[i].length; j++){
                System.out.print(nodeGraph[i][j]);
                for(int k = 0; k < (15-((nodeGraph[i][j]+””).length())); k++)//use 15 as some large default length space
                    System.out.print(” “);
            }
            System.out.println();
        }
    }

}

Now we must come up with a way to actually calculate the number of lattice paths.  Our starting point is the top left node, and our ending point is the bottom right node. We can only move to the right or move down.  Moving right and moving down is just incrementing one of the indexes in the 2D array.  Basically, we would have something similar to:

recursiveMethod(nodeGraph, x+1, y) + recursiveMethod(nodeGraph, x, y+1);

This way we will increment one to the right and increment one to the bottom.  This will go through all possible paths.  Now we need a base case, so we know when to end the recursion.  Once x and y (the current indexes of the 2D array or the coordinates of the node) are equal to the length of the array minus one, we should return 1.  This returned 1 represents a unique complete lattice path.

if(nodeGraph.length-1 == x && nodeGraph[0].length-1 == y)
        return 1;

However, are missing any other possible cases?  What if we reach x length before reaching the y length or vice versa?  What does this mean visually?

Image

(Forgive my lack of artistic skills.)

The red arrows are the currently searched paths.  The green circle is the node we have finally reached.  We cannot go any further right (we cannot increment x any further).  However, we can go further down (we can increment y).  We could iteratively add the rest at this point (which would be better for space/time efficiency) or we could a single recursive call (which is easier to write).  Finally, it will look like:

 public static long numberOfLatticePathsRecursive(LatticePath nodeGraph){
return  numberOfLatticePathsRecursiveHelper(nodeGraph.nodeGraph, 0, 0);
}

private static long numberOfLatticePathsRecursiveHelper(long[][] nodeGraph, int x, int y){
     nodeGraph[x][y]++;
     if(nodeGraph.length-1 == x && nodeGraph[0].length-1 == y)
         return 1;
     else if(nodeGraph.length-1 == x && nodeGraph[0].length-1 > y)
        return numberOfLatticePathsRecursiveHelper(nodeGraph, x, y+1);
     else if(nodeGraph.length-1 > x && nodeGraph[0].length-1 == y)
       return numberOfLatticePathsRecursiveHelper(nodeGraph, x+1, y);

    return numberOfLatticePathsRecursiveHelper(nodeGraph, x+1, y) + numberOfLatticePathsRecursiveHelper(nodeGraph, x, y+1);
}

Now let’s run some tests:

        int x, y;
        long total;
        LatticePath lp;
       
        for(int i = 1; i <= 20; i++){
            lp = new LatticePath(i,i);
            total = numberOfLatticePathsRecursive(lp);
            System.out.println(i + “*” + i + ” lattice path = ” + total);
        }

After about a minute, I get the following results:

1*1 lattice path = 2
2*2 lattice path = 6
3*3 lattice path = 20
4*4 lattice path = 70
5*5 lattice path = 252
6*6 lattice path = 924
7*7 lattice path = 3432
8*8 lattice path = 12870
9*9 lattice path = 48620
10*10 lattice path = 184756
11*11 lattice path = 705432
12*12 lattice path = 2704156
13*13 lattice path = 10400600
14*14 lattice path = 40116600
15*15 lattice path = 155117520
16*16 lattice path = 601080390
17*17 lattice path = 2333606220

Now my computer is pushing some weights and attempting to pump out the rest of results.  Clearly this recursive strategy is not fast enough.  Project Euler only accepts answers that are correct and calculated within one minute.  I’m sure we could come up with a faster solution by the time this one finishes running (or crashes due to lack of memory).  Let’s re-run this test from 1 to 5 only and print out the 2D array after each run.

for(int i = 1; i <= 5; i++){
lp = new LatticePath(i,i);
total = numberOfLatticePathsRecursive(lp);
lp.printNodeGraph();
System.out.println(i + “*” + i + ” lattice path = ” + total);
}

We get the following:

1              1              
1              2              
1*1 lattice path = 2
1              1              1              
1              2              3              
1              3              6              
2*2 lattice path = 6
1              1              1              1              
1              2              3              4              
1              3              6              10             
1              4              10             20             
3*3 lattice path = 20
1              1              1              1              1              
1              2              3              4              5              
1              3              6              10             15             
1              4              10             20             35             
1              5              15             35             70             
4*4 lattice path = 70
1              1              1              1              1              1              
1              2              3              4              5              6              
1              3              6              10             15             21             
1              4              10             20             35             56             
1              5              15             35             70             126            
1              6              21             56             126            252            
5*5 lattice path = 252

Interestingly enough, there seems to be a very familiar pattern here.   It seems to  be replicating Pascal’s Triangle.

File:PascalTriangleAnimated2.gif

Pascal’s Triangle is triangular array of binomial coefficients.    The binomial coefficient is the number of ways picking k unordered outcomes from n possibilities:

 _nC_k=(n; k)=(n!)/((n-k)!k!),

Let’s look at all the possible paths of a 2×2 grid:

2x2 grid all paths colored

In each path of a AxB grid, there are A ways to move down and B ways to move right.   The number of possibilities n is A + B (e.g. 2 + 2) and the k unordered outcomes can either be A or B.  This means our formula should look as so:

(A+B)! /A!(A+B-A)! = (A+B)! /B!(A+B-B)!

Final Solution:  Now we can calculate the number of paths iteratively/mathematically

    public static long numberOfLatticePathsMathematically(LatticePath nodeGraph){
        //note:  the size of the array is for the number of nodes.  not number of edges.
        //this formula is for number of edges (or number of right plus number of down turns)
        long m = nodeGraph.nodeGraph.length – 1;
        long n = nodeGraph.nodeGraph[0].length – 1;
        return binomialCoefficient(m+n, n);
    }

    private static long binomialCoefficient(long a, long b){
        //numbers get too big here so long will not do.  try BigInteger!
        /*
        long x = factorial(a), y = factorial(b), z = factorial(a – b);
        long c = y*z;
        return x/c;
        */
        
        BigInteger biA = factorial(a), biB = factorial(b), biAB = factorial(a-b);
        BigInteger denom = biB.multiply(biAB);
        return (biA.divide(denom)).longValue();    
    }

    private static BigInteger factorial(long v){
        BigInteger retVal = BigInteger.valueOf(1);
        for(long i = v; i > 1; i–)
            retVal = retVal.multiply(BigInteger.valueOf(i));
        return retVal;
    }

Now let’s test this out:

        int x, y;
        long total;
        LatticePath lp;
    
        for(int i = 1; i <= 20; i++){
            lp = new LatticePath(i,i);
            total = numberOfLatticePathsMathematically(lp);
            System.out.println(i + “*” + i + ” lattice path = ” + total);
        }

We get the following results in about a second:

1*1 lattice path = 2
2*2 lattice path = 6
3*3 lattice path = 20
4*4 lattice path = 70
5*5 lattice path = 252
6*6 lattice path = 924
7*7 lattice path = 3432
8*8 lattice path = 12870
9*9 lattice path = 48620
10*10 lattice path = 184756
11*11 lattice path = 705432
12*12 lattice path = 2704156
13*13 lattice path = 10400600
14*14 lattice path = 40116600
15*15 lattice path = 155117520
16*16 lattice path = 601080390
17*17 lattice path = 2333606220
18*18 lattice path = 9075135300
19*19 lattice path = 35345263800
20*20 lattice path = 137846528820

Phew.  That one was a thinker.

Random Number Selector

Problem:   Say a user is entering a stream of numbers (one integer at a time), how can you randomly select one of the numbers so each of the numbers has an equal chance of being selected?

Example:  User enters:

1

5

1

4

2

1 has a 40% chance of being selected where as 2, 4, and 5 each have a 20% chance of being selected.

Initial Thoughts:  Use a data structure such as a Multi-Valued HashMap or ArrayList.  In this case, we’ll use ArrayList because we can easily store all the numbers into the list, then randomly select between 0 and size of the ArrayList minus 1.  This will take care of worrying about which number has a greater chance of being selected since there will be more of that same number in the ArrayList.

Here is a relatively quick and simple implementation:

    public static void RandomSelectorAttempt1(){
    
            ArrayList<Integer> list = new ArrayList<Integer>();
            Scanner scanner = new Scanner(System.in);
            
            System.out.println(“Enter a list of intergers.  Once complete, type \”DONE\”.”);
            while(scanner.hasNext()){
                if (scanner.hasNextInt())
                list.add(new Integer(scanner.nextInt()));
                else if(scanner.next().trim().equalsIgnoreCase(“DONE”))
                    break;
                else
                    System.out.println(“ERROR:  invalid value.”);
            }
            scanner.close();
            
            if(list.size() > 0){        
                Random randomGenerator = new Random();
                System.out.println(list.get(randomGenerator.nextInt(list.size())));
            }else{
                System.out.println(“No integers enetred.”);
            }
        }

Now it’s time to look back and see if we can improve this solution.  Say we have billions of numbers inputted by the user, storing all these numbers into an ArrayList wouldn’t be wise as far as space complexity is concerned.   We could keep a List of uniquely entered numbers and a parallel List of numbers which tracks the number of times each unique number has been entered.  This way we could save space on duplicates, but this won’t help us if the user enters unique numbers each time.  Is there a solution that can avoid data structures all together?  What if we calculate a selected number every chunk of numbers entered, then calculate a final selected number of the pool of chunk selected numbers?  This can still lead to issues if the user enters billions of chunks of numbers.  What if we can calculate the selected number each time a number is entered?  We know that each number entered has an equal chance of being selected.  This means for n numbers, each number has a 1/n chance of winning.  After receiving the first number, we can calculate the next entered number’s chance of being selected, and then use a random number generator to roll out a number between 0 and n-1.  If the number generated is n-1 or 0 (consistently use one!), then the current number selected is the most recent number entered.

Imagine if we received the same sample user input as above:

User Input    n              % of user input being selected          Currently Selected

1                     1              100%                                                       1

5                     2               50%                                                         50% => 1;             50%=> 5

1                     3               33.3%                                                      66.6% => 1,5;      33.3%=> 1

4                     4               25%                                                         75% => 1,5,1;      25% => 4

2                     5               20%                                                         80% => 1,5,1,4;   20% =>2

The chances of being selected may not be so obvious at first.  The first input has 100% chance of winning because it is the only number inputted.  The second input has 50% chance of winning because there are only two inputs so far (and each number has an equal chance of being selected).  Now assume we have a winner (we don’t care what exact number the current winner is), we know that the third input has about 33.3% chance of winning where as current winner has about 66.6%.  The reason why the current winner has about 66.6% of winning is because it covers the chances two of the three currently entered numbers.  This applies to when we have 4 numbers inputted as well.  The first three numbers have a 75% of winning where as the fourth number has a 25%.

Final Solution:

public static void RandomSelectorAttempt2(){
    
            Scanner scanner = new Scanner(System.in);
            int n = 0;
            Integer currentWinner = null;
            boolean firstNumber = true;
            Random randomGenerator = new Random();
            int rand;
            String line;
            
            System.out.println(“Enter a list of intergers.  Once complete, type \”DONE\”.”);
            while(scanner.hasNextLine()){
                line = scanner.nextLine();
                if (isInteger(line)){
                    if(currentWinner == null)
                        currentWinner = new Integer(Integer.parseInt(line));
                    else{
                        rand = randomGenerator.nextInt(n);
                        System.out.println(n + “th random = ” + rand);
                        if(rand == n-1)
                            currentWinner = new Integer(Integer.parseInt(line));
                    }
                    n++;
                }
                else if(line.trim().equalsIgnoreCase(“DONE”)){
                    break;
                }
                else{
                    System.out.println(“ERROR:  invalid value.”);
                }
            }
            scanner.close();
            
            if(currentWinner != null)
                System.out.println(currentWinner);
            else
                System.out.println(“No integers enetred.”);
        }

    public static boolean isInteger(String s) {
        try{
              Integer.parseInt(s);
        }catch(NumberFormatException e){
            return false;
        }
       return true;
    }

The first solution had a space complexity of O(n) (n being the number of numbers stored) where as this new solution has space complexity of O(1).