HackerRank - Search in a 2-D grid

Today, I managed to complete HackerRank's Grid Search albeit with a little difficulty. In a nutshell, the problem requires us to find a 2-D rectangular pattern in a 2-D rectangular grid. All contents of the pattern and the grid are digits from 0-9.

For example:

The given grid is:

1 2 3 4 5 6 7 8 9 0
0 9 8 7 6 5 4 3 2 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2

The given pattern is:

8 7 6 5 4 3
1 1 1 1 1 1
1 1 1 1 1 1

It is not difficult to tell that the pattern exists in the center of the grid (highlighted in red):

1 2 3 4 5 6 7 8 9 0
0 9 8 7 6 5 4 3 2 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2

The grid's dimensions can be at most 1000 * 1000 and the pattern can be at most as large as the grid itself.

Problem source: here

What would be your algorithm to solve this problem?

On first thought, the naive solution should come to mind: brute-force matching. For every cell in the grid, try to find the pattern in the grid starting at that cell. This solution clearly works in O(n^2*m^2) where n is the number of rows and m is the number of columns in the grid. This would clearly fail as the worse case number of operations is (a constant factor of) 1e12. Strangely, the solutions that I found online managed to pass with this brute force approach.

Is there a better solution?

I thought about this for 1 day, trying to figure out an algorithm that works in quadratic time. That means that we would perform O(n*m) or (a constant factor of) 1e6 at most. However, I soon figured that this was quite an impossible task as a 1-D pattern would already take quadratic time. So my algorithm had to be at least O(n^2*m) or O(n*m^2).

I managed to devise an O(n^2*m) algorithm:

To make things clearer, we will use gi and gj to represent the row and column indexes of the grid and pi and pj to represent those of the pattern.
  • Use initial stage of KMP algorithm to preprocess (build the match array) for each row of the pattern matrix.
  • For each row, pi, in the pattern matrix:
    • Search the entire grid for all of its occurrences using KMP search algorithm.
    • If there is an occurence of the pattern starting at (gi,gj), i.e. (pi,0) matches (gi,gj), (pi,1) matches (gi,gj+1) and so on, check if the cell above it,  (gi-1,gj) was matched with (pi-1,0), in the pattern. We can easily do this using an array of set/hashmap. If there is a match, store (gi,gj) into a set/hashmap (so that pi+1 can "see" this match).
  • If the last set/hashmap in the array is not empty after processing all pi, that implies that we have found the pattern because row pi was found below row pi-1 in the grid for every pi > 0 (when pi = 0, we don't need to check this condition because it does not have any row above itself).
The complexity of this solution depends on the data structure used to track the matched cells in the grid. I used the set data structure from C++ STL. So the complexity of my solution is O(n^2*mlog(n)) which would be O(n^2*m) if I used unordered_set instead (but strangely there was no difference in runtime), which should be much better than the brute force approach. Managed to get AC with this algorithm. Yay!

AC status from HackerRank

Comments

  1. A note to readers: the brute force algorithm works faster for this problem due to the distribution of data in the test cases. I have designed a heavily skewed input that will cause the brute force strategy to run much slower than my efficient algorithm. See the following test case:

    Grid: 1000 * 1000 grid consisting of all 'a' but the last letter is 'b'
    Pattern: 100 * 100 matrix consisting of all 'a' but the last letter is 'b'

    ReplyDelete
  2. Update: I have devised several optimisations for my algorithm. Now it is much faster than the brute force algorithm under any conditions.

    ReplyDelete

Post a Comment

Popular posts from this blog

USACO - Broken Necklace

Efficient learning in Competitive Programming