Idea is to search document images for particular image patch(of a word) without doing OCR. Available literature for techniques which I am following:
I tried to use popular feature extraction and matching algorithms like SURF but didn't get good results yet. I have to explore more on how to use them more efficiently and confirm if they can be used or not.
Compiling instructions are part of comments in source files. There is one file using OpenCV surf features but other programs depends on leptonica for all image related operations.