for text detection we used EAST (An efficient and accurate scene text detector)
we simply merge the bounding boxes into a larger one most probably it will be the title
we apply only the title image to EAST
we use tesseract for text recognition for thr image
The output text:
(THIRD EDITION Textbook of Geotechnical Enoineering)
we applied some simple text processing methods:
-removing punctuation.
-removing single characters from the list of results.
-removing duplication.