Technical Papers Parallel Session-V: Urdu optical character recognition technique using point feature matching; a generic approach
Abstract/Description
The complexity associated with Urdu fonts regarding OCR in newspapers is being dealt with active research. When creating an Urdu OCR you are limited to a certain font size i.e. if working with a font size of 12, you will have to create a database covering all characters/words of font size 12. In order to work with another font size of same Urdu font you'll have to cover all the characters/words of that respective font size. The OCR technique should be generic where the font size should not matter. The objective was to create a technique that could be applied to any Urdu script font size, without worrying about the variation of characters/words caused by the disposal of ink in Urdu newspaper clippings. In this paper the authors have developed a technique using point feature matching on cropped Urdu newspaper clippings with font Jameel Noori Nastaleeq and converted them into editable textual Unicodes.
Keywords
Urdu OCR, Point feature matching, Jameel Noori Nastaleeq, Generic font size, Image processing
Location
C-10, AMAN CED
Session Theme
Technical Papers Parallel Session-V (Information Retrieval)
Session Type
Parallel Technical Session
Session Chair
Dr. Imran Hayee
Start Date
13-12-2015 2:50 PM
End Date
13-12-2015 3:10 PM
Recommended Citation
Khan, W. Q., & Khan, R. Q. (2015). Technical Papers Parallel Session-V: Urdu optical character recognition technique using point feature matching; a generic approach. International Conference on Information and Communication Technologies. Retrieved from https://ir.iba.edu.pk/icict/2015/2015/25
COinS
Technical Papers Parallel Session-V: Urdu optical character recognition technique using point feature matching; a generic approach
C-10, AMAN CED
The complexity associated with Urdu fonts regarding OCR in newspapers is being dealt with active research. When creating an Urdu OCR you are limited to a certain font size i.e. if working with a font size of 12, you will have to create a database covering all characters/words of font size 12. In order to work with another font size of same Urdu font you'll have to cover all the characters/words of that respective font size. The OCR technique should be generic where the font size should not matter. The objective was to create a technique that could be applied to any Urdu script font size, without worrying about the variation of characters/words caused by the disposal of ink in Urdu newspaper clippings. In this paper the authors have developed a technique using point feature matching on cropped Urdu newspaper clippings with font Jameel Noori Nastaleeq and converted them into editable textual Unicodes.