LEADER 02277oam 2200433zu 450 001 9910145148603321 005 20241212215544.0 010 $a9781509089468 010 $a1509089462 035 $a(CKB)1000000000695725 035 $a(SSID)ssj0000451170 035 $a(PQKBManifestationID)12176423 035 $a(PQKBTitleCode)TC0000451170 035 $a(PQKBWorkID)10459110 035 $a(PQKB)11240079 035 $a(NjHacI)991000000000695725 035 $a(EXLCZ)991000000000695725 100 $a20160829d2007 uy 101 0 $aeng 135 $aur||||||||||| 181 $ctxt 182 $cc 183 $acr 200 10$aMachine Learning and Applications; Proceedings: International Conference on Machine Learning and Applications (6th: 2007: Cincinnati, Ohio) 210 31$a[Place of publication not identified]$cIEEE Computer Society Press$d2007 215 $a1 online resource 300 $aBibliographic Level Mode of Issuance: Monograph 311 08$a9780769530697 311 08$a0769530699 330 $aAn optical character recognition (OCR) system with a high recognition rate is challenging to develop. One of the major contributors to OCR errors is smeared characters. Several factors lead to the smearing of characters such as bad scanning quality and a poor binarization technique. Typical approaches to character segmentation falls into three major categories: image-based, recognition-based, and holistic-based. Among these approaches, the segmentation path can be linear or non-linear. Our paper proposes a non-linear approach to segment characters on grayscale document images. Our method first determines whether characters are smeared together using general character features. The correct segmentation path is found using a shortest path approach. We achieved a segmentation accuracy of 95% over a set of about 2,000 smeared characters. 606 $aMachine learning$vCongresses 615 0$aMachine learning 676 $a006.31 700 $aWani$b M. Arif$01006177 801 0$bPQKB 906 $aPROCEEDING 912 $a9910145148603321 996 $aMachine Learning and Applications; Proceedings: International Conference on Machine Learning and Applications (6th: 2007: Cincinnati, Ohio)$92314962 997 $aUNINA