The FormAssist : Deep learning methods for converting handwritten forms into digital assets

Main Article Content

Srivathshan KS
Saurav Kumar
Shreekanth R
Midhilesh E
Parvej Reja Saleh

Abstract

Customer agreement are required to follow statuary and legal requirements, which include agreements to be manually signed. In India, paper forms are still prevalent in Banking Industry. The paper forms require customers to fill a template form in capital letters and manually sign by agreeing to the terms. This creates challenge in analytical systems as the data is captured outside the system and requires time to become part of data pipeline. The future of banks is poised to be digital, however we still need historical data for train models for current data applications. This limitation is a known bottleneck in designing data applications for real time decision making. Developing Optical Character Recognition (OCR) with capabilities commensurable to that of human is still not achievable, in spite of decades of excruciating research. Due to idiosyncrasy of individual form, analysts from industry and scholastic circles have coordinated their considerations towards OCR. The work in this paper shows an efficient model to capture offline handwritten forms and convert them into digital records. The model techniques are based on deep learning methodologies and show higher accuracy for our testing set of real application forms of selected Banks. We have experimented with different feature extraction techniques to extract hand written characters in the forms. Our experimentation has evolved over time to find a generalized solution and better results. The final model uses relative position of the characters for extracting characters from the forms and Convolutional Neural Networks (CNNs) to predict the characters. The paper also discusses the serverless architecture to host the FormAssist as a REST API with model calibration feature to accommodate multiple types of forms.

Article Details

How to Cite
KSS., KumarS., RS., EM., & SalehP. R. (2020). The FormAssist : Deep learning methods for converting handwritten forms into digital assets. Probyto Journal of AI Research, 1(01). Retrieved from https://journal.probyto.com/index.php/probyto-ai-research/article/view/11
Section
Articles

References

[1] Darmatasia and Mohamad Ivan Fanany, "Handwriting Recognition on Form Document Using Convolutional Neural Network and Support Vector Machines (CNN-SVM)"
[2] A. Jindal and M. Amir, “Automatic classification of handwritten and printed text in ICR boxes,” Souvenir 2014 IEEE Int. Adv. Comput. Conf. IACC 2014, pp. 1028– 1032, 2014.
[3] N. Sharma, T. Patnaik, and B. Kumar, “Recognition for Handwritten English Letters : A Review,” vol. 2, no. 7, pp. 318–321, 2013
[4] Dan ClaudiuCires¸an and Ueli Meier and Luca Maria Gambardella and JurgenSchmidhuber, “Convolutional Neural Network Committees for Handwritten Character Classification”, 2011 International Conference on Document Analysis and Recognition, IEEE, 2011
[5] GeorgiosVamvakas, Basilis Gatos, Stavros J. Perantonis, “Handwritten character recognition through two-stage foreground sub-sampling” ,Pattern Recognition, Volume 43, Issue 8, August 2010
[6] Shrey Dutta, Naveen Sankaran, PramodSankar K., C.V. Jawahar, “Robust Recognition of Degraded Documents Using Character N-Grams”, IEEE, 2012
[7] Naveen Sankaran and C.V Jawahar, “Recognition of Printed Devanagari Text Using BLSTM Neural Network”, IEEE, 2012
[8] Yong-Qin Zhang, Yu Ding, Jin-Sheng Xiao, Jiaying Liu and Zongming Guo1, “Visibility enhancement using an image filtering approach”, Zhang et al. EURASIP Journal on Advances in Signal Processing 2012
[9] Bhatia, "Optical Character Recognition Techniques: A Review", International Journal of Advanced Research in Computer Science and Software Engineering 4(5), May - 2014, pp. 1219-1223
[10] Anuj Dutt, AashiDutt, "Handwritten Digit Recognition Using Deep Learning", IJARCET, Volume 6, Issue 7, July 2017, ISSN: 2278 – 1323
[11] Batuhan Balci, Dan Saadati, Dan Shiferaw, "Handwritten Text Recognition using Deep Learning"
[12] Gauri Katiyar, Ankita Katiyar, Shabana Mehfuz, "Off-Line Handwritten Character Recognition System Using Support Vector Machine", American Journal of Neural Networks and Applications 2017; 3(2): 22-28
[13] Nikhil Pai, Vijaykumar S. Kolkure, "OPTICAL CHARACTER RECOGNITION: AN ENCOMPASSING REVIEW", IJRET, Volume: 04 Issue: 01 | Jan-2015
[14] Jason Brownlee, "Gentle Introduction to the Adam Optimization Algorithm for Deep Learning", Deep Learning, Machine Learning Mastery
[15] Diederik Kingma, Jimmy Ba, “Adam: A Method for Stochastic Optimization“, University of Toronto, 2015 ICLR paper (poster)
[16] Alsaad, A., 2016. Enhanced root extraction and document classification algorithm for Arabic text (Doctoral dissertation, Brunel University London)

Most read articles by the same author(s)

1 2 > >>