Digitization, i.e., the process of converting information into a digital format, may provide various opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses. In this context, one of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers as usual business processes. For example, processes in banking (e.g., applying for loans) or the Government Registry of BDM (Births, Deaths, and Marriages) applications may involve uploading several documents such as a driver's license and passport. There are not many studies available to address the challenge as an application of image classification. Although some studies are available which used various methods, a more accurate model is still required. The current study has proposed a robust fusion model to define the type of identity documents accurately. The proposed approach is based on two different methods in which images are classified based on their visual features and text features. A novel model based on statistics and regression has been proposed to calculate the confidence level for the feature-based classifier. A fuzzy-mean fusion model has been proposed to combine the classifier results based on their confidence score. The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets. The performance of the proposed model is evaluated using the Receiver Operating Characteristic (ROC) curve analysis.
翻译:数字化,即将信息转换成数字格式的过程,可能为企业提供各种机会(例如,提高生产力、灾后恢复和环保解决方案的提高)和挑战,为企业提供各种机会(例如,提高生产率、灾后恢复和环保解决方案)和挑战,在这方面,主要挑战之一是将客户每天上传的大量扫描文件作为通常的业务流程准确分类,例如,银行(例如,申请贷款)或BDM(Birth、死亡和婚姻)的政府登记处(Birth、死亡和婚姻)应用程序可能涉及上传一些文件,如驾驶执照和护照等。虽然目前没有多少研究可用于应对作为图像分类应用的挑战。虽然有些研究使用了各种方法,但仍需要一种更准确的模式。目前的研究提出了一种强有力的聚合模式,以准确界定身份证件的类型。拟议的方法有两种不同的方法,根据图像的视觉特征和文字特征特征特征(Birth、死亡和婚姻)对图像进行分类进行分类。基于统计和回归的新模式,以计算基于地貌分类和护照等的信任度。提议了一种fzzy-me-meal-cental 模型,以采用各种方法应对挑战,但仍需要一种更精确的模式。目前的模型。目前研究提议,以便根据对正变式方法对正态数据进行比较评估。