Artificial neural networks (ANNs) require tremendous amount of data to train on. However, in classification models, most data features are often similar which can lead to increase in training time without significant improvement in the performance. Thus, we hypothesize that there could be a more efficient way to train an ANN using a better representative sample. For this, we propose the LAD Improved Iterative Training (LIIT), a novel training approach for ANN using large deviations principle to generate and iteratively update training samples in a fast and efficient setting. This is exploratory work with extensive opportunities for future work. The thesis presents this ongoing research work with the following contributions from this study: (1) We propose a novel ANN training method, LIIT, based on the large deviations theory where additional dimensionality reduction is not needed to study high dimensional data. (2) The LIIT approach uses a Modified Training Sample (MTS) that is generated and iteratively updated using a LAD anomaly score based sampling strategy. (3) The MTS sample is designed to be well representative of the training data by including most anomalous of the observations in each class. This ensures distinct patterns and features are learnt with smaller samples. (4) We study the classification performance of the LIIT trained ANNs with traditional batch trained counterparts.
翻译:然而,在分类模型中,大多数数据特征往往相似,可能会增加培训时间,而不会显著改善绩效。因此,我们假设,使用更具代表性的样本,可以有一种更高效的方式培训ANN;为此,我们提议LAD改进循环培训(LIIT),这是对ANN的一种新颖的培训方法,使用大偏差原则在快速高效的环境中生成和迭接更新培训样本。这是探索性工作,为今后工作提供了广泛的机会。本论文展示了这一正在进行的研究工作,本研究报告的以下贡献:(1) 我们根据大偏差理论提出了一个新的ANNE培训方法,即LIIT,该理论是不需要额外减少维度以研究高维度数据的。 (2) LIIT方法使用基于LAD异常得分的取样战略生成和迭接更新的修改培训样本。 (3) MTS样本旨在充分代表培训数据,包括每一类中大多数观测的异常情况。</s>