金属投掷缺陷探测的有效神经网方法 (Efficient Neural Net Approaches in Metal Casting Defect Detection)

One of the most pressing challenges prevalent in the steel manufacturing industry is the identification of surface defects. Early identification of casting defects can help boost performance, including streamlining production processes. Though, deep learning models have helped bridge this gap and automate most of these processes, there is a dire need to come up with lightweight models that can be deployed easily with faster inference times. This research proposes a lightweight architecture that is efficient in terms of accuracy and inference time compared with sophisticated pre-trained CNN architectures like MobileNet, Inception, and ResNet, including vision transformers. Methodologies to minimize computational requirements such as depth-wise separable convolution and global average pooling (GAP) layer, including techniques that improve architectural efficiencies and augmentations, have been experimented. Our results indicate that a custom model of 590K parameters with depth-wise separable convolutions outperformed pretrained architectures such as Resnet and Vision transformers in terms of accuracy (81.87%) and comfortably outdid architectures such as Resnet, Inception, and Vision transformers in terms of faster inference times (12 ms). Blurpool fared outperformed other techniques, with an accuracy of 83.98%. Augmentations had a paradoxical effect on the model performance. No direct correlation between depth-wise and 3x3 convolutions on inference time, they, however, they played a direct role in improving model efficiency by enabling the networks to go deeper and by decreasing the number of trainable parameters. Our work sheds light on the fact that custom networks with efficient architectures and faster inference times can be built without the need of relying on pre-trained architectures.

翻译：钢铁制造业普遍存在的最紧迫挑战之一是查明表面缺陷。早期识别铸造缺陷可以帮助提高性能,包括简化生产流程。虽然深层学习模型帮助弥补了这一差距,使大部分这些流程自动化,但迫切需要找到轻量模型,这些模型可以很容易地以更快的推推论时间部署。这项研究提出了一种轻量结构,在准确性和推论时间方面,与精密的CNN结构,如移动网络、感知和ResNet,包括视觉变异器相比,这种结构效率是有效的。尽量减少计算要求的方法,如深度分解的融和全球平均集合(GAP)层,包括提高建筑效率和增强这些流程的技术,已经进行了实验。我们的结果表明,590K参数的定制模型,其深度和分解的二次曲线,在准确性(81.87%)方面超越了Resnet和视觉变异式结构,而Resnet、感知和视觉变异型等结构,在更快的精确度方面,在12 mmillevilation3 网络中,在更精确的精确度方面,在不易变更精确的轨道上,在更精确的精确的轨道上建立了更精确性结构。