Underwater automatic target recognition (UATR) has been a challenging research topic in ocean engineering. Although deep learning brings opportunities for target recognition on land and in the air, underwater target recognition techniques based on deep learning have lagged due to sensor performance and the size of trainable data. This letter proposed a framework for learning the visual representation of underwater acoustic imageries, which takes a transformer-based style transfer model as the main body. It could replace the low-level texture features of optical images with the visual features of underwater acoustic imageries while preserving their raw high-level semantic content. The proposed framework could fully use the rich optical image dataset to generate a pseudo-acoustic image dataset and use it as the initial sample to train the underwater acoustic target recognition model. The experiments select the dual-frequency identification sonar (DIDSON) as the underwater acoustic data source and also take fish, the most common marine creature, as the research subject. Experimental results show that the proposed method could generate high-quality and high-fidelity pseudo-acoustic samples, achieve the purpose of acoustic data enhancement and provide support for the underwater acoustic-optical images domain transfer research.
翻译:水下自动目标识别(UATR)一直是海洋工程中一项具有挑战性的研究课题。虽然深层次的学习为陆地和空中的目标识别提供了机会,但基于深层学习的水下目标识别技术由于传感器的性能和可训练数据的规模而落后。本信提议了一个框架,用于了解水下声学图像的直观表述,以以变压器为基础的风格传输模型作为主体。它可以用水下声学图像的直观特征取代光学图像的低层纹理特征,同时保存其原始高层次的语义内容。拟议的框架可以充分利用丰富的光学图像数据集生成一个伪声学图像数据集,并将其作为初始样本用于培训水下声学识别模型。实验选择双频率识别声纳作为水下声学数据源,并将最常见的海洋生物鱼作为研究对象。实验结果显示,拟议的方法可以产生高品质和高纤维素度的伪声学样本,实现加强声学数据的目的,并为水下声学图像域传输提供支持。