Learning style refers to a type of training mechanism adopted by an individual to gain new knowledge. As suggested by the VARK model, humans have different learning preferences like visual, auditory, etc., for acquiring and effectively processing information. Inspired by this concept, our work explores the idea of mixed information sharing with model compression in the context of Knowledge Distillation (KD) and Mutual Learning (ML). Unlike conventional techniques that share the same type of knowledge with all networks, we propose to train individual networks with different forms of information to enhance the learning process. We formulate a combined KD and ML framework with one teacher and two student networks that share or exchange information in the form of predictions and feature maps. Our comprehensive experiments with benchmark classification and segmentation datasets demonstrate that with 15% compression, the ensemble performance of networks trained with diverse forms of knowledge outperforms the conventional techniques both quantitatively and qualitatively.
翻译:根据VARK模式,人类在获取和有效处理信息方面有不同的学习偏好,如视觉、听觉等。受这个概念的启发,我们的工作探索了在知识蒸馏和相互学习(ML)背景下将信息共享与模式压缩混在一起的想法。与与与所有网络共享相同类型知识的传统技术不同,我们提议培训具有不同形式信息的个别网络,以加强学习过程。我们与一个教师和两个学生网络共同设计了一个合并的KD和ML框架,以预测和地貌图的形式分享或交流信息。我们用基准分类和分层数据集进行的全面实验表明,在15%的压缩条件下,经过不同形式知识培训的网络的共性能超过了常规技术的数量和质量。