2024年伊始,团队论文“Learning from Human Educational Wisdom: A Student-Centered Knowledge Distillation Method ”被人工智能领域国际顶刊IEEE Transactions on Pattern Analysis and Machine Intelligence(CCF A,IF=23.6)录用,学生为第一作者,本人为通讯作者,主要通过知识蒸馏技术使得轻量级小模型在智能手机、摄像头、机器人和无人车(机)等资源受限的终端设备上具有云端大模型的性能。
Abstract——Existing studies on knowledge distillation typically focus on teacher-centered methods, in which the teacher network is trained according to its own standards before transferring the learned knowledge to a student one. However, due to differences in network structure between the teacher and the student, the knowledge learned by the former may not be desired by the latter. Inspired by human educational wisdom, this paper proposes a Student-Centered Distillation (SCD) method that enables the teacher network to adjust its knowledge transfer according to the student network’s needs. We implemented SCD based on various human educational wisdom, e.g., the teacher network identified and learned the knowledge desired by the student network on the validation set, and then transferred it to the latter through the training set. To address the problems of current deficiency knowledge, hard sample learning and knowledge forgetting faced by a student network in the learning process, we introduce and improve Proportional-Integral-Derivative (PID) algorithms from automation fields to make them effective in identifying the current knowledge required by the student network. Furthermore, we propose a curriculum learning-based fuzzy strategy and apply it to the proposed PID control algorithm, such that the student network in SCD can actively pay attention to the learning of challenging samples after with certain knowledge. The overall performance of SCD is verified in multiple tasks by comparing it with state-of-the-art ones. Experimental results show that our student-centered distillation method outperforms existing teacher-centered ones.
Keywords——Knowledge distillation, human educational wisdom, student-centered, fuzzy PID, curriculum learning