研究团队
您所在的位置: 首页    研究团队    导师    洪青阳教授

洪青阳 教授


Email: qyhong@xmu.edu.cn


厦门大学信息学院 教授

中国中文信息学会语音信息专委会副主任

全国人机语音通讯会议(NCMMSC)常设委

员全国安防标委会人体生物特征识别应用分技术委员会(SAC/TC100/SC2)委员


研究方向

1、声纹识别  

2、语音识别、语音合成

3、大模型技术



个人简介 

洪青阳,厦门大学信息学院教授,主要研究方向是语音识别、声纹识别,先后主持国家自然科学基金三项,科技部创新基金两项。牵头组建厦门大学智能语音实验室,带领XMUSPEECH团队连续两届获东方语种识别(OLR)竞赛第一名,成功研发闽南语识别与合成系统。具有丰富的工业研发经验,与华为、海思、海信、鼎桥等知名企业合作,承担过大量的智能语音项目,核心技术应用到华为智能手机、说咱闽南话APP、声云语音转写和全国十几个省市的司法/社保/证券/电力系统。出版专著《语音识别:原理与应用》,累计发行超1万册。发布国内第一套声纹识别开源工具ASV-Subtools,助力学术研究和产业落地。担任2020-2021年全国声纹识别研究与应用学术研讨会主席、中文信息学会语音信息专委会副主任等职。获电子工业出版社“优秀作者奖”和华为“优秀技术合作成果奖”。



主持项目

■ “基于图结构建模的说话人日志研究”,国家自然科学基金面上项目,2023.1-2026.12。

■ “复杂场景下的说话人特征提取及识别研究”,国家自然科学基金面上项目,2019.1-2022.12。

■ “耳内声纹技术合作项目”,横向课题(华为终端),2018.8-2019.3。

■ “自由场语音交互关键后端技术研究”,横向课题(华为技术),2017.12-2018.12。

■ “语音交互关键技术研究”,横向课题(华为技术),2016.4-2018.12。

■ “闽南语智能语音对话系统”,横向课题,2017.8-2018.7。

■ “基于迁移学习的跨信道说话人识别研究”,国家青年科学基金,2012.1-2014.12。

■ “基于GSV-SVM 和迁移学习的跨信道多说话人识别研究”,深圳市科技计划项目,2011.2-2014.3。

■ “智能声纹识别引擎tsVPR2.0”,横向课题(厦门天聪),2012.08-2013.03。

■ “基于声纹识别技术的亲情电话管理系统”,国家科技部创新基金,2011.2-2013.2。

■ “智能声纹识别引擎tsVPR1.0”,横向课题(厦门天聪),2009.11-2010.12。

■ “面向智能家电的嵌入式语音识别系统”,国家科技部创新基金,2007.7-2009.9。

■ “面向智能家电的语音识别插座”,厦门市科技计划,2007.3-2008.12。

■ “嵌入式语音识别系统”,厦门大学启动经费,2005.1-2005.12。



科研专著

洪青阳,李琳著,《语音识别:原理与应用》,电子工业出版社,2023年2月第2版。


开源工具(声纹识别)

ASV-Subtools:  https://github.com/Snowdar/asv-subtools


代表性论文

■ Hukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li, “Dynamic Language Group-based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing,” ICASSP2025.

Jianglong Yao, Shenghui Lu, Pengyu Ren, Kaidi Wang, Qingyang Hong, Lin Li, "A Noisy Label Filter based on GMM Binary Classification for Speaker Verification," ICASSP2025.

Kaidi Wang, Wenhao Guan, Shenghui Lu, Jianglong Yao, Lin Li, Qingyang Hong, "SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow," ICASSP2025.

■ Jiayan Lin, Shenghui Lu, Hukai Huang, Wenhao Guan, Binbin Xu, Hui Bu, Qingyang Hong, Lin Li, “MinSpeech: A Corpus of Southern Min Dialect for Automatic Speech Recognition,” INTERSPEECH 2024.

 ■ Wenhao Guan, Kaidi Wang, Wangjin Zhou, Yang Wang, Feng Deng, Hui Wang, Lin Li, Qingyang Hong, Yong Qin, “LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation,” INTERSPEECH 2024.

■ Yishuang Li, Wenhao Guan, Hukai Huang, Shiyu Miao, Qi Su, Lin Li, Qingyang Hong, “Efficient Integrated Features Based on Pre-trained Models for Speaker Verification,” INTERSPEECH 2024.

■ Wenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong, “ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech,” ICASSP 2024.

■ Tao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li, “Improving Multi-speaker ASR with Overlap-aware Encoding and Monotonic Attention,” ICASSP 2024.

■ Longjie Luo, Tao Li, Lin Li, Qingyang Hong, “The XMUSPEECH System for Audio-Visual Target Speaker Extraction in MISP 2023 Challenge,” ICASSP 2024.

■ Yishuang Li, Hukai Huang, Zhicong Chen, Wenhao Guan, Jiayan Lin, Lin Li, Qingyang Hong, “SR-Hubert: an Efficient Pre-trained Model for Speaker Verification,” ICASSP 2024.

■ Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong, “MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis,” Proceedings of the AAAI Conference on Artificial Intelligence, Volume 38, Issue 16, Pages 18117-18125, March 25, 2024.

■ Feng Wang, Lingyan Huang, Tao Li, Qingyang Hong, Lin Li, “Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification,” INTERSPEECH 2023.

■ Lingyan Huang, Tao Li, Haodong Zhou, Qingyang Hong, Lin Li, “Cross-Modal Semantic Alignment before Fusion for Two-Pass End-to-End Spoken Language Understanding,” INTERSPEECH 2023.

■ Wenhao Guan, Tao Li, Yishuang Li, Hukai Huang, Qingyang Hong, Lin Li, “Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge,” INTERSPEECH 2023.

■ Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong, “Towards A Unified Conformer Structure: from ASR to ASV Task,” ICASSP2023.

■ Qiulin Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition,” ICASSP2023.

■ Tao Li, Haodong Zhou, Jie Wang, Qingyang Hong, Lin Li, “The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022,” ICASSP2023.

■ Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong, “Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization,” ICASSP2023.

■ Zhicong Chen, Jie Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction,” ICASSP2023.

■ Jie Wang, Yuji Liu, Binling Wang, Yiming Zhi, Song Li, Shipeng Xia, Jiayang Zhang, Feng Tong, Lin Li, Qingyang Hong, “Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting,” INTERSPEECH 2022.

■ Binling Wang, Feng Wang, Wenxuan Hu, Qiulin Wang, Jing Li, Dong Wang, Lin Li, Qingyang Hong, “Oriental Language Recognition (OLR) 2021: Summary and Analysis,” INTERSPEECH 2022.

Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li, “Deep Representation Decomposition for Rate-invariant Speaker Verification, Odyssey 2022.

■ Lin Li, Fuchuan Tong, Qingyang Hong, “When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1586-1599, 2022, doi: 10.1109/TASLP.2022.3169977.       

■ Fuchuan Tong, Siqi Zheng, Min Zhang, Binling Wang, Song Li, Yafeng Chen, Hongbin Suo, Lin Li, Qingyang Hong, “Graph Convolutional Network Based Semi-supervised Learning on Multi-Speaker Meeting Data,” ICASSP 2022.

■ Yan Liu, Zheng Li, Lin Li and Qingyang Hong, “Phoneme-aware and Channel-wise Attentive Learning for Text Dependent Speaker Verification,” INTERSPEECH 2021.

■ Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li and Qingyang Hong, “Automatic Error Correction for Speaker Embedding Learning with Noisy Label,” INTERSPEECH 2021.

■ Zheng Li, Yan Liu, Lin Li and Qingyang Hong, “Additive Phoneme-aware Margin Softmax Loss for Language Recognition” INTERSPEECH 2021.

■ Song Li, Beibei Ouyang, Fuchuang Tong, Dexin Liao, Lin Li and Qingyang Hong, “Real-time End-to-End Monaural Multi-Speaker Speech Recognition,” INTERSPEECH 2021.

■ Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, and Lin Li, “An Integrated Framework for Two-pass Personalized Voice Trigger,” INTERSPEECH 2021.

■ Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, and Dong Wang, “Oriental Language Recognition (OLR) 2020: Summary and Analysis,” INTERSPEECH 2021.

■ Lin Li, Zheng Li, Yan Liu, Qingyang Hong, “Deep joint learning for language recognition,” Neural Networks, 141 (2021) 72-86.

■ Song Li, Beibei Ouyang, Dexin Liao, Shipeng Xia, Lin Li, Qingyang Hong, “End-to-end Multi-accent Speech Recognition with Unsupervised Accent Modelling,” ICASSP 2021.

■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “Light-TTS: Lightweight Multi-speaker Multi-lingual Text-to-speech,” ICASSP 2021.

■ Fuchuan Tong, Miao Zhao, Jianfeng Zhou, Hao Lu, Zheng Li, Lin Li, Qingyang Hong, “ASV-Subtools: Open Source Toolkit for Automatic Speaker Verification”, ICASSP 2021.

■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “LightSpeech: Lightweight Non-regressive Multi-speaker Text-to-speech,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.

■ Zheng Li, Miao Zhao, Lin Li, Qingyang Hong, “Multi-feature Learning with Canonical Correlation Analysis Constraint for Text-independent Speaker Verification,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.

■ Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Liming Song and Cheng Yang, “AP20-OLR Challenge: Three Tasks and Their Baselines,” APSIPA ASC 2020.

■ Song Li, Lin Li, Qingyang Hong and Lingling Liu, “Improving Transformer-based Speech Recognition With Unsupervised Pre-training and Multi-task Semantic Knowledge Learning,” INTERSPEECH 2020.

■ Tao Jiang, Miao Zhao, Lin Li, Qingyang Hong, “The XMUSPEECH System for Short-Duration Speaker Verification Challenge 2020,” INTERSPEECH 2020.

■ Zheng Li, Miao Zhao, Jing Li, Lin Li, Qingyang Hong, “On the Usage of Multi-feature Integration for Speaker Verification and Language Identification,” INTERSPEECH 2020.

■ Zheng Li, Miao Zhao, Jing Li, Yiming Zhi, Lin Li, Qingyang Hong, “The XMUSPEECH System for AP19-OLR Challenge,” INTERSPEECH 2020.

■ Hao Lu, Jianfeng Zhou, Miao Zhao, Wendian Lei, Qingyang Hong, Lin Li, “XMU-TS Systems for NIST SRE19 CTS Challenge”, ICASSP2020.

■ Jianfeng Zhou, Tao Jiang, Zheng Li, Lin Li, Qingyang Hong, “Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function”, INTERSPEECH 2019.

■ Rongjin Li, Miao Zhao, Zheng Li, Lin Li, Qingyang Hong, “Anti-Spoofing Speaker Verification System with Multi-Feature Integration and Multi-Task Learning”, INTERSPEECH 2019.

■ Jianfeng Zhou, Tao Jiang, Lin Li, Qingyang Hong, Zhe Wang, Bingyin Xia, “Training Multi-Task Adversarial Network for Extracting Noise-robust Speaker Embedding”, ICASSP 2019.

■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Huiyang Guo, “Transfer learning for PLDA-based speaker verification”, Speech Communication, 92:90-99, 2017.

■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Feng Tong, “Transfer Learning for Speaker Verification on Short Utterances”, INTERSPEECH 2016.

■ Qingyang Hong, Jun Zhang, Lin Li, Lihong Wan, Feng Tong, “A Transfer Learning Method for PLDA-based Speaker Verification”, ICASSP 2016.

■ 李琳,万丽虹,洪青阳,张君,李明,“基于概率修正PLDA的说话人识别技术”,第十三届全国人机语音通讯学术会议(NCMMSC2015)201510月,中国,天津.

■ Qingyang Hong, Lin Li, Ming Li, Ling Huang, Lihong Wan and Jun Zhang, “Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System”, INTERSPEECH 2015.

■ Weicheng Cai, Ming Li, Lin Li and Qingyang Hong, “Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification”, INTERSPEECH 2015.

■ 李琳,万丽虹洪青阳张君,李明,“基于概率修正PLDA 的说话人识别系统”,天津大学学报(自然科学版),2015.8,pp.692-696.

■ Hong, Qingyang; Wang, Sheng; Liu, Zhijian, “A robust speaker-adaptive and text-prompted speaker verification system”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v 8833, p 385-393, 2014.

■ Wanli Chen, Qingyang Hong, Ximin Li, “MM-UBM for Text-Dependent Speaker Recognition”, 2012 Third IEEE/IET International Conference on Audio, Language and Image Processing (ICALIP2012), July 16-18, 2012, Shanghai, China.

■ Q.Y. Hong and S. Kwong, “A Discriminative Training Approach for Text-independent Speaker Recognition”, Signal Processing, 85 (7), July 2005, pp. 1449-1463.

■ Q.Y. Hong and S. Kwong, “A Genetic Classification Method for Speaker Recognition”, Engineering Applications of Artificial Intelligence, 18 (1), February 2005, pp. 13-19.

■ Q.Y. Hong and S. Kwong, Discriminative Training for Speaker Identification, Electronics Letters, 40 (4), February 2004, pp. 280-281.

■ Q.Y. Hong and S. Kwong, “Discriminative Training for Speaker Identification Based on Maximum Model Distance Algorithm”, ICASSP2004.


发明专利

[1] 发明专利:一种可抑制移动噪声的麦克风阵列语音增强装置(专利号:ZL 2012 1 0497016.9)

[2] 发明专利:谱减与动态时间规整算法联合处理的抗噪声声纹识别装置(专利号:ZL 2013 1 0370030.7)

[3] 发明专利:带声源方向跟踪功能的麦克风阵列语音增强装置及其方法(专利号:ZL 2012 1 0320004.9)

[4] 发明专利:文本提示型声纹门禁系统(专利号:ZL 2013 1 0294975.5)

[5] 发明专利:基于声纹识别技术的满意度调查作弊排查方法(专利号:ZL 2013 1 0754586.6)

[6] 发明专利:可适应强烈背景噪声的麦克风阵列语音增强装置(专利号:ZL 2016 1 0080236.X)

[7] 发明专利:带相位自校正功能的声聚焦麦克风阵列长距离拾音装置(专利号:ZL 2016 1 0080008.2)

[8] 发明专利:无直达声条件下的麦克风阵列语音增强装置(专利号:ZL 2017 1 0408164.1)

[9] 发明专利:基于加权有限状态转换器的文本内容添加标点方法(专利号:ZL 2018 1 1180949.9)

[10] 发明专利:一种基于深度神经网络的欺骗语音检测方法(专利号:ZL 2019 1 0590712.6)

[11] 发明专利:一种基于深度神经网络的多类声学特征整合方法和系统(专利号:ZL 2020 1 0073244.8)

[12] 发明专利:一种基于矫正流模型的高质量语音合成方法(专利号:ZL 2023 1 1587465.7)