研究团队
您所在的位置:首页  研究团队  导师  洪青阳教授

洪青阳 博士


Email: qyhong@xmu.edu.cn



厦门大学信息学院 教授

中国中文信息学会语音信息专委会副主任

全国人机语音通讯会议(NCMMSC)常设委员

全国安防标委会人体生物特征识别应用分技术委员会(SAC/TC100/SC2)委员


研究方向

1、声纹识别  

2、语音识别、语音合成

3、大模型技术



个人简介 

洪青阳,厦门大学信息学院教授,主要研究方向是语音识别、声纹识别,先后主持国家自然科学基金三项,科技部创新基金两项。牵头组建厦门大学智能语音实验室,带领XMUSPEECH团队连续两届获东方语种识别(OLR)竞赛第一名,成功研发国内第一套闽南语合成系统。创办厦门天聪等企业,具有丰富的工业研发经验,与华为、海思、海信、鼎桥等知名企业合作,承担过大量的智能语音项目,核心技术应用到华为智能手机、说咱闽南话APP、声云语音转写和全国十几个省市的司法/社保/证券/电力系统。出版专著《语音识别:原理与应用》,发布声纹识别开源工具ASV-Subtools,助力国内外学术研究和产业落地。担任2020-2021年全国声纹识别研究与应用学术研讨会主席、中文信息学会语音信息专委会副主任。获电子工业出版社“优秀作者奖”和华为“优秀技术合作成果奖”。



主持项目

■ “基于图结构建模的说话人日志研究”,国家自然科学基金面上项目,2023.1-2026.12。

■ “复杂场景下的说话人特征提取及识别研究”,国家自然科学基金面上项目,2019.1-2022.12。

■ “耳内声纹技术合作项目”,横向课题(华为终端),2018.8-2019.3。

■ “自由场语音交互关键后端技术研究”,横向课题(华为技术),2017.12-2018.12。

■ “语音交互关键技术研究”,横向课题(华为技术),2016.4-2018.12。

■ “闽南语智能语音对话系统”,横向课题,2017.8-2018.7。

■ “基于迁移学习的跨信道说话人识别研究”,国家青年科学基金,2012.1-2014.12。

■ “基于GSV-SVM 和迁移学习的跨信道多说话人识别研究”,深圳市科技计划项目,2011.2-2014.3。

■ “智能声纹识别引擎tsVPR2.0”,横向课题(厦门天聪),2012.08-2013.03。

■ “基于声纹识别技术的亲情电话管理系统”,国家科技部创新基金,2011.2-2013.2。

■ “智能声纹识别引擎tsVPR1.0”,横向课题(厦门天聪),2009.11-2010.12。

■ “面向智能家电的嵌入式语音识别系统”,国家科技部创新基金,2007.7-2009.9。

■ “面向智能家电的语音识别插座”,厦门市科技计划,2007.3-2008.12。

■ “嵌入式语音识别系统”,厦门大学启动经费,2005.1-2005.12。



科研专著

洪青阳,李琳著,语音识别:原理与应用,电子工业出版社,20232月第2版。


开源工具(声纹识别)

ASV-Subtools:  https://github.com/Snowdar/asv-subtools


代表性论文

■ Wenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong, “ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech,” ICASSP 2024.

■ Tao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li, “Improving Multi-speaker ASR with Overlap-aware Encoding and Monotonic Attention,” ICASSP 2024.

■ Longjie Luo, Tao Li, Lin Li, Qingyang Hong, “The XMUSPEECH System for Audio-Visual Target Speaker Extraction in MISP 2023 Challenge,” ICASSP 2024.

■ Yishuang Li, Hukai Huang, Zhicong Chen, Wenhao Guan, Jiayan Lin, Lin Li, Qingyang Hong, “SR-Hubert: an Efficient Pre-trained Model for Speaker Verification,” ICASSP 2024.

■ Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong, “MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis,” Proceedings of the AAAI Conference on Artificial Intelligence, Volume 38, Issue 16, Pages 18117-18125, March 25, 2024.

■ Feng Wang, Lingyan Huang, Tao Li, Qingyang Hong, Lin Li, “Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification,” INTERSPEECH 2023.

■ Lingyan Huang, Tao Li, Haodong Zhou, Qingyang Hong, Lin Li, “Cross-Modal Semantic Alignment before Fusion for Two-Pass End-to-End Spoken Language Understanding,” INTERSPEECH 2023.

■ Wenhao Guan, Tao Li, Yishuang Li, Hukai Huang, Qingyang Hong, Lin Li, “Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge,” INTERSPEECH 2023.

■ Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong, “Towards A Unified Conformer Structure: from ASR to ASV Task,” ICASSP2023.

■ Qiulin Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition,” ICASSP2023.

■ Tao Li, Haodong Zhou, Jie Wang, Qingyang Hong, Lin Li, “The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022,” ICASSP2023.

■ Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong, “Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization,” ICASSP2023.

■ Zhicong Chen, Jie Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction,” ICASSP2023.

■ Jie Wang, Yuji Liu, Binling Wang, Yiming Zhi, Song Li, Shipeng Xia, Jiayang Zhang, Feng Tong, Lin Li, Qingyang Hong, “Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting,” INTERSPEECH 2022.

■ Binling Wang, Feng Wang, Wenxuan Hu, Qiulin Wang, Jing Li, Dong Wang, Lin Li, Qingyang Hong, “Oriental Language Recognition (OLR) 2021: Summary and Analysis,” INTERSPEECH 2022.

Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li, “Deep Representation Decomposition for Rate-invariant Speaker Verification, Odyssey 2022.

■ Lin Li, Fuchuan Tong, Qingyang Hong, “When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1586-1599, 2022, doi: 10.1109/TASLP.2022.3169977.       

■ Fuchuan Tong, Siqi Zheng, Min Zhang, Binling Wang, Song Li, Yafeng Chen, Hongbin Suo, Lin Li, Qingyang Hong, “Graph Convolutional Network Based Semi-supervised Learning on Multi-Speaker Meeting Data,” ICASSP 2022.

■ Yan Liu, Zheng Li, Lin Li and Qingyang Hong, “Phoneme-aware and Channel-wise Attentive Learning for Text Dependent Speaker Verification,” INTERSPEECH 2021.

■ Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li and Qingyang Hong, “Automatic Error Correction for Speaker Embedding Learning with Noisy Label,” INTERSPEECH 2021.

■ Zheng Li, Yan Liu, Lin Li and Qingyang Hong, “Additive Phoneme-aware Margin Softmax Loss for Language Recognition” INTERSPEECH 2021.

■ Song Li, Beibei Ouyang, Fuchuang Tong, Dexin Liao, Lin Li and Qingyang Hong, “Real-time End-to-End Monaural Multi-Speaker Speech Recognition,” INTERSPEECH 2021.

■ Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, and Lin Li, “An Integrated Framework for Two-pass Personalized Voice Trigger,” INTERSPEECH 2021.

■ Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, and Dong Wang, “Oriental Language Recognition (OLR) 2020: Summary and Analysis,” INTERSPEECH 2021.

■ Lin Li, Zheng Li, Yan Liu, Qingyang Hong, “Deep joint learning for language recognition,” Neural Networks, 141 (2021) 72-86.

■ Song Li, Beibei Ouyang, Dexin Liao, Shipeng Xia, Lin Li, Qingyang Hong, “End-to-end Multi-accent Speech Recognition with Unsupervised Accent Modelling,” ICASSP 2021.

■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “Light-TTS: Lightweight Multi-speaker Multi-lingual Text-to-speech,” ICASSP 2021.

■ Fuchuan Tong, Miao Zhao, Jianfeng Zhou, Hao Lu, Zheng Li, Lin Li, Qingyang Hong, “ASV-Subtools: Open Source Toolkit for Automatic Speaker Verification”, ICASSP 2021.

■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “LightSpeech: Lightweight Non-regressive Multi-speaker Text-to-speech,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.

■ Zheng Li, Miao Zhao, Lin Li, Qingyang Hong, “Multi-feature Learning with Canonical Correlation Analysis Constraint for Text-independent Speaker Verification,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.

■ Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Liming Song and Cheng Yang, “AP20-OLR Challenge: Three Tasks and Their Baselines,” APSIPA ASC 2020.

■ Song Li, Lin Li, Qingyang Hong and Lingling Liu, “Improving Transformer-based Speech Recognition With Unsupervised Pre-training and Multi-task Semantic Knowledge Learning,” INTERSPEECH 2020.

■ Tao Jiang, Miao Zhao, Lin Li, Qingyang Hong, “The XMUSPEECH System for Short-Duration Speaker Verification Challenge 2020,” INTERSPEECH 2020.

■ Zheng Li, Miao Zhao, Jing Li, Lin Li, Qingyang Hong, “On the Usage of Multi-feature Integration for Speaker Verification and Language Identification,” INTERSPEECH 2020.

■ Zheng Li, Miao Zhao, Jing Li, Yiming Zhi, Lin Li, Qingyang Hong, “The XMUSPEECH System for AP19-OLR Challenge,” INTERSPEECH 2020.

■ Hao Lu, Jianfeng Zhou, Miao Zhao, Wendian Lei, Qingyang Hong, Lin Li, “XMU-TS Systems for NIST SRE19 CTS Challenge”, ICASSP2020.

■ Jianfeng Zhou, Tao Jiang, Zheng Li, Lin Li, Qingyang Hong, “Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function”, INTERSPEECH 2019.

■ Rongjin Li, Miao Zhao, Zheng Li, Lin Li, Qingyang Hong, “Anti-Spoofing Speaker Verification System with Multi-Feature Integration and Multi-Task Learning”, INTERSPEECH 2019.

■ Jianfeng Zhou, Tao Jiang, Lin Li, Qingyang Hong, Zhe Wang, Bingyin Xia, “Training Multi-Task Adversarial Network for Extracting Noise-robust Speaker Embedding”, ICASSP 2019.

■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Huiyang Guo, “Transfer learning for PLDA-based speaker verification”, Speech Communication, 92:90-99, 2017.

■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Feng Tong, “Transfer Learning for Speaker Verification on Short Utterances”, INTERSPEECH 2016.

■ Qingyang Hong, Jun Zhang, Lin Li, Lihong Wan, Feng Tong, “A Transfer Learning Method for PLDA-based Speaker Verification”, ICASSP 2016.

■ 李琳,万丽虹,洪青阳,张君,李明,“基于概率修正PLDA的说话人识别技术”,第十三届全国人机语音通讯学术会议(NCMMSC2015)201510月,中国,天津.

■ Qingyang Hong, Lin Li, Ming Li, Ling Huang, Lihong Wan and Jun Zhang, “Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System”, INTERSPEECH 2015.

■ Weicheng Cai, Ming Li, Lin Li and Qingyang Hong, “Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification”, INTERSPEECH 2015.

■ 李琳,万丽虹洪青阳张君,李明,“基于概率修正PLDA 的说话人识别系统”,天津大学学报(自然科学版)2015.8,pp.692-696.

■ Hong, Qingyang; Wang, Sheng; Liu, Zhijian, “A robust speaker-adaptive and text-prompted speaker verification system”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v 8833, p 385-393, 2014.

■ Wanli Chen, Qingyang Hong, Ximin Li, “MM-UBM for Text-Dependent Speaker Recognition”, 2012 Third IEEE/IET International Conference on Audio, Language and Image Processing (ICALIP2012), July 16-18, 2012, Shanghai, China.

■ Q.Y. Hong and S. Kwong, “A Discriminative Training Approach for Text-independent Speaker Recognition”, Signal Processing, 85 (7), July 2005, pp. 1449-1463.

■ Q.Y. Hong and S. Kwong, “A Genetic Classification Method for Speaker Recognition”, Engineering Applications of Artificial Intelligence, 18 (1), February 2005, pp. 13-19.

■ Q.Y. Hong and S. Kwong, Discriminative Training for Speaker Identification, Electronics Letters, 40 (4), February 2004, pp. 280-281.

■ Q.Y. Hong and S. Kwong, “Discriminative Training for Speaker Identification Based on Maximum Model Distance Algorithm”, ICASSP2004.


发明专利

[1] 发明专利:一种可抑制移动噪声的麦克风阵列语音增强装置(专利号:ZL 2012 1 0497016.9)

[2] 发明专利:谱减与动态时间规整算法联合处理的抗噪声声纹识别装置(专利号:ZL 2013 1 0370030.7)

[3] 发明专利:带声源方向跟踪功能的麦克风阵列语音增强装置及其方法(专利号:ZL 2012 1 0320004.9)

[4] 发明专利:文本提示型声纹门禁系统(专利号:ZL 2013 1 0294975.5)

[5] 发明专利:基于声纹识别技术的满意度调查作弊排查方法(专利号:ZL 2013 1 0754586.6)

[6] 发明专利:可适应强烈背景噪声的麦克风阵列语音增强装置(专利号:ZL 2016 1 0080236.X)

[7] 发明专利:带相位自校正功能的声聚焦麦克风阵列长距离拾音装置(专利号:ZL 2016 1 0080008.2)

[8] 发明专利:无直达声条件下的麦克风阵列语音增强装置(专利号:ZL 2017 1 0408164.1)

[9] 发明专利:基于加权有限状态转换器的文本内容添加标点方法(专利号:ZL 2018 1 1180949.9)

[10] 发明专利:一种基于深度神经网络的欺骗语音检测方法(专利号:ZL 2019 1 0590712.6)

[11] 发明专利:一种基于深度神经网络的多类声学特征整合方法和系统(专利号:ZL 2020 1 0073244.8)

[12] 发明专利:一种基于矫正流模型的高质量语音合成方法(专利号:ZL 2023 1 1587465.7)