洪青阳 教授
Email: qyhong@xmu.edu.cn
■
厦门大学信息学院 教授
中国中文信息学会语音信息专委会副主任
全国人机语音通讯会议(NCMMSC)常设委员
全国安防标委会人体生物特征识别应用分技术委员会(SAC/TC100/SC2)委员
研究方向
1、声纹识别
2、语音识别、语音合成
3、大模型技术
个人简介
洪青阳,厦门大学信息学院教授,主要研究方向是语音识别、声纹识别,先后主持国家自然科学基金三项,科技部创新基金两项。牵头组建厦门大学智能语音实验室,带领XMUSPEECH团队连续两届获东方语种识别(OLR)竞赛第一名,成功研发闽南语识别与合成系统。具有丰富的工业研发经验,与华为、海思、海信、鼎桥等知名企业合作,承担过大量的智能语音项目,核心技术应用到华为智能手机、说咱闽南话APP、声云语音转写和全国十几个省市的司法/社保/证券/电力系统。出版专著《语音识别:原理与应用》,累计发行超1万册。发布国内第一套声纹识别开源工具ASV-Subtools,助力学术研究和产业落地。担任2020-2021年全国声纹识别研究与应用学术研讨会主席、中文信息学会语音信息专委会副主任等职。获电子工业出版社“优秀作者奖”和华为“优秀技术合作成果奖”。
主持项目
■ “基于图结构建模的说话人日志研究”,国家自然科学基金面上项目,2023.1-2026.12。
■ “复杂场景下的说话人特征提取及识别研究”,国家自然科学基金面上项目,2019.1-2022.12。
■ “耳内声纹技术合作项目”,横向课题(华为终端),2018.8-2019.3。
■ “自由场语音交互关键后端技术研究”,横向课题(华为技术),2017.12-2018.12。
■ “语音交互关键技术研究”,横向课题(华为技术),2016.4-2018.12。
■ “闽南语智能语音对话系统”,横向课题,2017.8-2018.7。
■ “基于迁移学习的跨信道说话人识别研究”,国家青年科学基金,2012.1-2014.12。
■ “基于GSV-SVM 和迁移学习的跨信道多说话人识别研究”,深圳市科技计划项目,2011.2-2014.3。
■ “智能声纹识别引擎tsVPR2.0”,横向课题(厦门天聪),2012.08-2013.03。
■ “基于声纹识别技术的亲情电话管理系统”,国家科技部创新基金,2011.2-2013.2。
■ “智能声纹识别引擎tsVPR1.0”,横向课题(厦门天聪),2009.11-2010.12。
■ “面向智能家电的嵌入式语音识别系统”,国家科技部创新基金,2007.7-2009.9。
■ “面向智能家电的语音识别插座”,厦门市科技计划,2007.3-2008.12。
■ “嵌入式语音识别系统”,厦门大学启动经费,2005.1-2005.12。
科研专著
洪青阳,李琳著,《语音识别:原理与应用》,电子工业出版社,2023年2月第2版。
开源工具(声纹识别)
ASV-Subtools: https://github.com/Snowdar/asv-subtools
代表性论文
■ Jiayan Lin, Shenghui Lu, Hukai Huang, Wenhao Guan, Binbin Xu, Hui Bu, Qingyang Hong, Lin Li, “MinSpeech: A Corpus of Southern Min Dialect for Automatic Speech Recognition,” INTERSPEECH 2024.
■ Wenhao Guan, Kaidi Wang, Wangjin Zhou, Yang Wang, Feng Deng, Hui Wang, Lin Li, Qingyang Hong, Yong Qin, “LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation,” INTERSPEECH 2024.
■ Yishuang Li, Wenhao Guan, Hukai Huang, Shiyu Miao, Qi Su, Lin Li, Qingyang Hong, “Efficient Integrated Features Based on Pre-trained Models for Speaker Verification,” INTERSPEECH 2024.
■ Wenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong, “ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech,” ICASSP 2024.
■ Tao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li, “Improving Multi-speaker ASR with Overlap-aware Encoding and Monotonic Attention,” ICASSP 2024.
■ Longjie Luo, Tao Li, Lin Li, Qingyang Hong, “The XMUSPEECH System for Audio-Visual Target Speaker Extraction in MISP 2023 Challenge,” ICASSP 2024.
■ Yishuang Li, Hukai Huang, Zhicong Chen, Wenhao Guan, Jiayan Lin, Lin Li, Qingyang Hong, “SR-Hubert: an Efficient Pre-trained Model for Speaker Verification,” ICASSP 2024.
■ Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong, “MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis,” Proceedings of the AAAI Conference on Artificial Intelligence, Volume 38, Issue 16, Pages 18117-18125, March 25, 2024.
■ Feng Wang, Lingyan Huang, Tao Li, Qingyang Hong, Lin Li, “Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification,” INTERSPEECH 2023.
■ Lingyan Huang, Tao Li, Haodong Zhou, Qingyang Hong, Lin Li, “Cross-Modal Semantic Alignment before Fusion for Two-Pass End-to-End Spoken Language Understanding,” INTERSPEECH 2023.
■ Wenhao Guan, Tao Li, Yishuang Li, Hukai Huang, Qingyang Hong, Lin Li, “Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge,” INTERSPEECH 2023.
■ Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong, “Towards A Unified Conformer Structure: from ASR to ASV Task,” ICASSP2023.
■ Qiulin Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition,” ICASSP2023.
■ Tao Li, Haodong Zhou, Jie Wang, Qingyang Hong, Lin Li, “The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022,” ICASSP2023.
■ Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong, “Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization,” ICASSP2023.
■ Zhicong Chen, Jie Wang, Wenxuan Hu, Lin Li, Qingyang Hong, “Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction,” ICASSP2023.
■ Jie Wang, Yuji Liu, Binling Wang, Yiming Zhi, Song Li, Shipeng Xia, Jiayang Zhang, Feng Tong, Lin Li, Qingyang Hong, “Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting,” INTERSPEECH 2022.
■ Binling Wang, Feng Wang, Wenxuan Hu, Qiulin Wang, Jing Li, Dong Wang, Lin Li, Qingyang Hong, “Oriental Language Recognition (OLR) 2021: Summary and Analysis,” INTERSPEECH 2022.
■Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li, “Deep Representation Decomposition for Rate-invariant Speaker Verification, Odyssey 2022.
■ Lin Li, Fuchuan Tong, Qingyang Hong, “When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1586-1599, 2022, doi: 10.1109/TASLP.2022.3169977.
■ Fuchuan Tong, Siqi Zheng, Min Zhang, Binling Wang, Song Li, Yafeng Chen, Hongbin Suo, Lin Li, Qingyang Hong, “Graph Convolutional Network Based Semi-supervised Learning on Multi-Speaker Meeting Data,” ICASSP 2022.
■ Yan Liu, Zheng Li, Lin Li and Qingyang Hong, “Phoneme-aware and Channel-wise Attentive Learning for Text Dependent Speaker Verification,” INTERSPEECH 2021.
■ Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li and Qingyang Hong, “Automatic Error Correction for Speaker Embedding Learning with Noisy Label,” INTERSPEECH 2021.
■ Zheng Li, Yan Liu, Lin Li and Qingyang Hong, “Additive Phoneme-aware Margin Softmax Loss for Language Recognition” INTERSPEECH 2021.
■ Song Li, Beibei Ouyang, Fuchuang Tong, Dexin Liao, Lin Li and Qingyang Hong, “Real-time End-to-End Monaural Multi-Speaker Speech Recognition,” INTERSPEECH 2021.
■ Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, and Lin Li, “An Integrated Framework for Two-pass Personalized Voice Trigger,” INTERSPEECH 2021.
■ Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, and Dong Wang, “Oriental Language Recognition (OLR) 2020: Summary and Analysis,” INTERSPEECH 2021.
■ Lin Li, Zheng Li, Yan Liu, Qingyang Hong, “Deep joint learning for language recognition,” Neural Networks, 141 (2021) 72-86.
■ Song Li, Beibei Ouyang, Dexin Liao, Shipeng Xia, Lin Li, Qingyang Hong, “End-to-end Multi-accent Speech Recognition with Unsupervised Accent Modelling,” ICASSP 2021.
■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “Light-TTS: Lightweight Multi-speaker Multi-lingual Text-to-speech,” ICASSP 2021.
■ Fuchuan Tong, Miao Zhao, Jianfeng Zhou, Hao Lu, Zheng Li, Lin Li, Qingyang Hong, “ASV-Subtools: Open Source Toolkit for Automatic Speaker Verification”, ICASSP 2021.
■ Song Li, Beibei Ouyang, Lin Li, Qingyang Hong, “LightSpeech: Lightweight Non-regressive Multi-speaker Text-to-speech,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.
■ Zheng Li, Miao Zhao, Lin Li, Qingyang Hong, “Multi-feature Learning with Canonical Correlation Analysis Constraint for Text-independent Speaker Verification,” IEEE Spoken Language Technology Workshop (SLT 2021), Jan 2021, Shenzhen, China.
■ Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Liming Song and Cheng Yang, “AP20-OLR Challenge: Three Tasks and Their Baselines,” APSIPA ASC 2020.
■ Song Li, Lin Li, Qingyang Hong and Lingling Liu, “Improving Transformer-based Speech Recognition With Unsupervised Pre-training and Multi-task Semantic Knowledge Learning,” INTERSPEECH 2020.
■ Tao Jiang, Miao Zhao, Lin Li, Qingyang Hong, “The XMUSPEECH System for Short-Duration Speaker Verification Challenge 2020,” INTERSPEECH 2020.
■ Zheng Li, Miao Zhao, Jing Li, Lin Li, Qingyang Hong, “On the Usage of Multi-feature Integration for Speaker Verification and Language Identification,” INTERSPEECH 2020.
■ Zheng Li, Miao Zhao, Jing Li, Yiming Zhi, Lin Li, Qingyang Hong, “The XMUSPEECH System for AP19-OLR Challenge,” INTERSPEECH 2020.
■ Hao Lu, Jianfeng Zhou, Miao Zhao, Wendian Lei, Qingyang Hong, Lin Li, “XMU-TS Systems for NIST SRE19 CTS Challenge”, ICASSP2020.
■ Jianfeng Zhou, Tao Jiang, Zheng Li, Lin Li, Qingyang Hong, “Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function”, INTERSPEECH 2019.
■ Rongjin Li, Miao Zhao, Zheng Li, Lin Li, Qingyang Hong, “Anti-Spoofing Speaker Verification System with Multi-Feature Integration and Multi-Task Learning”, INTERSPEECH 2019.
■ Jianfeng Zhou, Tao Jiang, Lin Li, Qingyang Hong, Zhe Wang, Bingyin Xia, “Training Multi-Task Adversarial Network for Extracting Noise-robust Speaker Embedding”, ICASSP 2019.
■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Huiyang Guo, “Transfer learning for PLDA-based speaker verification”, Speech Communication, 92:90-99, 2017.
■ Qingyang Hong, Lin Li, Jun Zhang, Lihong Wan, Feng Tong, “Transfer Learning for Speaker Verification on Short Utterances”, INTERSPEECH 2016.
■ Qingyang Hong, Jun Zhang, Lin Li, Lihong Wan, Feng Tong, “A Transfer Learning Method for PLDA-based Speaker Verification”, ICASSP 2016.
■ 李琳,万丽虹,洪青阳,张君,李明,“基于概率修正PLDA的说话人识别技术”,第十三届全国人机语音通讯学术会议(NCMMSC2015),2015年10月,中国,天津.
■ Qingyang Hong, Lin Li, Ming Li, Ling Huang, Lihong Wan and Jun Zhang, “Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System”, INTERSPEECH 2015.
■ Weicheng Cai, Ming Li, Lin Li and Qingyang Hong, “Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification”, INTERSPEECH 2015.
■ 李琳,万丽虹,洪青阳,张君,李明,“基于概率修正PLDA 的说话人识别系统”,天津大学学报(自然科学版),2015.8,pp.692-696.
■ Hong, Qingyang; Wang, Sheng; Liu, Zhijian, “A robust speaker-adaptive and text-prompted speaker verification system”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v 8833, p 385-393, 2014.
■ Wanli Chen, Qingyang Hong, Ximin Li, “MM-UBM for Text-Dependent Speaker Recognition”, 2012 Third IEEE/IET International Conference on Audio, Language and Image Processing (ICALIP2012), July 16-18, 2012, Shanghai, China.
■ Q.Y. Hong and S. Kwong, “A Discriminative Training Approach for Text-independent Speaker Recognition”, Signal Processing, 85 (7), July 2005, pp. 1449-1463.
■ Q.Y. Hong and S. Kwong, “A Genetic Classification Method for Speaker Recognition”, Engineering Applications of Artificial Intelligence, 18 (1), February 2005, pp. 13-19.
■ Q.Y. Hong and S. Kwong, Discriminative Training for Speaker Identification, Electronics Letters, 40 (4), February 2004, pp. 280-281.
■ Q.Y. Hong and S. Kwong, “Discriminative Training for Speaker Identification Based on Maximum Model Distance Algorithm”, ICASSP2004.
发明专利
[1] 发明专利:一种可抑制移动噪声的麦克风阵列语音增强装置(专利号:ZL 2012 1 0497016.9)
[2] 发明专利:谱减与动态时间规整算法联合处理的抗噪声声纹识别装置(专利号:ZL 2013 1 0370030.7)
[3] 发明专利:带声源方向跟踪功能的麦克风阵列语音增强装置及其方法(专利号:ZL 2012 1 0320004.9)
[4] 发明专利:文本提示型声纹门禁系统(专利号:ZL 2013 1 0294975.5)
[5] 发明专利:基于声纹识别技术的满意度调查作弊排查方法(专利号:ZL 2013 1 0754586.6)
[6] 发明专利:可适应强烈背景噪声的麦克风阵列语音增强装置(专利号:ZL 2016 1 0080236.X)
[7] 发明专利:带相位自校正功能的声聚焦麦克风阵列长距离拾音装置(专利号:ZL 2016 1 0080008.2)
[8] 发明专利:无直达声条件下的麦克风阵列语音增强装置(专利号:ZL 2017 1 0408164.1)
[9] 发明专利:基于加权有限状态转换器的文本内容添加标点方法(专利号:ZL 2018 1 1180949.9)
[10] 发明专利:一种基于深度神经网络的欺骗语音检测方法(专利号:ZL 2019 1 0590712.6)
[11] 发明专利:一种基于深度神经网络的多类声学特征整合方法和系统(专利号:ZL 2020 1 0073244.8)
[12] 发明专利:一种基于矫正流模型的高质量语音合成方法(专利号:ZL 2023 1 1587465.7)