Publications
2025
- UCSC at SemEval-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in LLM OutputSicong Huang , Jincheng He, Shiyuan Huang , Karthik Raja Anandan , Arkajyoti Chakraborty , and Ian LaneIn Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025) , Jul 2025Received Best System Description Award
Hallucinations pose a significant challenge for large language models when answering knowledge-intensive queries. As LLMs become more widely adopted, it is crucial not only to detect if hallucinations occur but also to pinpoint where they arise. SemEval 2025 Task 3, Mu-SHROOM: Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes, is a recent effort in this direction. This paper describes our solution to the shared task. We propose a framework that first retrieves relevant context, next identifies false content from the answer, and finally maps them back to spans. The process is further enhanced by automatically optimizing prompts. Our system achieves the highest overall performance, ranking #1 in average position across all languages.
@inproceedings{huang-etal-2025-ucsc, title = {{UCSC} at {S}em{E}val-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in {LLM} Output}, author = {Huang, Sicong and He, Jincheng and Huang, Shiyuan and Anandan, Karthik Raja and Chakraborty, Arkajyoti and Lane, Ian}, editor = {Rosenthal, Sara and Ros{\'a}, Aiala and Ghosh, Debanjan and Zampieri, Marcos}, booktitle = {Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)}, month = jul, year = {2025}, address = {Vienna, Austria}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2025.semeval-1.257/}, pages = {1981--1992}, isbn = {979-8-89176-273-2}, note = {Received Best System Description Award} }
2022
- Single-Channel Target Speaker Separation Using Joint Training with Target Speaker’s Pitch InformationJincheng He, Yuanyuan Bao , Na Xu , Hongfeng Li , Shicong Li , Linzhang Wang , Fei Xiang , and Ming LiIn Proc. The Speaker and Language Recognition Workshop (Odyssey 2022) , Jul 2022
Despite the great progress achieved in the target speaker separation (TSS) task, we are still trying to find other robust ways for performance improvement which are independent of the model architecture and the training loss. Pitch extraction plays an important role in many applications such as speech enhancement and speech separation. It is also a challenging task when there are multiple speakers in the same utterance. In this paper, we explore if the target speaker pitch extraction is possible and how the extracted target pitch could help to improve the TSS performance. A target pitch extraction model is built and incorporated into different TSS models using two different strategies, namely concatenation and joint training. The experimental results on the LibriSpeech dataset show that both training strategies could bring significant improvements to the TSS task, even the precision of the target pitch extraction module is not high enough.
@inproceedings{he22_odyssey, author = {He, Jincheng and Bao, Yuanyuan and Xu, Na and Li, Hongfeng and Li, Shicong and Wang, Linzhang and Xiang, Fei and Li, Ming}, title = {{Single-Channel Target Speaker Separation Using Joint Training with Target Speaker's Pitch Information}}, year = {2022}, booktitle = {Proc. The Speaker and Language Recognition Workshop (Odyssey 2022)}, pages = {301--305}, doi = {10.21437/Odyssey.2022-42}, }
2021
- Lightweight Dual-channel Target Speaker Separation for Mobile Voice CommunicationYuanyuan Bao , Yanze Xu , Na Xu , Wenjing Yang , Hongfeng Li , Shicong Li , Yongtao Jia , Fei Xiang , Jincheng He, and Ming LiJul 2021
Nowadays, there is a strong need to deploy the target speaker separation (TSS) model on mobile devices with a limitation of the model size and computational complexity. To better perform TSS for mobile voice communication, we first make a dual-channel dataset based on a specific scenario, LibriPhone. Specifically, to better mimic the real-case scenario, instead of simulating from the single-channel dataset, LibriPhone is made by simultaneously replaying pairs of utterances from LibriSpeech by two professional artificial heads and recording by two built-in microphones of the mobile. Then, we propose a lightweight time-frequency domain separation model, LSTM-Former, which is based on the LSTM framework with source-to-noise ratio (SI-SNR) loss. For the experiments on Libri-Phone, we explore the dual-channel LSTMFormer model and a single-channel version by a random single channel of Libri-Phone. Experimental result shows that the dual-channel LSTM-Former outperforms the single-channel LSTMFormer with relative 25% improvement. This work provides a feasible solution for the TSS task on mobile devices, playing back and recording multiple data sources in real application scenarios for getting dual-channel real data can assist the lightweight model to achieve higher performance.
@misc{bao2021lightweight, title = {Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication}, author = {Bao, Yuanyuan and Xu, Yanze and Xu, Na and Yang, Wenjing and Li, Hongfeng and Li, Shicong and Jia, Yongtao and Xiang, Fei and He, Jincheng and Li, Ming}, year = {2021}, eprint = {2106.02934}, archiveprefix = {arXiv}, primaryclass = {cs.SD}, }