HAI Research Laboratory [Sakti-Lab]

過去の論文一覧・ Past publications (before October 2021)
招待講演・ Invited Talks
著書 / 書籍の一章・ Scientific Book / Chapter in Scientific Book
査読付き学術論文誌・ Peer-reviewed Scientific Journals
査読付き国際会議論文・ Peer-reviewed International Conferences
国内会議論文・ Domestic Conferences

招待講演・Invited Talks

[Keynote speaker] S. Sakti, "Towards Language Technology for All: A Zero-Resourced S2ST Approach for Unknown, Unpaired, and Untranscribed Languages" [A joint work with L.-T. Nguyen (JAIST, Japan)] The India International Symposium on Acoustics (IISA), Gurugram, India, Oct 31st, 2025 [Rais Ahmad Memorial Lecture Award]
[Invited speaker] S. Sakti, "Language Technologies for All Mother Languages: Opportunities and Challenges", The International Mother Language Day (IMLD), Paris, Feb 20th, 2025
[Invited speaker] S. Sakti, "Machine Speech Chain: Modeling Human Speech Perception and Production with Auditory Feedback Mechanism" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The NUS Computer Science Research Week, Singapore, Jan 10th, 2025
[Keynote speaker] S. Sakti, "Machine Speech Chain: From Human Auditory Feedback Principles to Language Technology Empowering Indigenous Communities "[A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), C. Soria (CNR-ILC), M. Malero (BSC)], The 21st International Conference on Natural Language Processing (ICON), India, Dec 21st, 2024
[Invited speaker] S. Sakti, "Leveraging the Foundational Speech Chain Models to Empower Low-Resource Languages" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The IX International Scientific Conference “Modern Problems of Applied Mathematics and Information Technologies Al-Khwarizmi 2024”, Uzbekistan, Oct 22nd, 2024
[Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Modeling Human Speech Perception and Production with Auditory Feedback Mechanism for Low-Resource Languages" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The 11th International Conference on Computer, Control, Informatics and its Applications (IC3INA), Indonesia, Oct 10th, 2024
[Invited speaker] S. Sakti, "Language Technology for All: Leveraging Foundational Speech Models to Empower Low-Resource Languages" [A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], The INTERSPEECH Workshop on Synthetic Data’s Transformative Role in Foundational Speech Models (SynData4GenAI 2024), Greece Aug 31st, 2024
[Keynote speaker] S. Sakti, "Communicative Intelligent Systems towards Society 5.0," “Sarasehan Nasional Pendidikan Tinggi Informatika dan Pemberian Tribute kepada Penggagas dan Pendidik Senior Teknik Informatika ITB, Indonesia, Feb 2nd, 2023
[Invited speaker] S. Sakti, "Language Technology for All: From the indigenous community perspectives" [A joint work with J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], "Data, Technologies and Benchmarks for the Spoken Languages of the World" Meeting, IEEE SLT, China, Jan 13th, 2023
[Keynote speaker] S. Sakti, "Language Technology for All: From the technology and indigenous community perspectives" [A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], the 25th Conference of the Oriental COCOSDA, Vietnam, Nov 25th, 2022
[Invited panelist] O. Scharenborg (TU Delft, Netherland), E. Ahn (U. Washington, USA), G. Anumanchipalli (UC Berkeley, USA), S. Sakti (JAIST, Japan), Moderator: A. Black, "Data Collection, Bias, and Ethical Concerns in Speech Processing," Speech for Social Good - INTERSPEECH Satellite Workshop, [Virtual], September 24th, 2022
[Invited speaker] S. Sakti, "Semi-supervised Learning for Low-resource Multilingual and Multimodal Speech Processing with Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakayama, T. Yanagita, S. Nakamura (NAIST/RIKEN AIP, Japan)], HiTZ Language Technology Webinar, Basque, May 5th, 2022
[Invited speaker] S. Sakti, "Self-Adaptive Machine Speech Chain in Noisy Environment" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The AAAI workshop on Self-supervised Learning for Audio and Speech Processing, [Virtual], Feb 28th, 2022
[Invited speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Modeling Human Speech Perception and Production with Auditory Feedback Mechanism" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The ITB Seminar, Dec 24th, 2021
[Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Training and Inference through Feedback Loop" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Cartagena, Colombia, Dec 15th, 2021
[Keynote speaker] S. Sakti, "Listening while Speaking and Visualizing: A Semi-supervised Approach with Multimodal Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the SoCS International Seminar, Dec 10th, 2021
[Keynote speaker] S. Sakti, "Listening while Speaking and Visualizing: A Semi-supervised Approach with Multimodal Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the International Conference of Artificial Intelligence and Speech Technology (AIST), Nov 13th, 2021

著書 / 書籍の一章・ Scientific Book / Chapter in Scientific Book

S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura, "Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot," In: Stoyanchev, S., Ultes, S., Li, H. (eds) Conversational AI for Natural Human-Centric Interaction. Lecture Notes in Electrical Engineering, vol 943. Springer, Singapore, Nov 2022 [PDF]
S. Sakti, "Language technology impact on linguistic diversity". In Book: "State of the art of indigenous languages in research: a collection of selected research papers," In the framework of the International Decade of Indigenous Languages (2022-2032), UNESCO Open Access Repository, pp. 341-348, May 2022 [PDF]
LP. Morency, S. Sakti, B.W. Schuller, S. Ultes, "Multimodal Machine Learning for Social Interaction with Ageing Individuals". In Book: J. Miehle, W. Minker, E. André, K. Yoshino (eds), "Multimodal Agents for Ageing and Multicultural Societies," Springer, Singapore, pp. 61–70, Oct. 2021 [PDF]

査読付き学術論文誌・ Peer-reviewed Scientific Journals

Y. Hirano, M. Nguyen, K. Azuma, J.M. Saragih, S. Sakti, "Toward fast meeting transcription: NAIST system for CHiME-8 NOTSOFAR-1 task and its analysis", Computer Speech & Language, Vol. 95, pp. 1 - 13, Jan 2026 [PDF]
C. Tran, S. Sakti, "SAM Translator: Self-Paced Learning And Mixture-of-Experts for Cross-Lingual Text-to-Speech Translation", IEEE Access, Vol. 14, pp. 2742 - 2752, Dec 2025 [PDF]
R.F. Widiaputri; A. Purwarianti; S. Sakti, "Structural Ambiguity Resolution in Indonesian–English Speech-to-Text Translation by Utilizing Prosodic Information", IEEE Access, Vol. 14, pp. 778 - 791, Dec 2025 [PDF]
Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, K. Sudoh, S. Sakti, S. Nakamura, "End-to-end Simultaneous Speech Translation with Style Tags using Human Simultaneous Interpretation Data", Journal of Natural Language Processing, Vol.32, No.2, Jun 2025 [PDF]
C. Tran; C.-M. Luong; S. Sakti, "Zero-Shot Cross-Lingual Text-to-Speech With Style-Enhanced Normalization and Auditory Feedback Training Mechanism", IEEE Transactions on Audio, Speech and Language Processing (TASLP), Vol. 33, pp. 1479 - 1492, Mar 2025 [PDF]
L.-T. Nguyen; S. Sakti, "ZeST: A Zero-Resourced Speech-to-Speech Translation Approach for Unknown, Unpaired, and Untranscribed Languages", IEEE Access, Vol. 13, pp. 8638 - 8648, Jan 2025 [PDF]
K. Furukawa, T. Kishiyama, S. Nakamura, S. Sakti, "Applying Syntax-Prosody Mapping Hypothesis and Boundary-Driven Theory to Neural Sequence-to-Sequence Speech Synthesis", IEEE Access, Vol. 12, pp. 160896 - 160917, Oct 2024 [PDF]
Y. Ko, K. Sudoh, S. Sakti and S. Nakamura, "Neural End-to-end Speech Translation Leveraged by ASR Posterior Distribution", IEICE Transactions on Information and Systems, Vol. E107-D, No. 10, pp. 1322 - 1331, Oct 2024 [PDF]
B. Putra, K. Azizah, C.-O. Mawalim, I.-A. Hanif, S. Sakti, C.-W. Leong, S. Okada, "MAG-BERT-ARL for Fair Automated Video Interview Assessment", IEEE Access, Vol. 12, pp. 145188 - 145205, Oct 2024 [PDF]
T. Yanagita, S. Sakti, S. Nakamura, "Japanese Neural Incremental Text-to-Speech Synthesis Framework With an Accent Phrase Input", IEEE Access, Vol. 11, pp. 22355 - 22363, Mar 2023 [PDF]
S. Novitasari, S. Sakti, S. Nakamura, "A Machine Speech Chain Approach for Dynamically Adaptive Lombard TTS in Static and Dynamic Noise Environments", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 2673-2688, Aug 2022 [PDF]
F. Yang, Z. Wang, Y. Wu, S. Sakti, S. Nakamura, "Tackling multiple object tracking with complicated motions — Re-designing the integration of motion and appearance", Image and Vision Computing, Vol. 124, Aug 2022 [PDF] [Based on our winner solutions of the CVPR 2020 WAD MOT Challenge and the CVPR 2020 MOTS Challenge]
柳田智也, サクティサクリアニ, 中村哲, "日本語逐次音声合成における合成単位", 情報処理学会論文誌, Vol. 63, No. 4, pp. 1149-1158, Apr. 2022 [PDF]
B. Wu, S. Sakti, J. Zhang, S. Nakamura, "Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR", IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol. 30, pp. 901-916, Feb 2022 [PDF]
S. Novitasari, S. Sakti, S. Nakamura, "Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation", IEICE Transactions on Information and Systems, E104.D (12), pp. 2195-2208, Dec 2021 [PDF]

査読付き国際会議論文・ Peer-reviewed International Conferences

W. Zhou, B.T. Atmaja, S. Sakti, "Granular Control of Nonverbal Expressions for Achieving Natural Emotional Text-to-Speech System", AAAI Workshop on Audio-Centric AI (Audio-AAAI), Jan 2026 [Best Poster Presentation Runner-Up]
C. Tran, S. Sakti, "Toward High-Quality Cross-lingual Text-to-Speech Synthesis In Low-Resource Scenarios", AAAI Workshop on Audio-Centric AI (Audio-AAAI), Jan 2026
R. Yagi, W. Zhou, H. Hu, Y. Hirano, S. Sakti, "Tuning Tone With Age: Adapting Dialogue Response Generation Based on LLMs and Self-Supervised Speaker Age Estimation", O-COCOSDA, Nov 2025 [PDF]
B.T. Atmaja, T. Shirai, S. Sakti, "Measuring Emotion Preservation in Expressive Speech-to-Speech Translation", O-COCOSDA, Nov 2025 [PDF]
W. Zhou, B.T. Atmaja, S. Sakti, "Toward Natural Emotional Text-to-Speech System with Fine-Grained Non-Verbal Expression Control", O-COCOSDA, Nov 2025 [PDF] [Best paper award]
S.A. Shaquille, D.P. Lestari, S. Sakti, "Stage-Wise Acoustic-Linguistic Fine-Tuning for Overlapped Speech Recognition: Does Ordering Matter?", O-COCOSDA, Nov 2025 [PDF] [Best paper award finalist]
Y. Ko, K. Sudoh, S. Nakamura, S. Sakti, "Bridging Disfluent to Fluent in Speech Translation: Effective Tagging and Fine-Tuning Strategies", O-COCOSDA, Nov 2025 [PDF]
M.A. Aye Aung, W. Pa Pa, S. Sakti, "Joining Diarization and Multi-Speaker Automatic Speech Recognition with Overlap Handling for Long Conversations", O-COCOSDA, Nov 2025 [PDF]
L. Fazry, K. Azizah, D. Tanaya, A. Purwarianti, D. Lestari, S. Sakti, "HifiDiff: Two Stream Diffusion Models for High Fidelity Speech Generation of Unseen Languages", O-COCOSDA, Nov 2025 [PDF]
F. Mehmood and S. Sakti, "Exploring the Relationship Between Ostracism and Communication Apprehension in Human-Robot Interaction", The 13th International Conference on Human-Agent Interaction (HAI), pp. 442-444, Nov 2025 [PDF]
A.D. Prasetyo, B.T. Atmaja, D. Arifianto, S. Sakti, "A Comparison of Solicited and Longitudinal Cough Sounds for Tuberculosis Detection", Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1657-1662, Oct 2025 [PDF]
B.T. Atmaja, S. Sakti, "Dementia Prediction From Speech Signal Using Optimized Prosodic Features", Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 718-723, Oct 2025 [PDF]
B.A. Titalim, F. Mehmood, S. Sakti, "Rethinking Robust ASR Strategies: Can Textual In-Context Learning Improve Acoustic Robustness?", Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 2541-2546, Oct 2025 [PDF] [Best paper award finalist]
J.M. Saragih, F. Mehmood, S. Sakti, "Beyond One-Shot Dubbing: Leveraging N-Best Translation and Prompted Paraphrasing with Synchrony Aware Re-Ranking", Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1146-1151, Oct 2025 [PDF]
H. Tan, R.-F. Widiaputri, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti, "NAIST Simultaneous Speech Translation System for IWSLT 2025", IWSLT, pp. 369–378, Jul 2025 [PDF]
R.-F. Widiaputri, H. Tan, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti, "NAIST Offline Speech Translation System for IWSLT 2025", IWSLT, pp. 360–368, Jul 2025 [PDF]
F. Mehmood and S. Sakti, "Behavioral Interdependence: A Mediator of Ostracism-Aggression Relationship in Human-Robot Interaction", The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. 53-59, Jul 2025 [PDF]
F. Mehmood and S. Sakti, "Impact of Gender and Group Size on Right to Speak and Peer Pressure in Human–Robot Interaction", The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. 86-92, Jul 2025 [PDF]
F. Mehmood and S. Sakti, "Role of Social Treatment and Mode of Communication in Shaping Social Acceptance", The 17th International Conference on Human System Interaction (HSI), pp. 1-8, Jul 2025 [PDF]
H. Watanabe, A.-S. Ihara, M. Okada, S. Sakti, M. Tachimori, E. Mizukami, and Y. Naruse , "Automated Classification of Non-Clinical Depressive States Based on EEG during Listening to Natural Speech", The 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. to appear, Jul 2025
M.-R. Ridha, S. Hasegawa, S. Sakti, "Toward Visual Pronunciation Learning: A Speech-to-Articulatory Animation Pipeline Leveraging wav2vec 2.0 and rtMRI Landmarks", ICASSP, Apr 2025 [PDF]
M. Nguyen, S. Hasegawa, S. Sakti, "Enhancing Unsupervised Acoustic Word Embedding with Visual-Grounded Speech Model and Novel Word-level ABX Evaluation Schemes", ICASSP, Apr 2025 [PDF]
C. Tran, S. Sakti, "From Pixels to Voice: A Simple and Efficient End-to-End Spoken Image Description Approach via Vision Codec Language Models", ICASSP, Apr 2025 [PDF]
R.N. Abdjul, D.P. Lestari, A. Purwarianti, C.O Mawalim, S. Sakti, M. Unoki, "Indonesian Speech Content De-Identification in Low Resource Transcripts", The Second Workshop in South East Asian Language Processing (SEALP), COLING Satellite Workshop, Jan 2025 [PDF]
P.-A. Hafiz, C.-O. Mawalim, D. P. Lestari, S. Sakti, M. Unoki, "Anomalous Machine Sound Detection Based on Time Domain Gammatone Spectrogram Feature and IDNN Model", APSIPA ASC, Dec 2024 [PDF]
M.-H. Rafsanjani, C.-O. Mawalim, D. P. Lestari, S. Sakti, M. Unoki, "Unsupervised Anomalous Sound Detection Using Timbral and Human Voice Disorder-Related Acoustic Features", APSIPA ASC, Dec 2024 [PDF]
A. Adila, D.-P. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech Variabilities", Oriental COCOSDA, Oct 2024 [PDF]
A.-A. Handoyo, C. Tran, D.-P. Lestari, S. Sakti, "Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and BERT LID", Oriental COCOSDA, Oct 2024 [PDF]
Z. Zhang, S. Sakti, "A Feedback-driven Self-improvement Strategy And Emotion-aware Vocoder For Emotional Voice Conversion", Oriental COCOSDA, Oct 2024 [PDF] [Best paper award]
G. Tyndall, K. Azizah, D. Tanaya, A. Purwarianti, D. P. Lestari, S. Sakti, "Continual Learning in Machine Speech Chain Using Gradient Episodic Memory", Oriental COCOSDA, Oct 2024 [PDF]
D.-U. Dewangga, D. Puji, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "An Evaluation of Neural Vocoder-based Voice Cloning System for Dysphonia Speech Disorder", Oriental COCOSDA, Oct 2024 [PDF]
I.-P. Amin, H. Tan, K. Azizah, S. Sakti, "Chunk Size Scheduling for Optimizing The Quality-latency Trade-off in Simultaneous Speech Translation", Oriental COCOSDA, Oct 2024 [PDF]
H. Tan, S. Sakti, "Contrastive Feedback Mechanism for Simultaneous Speech Translation", INTERSPEECH, Sept 2024 [PDF]
R. Hartanto, S. Sakti, K. Shinoda, "MSDET: Multitask Speaker Separation and Direction-of-Arrival Estimation Training", INTERSPEECH, Sept 2024 [PDF]
Y. Hirano, M. Nguyen, K. Azuma, J.-M. Saragih, S. Sakti, "The NAIST System for the CHiME-8 NOTSOFAR-1 Task", International Workshop on Speech Processing in Everyday Environments (CHiME), INTERSPEECH Sattelite Workshop, Sept 2024 [PDF]
A.-P. Naufal, D.-P. Lestari, A. Purwarianti, K. Azizah, D. Tanaya, S. Sakti, "Machine Speech Chain with Emotion Recognition", ICAICTA, Sept 2024 [PDF]
Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, T. Yanagita, K. Doi, M. Makinae, H. Tan, M. Sakai, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech Translation System for IWSLT 2024", IWSLT, Aug 2024 [PDF]
M.-R. Ridha, S. Sakti, "Refining rtMRI Landmark-Based Vocal Tract Contour Labels with FCN-Based Smoothing and Point-to-Curve Projection", LREC-COLING, May 2024 [PDF]
R. V. M. Tazakka, D. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "Indonesian-English Code-Switching Speech Recognition Using the Machine Speech Chain Based Semi-Supervised Learning", SIGUL, LREC-COLING Satellite Workshop, May 2024 [PDF]
H.-T. Nguyen, S. Sakti, "Multilingual Self-supervised Visually Grounded Speech Models", SIGUL, LREC-COLING Satellite Workshop, May 2024 [PDF]
S. Sakti, B.A. Titalim, "Leveraging the Multilingual Indonesian Ethnic Languages Dataset in Self-supervised Model for Low-resource ASR Task", ASRU, Dec 2023 [PDF]
R.F. Widiaputri, A. Purwarianti, D. Lestari, K. Azizah, D. Tanaya, S. Sakti, "Speech Recognition and Meaning Interpretation: Towards Disambiguation of Structurally Ambiguous Spoken Utterances in Indonesian", EMNLP, pp. 16813–16824, Dec 2023 [PDF]
B. Hartanti, D. Tanaya, K. Azizah, D. Lestari, A. Purwarianti, S. Sakti, "Generating Speech with Prosodic Prominence based on SSL-Visually Grounded Models", Oriental COCOSDA, Dec 2023 [PDF]
H. Xi and S. Sakti, "Exploring Difficulties Encountered by Professional Interpreters in Japanese-to-English and English-to-Japanese Simultaneous Translation", Oriental COCOSDA, Dec 2023 [PDF]
C. Tran, C.M. Luong, S. Sakti, "STEN-TTS: Improving Zero-shot Cross-Lingual Transfer for Multi-Lingual TTS with Style-Enhanced Normalization Diffusion Framework", INTERSPEECH, pp. 4464-4468, Aug 2023 [PDF]
S. Takahashi, S. Sakti, "Unsupervised Learning of Discrete Latent Representations with Data-Adaptive Dimensionality from Continuous Speech Streams", INTERSPEECH, pp. 416-420, Aug 2023 [PDF]
T.D. Tran, S. Sakti, "Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning", INTERSPEECH Satellite Workshop - the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL), pp. 78-82, Aug 2023 [PDF]
L.T. Nguyen, S. Sakti, "VGSAlign: Bilingual Speech Alignment of Unpaired and Untranscribed Languages using Self-Supervised Visually Grounded Speech Models", INTERSPEECH Satellite Workshop - the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL), pp. 53-57, Aug 2023 [PDF]
R. Fukuda, Y. Nishikawa, Y. Kano, Y. Ko, T. Yanagita, K. Doi, M. Makinae, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech-to-speech Translation System for IWSLT 2023", IWSLT, pp. 330-340, Jul 2023 [PDF]
S. Cahyawijaya, H. Lovenia, A.F. Aji, G.I. Winata, B.Wilie, F. Koto, R. Mahendra, C. Wibisono, A. Romadhony, K. Vincentio, J. Santoso, D. Moeljadi, C. Wirawan, F. Hudi, M.S. Wicaksono, I.H. Parmonangan, I. Alfina, I.F. Putra, S. Rahmadani, Y. Oenang, A.A. Septiandri, J. Jaya, K. Dhole, A.A. Suryani, R.A. Putri, D. Su, K. Stevens, M.N. Nityasya, M.F. Adilazuarda, R. Ignatius, R. Diandaru, V. Ghifari, T. Yu, W. Dai, Y. Xu, D. Damapuspita, H.A. Wibowo, C. Tho, I.M. Karo, T.N. Fatyanosa, Z. Ji, G. Neubig, T. Baldwin, S. Ruder, P. Fung, H. Sujaini, S. Sakti, A. Purwarianti, "NusaCrowd: Open Source Initiative for Indonesian NLP Resources", ACL Findings, pp. 13745-13818, Jul 2023 [PDF]
J. Chen, S. Sakti, "An Isotropy Analysis for Self-supervised Acoustic Unit Embeddings on the Zero Resource Speech Challenge 2021 Framework", IEEE ICASSP, Jun 2023 [PDF]
S. Novitasari, S. Sakti, S. Nakamura, "Self-adaptive Incremental Machine Speech Chain for Lombard TTS with High-granularity ASR Feedback in Dynamic Noise Condition", IEEE ICASSP, Jun 2023 [PDF]
H. Qi, S. Novitasari, A. Tjandra, S. Sakti, S. Nakamura, "SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain," arXiv preprint arXiv:2301.02966, Jan 2023 [PDF]
R. Chevi, R.E. Prasojo, A.F. Aji, A. Tjandra, S. Sakti, "Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation", IEEE SLT, Jan 2023 [PDF]
H. Qi, S. Novitasari, S. Sakti, S. Nakamura, "Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing", INTERSPEECH, pp. 3413-3417, Sep 2022 [PDF]
R. Fukuda, Y. Ko, Y. Kano, K. Doi, H. Tokuyama, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022", International Conference on Spoken Language Translation (IWSLT), pp.286-292, May 2022 [PDF]
S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura, "Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot", International Workshop on Spoken Dialogue Systems Technology (IWSDS), Nov 2021 [PDF]
R. Fukuda, S. Novitasari, Y. Oka, Y. Kano, Y. Yano, Y. Ko, H. Tokuyama, K. Doi, T. Yanagita, S. Sakti, K. Sudoh, S. Nakamura, "Simultaneous Speech-to-speech Translation System with Transformer-based Incremental ASR, MT, and TTS", Oriental COCOSDA, pp. 186-192, Nov 2021 [PDF]
N. Kaiki, S. Sakti, S. Nakamura, "Using Local Phrase Dependency Structure Information in Neural Sequence-to-sequence Speech Synthesis", Oriental COCOSDA, pp. 206-211, Nov 2021 [PDF]
N. Tachimori, S. Sakti, S. Nakamura, "Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation", Oriental COCOSDA, pp. 1-6, Nov 2021 [PDF] [Best paper award]

国内会議論文・ Domestic Conferences

中村佳登, メフムードファイサル, サクティサクリアニ, "日英コードスイッチングが社会的なヒューマンロボットインタラクションに及ぼす影響", SIG-SLUD, Mar 2025 [PDF]
久保田なつみ, サクティサクリアニ, "音声翻訳フレームワークによる吃音音声の自動音声認識に対する課題への取り組み", SIG-SLUD, Mar 2025 [PDF]
髙橋舜, 金崎朝子, 須田仁志, サクティサクリアニ, "音声信号から文字記号を創り出す―深層ベイズに基づく教師なし表現学習によるアプローチ―", NLP, Mar 2025 [PDF]
胡尤佳, 須藤克仁, 中村哲, サクティサクリアニ, "音声認識出力の曖昧性を考慮したMulti-task End-to-end音声翻訳と曖昧性の高い音声入力に対する頑健性の分析", NLP, Mar 2025 [PDF]
R. F. Widiaputri, A. Purwarianti, D. P. Lestari, K. Azizah, D. Tanaya, S. Sakti, "Disambiguating Ambiguous Indonesian Utterances with ASR and Meaning Interpretation", ASJ Spring Meeting, Mar 2025
R. Hartanto, S. Sakti, K. Shinoda, "Multitask Training of Multi-channel Speaker Separation and Room Acoustic Parameter Estimation", ASJ Spring Meeting, Mar 2025
東翔, サクティサクリアニ, "中間 CTC 目標を活用した多言語 ASR におけるコードスイッチングの向上", ASJ Spring Meeting, Mar 2025
Y. Wang, S. Sakti, "Flow Matchingによる周波数領域でのフローマッチングを用いた高速ニューラルボコーダー", ASJ Spring Meeting, Mar 2025
H. Tan, S. Sakti, "Improving Simultaneous Speech Translation with a Contrastive Feedback Mechanism", ASJ Spring Meeting, Mar 2025
Y. Hirano, M. Nguyen, K. Azuma, J. M. Saragih, S. Sakti, "The NAIST System for the CHiME-8 Distant Meeting Transcription Challenge", ASJ Spring Meeting, Mar 2025
安藤宏祐, 平野雄太, 佐藤颯空, サクティサクリアニ, "音声認識誤りが ChatGPT の翻訳に与える影響の調査", ASJ Spring Meeting, Mar 2025
佐藤颯空, サクティサクリアニ, "拡散モデルベース DNN 音声合成のバックボーンに着目した軽量化とカーネル形状変化の影響", ASJ Spring Meeting, Mar 2025
S. Sakti, B. A. Titalim, "Investigation of Cross-Lingual Mismatch in Low-resource ASR for Indonesian Ethnic Languages", ASJ Spring Meeting, Mar 2024
C. Tran, C.-M. Luong, S. Sakti, "Maintaining Personal Styles in Multilingual TTS with STEN Approach in Diffusion Framework", ASJ Spring Meeting, Mar 2024
R. Hartanto, S. Sakti, K. Shinoda, "Multitask Learning of Speaker Separation and Direction-of-Arrival Estimation", ASJ Spring Meeting, Mar 2024
Z. Zhang, S. Sakti, "Non-Parallel Limited Data Emotion Voice Conversion with Variance Adapter and Non-Autoregressive Decoder", ASJ Spring Meeting, Mar 2024
S. Takahashi, S. Sakti, "Deep Sequential Generative Modeling for Unsupervised Learning of Linguistic Representations from Speech Streams", ASJ Spring Meeting, Mar 2024
H. Xi, S. Sakti, "Perceived Challenges in Simultaneous Japanese-English Translation", ASJ Spring Meeting, Mar 2024
L.-T. Nguyen, S. Sakti, "Utilizing Self-Supervised Visually Grounded Speech Models for Aligning Unpaired and Untranscribed Bilingual Speech", ASJ Spring Meeting, Mar 2024
M. Liu, S. Sakti, "Generating Textual Prosody based on ASR", ASJ Spring Meeting, Mar 2024
J. Effendi, S. Sakti, S. Nakamura, "Cyclic Partially-aligned Transformer for Visually Connected Speech-to-text Mapping", The ASJ Spring Meeting, Mar 2023
多谷邦彦, サクティサクリアニ, 藤原修治, 中村哲, "X-vector を用いた日本語電話音声に対するテキスト独立型話者照合システムの検討", 日本音響学会誌, 79巻1号, pp.18-25, Dec. 2022 [PDF]
S. Novitasari, S. Sakti, S. Nakamura, "Improving Intelligibility of Synthesized Speech in Noisy Condition with Dynamically Adaptive Machine Speech Chain", 情報処理学会音声言語情報処理研究会 SIG-SLP, Dec. 2021 [PDF]

HAI-Lab Publications

招待講演・Invited Talks

著書 / 書籍の一章 ・ Scientific Book / Chapter in Scientific Book

査読付き学術論文誌 ・ Peer-reviewed Scientific Journals

査読付き国際会議論文 ・ Peer-reviewed International Conferences

国内会議論文 ・ Domestic Conferences

著書 / 書籍の一章・ Scientific Book / Chapter in Scientific Book

査読付き学術論文誌・ Peer-reviewed Scientific Journals

査読付き国際会議論文・ Peer-reviewed International Conferences

国内会議論文・ Domestic Conferences