HAI-Lab Publications

          


過去の論文一覧 ・ Past publications (before October 2021)
招待講演 ・ Invited Talks
著書 / 書籍の一章 ・ Scientific Book / Chapter in Scientific Book
査読付き学術論文誌 ・ Peer-reviewed Scientific Journals
査読付き国際会議論文 ・ Peer-reviewed International Conferences
国内会議論文 ・ Domestic Conferences



招待講演・Invited Talks
  1. [Invited speaker] S. Sakti, "Language Technologies for All Mother Languages: Opportunities and Challenges", The International Mother Language Day (IMLD), Paris, Feb 20th, 2025
  2. [Invited speaker] S. Sakti, "Machine Speech Chain: Modeling Human Speech Perception and Production with Auditory Feedback Mechanism" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The NUS Computer Science Research Week, Singapore, Jan 10th, 2025
  3. [Keynote speaker] S. Sakti, "Machine Speech Chain: From Human Auditory Feedback Principles to Language Technology Empowering Indigenous Communities "[A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), C. Soria (CNR-ILC), M. Malero (BSC)], The 21st International Conference on Natural Language Processing (ICON), India, Dec 21st, 2024
  4. [Invited speaker] S. Sakti, "Leveraging the Foundational Speech Chain Models to Empower Low-Resource Languages" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The IX International Scientific Conference “Modern Problems of Applied Mathematics and Information Technologies Al-Khwarizmi 2024”, Uzbekistan, Oct 22nd, 2024
  5. [Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Modeling Human Speech Perception and Production with Auditory Feedback Mechanism for Low-Resource Languages" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan)], The 11th International Conference on Computer, Control, Informatics and its Applications (IC3INA), Indonesia, Oct 10th, 2024
  6. [Invited speaker] S. Sakti, "Language Technology for All: Leveraging Foundational Speech Models to Empower Low-Resource Languages" [A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], The INTERSPEECH Workshop on Synthetic Data’s Transformative Role in Foundational Speech Models (SynData4GenAI 2024), Greece Aug 31st, 2024
  7. [Keynote speaker] S. Sakti, "Communicative Intelligent Systems towards Society 5.0," “Sarasehan Nasional Pendidikan Tinggi Informatika dan Pemberian Tribute kepada Penggagas dan Pendidik Senior Teknik Informatika ITB, Indonesia, Feb 2nd, 2023
  8. [Invited speaker] S. Sakti, "Language Technology for All: From the indigenous community perspectives" [A joint work with J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], "Data, Technologies and Benchmarks for the Spoken Languages of the World" Meeting, IEEE SLT, China, Jan 13th, 2023
  9. [Keynote speaker] S. Sakti, "Language Technology for All: From the technology and indigenous community perspectives" [A joint work with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)], the 25th Conference of the Oriental COCOSDA, Vietnam, Nov 25th, 2022
  10. [Invited panelist] O. Scharenborg (TU Delft, Netherland), E. Ahn (U. Washington, USA), G. Anumanchipalli (UC Berkeley, USA), S. Sakti (JAIST, Japan), Moderator: A. Black, "Data Collection, Bias, and Ethical Concerns in Speech Processing," Speech for Social Good - INTERSPEECH Satellite Workshop, [Virtual], September 24th, 2022
  11. [Invited speaker] S. Sakti, "Semi-supervised Learning for Low-resource Multilingual and Multimodal Speech Processing with Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakayama, T. Yanagita, S. Nakamura (NAIST/RIKEN AIP, Japan)], HiTZ Language Technology Webinar, Basque, May 5th, 2022
  12. [Invited speaker] S. Sakti, "Self-Adaptive Machine Speech Chain in Noisy Environment" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The AAAI workshop on Self-supervised Learning for Audio and Speech Processing, [Virtual], Feb 28th, 2022
  13. [Invited speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Modeling Human Speech Perception and Production with Auditory Feedback Mechanism" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The ITB Seminar, Dec 24th, 2021
  14. [Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for Training and Inference through Feedback Loop" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Cartagena, Colombia, Dec 15th, 2021
  15. [Keynote speaker] S. Sakti, "Listening while Speaking and Visualizing: A Semi-supervised Approach with Multimodal Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the SoCS International Seminar, Dec 10th, 2021
  16. [Keynote speaker] S. Sakti, "Listening while Speaking and Visualizing: A Semi-supervised Approach with Multimodal Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the International Conference of Artificial Intelligence and Speech Technology (AIST), Nov 13th, 2021
  17. back to top


著書 / 書籍の一章 ・ Scientific Book / Chapter in Scientific Book
  1. S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura, "Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot," In: Stoyanchev, S., Ultes, S., Li, H. (eds) Conversational AI for Natural Human-Centric Interaction. Lecture Notes in Electrical Engineering, vol 943. Springer, Singapore, Nov 2022 [PDF]
  2. S. Sakti, "Language technology impact on linguistic diversity". In Book: "State of the art of indigenous languages in research: a collection of selected research papers," In the framework of the International Decade of Indigenous Languages (2022-2032), UNESCO Open Access Repository, pp. 341-348, May 2022 [PDF]
  3. LP. Morency, S. Sakti, B.W. Schuller, S. Ultes, "Multimodal Machine Learning for Social Interaction with Ageing Individuals". In Book: J. Miehle, W. Minker, E. André, K. Yoshino (eds), "Multimodal Agents for Ageing and Multicultural Societies," Springer, Singapore, pp. 61–70, Oct. 2021 [PDF]
  4. back to top


査読付き学術論文誌 ・ Peer-reviewed Scientific Journals
  1. Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, K. Sudoh, S. Sakti, S. Nakamura, "End-to-end Simultaneous Speech Translation with Style Tags using Human Simultaneous Interpretation Data", Journal of Natural Language Processing, Vol.32, No.2, Jun 2025
  2. C. Tran; C.-M. Luong; S. Sakti, "Zero-Shot Cross-Lingual Text-to-Speech With Style-Enhanced Normalization and Auditory Feedback Training Mechanism", IEEE Transactions on Audio, Speech and Language Processing (TASLP), Vol. 33, pp. 1479 - 1492, Mar 2025 [PDF]
  3. L.-T. Nguyen; S. Sakti, "ZeST: A Zero-Resourced Speech-to-Speech Translation Approach for Unknown, Unpaired, and Untranscribed Languages", IEEE Access, Vol. 13, pp. 8638 - 8648, Jan 2025 [PDF]
  4. K. Furukawa, T. Kishiyama, S. Nakamura, S. Sakti, "Applying Syntax-Prosody Mapping Hypothesis and Boundary-Driven Theory to Neural Sequence-to-Sequence Speech Synthesis", IEEE Access, Vol. 12, pp. 160896 - 160917, Oct 2024 [PDF]
  5. Y. Ko, K. Sudoh, S. Sakti and S. Nakamura, "Neural End-to-end Speech Translation Leveraged by ASR Posterior Distribution", IEICE Transactions on Information and Systems, Vol. E107-D, No. 10, pp. 1322 - 1331, Oct 2024 [PDF]
  6. B. Putra, K. Azizah, C.-O. Mawalim, I.-A. Hanif, S. Sakti, C.-W. Leong, S. Okada, "MAG-BERT-ARL for Fair Automated Video Interview Assessment", IEEE Access, Vol. 12, pp. 145188 - 145205, Oct 2024 [PDF]
  7. T. Yanagita, S. Sakti, S. Nakamura, "Japanese Neural Incremental Text-to-Speech Synthesis Framework With an Accent Phrase Input", IEEE Access, Vol. 11, pp. 22355 - 22363, Mar 2023 [PDF]
  8. S. Novitasari, S. Sakti, S. Nakamura, "A Machine Speech Chain Approach for Dynamically Adaptive Lombard TTS in Static and Dynamic Noise Environments", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 2673-2688, Aug 2022 [PDF]
  9. F. Yang, Z. Wang, Y. Wu, S. Sakti, S. Nakamura, "Tackling multiple object tracking with complicated motions — Re-designing the integration of motion and appearance", Image and Vision Computing, Vol. 124, Aug 2022 [PDF] [Based on our winner solutions of the CVPR 2020 WAD MOT Challenge and the CVPR 2020 MOTS Challenge]
  10. 柳田 智也, サクティ サクリアニ, 中村 哲, "日本語逐次音声合成における合成単位", 情報処理学会論文誌, Vol. 63, No. 4, pp. 1149-1158, Apr. 2022 [PDF]
  11. B. Wu, S. Sakti, J. Zhang, S. Nakamura, "Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR", IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol. 30, pp. 901-916, Feb 2022 [PDF]
  12. S. Novitasari, S. Sakti, S. Nakamura, "Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation", IEICE Transactions on Information and Systems, E104.D (12), pp. 2195-2208, Dec 2021 [PDF]
  13. back to top


査読付き国際会議論文 ・ Peer-reviewed International Conferences
  1. H. Tan, R.-F. Widiaputri, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti, "NAIST Simultaneous Speech Translation System for IWSLT 2025", IWSLT, pp. to appear, Jul 2025
  2. R.-F. Widiaputri, H. Tan, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti, "NAIST Offline Speech Translation System for IWSLT 2025", IWSLT, pp. to appear, Jul 2025
  3. F. Mehmood and S. Sakti, "Behavioral Interdependence: A Mediator of Ostracism-Aggression Relationship in Human-Robot Interaction", The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. to appear, Jul 2025
  4. F. Mehmood and S. Sakti, "Impact of Gender and Group Size on Right to Speak and Peer Pressure in Human–Robot Interaction", The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. to appear, Jul 2025
  5. F. Mehmood and S. Sakti, "Role of Social Treatment and Mode of Communication in Shaping Social Acceptance", The 17th International Conference on Human System Interaction (HSI), pp. to appear, Jul 2025
  6. H. Watanabe, A.-S. Ihara, M. Okada, S. Sakti, M. Tachimori, E. Mizukami, and Y. Naruse , "Automated Classification of Non-Clinical Depressive States Based on EEG during Listening to Natural Speech", The 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. to appear, Jul 2025
  7. M.-R. Ridha, S. Hasegawa, S. Sakti, "Toward Visual Pronunciation Learning: A Speech-to-Articulatory Animation Pipeline Leveraging wav2vec 2.0 and rtMRI Landmarks", ICASSP, Apr 2025 [PDF]
  8. M. Nguyen, S. Hasegawa, S. Sakti, "Enhancing Unsupervised Acoustic Word Embedding with Visual-Grounded Speech Model and Novel Word-level ABX Evaluation Schemes", ICASSP, Apr 2025 [PDF]
  9. C. Tran, S. Sakti, "From Pixels to Voice: A Simple and Efficient End-to-End Spoken Image Description Approach via Vision Codec Language Models", ICASSP, Apr 2025 [PDF]
  10. R.N. Abdjul, D.P. Lestari, A. Purwarianti, C.O Mawalim, S. Sakti, M. Unoki, "Indonesian Speech Content De-Identification in Low Resource Transcripts", The Second Workshop in South East Asian Language Processing (SEALP), COLING Satellite Workshop, Jan 2025 [PDF]
  11. P.-A. Hafiz, C.-O. Mawalim, D. P. Lestari, S. Sakti, M. Unoki, "Anomalous Machine Sound Detection Based on Time Domain Gammatone Spectrogram Feature and IDNN Model", APSIPA ASC, Dec 2024 [PDF]
  12. M.-H. Rafsanjani, C.-O. Mawalim, D. P. Lestari, S. Sakti, M. Unoki, "Unsupervised Anomalous Sound Detection Using Timbral and Human Voice Disorder-Related Acoustic Features", APSIPA ASC, Dec 2024 [PDF]
  13. A. Adila, D.-P. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech Variabilities", Oriental COCOSDA, Oct 2024 [PDF]
  14. A.-A. Handoyo, C. Tran, D.-P. Lestari, S. Sakti, "Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and BERT LID", Oriental COCOSDA, Oct 2024 [PDF]
  15. Z. Zhang, S. Sakti, "A Feedback-driven Self-improvement Strategy And Emotion-aware Vocoder For Emotional Voice Conversion", Oriental COCOSDA, Oct 2024 [PDF] [Best paper award]
  16. G. Tyndall, K. Azizah, D. Tanaya, A. Purwarianti, D. P. Lestari, S. Sakti, "Continual Learning in Machine Speech Chain Using Gradient Episodic Memory", Oriental COCOSDA, Oct 2024 [PDF]
  17. D.-U. Dewangga, D. Puji, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "An Evaluation of Neural Vocoder-based Voice Cloning System for Dysphonia Speech Disorder", Oriental COCOSDA, Oct 2024 [PDF]
  18. I.-P. Amin, H. Tan, K. Azizah, S. Sakti, "Chunk Size Scheduling for Optimizing The Quality-latency Trade-off in Simultaneous Speech Translation", Oriental COCOSDA, Oct 2024 [PDF]
  19. H. Tan, S. Sakti, "Contrastive Feedback Mechanism for Simultaneous Speech Translation", INTERSPEECH, Sept 2024 [PDF]
  20. R. Hartanto, S. Sakti, K. Shinoda, "MSDET: Multitask Speaker Separation and Direction-of-Arrival Estimation Training", INTERSPEECH, Sept 2024 [PDF]
  21. Y. Hirano, M. Nguyen, K. Azuma, J.-M. Saragih, S. Sakti, "The NAIST System for the CHiME-8 NOTSOFAR-1 Task", International Workshop on Speech Processing in Everyday Environments (CHiME), INTERSPEECH Sattelite Workshop, Sept 2024 [PDF]
  22. A.-P. Naufal, D.-P. Lestari, A. Purwarianti, K. Azizah, D. Tanaya, S. Sakti, "Machine Speech Chain with Emotion Recognition", ICAICTA, Sept 2024 [PDF]
  23. Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, T. Yanagita, K. Doi, M. Makinae, H. Tan, M. Sakai, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech Translation System for IWSLT 2024", IWSLT, Aug 2024 [PDF]
  24. M.-R. Ridha, S. Sakti, "Refining rtMRI Landmark-Based Vocal Tract Contour Labels with FCN-Based Smoothing and Point-to-Curve Projection", LREC-COLING, May 2024 [PDF]
  25. R. V. M. Tazakka, D. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti, "Indonesian-English Code-Switching Speech Recognition Using the Machine Speech Chain Based Semi-Supervised Learning", SIGUL, LREC-COLING Satellite Workshop, May 2024 [PDF]
  26. H.-T. Nguyen, S. Sakti, "Multilingual Self-supervised Visually Grounded Speech Models", SIGUL, LREC-COLING Satellite Workshop, May 2024 [PDF]
  27. S. Sakti, B.A. Titalim, "Leveraging the Multilingual Indonesian Ethnic Languages Dataset in Self-supervised Model for Low-resource ASR Task", ASRU, Dec 2023 [PDF]
  28. R.F. Widiaputri, A. Purwarianti, D. Lestari, K. Azizah, D. Tanaya, S. Sakti, "Speech Recognition and Meaning Interpretation: Towards Disambiguation of Structurally Ambiguous Spoken Utterances in Indonesian", EMNLP, pp. 16813–16824, Dec 2023 [PDF]
  29. B. Hartanti, D. Tanaya, K. Azizah, D. Lestari, A. Purwarianti, S. Sakti, "Generating Speech with Prosodic Prominence based on SSL-Visually Grounded Models", Oriental COCOSDA, Dec 2023 [PDF]
  30. H. Xi and S. Sakti, "Exploring Difficulties Encountered by Professional Interpreters in Japanese-to-English and English-to-Japanese Simultaneous Translation", Oriental COCOSDA, Dec 2023 [PDF]
  31. C. Tran, C.M. Luong, S. Sakti, "STEN-TTS: Improving Zero-shot Cross-Lingual Transfer for Multi-Lingual TTS with Style-Enhanced Normalization Diffusion Framework", INTERSPEECH, pp. 4464-4468, Aug 2023 [PDF]
  32. S. Takahashi, S. Sakti, "Unsupervised Learning of Discrete Latent Representations with Data-Adaptive Dimensionality from Continuous Speech Streams", INTERSPEECH, pp. 416-420, Aug 2023 [PDF]
  33. T.D. Tran, S. Sakti, "Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning", INTERSPEECH Satellite Workshop - the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL), pp. 78-82, Aug 2023 [PDF]
  34. L.T. Nguyen, S. Sakti, "VGSAlign: Bilingual Speech Alignment of Unpaired and Untranscribed Languages using Self-Supervised Visually Grounded Speech Models", INTERSPEECH Satellite Workshop - the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL), pp. 53-57, Aug 2023 [PDF]
  35. R. Fukuda, Y. Nishikawa, Y. Kano, Y. Ko, T. Yanagita, K. Doi, M. Makinae, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech-to-speech Translation System for IWSLT 2023", IWSLT, pp. 330-340, Jul 2023 [PDF]
  36. S. Cahyawijaya, H. Lovenia, A.F. Aji, G.I. Winata, B.Wilie, F. Koto, R. Mahendra, C. Wibisono, A. Romadhony, K. Vincentio, J. Santoso, D. Moeljadi, C. Wirawan, F. Hudi, M.S. Wicaksono, I.H. Parmonangan, I. Alfina, I.F. Putra, S. Rahmadani, Y. Oenang, A.A. Septiandri, J. Jaya, K. Dhole, A.A. Suryani, R.A. Putri, D. Su, K. Stevens, M.N. Nityasya, M.F. Adilazuarda, R. Ignatius, R. Diandaru, V. Ghifari, T. Yu, W. Dai, Y. Xu, D. Damapuspita, H.A. Wibowo, C. Tho, I.M. Karo, T.N. Fatyanosa, Z. Ji, G. Neubig, T. Baldwin, S. Ruder, P. Fung, H. Sujaini, S. Sakti, A. Purwarianti, "NusaCrowd: Open Source Initiative for Indonesian NLP Resources", ACL Findings, pp. 13745-13818, Jul 2023 [PDF]
  37. J. Chen, S. Sakti, "An Isotropy Analysis for Self-supervised Acoustic Unit Embeddings on the Zero Resource Speech Challenge 2021 Framework", IEEE ICASSP, Jun 2023 [PDF]
  38. S. Novitasari, S. Sakti, S. Nakamura, "Self-adaptive Incremental Machine Speech Chain for Lombard TTS with High-granularity ASR Feedback in Dynamic Noise Condition", IEEE ICASSP, Jun 2023 [PDF]
  39. H. Qi, S. Novitasari, A. Tjandra, S. Sakti, S. Nakamura, "SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain," arXiv preprint arXiv:2301.02966, Jan 2023 [PDF]
  40. R. Chevi, R.E. Prasojo, A.F. Aji, A. Tjandra, S. Sakti, "Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation", IEEE SLT, Jan 2023 [PDF]
  41. H. Qi, S. Novitasari, S. Sakti, S. Nakamura, "Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing", INTERSPEECH, pp. 3413-3417, Sep 2022 [PDF]
  42. R. Fukuda, Y. Ko, Y. Kano, K. Doi, H. Tokuyama, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022", International Conference on Spoken Language Translation (IWSLT), pp.286-292, May 2022 [PDF]
  43. S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura, "Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot", International Workshop on Spoken Dialogue Systems Technology (IWSDS), Nov 2021 [PDF]
  44. R. Fukuda, S. Novitasari, Y. Oka, Y. Kano, Y. Yano, Y. Ko, H. Tokuyama, K. Doi, T. Yanagita, S. Sakti, K. Sudoh, S. Nakamura, "Simultaneous Speech-to-speech Translation System with Transformer-based Incremental ASR, MT, and TTS", Oriental COCOSDA, pp. 186-192, Nov 2021 [PDF]
  45. N. Kaiki, S. Sakti, S. Nakamura, "Using Local Phrase Dependency Structure Information in Neural Sequence-to-sequence Speech Synthesis", Oriental COCOSDA, pp. 206-211, Nov 2021 [PDF]
  46. N. Tachimori, S. Sakti, S. Nakamura, "Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation", Oriental COCOSDA, pp. 1-6, Nov 2021 [PDF] [Best paper award]
  47. back to top


国内会議論文 ・ Domestic Conferences
  1. 中村 佳登, メフムード ファイサル, サクティ サクリアニ, "日英コードスイッチングが社会的なヒューマンロボットインタラクションに及ぼす影響", SIG-SLUD, Mar 2025 [PDF]
  2. 久保田 なつみ, サクティ サクリアニ, "音声翻訳フレームワークによる吃音音声の自動音声認識に対する課題への取り組み", SIG-SLUD, Mar 2025 [PDF]
  3. 髙橋 舜, 金崎 朝子, 須田 仁志, サクティ サクリアニ, "音声信号から文字記号を創り出す―深層ベイズに基づく教師なし表現学習によるアプローチ―", NLP, Mar 2025 [PDF]
  4. 胡 尤佳, 須藤 克仁, 中村 哲, サクティ サクリアニ, "音声認識出力の曖昧性を考慮したMulti-task End-to-end音声翻訳と曖昧性の高い音声入力に対する頑健性の分析", NLP, Mar 2025 [PDF]
  5. R. F. Widiaputri, A. Purwarianti, D. P. Lestari, K. Azizah, D. Tanaya, S. Sakti, "Disambiguating Ambiguous Indonesian Utterances with ASR and Meaning Interpretation", ASJ Spring Meeting, Mar 2025
  6. R. Hartanto, S. Sakti, K. Shinoda, "Multitask Training of Multi-channel Speaker Separation and Room Acoustic Parameter Estimation", ASJ Spring Meeting, Mar 2025
  7. 東 翔, サクティ サクリアニ, "中間 CTC 目標を活用した多言語 ASR におけるコードスイッチングの向上", ASJ Spring Meeting, Mar 2025
  8. Y. Wang, S. Sakti, "Flow Matchingによる周波数領域でのフローマッチングを用いた高速ニューラルボコーダー", ASJ Spring Meeting, Mar 2025
  9. H. Tan, S. Sakti, "Improving Simultaneous Speech Translation with a Contrastive Feedback Mechanism", ASJ Spring Meeting, Mar 2025
  10. Y. Hirano, M. Nguyen, K. Azuma, J. M. Saragih, S. Sakti, "The NAIST System for the CHiME-8 Distant Meeting Transcription Challenge", ASJ Spring Meeting, Mar 2025
  11. 安藤 宏祐, 平野 雄太, 佐藤 颯空, サクティ サクリアニ, "音声認識誤りが ChatGPT の翻訳に与える影響の調査", ASJ Spring Meeting, Mar 2025
  12. 佐藤 颯空, サクティ サクリアニ, "拡散モデルベース DNN 音声合成のバックボーンに着目した軽量化とカーネル形状変化の影響", ASJ Spring Meeting, Mar 2025
  13. S. Sakti, B. A. Titalim, "Investigation of Cross-Lingual Mismatch in Low-resource ASR for Indonesian Ethnic Languages", ASJ Spring Meeting, Mar 2024
  14. C. Tran, C.-M. Luong, S. Sakti, "Maintaining Personal Styles in Multilingual TTS with STEN Approach in Diffusion Framework", ASJ Spring Meeting, Mar 2024
  15. R. Hartanto, S. Sakti, K. Shinoda, "Multitask Learning of Speaker Separation and Direction-of-Arrival Estimation", ASJ Spring Meeting, Mar 2024
  16. Z. Zhang, S. Sakti, "Non-Parallel Limited Data Emotion Voice Conversion with Variance Adapter and Non-Autoregressive Decoder", ASJ Spring Meeting, Mar 2024
  17. S. Takahashi, S. Sakti, "Deep Sequential Generative Modeling for Unsupervised Learning of Linguistic Representations from Speech Streams", ASJ Spring Meeting, Mar 2024
  18. H. Xi, S. Sakti, "Perceived Challenges in Simultaneous Japanese-English Translation", ASJ Spring Meeting, Mar 2024
  19. L.-T. Nguyen, S. Sakti, "Utilizing Self-Supervised Visually Grounded Speech Models for Aligning Unpaired and Untranscribed Bilingual Speech", ASJ Spring Meeting, Mar 2024
  20. M. Liu, S. Sakti, "Generating Textual Prosody based on ASR", ASJ Spring Meeting, Mar 2024
  21. J. Effendi, S. Sakti, S. Nakamura, "Cyclic Partially-aligned Transformer for Visually Connected Speech-to-text Mapping", The ASJ Spring Meeting, Mar 2023
  22. 多谷 邦彦, サクティ サクリアニ, 藤原 修治, 中村 哲, "X-vector を用いた日本語電話音声に対するテキスト独立型話者照合システムの検討", 日本音響学会誌, 79巻1号, pp.18-25, Dec. 2022 [PDF]
  23. S. Novitasari, S. Sakti, S. Nakamura, "Improving Intelligibility of Synthesized Speech in Noisy Condition with Dynamically Adaptive Machine Speech Chain", 情報処理学会 音声言語情報処理研究会 SIG-SLP, Dec. 2021 [PDF]
  24. back to top