過去の論文一覧 ・ Past publications (before October 2021)
招待講演 ・ Invited Talks
著書 / 書籍の一章 ・ Scientific Book / Chapter in Scientific Book
査読付き学術論文誌 ・ Peer-reviewed Scientific Journals
査読付き国際会議論文 ・ Peer-reviewed International Conferences
国内会議論文 ・ Domestic Conferences
招待講演・Invited Talks
-
[Invited speaker] S. Sakti, "Language Technologies for All Mother Languages: Opportunities and Challenges",
The International Mother Language Day (IMLD), Paris, Feb 20th, 2025
-
[Invited speaker] S. Sakti, "Machine Speech Chain: Modeling Human Speech Perception and Production
with Auditory Feedback Mechanism"
[A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST, Japan)], The NUS Computer Science Research Week, Singapore, Jan 10th, 2025
-
[Keynote speaker] S. Sakti, "Machine Speech Chain: From Human Auditory Feedback Principles to
Language Technology Empowering Indigenous Communities
"[A joint work
with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura
(NAIST, Japan), J. Mariani (LIMSI), C. Soria
(CNR-ILC), M. Malero (BSC)], The 21st International Conference on Natural Language Processing (ICON), India,
Dec 21st, 2024
-
[Invited speaker] S. Sakti, "Leveraging the Foundational Speech Chain Models
to Empower Low-Resource Languages"
[A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST, Japan)], The IX International Scientific Conference
“Modern Problems of Applied Mathematics and Information Technologies Al-Khwarizmi 2024”, Uzbekistan, Oct 22nd, 2024
-
[Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning Approach for
Modeling Human Speech Perception and Production with
Auditory Feedback Mechanism for Low-Resource Languages"
[A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST, Japan)], The 11th International Conference on
Computer, Control, Informatics and its Applications (IC3INA), Indonesia, Oct 10th, 2024
-
[Invited speaker] S. Sakti, "Language Technology for All:
Leveraging Foundational Speech Models
to Empower Low-Resource Languages" [A joint work
with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura
(NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria
(CNR-ILC), M. Malero (BSC)], The INTERSPEECH Workshop on Synthetic Data’s
Transformative Role in Foundational Speech Models (SynData4GenAI 2024), Greece
Aug 31st, 2024
-
[Keynote speaker] S. Sakti, "Communicative Intelligent Systems
towards Society 5.0," “Sarasehan Nasional Pendidikan Tinggi
Informatika dan Pemberian Tribute kepada Penggagas dan Pendidik
Senior Teknik Informatika ITB, Indonesia, Feb 2nd, 2023
-
[Invited speaker] S. Sakti, "Language Technology for All: From the
indigenous community perspectives" [A joint work with J. Mariani
(LIMSI), K. Choukri (ELRA), C. Soria (CNR-ILC), M. Malero (BSC)],
"Data, Technologies and Benchmarks for the Spoken Languages of the
World" Meeting, IEEE SLT, China, Jan 13th, 2023
-
[Keynote speaker] S. Sakti, "Language Technology for All: From the
technology and indigenous community perspectives" [A joint work
with M. Heck, A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura
(NAIST, Japan), J. Mariani (LIMSI), K. Choukri (ELRA), C. Soria
(CNR-ILC), M. Malero (BSC)], the 25th Conference of the Oriental
COCOSDA, Vietnam, Nov 25th, 2022
-
[Invited panelist] O. Scharenborg (TU Delft, Netherland), E. Ahn
(U. Washington, USA), G. Anumanchipalli (UC Berkeley, USA), S.
Sakti (JAIST, Japan), Moderator: A. Black, "Data Collection, Bias,
and Ethical Concerns in Speech Processing," Speech for Social Good
- INTERSPEECH Satellite Workshop, [Virtual], September 24th, 2022
-
[Invited speaker] S. Sakti, "Semi-supervised Learning for
Low-resource Multilingual and Multimodal Speech Processing with
Machine Speech Chain" [A joint work with A. Tjandra, J. Effendi,
S. Novitasari, S. Nakayama, T. Yanagita, S. Nakamura (NAIST/RIKEN
AIP, Japan)], HiTZ Language Technology Webinar, Basque, May 5th, 2022
-
[Invited speaker] S. Sakti, "Self-Adaptive Machine Speech Chain in
Noisy Environment" [A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The AAAI
workshop on Self-supervised Learning for Audio and Speech
Processing, [Virtual], Feb 28th, 2022
-
[Invited speaker] S. Sakti, "Machine Speech Chain: A Deep Learning
Approach for Modeling Human Speech Perception and Production with
Auditory Feedback Mechanism" [A joint work with A. Tjandra, J.
Effendi, S. Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], The
ITB Seminar, Dec 24th, 2021
-
[Keynote speaker] S. Sakti, "Machine Speech Chain: A Deep Learning
Approach for Training and Inference through Feedback Loop" [A
joint work with A. Tjandra, J. Effendi, S. Novitasari, S. Nakamura
(NAIST/RIKEN AIP, Japan)], the IEEE Automatic Speech Recognition
and Understanding Workshop (ASRU), Cartagena, Colombia, Dec 15th,
2021
-
[Keynote speaker] S. Sakti, "Listening while Speaking and
Visualizing: A Semi-supervised Approach with Multimodal Machine
Speech Chain" [A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the SoCS
International Seminar, Dec 10th, 2021
-
[Keynote speaker] S. Sakti, "Listening while Speaking and
Visualizing: A Semi-supervised Approach with Multimodal Machine
Speech Chain" [A joint work with A. Tjandra, J. Effendi, S.
Novitasari, S. Nakamura (NAIST/RIKEN AIP, Japan)], the
International Conference of Artificial Intelligence and Speech
Technology (AIST), Nov 13th, 2021
back to top
著書 / 書籍の一章 ・ Scientific Book / Chapter in Scientific Book
-
S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura,
"Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional
Robot," In: Stoyanchev, S., Ultes, S., Li, H. (eds) Conversational
AI for Natural Human-Centric Interaction. Lecture Notes in
Electrical Engineering, vol 943. Springer, Singapore, Nov 2022
[PDF]
-
S. Sakti, "Language technology impact on linguistic diversity". In
Book: "State of the art of indigenous languages in research: a
collection of selected research papers," In the framework of the
International Decade of Indigenous Languages (2022-2032), UNESCO
Open Access Repository, pp. 341-348, May 2022
[PDF]
-
LP. Morency, S. Sakti, B.W. Schuller, S. Ultes, "Multimodal
Machine Learning for Social Interaction with Ageing Individuals".
In Book: J. Miehle, W. Minker, E. André, K. Yoshino (eds),
"Multimodal Agents for Ageing and Multicultural Societies,"
Springer, Singapore, pp. 61–70, Oct. 2021
[PDF]
back to top
査読付き学術論文誌 ・ Peer-reviewed Scientific Journals
-
Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, K. Sudoh, S. Sakti, S. Nakamura, "End-to-end Simultaneous Speech Translation with Style Tags
using Human Simultaneous Interpretation Data", Journal of Natural Language Processing, Vol.32, No.2, Jun 2025
-
C. Tran; C.-M. Luong; S. Sakti, "Zero-Shot Cross-Lingual Text-to-Speech With Style-Enhanced Normalization and Auditory Feedback
Training Mechanism", IEEE Transactions on Audio, Speech and Language Processing (TASLP), Vol. 33, pp. 1479 - 1492, Mar 2025
[PDF]
-
L.-T. Nguyen; S. Sakti, "ZeST: A Zero-Resourced Speech-to-Speech Translation Approach for Unknown, Unpaired, and Untranscribed
Languages",
IEEE Access, Vol. 13, pp. 8638 - 8648, Jan 2025
[PDF]
-
K. Furukawa, T. Kishiyama, S. Nakamura, S. Sakti, "Applying Syntax-Prosody Mapping Hypothesis and Boundary-Driven
Theory to Neural Sequence-to-Sequence Speech Synthesis",
IEEE Access, Vol. 12, pp. 160896 - 160917, Oct 2024
[PDF]
-
Y. Ko, K. Sudoh, S. Sakti and S. Nakamura, "Neural End-to-end Speech Translation Leveraged by ASR Posterior Distribution",
IEICE Transactions on Information and Systems, Vol. E107-D, No. 10, pp. 1322 - 1331, Oct 2024
[PDF]
-
B. Putra, K. Azizah, C.-O. Mawalim, I.-A. Hanif, S. Sakti, C.-W. Leong, S. Okada, "MAG-BERT-ARL for Fair Automated Video Interview
Assessment", IEEE Access, Vol. 12, pp. 145188 - 145205, Oct 2024 [PDF]
-
T. Yanagita, S. Sakti, S. Nakamura, "Japanese Neural Incremental
Text-to-Speech Synthesis Framework With an Accent Phrase Input",
IEEE Access, Vol. 11, pp. 22355 - 22363, Mar 2023
[PDF]
-
S. Novitasari, S. Sakti, S. Nakamura, "A Machine Speech Chain
Approach for Dynamically Adaptive Lombard TTS in Static and
Dynamic Noise Environments", IEEE/ACM Transactions on Audio,
Speech, and Language Processing, Vol. 30, pp. 2673-2688, Aug 2022
[PDF]
-
F. Yang, Z. Wang, Y. Wu, S. Sakti, S. Nakamura, "Tackling multiple
object tracking with complicated motions — Re-designing the
integration of motion and appearance", Image and Vision Computing,
Vol. 124, Aug 2022
[PDF]
[Based on our winner solutions of the CVPR 2020 WAD MOT
Challenge and the CVPR 2020 MOTS Challenge]
-
柳田 智也, サクティ サクリアニ, 中村 哲,
"日本語逐次音声合成における合成単位", 情報処理学会論文誌, Vol. 63,
No. 4, pp. 1149-1158, Apr. 2022
[PDF]
-
B. Wu, S. Sakti, J. Zhang, S. Nakamura, "Modeling Unsupervised
Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to
Extract Perceptual Features for Low-Resource ASR", IEEE/ACM
Transactions on Audio, Speech, and Language Processing (TASLP),
Vol. 30, pp. 901-916, Feb 2022
[PDF]
-
S. Novitasari, S. Sakti, S. Nakamura, "Neural Incremental Speech
Recognition Toward Real-Time Machine Speech Translation", IEICE
Transactions on Information and Systems, E104.D (12), pp.
2195-2208, Dec 2021
[PDF]
back to top
査読付き国際会議論文 ・ Peer-reviewed International Conferences
-
H. Tan, R.-F. Widiaputri, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti,
"NAIST Simultaneous Speech Translation System for IWSLT 2025", IWSLT, pp. to appear, Jul 2025
-
R.-F. Widiaputri, H. Tan, J.-M. Saragih, Y. Ko, K. Sudoh, S. Nakamura and S. Sakti,
"NAIST Offline Speech Translation System for IWSLT 2025", IWSLT, pp. to appear, Jul 2025
-
F. Mehmood and S. Sakti,
"Behavioral Interdependence: A Mediator of Ostracism-Aggression Relationship in Human-Robot Interaction",
The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. to appear, Jul 2025
-
F. Mehmood and S. Sakti,
"Impact of Gender and Group Size on Right to Speak and Peer Pressure in Human–Robot Interaction",
The 21st IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), pp. to appear, Jul 2025
-
F. Mehmood and S. Sakti,
"Role of Social Treatment and Mode of Communication in Shaping Social Acceptance",
The 17th International Conference on Human System Interaction (HSI), pp. to appear, Jul 2025
-
H. Watanabe, A.-S. Ihara, M. Okada, S. Sakti, M. Tachimori, E. Mizukami, and Y. Naruse ,
"Automated Classification of Non-Clinical Depressive States Based on EEG during Listening to Natural Speech",
The 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. to appear, Jul 2025
-
M.-R. Ridha, S. Hasegawa, S. Sakti,
"Toward Visual Pronunciation Learning: A Speech-to-Articulatory Animation Pipeline Leveraging wav2vec 2.0
and rtMRI Landmarks", ICASSP, Apr 2025
[PDF]
-
M. Nguyen, S. Hasegawa, S. Sakti,
"Enhancing Unsupervised Acoustic Word Embedding with Visual-Grounded Speech Model and Novel Word-level ABX
Evaluation Schemes", ICASSP, Apr 2025
[PDF]
-
C. Tran, S. Sakti,
"From Pixels to Voice: A Simple and Efficient End-to-End Spoken Image Description Approach via Vision Codec
Language Models", ICASSP, Apr 2025
[PDF]
-
R.N. Abdjul, D.P. Lestari, A. Purwarianti, C.O Mawalim, S. Sakti, M. Unoki,
"Indonesian Speech Content De-Identification in Low Resource Transcripts",
The Second Workshop in South East Asian Language Processing (SEALP), COLING Satellite Workshop, Jan 2025
[PDF]
-
P.-A. Hafiz, C.-O. Mawalim, D. P. Lestari, S. Sakti, M. Unoki,
"Anomalous Machine Sound Detection Based on Time Domain Gammatone
Spectrogram Feature and IDNN Model", APSIPA ASC, Dec 2024
[PDF]
-
M.-H. Rafsanjani, C.-O. Mawalim, D. P. Lestari, S. Sakti, M.
Unoki, "Unsupervised Anomalous Sound Detection Using Timbral and
Human Voice Disorder-Related Acoustic Features", APSIPA ASC, Dec
2024
[PDF]
-
A. Adila, D.-P. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, S. Sakti,
"Enhancing Indonesian Automatic Speech Recognition: Evaluating
Multilingual Models with Diverse Speech Variabilities", Oriental
COCOSDA, Oct 2024
[PDF]
-
A.-A. Handoyo, C. Tran, D.-P. Lestari, S. Sakti,
"Indonesian-English Code-Switching Speech Synthesizer Utilizing
Multilingual STEN-TTS and BERT LID", Oriental COCOSDA, Oct 2024
[PDF]
-
Z. Zhang, S. Sakti, "A Feedback-driven Self-improvement Strategy
And Emotion-aware Vocoder For Emotional Voice Conversion",
Oriental COCOSDA, Oct 2024
[PDF]
[Best paper award]
-
G. Tyndall, K. Azizah, D. Tanaya, A. Purwarianti, D. P. Lestari,
S. Sakti, "Continual Learning in Machine Speech Chain Using
Gradient Episodic Memory", Oriental COCOSDA, Oct 2024
[PDF]
-
D.-U. Dewangga, D. Puji, A. Purwarianti, D. Tanaya, K. Azizah, S.
Sakti, "An Evaluation of Neural Vocoder-based Voice Cloning System
for Dysphonia Speech Disorder", Oriental COCOSDA, Oct 2024
[PDF]
-
I.-P. Amin, H. Tan, K. Azizah, S. Sakti, "Chunk Size Scheduling
for Optimizing The Quality-latency Trade-off in Simultaneous
Speech Translation", Oriental COCOSDA, Oct 2024
[PDF]
-
H. Tan, S. Sakti, "Contrastive Feedback Mechanism for Simultaneous
Speech Translation", INTERSPEECH, Sept 2024
[PDF]
-
R. Hartanto, S. Sakti, K. Shinoda, "MSDET: Multitask Speaker
Separation and Direction-of-Arrival Estimation Training",
INTERSPEECH, Sept 2024
[PDF]
-
Y. Hirano, M. Nguyen, K. Azuma, J.-M. Saragih, S. Sakti, "The
NAIST System for the CHiME-8 NOTSOFAR-1 Task", International Workshop on Speech Processing in Everyday Environments (CHiME),
INTERSPEECH Sattelite Workshop, Sept 2024
[PDF]
-
A.-P. Naufal, D.-P. Lestari, A. Purwarianti, K. Azizah, D. Tanaya,
S. Sakti, "Machine Speech Chain with Emotion Recognition",
ICAICTA, Sept 2024
[PDF]
-
Y. Ko, R. Fukuda, Y. Nishikawa, Y. Kano, T. Yanagita, K. Doi, M.
Makinae, H. Tan, M. Sakai, S. Sakti, K. Sudoh, S. Nakamura, "NAIST
Simultaneous Speech Translation System for IWSLT 2024", IWSLT, Aug
2024
[PDF]
-
M.-R. Ridha, S. Sakti, "Refining rtMRI Landmark-Based Vocal Tract
Contour Labels with FCN-Based Smoothing and Point-to-Curve
Projection", LREC-COLING, May 2024
[PDF]
-
R. V. M. Tazakka, D. Lestari, A. Purwarianti, D. Tanaya, K.
Azizah, S. Sakti, "Indonesian-English Code-Switching Speech
Recognition Using the Machine Speech Chain Based Semi-Supervised
Learning", SIGUL, LREC-COLING Satellite Workshop, May 2024
[PDF]
-
H.-T. Nguyen, S. Sakti, "Multilingual Self-supervised Visually
Grounded Speech Models", SIGUL, LREC-COLING Satellite Workshop, May 2024
[PDF]
-
S. Sakti, B.A. Titalim, "Leveraging the Multilingual Indonesian
Ethnic Languages Dataset in Self-supervised Model for Low-resource
ASR Task", ASRU, Dec 2023
[PDF]
-
R.F. Widiaputri, A. Purwarianti, D. Lestari, K. Azizah, D. Tanaya,
S. Sakti, "Speech Recognition and Meaning Interpretation: Towards
Disambiguation of Structurally Ambiguous Spoken Utterances in
Indonesian", EMNLP, pp. 16813–16824, Dec 2023
[PDF]
-
B. Hartanti, D. Tanaya, K. Azizah, D. Lestari, A. Purwarianti, S.
Sakti, "Generating Speech with Prosodic Prominence based on
SSL-Visually Grounded Models", Oriental COCOSDA, Dec 2023
[PDF]
-
H. Xi and S. Sakti, "Exploring Difficulties Encountered by
Professional Interpreters in Japanese-to-English and
English-to-Japanese Simultaneous Translation", Oriental COCOSDA,
Dec 2023
[PDF]
-
C. Tran, C.M. Luong, S. Sakti, "STEN-TTS: Improving Zero-shot
Cross-Lingual Transfer for Multi-Lingual TTS with Style-Enhanced
Normalization Diffusion Framework", INTERSPEECH, pp. 4464-4468,
Aug 2023
[PDF]
-
S. Takahashi, S. Sakti, "Unsupervised Learning of Discrete Latent
Representations with Data-Adaptive Dimensionality from Continuous
Speech Streams", INTERSPEECH, pp. 416-420, Aug 2023
[PDF]
-
T.D. Tran, S. Sakti, "Low-Resource Japanese-English Speech-to-Text
Translation Leveraging Speech-Text Unified-model Representation
Learning", INTERSPEECH Satellite Workshop - the ELRA/ISCA Special
Interest Group on Under-resourced Languages (SIGUL), pp. 78-82,
Aug 2023
[PDF]
-
L.T. Nguyen, S. Sakti, "VGSAlign: Bilingual Speech Alignment of
Unpaired and Untranscribed Languages using Self-Supervised
Visually Grounded Speech Models", INTERSPEECH Satellite Workshop -
the ELRA/ISCA Special Interest Group on Under-resourced Languages
(SIGUL), pp. 53-57, Aug 2023
[PDF]
-
R. Fukuda, Y. Nishikawa, Y. Kano, Y. Ko, T. Yanagita, K. Doi, M.
Makinae, S. Sakti, K. Sudoh, S. Nakamura, "NAIST Simultaneous
Speech-to-speech Translation System for IWSLT 2023", IWSLT, pp.
330-340, Jul 2023
[PDF]
-
S. Cahyawijaya, H. Lovenia, A.F. Aji, G.I. Winata, B.Wilie, F.
Koto, R. Mahendra, C. Wibisono, A. Romadhony, K. Vincentio, J.
Santoso, D. Moeljadi, C. Wirawan, F. Hudi, M.S. Wicaksono, I.H.
Parmonangan, I. Alfina, I.F. Putra, S. Rahmadani, Y. Oenang, A.A.
Septiandri, J. Jaya, K. Dhole, A.A. Suryani, R.A. Putri, D. Su, K.
Stevens, M.N. Nityasya, M.F. Adilazuarda, R. Ignatius, R.
Diandaru, V. Ghifari, T. Yu, W. Dai, Y. Xu, D. Damapuspita, H.A.
Wibowo, C. Tho, I.M. Karo, T.N. Fatyanosa, Z. Ji, G. Neubig, T.
Baldwin, S. Ruder, P. Fung, H. Sujaini, S. Sakti, A. Purwarianti,
"NusaCrowd: Open Source Initiative for Indonesian NLP Resources",
ACL Findings, pp. 13745-13818, Jul 2023
[PDF]
-
J. Chen, S. Sakti, "An Isotropy Analysis for Self-supervised
Acoustic Unit Embeddings on the Zero Resource Speech Challenge
2021 Framework", IEEE ICASSP, Jun 2023
[PDF]
-
S. Novitasari, S. Sakti, S. Nakamura, "Self-adaptive Incremental
Machine Speech Chain for Lombard TTS with High-granularity ASR
Feedback in Dynamic Noise Condition", IEEE ICASSP, Jun 2023
[PDF]
-
H. Qi, S. Novitasari, A. Tjandra, S. Sakti, S. Nakamura,
"SpeeChain: A Speech Toolkit for Large-Scale Machine Speech
Chain," arXiv preprint arXiv:2301.02966, Jan 2023
[PDF]
-
R. Chevi, R.E. Prasojo, A.F. Aji, A. Tjandra, S. Sakti, "Nix-TTS:
Lightweight and End-to-End Text-to-Speech via Module-wise
Distillation", IEEE SLT, Jan 2023
[PDF]
-
H. Qi, S. Novitasari, S. Sakti, S. Nakamura, "Improved Consistency
Training for Semi-Supervised Sequence-to-Sequence ASR via Speech
Chain Reconstruction and Self-Transcribing", INTERSPEECH, pp.
3413-3417, Sep 2022
[PDF]
-
R. Fukuda, Y. Ko, Y. Kano, K. Doi, H. Tokuyama, S. Sakti, K.
Sudoh, S. Nakamura, "NAIST Simultaneous Speech-to-Text Translation
System for IWSLT 2022", International Conference on Spoken
Language Translation (IWSLT), pp.286-292, May 2022
[PDF]
-
S. Asai, K. Yoshino, S. Shinagawa, S. Sakti, S. Nakamura,
"Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional
Robot", International Workshop on Spoken Dialogue Systems
Technology (IWSDS), Nov 2021
[PDF]
-
R. Fukuda, S. Novitasari, Y. Oka, Y. Kano, Y. Yano, Y. Ko, H.
Tokuyama, K. Doi, T. Yanagita, S. Sakti, K. Sudoh, S. Nakamura,
"Simultaneous Speech-to-speech Translation System with
Transformer-based Incremental ASR, MT, and TTS", Oriental COCOSDA,
pp. 186-192, Nov 2021
[PDF]
-
N. Kaiki, S. Sakti, S. Nakamura, "Using Local Phrase Dependency
Structure Information in Neural Sequence-to-sequence Speech
Synthesis", Oriental COCOSDA, pp. 206-211, Nov 2021
[PDF]
-
N. Tachimori, S. Sakti, S. Nakamura, "Multi-Encoder Sequential
Attention Network for Context-Aware Speech Recognition in Japanese
Dialog Conversation", Oriental COCOSDA, pp. 1-6, Nov 2021
[PDF]
[Best paper award]
back to top
国内会議論文 ・ Domestic Conferences
-
中村 佳登, メフムード ファイサル, サクティ サクリアニ,
"日英コードスイッチングが社会的なヒューマンロボットインタラクションに及ぼす影響", SIG-SLUD,
Mar 2025
[PDF]
-
久保田 なつみ, サクティ サクリアニ,
"音声翻訳フレームワークによる吃音音声の自動音声認識に対する課題への取り組み", SIG-SLUD,
Mar 2025
[PDF]
-
髙橋 舜, 金崎 朝子, 須田 仁志, サクティ サクリアニ,
"音声信号から文字記号を創り出す―深層ベイズに基づく教師なし表現学習によるアプローチ―", NLP,
Mar 2025
[PDF]
-
胡 尤佳, 須藤 克仁, 中村 哲, サクティ サクリアニ,
"音声認識出力の曖昧性を考慮したMulti-task
End-to-end音声翻訳と曖昧性の高い音声入力に対する頑健性の分析", NLP, Mar 2025
[PDF]
-
R. F. Widiaputri, A. Purwarianti, D. P. Lestari, K. Azizah, D.
Tanaya, S. Sakti, "Disambiguating Ambiguous Indonesian Utterances
with ASR and Meaning Interpretation", ASJ Spring Meeting, Mar
2025
-
R. Hartanto, S. Sakti, K. Shinoda, "Multitask Training of
Multi-channel Speaker Separation and Room Acoustic Parameter
Estimation", ASJ Spring Meeting, Mar 2025
-
東 翔, サクティ サクリアニ, "中間 CTC 目標を活用した多言語 ASR
におけるコードスイッチングの向上", ASJ Spring Meeting, Mar
2025
-
Y. Wang, S. Sakti, "Flow
Matchingによる周波数領域でのフローマッチングを用いた高速ニューラルボコーダー",
ASJ Spring Meeting, Mar 2025
-
H. Tan, S. Sakti, "Improving Simultaneous Speech Translation
with a Contrastive Feedback Mechanism", ASJ Spring Meeting,
Mar 2025
-
Y. Hirano, M. Nguyen, K. Azuma, J. M. Saragih, S. Sakti, "The
NAIST System for the CHiME-8 Distant Meeting Transcription
Challenge", ASJ Spring Meeting, Mar 2025
-
安藤 宏祐, 平野 雄太, 佐藤 颯空, サクティ サクリアニ, "音声認識誤りが
ChatGPT の翻訳に与える影響の調査", ASJ Spring Meeting, Mar 2025
-
佐藤 颯空, サクティ サクリアニ, "拡散モデルベース DNN
音声合成のバックボーンに着目した軽量化とカーネル形状変化の影響",
ASJ Spring Meeting, Mar 2025
-
S. Sakti, B. A. Titalim, "Investigation of Cross-Lingual Mismatch
in Low-resource ASR for Indonesian Ethnic Languages", ASJ Spring Meeting, Mar 2024
-
C. Tran, C.-M. Luong, S. Sakti, "Maintaining Personal Styles in
Multilingual TTS with STEN Approach in Diffusion Framework", ASJ Spring Meeting, Mar 2024
-
R. Hartanto, S. Sakti, K. Shinoda, "Multitask Learning of Speaker
Separation and Direction-of-Arrival Estimation", ASJ Spring
Meeting, Mar 2024
-
Z. Zhang, S. Sakti, "Non-Parallel Limited Data Emotion Voice
Conversion with Variance Adapter and Non-Autoregressive Decoder",
ASJ Spring Meeting, Mar 2024
-
S. Takahashi, S. Sakti, "Deep Sequential Generative Modeling for
Unsupervised Learning of Linguistic Representations from Speech
Streams", ASJ Spring Meeting, Mar 2024
-
H. Xi, S. Sakti, "Perceived Challenges in Simultaneous
Japanese-English Translation", ASJ Spring Meeting, Mar 2024
-
L.-T. Nguyen, S. Sakti, "Utilizing Self-Supervised Visually
Grounded Speech Models for Aligning Unpaired and Untranscribed
Bilingual Speech", ASJ Spring Meeting, Mar 2024
-
M. Liu, S. Sakti, "Generating Textual Prosody based on ASR",
ASJ Spring Meeting, Mar 2024
-
J. Effendi, S. Sakti, S. Nakamura, "Cyclic Partially-aligned
Transformer for Visually Connected Speech-to-text Mapping", The
ASJ Spring Meeting, Mar 2023
-
多谷 邦彦, サクティ サクリアニ, 藤原 修治, 中村 哲, "X-vector
を用いた日本語電話音声に対するテキスト独立型話者照合システムの検討",
日本音響学会誌, 79巻1号, pp.18-25, Dec. 2022
[PDF]
-
S. Novitasari, S. Sakti, S. Nakamura, "Improving Intelligibility
of Synthesized Speech in Noisy Condition with Dynamically Adaptive
Machine Speech Chain", 情報処理学会 音声言語情報処理研究会
SIG-SLP, Dec. 2021
[PDF]
back to top