人工知能技術の中でも、テキストデータや音声・音響データ、計測データ等を扱う技術の研究開発に取り組んでいます。また人工知能の透明性や公平性に関わる研究開発を推進し、社会から信頼される人工知能技術をめざしています。機械学習、Deep Learning、パターン認識、知識処理、推論、説明可能AIなどの技術を深化させるとともに、実社会への適用も進めています。
自然言語処理(大規模言語モデル構築・活用、自然言語推論、論述構造解析)、音声認識 (音響/言語モデル適応、End-to-End、ダイアライゼーション、音声強調/分離、話者照合、Kaldi/ESPnet活用)、音響認識 (異常音検知、シーン分類、キャプション生成)、信号処理と機械学習 (スパースモデリング、信号復元、状態推定/予測のための機械学習)、対話エージェント、リスク推論、知識学習、説明可能AI、信頼できるAI(説明性・透明性・公平性・頑健性などの診断と改善)
応用例:要約システム、テキスト情報抽出、知識グラフ、対話解析、チャットボット、高度RPA、コンタクトセンター音声書き起こし、議事録作成、自動音声応答、故障予兆診断、保守知識支援、与信審査、救急需要予測、検品自動化など
2021年から2024年1月現在までの英文での発表は以下のとおりです。
Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, Yohei Kawaguchi, "Online Neural Diarization of Unlimited Numbers of Speakers," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola Garcia, "Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization," SLT 2022.
Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, Yohei Kawaguchi, "Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization," DCASE 2022.
Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yohei Kawaguchi, "Description and Discussion on DCASE 2022 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques," DCASE 2022.
Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yohei Kawaguchi, "Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models," INTERSPEECH 2022.
Harsh Purohit, Masaaki Yamamoto, Takashi Endo, Yohei Kawaguchi, "Hierarchical Conditional Variational Autoencoder Based Acoustic Anomaly Detection," EUSIPCO 2022.
Kota Dohi, Takashi Endo, Yohei Kawaguchi, "Disentangling Physical Parameters for Anomalous Sound Detection Under Domain Shifts," EUSIPCO 2022.
Tomoya Nishida, Kota Dohi, Takashi Endo, Masaaki Yamamoto, Yohei Kawaguchi, "Anomalous Sound Detection Based on Machine Activity Detection," EUSIPCO 2022.
Natsuo Yamashita, Shota Horiguchi, Takeshi Homma, "Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization," Odyssey 2022.
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola Garcia, "Encoder-Decoder Based Attractors for End-to-End Neural Diarization," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022.
Y. Okamoto, S. Horiguchi, M. Yamamoto, K. Imoto, and Y. Kawaguchi, "Environmental Sound Extraction Using Onomatopoeia," in Proc. IEEE ICASSP, 2022.
S. Horiguchi, Y. Takashima, P. Garcia, S. Watanabe, and Y. Kawaguchi, "Multi-Channel End-to-End Neural Diarization with Distributed Microphones," in Proc. IEEE ICASSP, 2022.
T. Homma, Q. Sun, T. Fujioka, R. Takawaki, E. Ankyu, K. Nagamatsu, D. Sugawara, and E. T. Harada, "Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech," arXiv preprint, 2021.
Y. Kawaguchi, K. Imoto, Y. Koizumi, N. Harada, D. Niizumi, K. Dohi, R. Tanabe, H. Purohit, and T. Endo, "Description and Discussion on DCASE 2021 Challenge Task 2," in Proc. DCASE, 2021.
S. Horiguchi, S. Watanabe, P. Garcia, Y. Xue, Y. Takashima, and Y. Kawaguchi, "Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors," in Proc. IEEE ASRU, 2021.
S. Horiguchi, Y. Fujita, S. Watanabe, Y. Xue, and P. Garcia, "Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization," arXiv preprint, 2021.
R. Tanabe, H. Purohit, K. Dohi, T. Endo, Y. Nikaido, T. Nakamura, and Y. Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in Proc. IEEE WASPAA, 2021.
A. Yamaguchi, G. Morio, H. Ozaki, K. Yokote, and K. Nagamatsu, "Team Hitachi @ AutoMin 2021: Reference-free Automatic Minuting Pipeline with Argument Structure Construction over Topic-based Summarization," in Proc. AutoMin, 2021.
K. Ito, T. Fujioka, Q. Sun, and K. Nagamatsu, "Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes," in Proc. INTERSPEECH, 2021.
Y. Takashima, Y. Fujita, S. Horiguchi, S. Watanabe, P. Garcia, and K. Nagamatsu, "Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization," in Proc. INTERSPEECH, 2021.
S. Horiguchi, N. Yalta, P. Garcia, Y. Takashima, Y. Xue, D. Raj, Z. Huang, Y. Fujita, S. Watanabe, and S. Khudanpur, "The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap," The Third DIHARD Speech Diarization Challenge, 2021. (2nd place in all the tasks)
A. I. Adiba, T. Homma, and T. Miyoshi, "Towards Immediate Backchannel Generation Using Attention-Based Early Prediction Model" in Proc. IEEE ICASSP, 2021.
K. Dohi, T. Endo, H. Purohit, R. Tanabe, and Y. Kawaguchi, "Flow-Based Self-Supervised Density Estimation for Anomalous Sound Detection" in Proc. IEEE ICASSP, 2021.
K. Ito, M. Yamamoto, and K. Nagamatsu, "Audio-Visual Speech Enhancement Method Conditioned on the Lip Motion and Speaker Discriminative Embeddings" in Proc. IEEE ICASSP, 2021.
S. Horiguchi, P. Garcia, Y. Fujita, S. Watanabe, and K. Nagamatsu, "End-to-End Speaker Diarization as Post-Processing" in Proc. IEEE ICASSP, 2021.
H. Ozaki, G. Morio, T. Morishita, and T. Miyoshi, "Project-Then-Transfer: Effective Two-Stage Cross-Lingual Transfer for Semantic Dependency Parsing" in Proc. EACL, 2021.
G. Morio*, H. Ozaki*, Y. Koreeda, T. Morishita, and T. Miyoshi, "i-Parser: Interactive Parser Development Kit for Natural Language Processing" in Proc. AAAI 2021. (*Equal contribution).
S. Horiguchi, Y. Fujita, and K. Nagamatsu, "Block-Online Guided Source Separation" in Proc. IEEE SLT, 2021.
Y. Takashima, Y. Fujita, S. Watanabe, S. Horiguchi, P. Garcia, and K. Nagamatsu, "End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection" IEEE SLT, 2021.
Y. Xue, S. Horiguchi, Y. Fujita, S. Watanabe, P. Garcia, and K. Nagamatsu, "Online End-to-End Neural Diarization with Speaker-Tracing Buffer" IEEE SLT, 2021.
M. Mase, A. B. Owen, and B. B. Seiler, "Cohort Shapley values for algorithmic fairness" arXiv preprint, 2021.
B. B. Seiler, M. Mase, and A. B. Owen, "What makes you unique?" arXiv preprint, 2021.
M. Ham"oto and M. Egi, "Model-agnostic Ensemble-based Explanation Correction Leveraging Rashomon Effect" in Proc. IEEE SSCI, 2021.
H. Namba and M. Egi, "Piecewise Simplification Approach for Accurate and Understandable Model," in Proc. IEEE SSCI, 2021.
Y. Koreeda and C. Manning, "ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts," in Proc. Findings of the Association for Computational Linguistics: EMNLP 2021, EMNLP, 2021.
Gaku Morio, Hiroaki Ozaki, Terufumi Morishita, Kohsuke Yanai, "End-to-end Argument Mining with Cross-corpora Multi-task Learning," Transactions of the Association for Computational Linguistics, 2022.
Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, Nobuo Nukaga, "Rethinking Fano’s Inequality in Ensemble Learning," in Proc. ICML 2022.
Naokazu Uchida, Takeshi Homma, Makoto Iwayama, Yasuhiro Sogawa, "Reducing Offensive Replies in Open Domain Dialogue System," In Proc. INTERSPEECH 2022.
Amalia Istiqlali Adiba, Takeshi Homma, Yasuhiro Sogawa, "Unsupervised Domain Adaptation on Question-Answering System with Conversation Data," in Proc. SIGDIAL 2022.
Masaki Hamamoto, Hiroyuki Namba, Masashi Egi, "Ensemble-Based Method for Correcting Global Explanation of Prediction Model," in IEICE Transactions of Information and Systems 2023.
Benjamin B. Seiler, Masayoshi Mase, Art B. Owen, "What makes you unique?," in Electronic Journal of Statistics 2023.
Y. Tsuchiya, Y. Mori, and M. Egi, "Explainable Reinforcement Learning Based on Q-Value Decomposition by Expected State Transitions," Proceedings of the AAAI 2023 Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering (AAAI-MAKE 2023) , 2023.
Y. Tsuchiya and M. Hamamoto, "Explanation Framework for Optimization-Based Scheduling: Evaluating Contributions of Constraints and Parameters by Shapley Values," ICAPS 2023 Workshop Human-Aware and Explainable Planning (HAXP), 2023.
Masayoshi Mase, Art B. Owen, and Benjamin B. Seiler, "Variable Importance Without Impossible Data," Annual Review of Statistics and Its Application, 2023.
N. Hama, Masayoshi Mase, Art B. Owen, "Deletion and Insertion Tests in Regression Models," Journal of Machine Learning Research (JMLR), 2023.
T. Nishida, T. Endo, and Y. Kawaguchi, "Zero-Shot Domain Adaptation of Anomalous Samples for Semi-Supervised Anomaly Detection," ICASSP, 2023
T. Morishita, G. Morio, A. Yamaguchi, and Y. Sogawa, "Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic," ICML, 2023
A. Ito, S. Horiguchi, "Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model," INTERSPEECH, 2023
K. Shimonishi, K. Dohi, and Y. Kawaguchi, "Anomalous Sound Detection Based on Sound Separation," INTERSPEECH, 2023
T. Okamoto, K. Shimonishi, K. Imoto, K. Dohi, S. Horiguchi, and Y. Kawaguchi, "CAPTDURE: Captioned Sound Dataset of Individual Sources," INTERSPEECH, 2023
M. Tsunokake, A. Yamaguchi, Y. Koreeda, H. Ozaki, and Y. Sogawa, "Hitachi at SemEval-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values," SemEval, 2023
Y. Koreeda, K. Yokote, H. Ozaki, A. Yamaguchi, M. Tsunokake, and Y. Sogawa, "Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News," SemEval, 2023
T. Sasazawa, T. Morishita, H. Ozaki, O. Imaichi, and Y. Sogawa, "Controling Keywords and Their Positions in Text Generation," INLG, 2023
T. Fujii, K. Shibata, A. Yamaguchi, T. Morishita, and Y. Sogawa, "How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese," ACL Student Research Workshop, 2023
A. Yamaguchi, H. Ozaki, T. Morishita, G. Morio, and T. Sogawa, "How Does the Task Complexity of Masked Pretraining Objectives Affect Downstream Performance?," ACL, 2023
T. Koreeda, T. Morishita, O. Imaichi, and Y. Sogawa, "LARCH: Large Language Model-based Automatic Readme Creation with Heuristics," CIKM, 2023
T. V. Ho, S. Horiguchi, S. Watanabe, P. Garcia, and T. Sumiyoshi, "Synthetic Data Augmentation for ASR with Domain Filtering," APSIPA ASC, 2023
T. Morishita, T. Koreeda, A. Yamaguchi, G. Morio, O. Imaichi, and Y. Sogawa, "CHICOT: A Developer-Assistance Toolkit for Code Search with High-Level Contextual Information," AAAI, 2024
S. Horiguchi, K. Dohi, and Y. Kawaguchi, "Streaming Active Learning for Regression Problems Using Regression via Classification," ICASSP, 2024
最新のパブリケーションリストは下記のページをご覧下さい。
https://hitachi-speech.github.io/