言語・音声・音響・時系列信号処理、リライアブルAI、知識獲得など、人や社会との接点となるAIの研究開発を進めています。マルチモーダルデータを対象に理論的探究を深めることで、生成AIの新たな地平を切り拓き、実用性と革新性を両立するソリューションの創出をめざします。
生成AI(業務特化型LLM、AIエージェント、マルチモーダル(音響/図表/言語/時系列信号等))、自然言語処理(LLMのための知識抽出・構造化、対話・論述構造解析、自然言語推論、ファクトチェック)、対話エージェント、強化学習、リスク推論、知識学習、信頼できるAI(説明性・透明性・公平性・頑健性などの診断と改善)、Agentic AI(信頼性、自己改善、記憶管理)、音声/音響/時系列汎用基盤モデル (音声:Full Duplex音声対話、Speech-to-Speech、音響:異常音検知、Audio QA and Reasoning、Text-to-Audio、時系列:予測/復元/検知、Signal QA and Reasoning、Text-to-Signal)
応用例:フロントワーカー支援、コンタクトセンター支援(チャットボット、自動音声応答)、保守業務支援、ドキュメント・インテリジェンス(契約書・報告書の自動分析)、応対品質評価・VOC(顧客の声)分析、知財情報分析、故障予兆診断
2021年から2025年12月現在までの英文での発表は以下のとおりです。
Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, Yohei Kawaguchi, "Online Neural Diarization of Unlimited Numbers of Speakers," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola Garcia, "Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization," SLT 2022.
Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, Yohei Kawaguchi, "Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization," DCASE 2022.
Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yohei Kawaguchi, "Description and Discussion on DCASE 2022 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques," DCASE 2022.
Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yohei Kawaguchi, "Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models," INTERSPEECH 2022.
Harsh Purohit, Masaaki Yamamoto, Takashi Endo, Yohei Kawaguchi, "Hierarchical Conditional Variational Autoencoder Based Acoustic Anomaly Detection," EUSIPCO 2022.
Kota Dohi, Takashi Endo, Yohei Kawaguchi, "Disentangling Physical Parameters for Anomalous Sound Detection Under Domain Shifts," EUSIPCO 2022.
Tomoya Nishida, Kota Dohi, Takashi Endo, Masaaki Yamamoto, Yohei Kawaguchi, "Anomalous Sound Detection Based on Machine Activity Detection," EUSIPCO 2022.
Natsuo Yamashita, Shota Horiguchi, Takeshi Homma, "Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization," Odyssey 2022.
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola Garcia, "Encoder-Decoder Based Attractors for End-to-End Neural Diarization," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022.
Y. Okamoto, S. Horiguchi, M. Yamamoto, K. Imoto, and Y. Kawaguchi, "Environmental Sound Extraction Using Onomatopoeia," in Proc. IEEE ICASSP, 2022.
S. Horiguchi, Y. Takashima, P. Garcia, S. Watanabe, and Y. Kawaguchi, "Multi-Channel End-to-End Neural Diarization with Distributed Microphones," in Proc. IEEE ICASSP, 2022.
T. Homma, Q. Sun, T. Fujioka, R. Takawaki, E. Ankyu, K. Nagamatsu, D. Sugawara, and E. T. Harada, "Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech," arXiv preprint, 2021.
Y. Kawaguchi, K. Imoto, Y. Koizumi, N. Harada, D. Niizumi, K. Dohi, R. Tanabe, H. Purohit, and T. Endo, "Description and Discussion on DCASE 2021 Challenge Task 2," in Proc. DCASE, 2021.
S. Horiguchi, S. Watanabe, P. Garcia, Y. Xue, Y. Takashima, and Y. Kawaguchi, "Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors," in Proc. IEEE ASRU, 2021.
R. Tanabe, H. Purohit, K. Dohi, T. Endo, Y. Nikaido, T. Nakamura, and Y. Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in Proc. IEEE WASPAA, 2021.
A. Yamaguchi, G. Morio, H. Ozaki, K. Yokote, and K. Nagamatsu, "Team Hitachi @ AutoMin 2021: Reference-free Automatic Minuting Pipeline with Argument Structure Construction over Topic-based Summarization," in Proc. AutoMin, 2021.
K. Ito, T. Fujioka, Q. Sun, and K. Nagamatsu, "Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes," in Proc. INTERSPEECH, 2021.
Y. Takashima, Y. Fujita, S. Horiguchi, S. Watanabe, P. Garcia, and K. Nagamatsu, "Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization," in Proc. INTERSPEECH, 2021.
S. Horiguchi, N. Yalta, P. Garcia, Y. Takashima, Y. Xue, D. Raj, Z. Huang, Y. Fujita, S. Watanabe, and S. Khudanpur, "The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap," The Third DIHARD Speech Diarization Challenge, 2021. (2nd place in all the tasks)
A. I. Adiba, T. Homma, and T. Miyoshi, "Towards Immediate Backchannel Generation Using Attention-Based Early Prediction Model" in Proc. IEEE ICASSP, 2021.
K. Dohi, T. Endo, H. Purohit, R. Tanabe, and Y. Kawaguchi, "Flow-Based Self-Supervised Density Estimation for Anomalous Sound Detection" in Proc. IEEE ICASSP, 2021.
K. Ito, M. Yamamoto, and K. Nagamatsu, "Audio-Visual Speech Enhancement Method Conditioned on the Lip Motion and Speaker Discriminative Embeddings" in Proc. IEEE ICASSP, 2021.
S. Horiguchi, P. Garcia, Y. Fujita, S. Watanabe, and K. Nagamatsu, "End-to-End Speaker Diarization as Post-Processing" in Proc. IEEE ICASSP, 2021.
H. Ozaki, G. Morio, T. Morishita, and T. Miyoshi, "Project-Then-Transfer: Effective Two-Stage Cross-Lingual Transfer for Semantic Dependency Parsing" in Proc. EACL, 2021.
G. Morio*, H. Ozaki*, Y. Koreeda, T. Morishita, and T. Miyoshi, "i-Parser: Interactive Parser Development Kit for Natural Language Processing" in Proc. AAAI 2021. (*Equal contribution).
S. Horiguchi, Y. Fujita, and K. Nagamatsu, "Block-Online Guided Source Separation" in Proc. IEEE SLT, 2021.
Y. Takashima, Y. Fujita, S. Watanabe, S. Horiguchi, P. Garcia, and K. Nagamatsu, "End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection" IEEE SLT, 2021.
Y. Xue, S. Horiguchi, Y. Fujita, S. Watanabe, P. Garcia, and K. Nagamatsu, "Online End-to-End Neural Diarization with Speaker-Tracing Buffer" IEEE SLT, 2021.
M. Mase, A. B. Owen, and B. B. Seiler, "Cohort Shapley values for algorithmic fairness" arXiv preprint, 2021.
B. B. Seiler, M. Mase, and A. B. Owen, "What makes you unique?" arXiv preprint, 2021.
M. Ham"oto and M. Egi, "Model-agnostic Ensemble-based Explanation Correction Leveraging Rashomon Effect" in Proc. IEEE SSCI, 2021.
H. Namba and M. Egi, "Piecewise Simplification Approach for Accurate and Understandable Model," in Proc. IEEE SSCI, 2021.
Y. Koreeda and C. Manning, "ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts," in Proc. Findings of the Association for Computational Linguistics: EMNLP 2021, EMNLP, 2021.
Gaku Morio, Hiroaki Ozaki, Terufumi Morishita, Kohsuke Yanai, "End-to-end Argument Mining with Cross-corpora Multi-task Learning," Transactions of the Association for Computational Linguistics, 2022.
Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, Nobuo Nukaga, "Rethinking Fano’s Inequality in Ensemble Learning," in Proc. ICML 2022.
Naokazu Uchida, Takeshi Homma, Makoto Iwayama, Yasuhiro Sogawa, "Reducing Offensive Replies in Open Domain Dialogue System," In Proc. INTERSPEECH 2022.
Amalia Istiqlali Adiba, Takeshi Homma, Yasuhiro Sogawa, "Unsupervised Domain Adaptation on Question-Answering System with Conversation Data," in Proc. SIGDIAL 2022.
Masaki Hamamoto, Hiroyuki Namba, Masashi Egi, "Ensemble-Based Method for Correcting Global Explanation of Prediction Model," in IEICE Transactions of Information and Systems 2023.
Benjamin B. Seiler, Masayoshi Mase, Art B. Owen, "What makes you unique?," in Electronic Journal of Statistics 2023.
Y. Tsuchiya, Y. Mori, and M. Egi, "Explainable Reinforcement Learning Based on Q-Value Decomposition by Expected State Transitions," Proceedings of the AAAI 2023 Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering (AAAI-MAKE 2023) , 2023.
Y. Tsuchiya and M. Hamamoto, "Explanation Framework for Optimization-Based Scheduling: Evaluating Contributions of Constraints and Parameters by Shapley Values," ICAPS 2023 Workshop Human-Aware and Explainable Planning (HAXP), 2023.
Masayoshi Mase, Art B. Owen, and Benjamin B. Seiler, "Variable Importance Without Impossible Data," Annual Review of Statistics and Its Application, 2023.
N. Hama, Masayoshi Mase, Art B. Owen, "Deletion and Insertion Tests in Regression Models," Journal of Machine Learning Research (JMLR), 2023.
T. Nishida, T. Endo, and Y. Kawaguchi, "Zero-Shot Domain Adaptation of Anomalous Samples for Semi-Supervised Anomaly Detection," ICASSP, 2023
T. Morishita, G. Morio, A. Yamaguchi, and Y. Sogawa, "Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic," ICML, 2023
A. Ito, S. Horiguchi, "Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model," INTERSPEECH, 2023
K. Shimonishi, K. Dohi, and Y. Kawaguchi, "Anomalous Sound Detection Based on Sound Separation," INTERSPEECH, 2023
T. Okamoto, K. Shimonishi, K. Imoto, K. Dohi, S. Horiguchi, and Y. Kawaguchi, "CAPTDURE: Captioned Sound Dataset of Individual Sources," INTERSPEECH, 2023
M. Tsunokake, A. Yamaguchi, Y. Koreeda, H. Ozaki, and Y. Sogawa, "Hitachi at SemEval-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values," SemEval, 2023
Y. Koreeda, K. Yokote, H. Ozaki, A. Yamaguchi, M. Tsunokake, and Y. Sogawa, "Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News," SemEval, 2023
T. Sasazawa, T. Morishita, H. Ozaki, O. Imaichi, and Y. Sogawa, "Controling Keywords and Their Positions in Text Generation," INLG, 2023
T. Fujii, K. Shibata, A. Yamaguchi, T. Morishita, and Y. Sogawa, "How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese," ACL Student Research Workshop, 2023
A. Yamaguchi, H. Ozaki, T. Morishita, G. Morio, and T. Sogawa, "How Does the Task Complexity of Masked Pretraining Objectives Affect Downstream Performance?," ACL, 2023
T. Koreeda, T. Morishita, O. Imaichi, and Y. Sogawa, "LARCH: Large Language Model-based Automatic Readme Creation with Heuristics," CIKM, 2023
T. V. Ho, S. Horiguchi, S. Watanabe, P. Garcia, and T. Sumiyoshi, "Synthetic Data Augmentation for ASR with Domain Filtering," APSIPA ASC, 2023
T. Morishita, T. Koreeda, A. Yamaguchi, G. Morio, O. Imaichi, and Y. Sogawa, "CHICOT: A Developer-Assistance Toolkit for Code Search with High-Level Contextual Information," AAAI, 2024
S. Horiguchi, K. Dohi, and Y. Kawaguchi, "Streaming Active Learning for Regression Problems Using Regression via Classification," ICASSP, 2024
K. Dohi and Y. Kawaguchi, "Distributed Collaborative Anomalous Sound Detection by Embedding Sharing," EUSIPCO, 2024
T. Vu Ho, K. Dohi, and Y. Kawaguchi, "Stream-based Active Learning for Streaming Anomalous Sound Detection in Machine Condition Monitoring", INTERSPEECH, 2024
T. Morishita, A. Yamaguchi, G. Morio, T. Tomonari, O. Imaichi, and Y. Sogawa, "JFLD: A Japanese Benchmark for Deductive Reasoning based on Formal Logic", LREC-COLING, 2024
T. Morishita, G. Morio, A. Yamaguchi, and Y. Sogawa, "Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus", NeurIPS, 2024
R. Nagase, T. Sumiyoshi, N. Yamashita, K. Dohi, and Y. Kawaguchi, "Can We Estimate Purchase Intention Based on Zero-shot Speech Emotion Recognition?," in Proc. APSIPA ASC, 2024.
K. Dohi, A. Ito, H. Purohit, T. Nishida, T. Endo, and Y. Kawaguchi, "Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data," in Proc. ICASSP, 2025.
H. Purohit, T. Nishida, K. Dohi, T. Endo, and Y. Kawaguchi, "MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System," in Proc. EUSIPCO, 2025.
N. Yamashita, M. Yamamoto, and Y. Kawaguchi, "End-to-End Integration of Speech Emotion Recognition and Voice Activity Detection with a Self-Supervised Model for Noise Robustness," in Proc. APSIPA ASC, 2025.
R. Ogura, T. Nishida, and Y. Kawaguchi, "Retrieval-Augmented Difference Captioning to Explain Unsupervised Anomalous Sound Detection," in Proc. APSIPA ASC, 2025.
T. Nishida, H. Purohit, K. Dohi, T. Endo, and Y. Kawaguchi, "Timbre-Based Anomaly Explanation without Anomalous Training Data," in Proc. EUSIPCO, 2025.
A. Ito, K. Dohi, and Y. Kawaguchi, "CLaSP: Learning Concepts for Time-Series Signals from Natural Language Supervision," in Proc. EUSIPCO, 2025.
H. Ozaki, N. Tanahashi, N. Masuda, K. Yamada, M. Kato, and N. Isagawa "Analytical Methodology and a Simulator for ESG-Financial Indicators Based on Causal Hypothesis Graph," STAI, 2024
Y. Tsuchiya, M. Mase, H. Matsuba, "Diverse Source-Aware Context Retrieval for Multi-Hop Queries: LLM-Powered Keyword Filtering and Cross-Source Ranking," AIxB, 2025.
D. Murata, T. Uezato, Y. Matsuda, K. Inata, S. Kujiraoka, M. Matsumoto, "Data-augmentation Technique using Helpful AI-generated Images for Training", ICMLA, 2025.
H. Matsuzaki, I. Karube, J. Hirayama, "Early Detection of Coordinated Online Community Using Graph Neural Networks", ASONAM workshop, 2025
A. Shibata, T. Gunji, M. Tsuda, T. Endo, K. Dohi, T. Nishida, S. Nomoto, "Automatic Inspection Based on Switch Sounds of Electric Point Machines," in Proc. ASPECT, 2025.
T. Nishida, N. Harada, D. Niizumi, D. Albertini, R. Sannino, S. Pradolini, F. Augusti, K. Imoto, K. Dohi, H. Purohit, T. Endo, Y. Kawaguchi, "Description and Discussion on DCASE 2025 Challenge Task 2: First-shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc DCASE, 2025.
H. Purohit, T. Nishida, K. Dohi, T. Endo, Y. Kawaguchi, "MIMII-Agent: Leveraging LLMs with Function Calling for Relative Evaluation of Anomalous Sound Detection," in Proc. DCASE, 2025.
T.V. Ho, H. Kokubo, M. Yamamoto, Y. Kawaguchi, "Model-free Speculative Decoding for Transformer-based ASR with Token Map Drafting," in Proc. EUSIPCO, 2025.
N. Yamashita, M. Yamamoto, H. Kokubo, Y. Kawaguchi, "LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context," in Proc. INTERSPEECH, 2025.
T. Nishida, N. Harada, D. Niizumi, D. Albertini, R. Sannino, S. Pradolini, F. Augusti, K. Imoto, K. Dohi, H. Purohit, T. Endo, Y. Kawaguchi, "Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. DCASE, 2024.
K. Dohi, K. Imoto, N. Harada, D. Niizumi, Y. Koizumi, T. Nishida, H. Purohit, R. Tanabe, T. Endo, Y. Kawaguchi, "Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. DCASE, 2023.
Y. Xue, M. Tsunokake, Y. Koreeda, E. Amin, T. Sumiyoshi, Y. Sogawa, "Exploring Fine-Tuning Methods for Agentic LLMs in Domain-Specific Microdomains," AIxB, 2025.
G. Morio, R. Hall, D. Stampa, C. D. Manning, P. Henderson, "A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection," NeurIPS, 2025.
M. Tsunokake, Y. Koreeda, H. Tomonari, K. Nagatsuka, Y. Sogawa, "Is Micro-Domain Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks," in Proc. EACL, 2026.
最新のパブリケーションリストは下記のページをご覧下さい。
https://hitachi-speech.github.io/
お知らせ, "日立の異音検知ソリューションが日本音響学会の「第29回 日本音響学会技術開発賞」を受賞,"
https://www.facebook.com/hitachi.it/posts/%E6%97%A5%E7%AB%8B%E3%81%AE%E7%95%B0%E9%9F%B3%E6%A4%9C%E7%9F%A5%E3%82%BD%E3%83%AA%E3%83%A5%E3%83%BC%E3%82%B7%E3%83%A7%E3%83%B3%E3%81%8C%E7%AC%AC29%E5%9B%9E-%E6%97%A5%E6%9C%AC%E9%9F%B3%E9%9F%BF%E5%AD%A6%E4%BC%9A%E6%8A%80%E8%A1%93%E9%96%8B%E7%99%BA%E8%B3%9E%E3%82%92%E5%8F%97%E8%B3%9E%E3%81%97%E3%81%BE%E3%81%97%E3%81%9F%E7%95%B0%E9%9F%B3%E6%A4%9C%E7%9F%A5%E3%82%BD%E3%83%AA%E3%83%A5%E3%83%BC%E3%82%B7%E3%83%A7%E3%83%B3%E3%81%AF%E6%97%A5%E7%AB%8B%E3%81%AE%E8%87%AA%E7%A4%BE%E5%B7%A5%E5%A0%B4%E3%81%A7%E3%81%AE%E5%AE%9F%E7%B8%BE%E3%83%8E%E3%82%A6%E3%83%8F%E3%82%A6%E3%82%92%E3%82%82%E3%81%A8%E3%81%AB%E5%AE%9F%E7%94%A8%E5%8C%96%E3%81%97%E3%81%9F%E3%83%9E%E3%82%A4%E3%82%AF%E6%A9%9F%E8%83%BD%E6%90%AD/4000020563392835/
2023年度人工知能学会全国大会優秀賞「大規模言語モデルとヒューリスティクスに基づくreadme生成」
https://www.ai-gakkai.or.jp/about/award/jsai_award-conf/
2025年度人工知能学会全国大会優秀賞「EconGrowthAgent: LLMエージェントと経済成長理論に基づくマクロ経済シミュレーション」
https://www.ai-gakkai.or.jp/about/award/jsai_award-conf/
Qiita Zine, "XAIにNLP。なぜ日立は、世界トップクラスのスタンフォード大とAI領域の共同研究を続けるのか,"
https://zine.qiita.com/interview/202112-hitachi/
Qiita Zine, "注目度の高まる「音声処理技術」領域で、日立製作所メンバーの研究開発姿勢を探る,"
https://zine.qiita.com/interview/202111-hitachi-5/
ニュースリリース, "音声データ活用によりカスタマーエクスペリエンス向上を支援する「音声テキスト化クラウドサービス」を販売開始,"
https://www.hitachi.co.jp/New/cnews/month/2021/10/1012.html
ニュースリリース, "三菱UFJモルガン・スタンレー証券、音声認識やAIを活用したお客さま応対のモニタリングシステムを導入,"
https://www.hitachi.co.jp/New/cnews/month/2021/10/1001.html
Qiita Zine, "リスクテイクしてこそ研究者だ。音響と画像認識で成果を出し続ける日立研究員のマインド,"
https://zine.qiita.com/interview/202103-hitachi/
ニュースリリース, "社会イノベーション事業における「AI倫理原則」を策定,"
https://www.hitachi.co.jp/New/cnews/month/2021/02/0222.html
Qiita Zine, "AIはブラックボックス? 判断根拠を説明する「XAI」を活用して社会課題に挑む日立製作所,"
https://zine.qiita.com/interview/202102-hitachi/
Qiita Zine, "だから音声は面白い!日立製作所が進める、「人の感情」を可視化する新規サービスの作り方,"
https://zine.qiita.com/interview/202207-hitachi-2/
協創の森ウェビナー, "人とAIが共進化するために本質的な視点とは│協創の森ウェビナー第17回 「サイバーシステムの社会実装とその課題 」プログラム3「人とAIが共進化する社会に向けて」,"
https://linkingsociety.hitachi.co.jp/_ct/17664780
研究トピックス, "生成AIの論理的思考能力を強化するための学習データを自動作成する基本技術を開発",
https://rd.hitachi.co.jp/_ct/17736579
生成AI活用のフロントランナー, "RAGの高度化で生成AIを次のステージへ",
https://deh.hitachi.co.jp/_ct/17733925
Hitachi Industrial AI blog, "Mapping industrial time-series to natural language: Making sensor data readable and searchable in plain English"
https://rd.hitachi.com/_ct/17805833