Hello, I’m Zhiyu Guo, a PhD student at Nara Institute of Science and Technology under the supervision of Taro Watanabe and Hidetaka Kamigaito. My research focuses on large language models (LLMs), with a particular emphasis on enhancing inference efficiency. I aim to address challenges related to inference by exploring sparsity-based methods to optimize performance. 日本での仕事を探しています。
📝 Publications
- Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters, Zhiyu Guo, Hidetaka Kamigaito, Taro Watanabe, Proceedings of EMNLP2024
- Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models, Zhiyu Guo, Hidetaka Kamigaito, Taro Watanabe, TMLR 2025
- Document-level neural machine translation using bert as context encoder, Zhiyu Guo, Minh Le Nguyen, AACL 2020 Student Research Workshop
🎖 Honors and Awards
- 2020.02 Kaggle TensorFlow 2.0 Question Answering. silver medal, Solo Participation, rank:20/1233.
- 2020.02 Kaggle Google QUEST Q&A Labeling. silver medal, rank:52/1571.
- 2019.09 JAIST Scholarship of M Program (Top 10%).
📖 Educations
- 2021.04 - now, Nara Institute of Science and Technology, Japan.
- 2018.10 - 2021.03, Japan Advanced Institute of Science and Technology, Japan.
- 2014.09 - 2018.06, China University of Mining and Technology, China.
💬 Invited Talks
- 2024.03, Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models, Tokyo AI community.
💻 Internships
- 2024.06 - now, Sparticle, Japan.