Hello, I’m Zhiyu Guo, a PhD student at Nara Institute of Science and Technology under the supervision of Taro Watanabe and Hidetaka Kamigaito. My research focuses on large language models (LLMs), with a particular emphasis on enhancing inference efficiency. I aim to address challenges related to inference by exploring sparsity-based methods to optimize performance. 日本での仕事を探しています。

📝 Publications

Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters, Zhiyu Guo, Hidetaka Kamigaito, Taro Watanabe, Proceedings of EMNLP2024
Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models, Zhiyu Guo, Hidetaka Kamigaito, Taro Watanabe, TMLR 2025
Document-level neural machine translation using bert as context encoder, Zhiyu Guo, Minh Le Nguyen, AACL 2020 Student Research Workshop

🎖 Honors and Awards

2020.02 Kaggle TensorFlow 2.0 Question Answering. silver medal, Solo Participation, rank:20/1233.
2020.02 Kaggle Google QUEST Q&A Labeling. silver medal, rank:52/1571.
2019.09 JAIST Scholarship of M Program (Top 10%).

📖 Educations

2021.04 - now, Nara Institute of Science and Technology, Japan.
2018.10 - 2021.03, Japan Advanced Institute of Science and Technology, Japan.
2014.09 - 2018.06, China University of Mining and Technology, China.

💬 Invited Talks

2024.03, Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models, Tokyo AI community.

💻 Internships

2024.06 - now, Sparticle, Japan.

Zhiyu Guo (郭志宇)

📝 Publications

🎖 Honors and Awards

📖 Educations

💬 Invited Talks

💻 Internships