参考になりそうなsurveyは、以下かな。
(surveyできる人って、どういう人種なんでしょう。。。。。マメだな。。。)
https://arxiv.org/abs/2507.19595
[Submitted on 25 Jul 2025 (v1), last revised 7 Aug 2025 (this version, v2)] Efficient Attention Mechanisms for Large Language Models: A Survey
https://arxiv.org/abs/2412.19442
[Submitted on 27 Dec 2024 (v1), last revised 30 Jul 2025 (this version, v3)] A Survey on Large Language Model Acceleration based on KV Cache Management