Minsoo Kim

minsoo2333 [at] hanyang.ac.kr

mskim_230127_idcard.jpg

Hi! I am Minsoo Kim, a Machine Learning Researcher at Apple MIND team. I received my Ph.D. from the AI Hardware & Algorithm lab at Hanyang University, advised by Professor Jungwook Choi. Here is my CV.

My core research centers on enhancing the efficient algorithm for generative language models, with a specific focus on real-world applications with Large Language Models (LLMs) and Large Multimodal Models. My research addresses key efficiency challenges in these models, including long context processing, efficient retrieval mechanisms, and inference acceleration.

News

Apr 2026 1 paper accepted and attending @ ICML 26 🇰🇷
Mar 2026 Starting as a ML Researcher at Apple
Sep 2025 1 paper accepted and attending @ NeurIPS 25 🇺🇸
Mar 2025 Starting as a PhD Intern at Apple

Selected Publications

Click to view the full list of publications.

  1. ICML 2026
    EpiCache: Episodic KV Cache Management for Long-Term Conversation on Resource-Constrained Environments
    Minsoo Kim, Arnav Kundu, Han-Byul Kim, Richa Dixit, and Minsik Cho
    Forty-Third International Conference on Machine Learning (ICML), 2026
  2. NeurIPS 2025
    InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
    Minsoo Kim, Kyuhong Shim, Jungwook Choi, and Simyung Chang
    The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025
  3. EMNLP 2024
    InfiniPot: Infinite Context Processing on Memory-Constrained LLMs
    Minsoo Kim, Kyuhong Shim, Jungwook Choi, and Simyung Chang
    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
  4. ACL 2024
    RA-LoRA: Rank-Adaptive Parameter-Efficient Fine-Tuning for Accurate 2-bit Quantized Large Language Models
    Minsoo Kim, Sihwa Lee, Won Yong Sung, and Jungwook Choi
    Findings of the Association for Computational Linguistics: ACL 2024 (ACL), 2024
  5. NeurIPS 2023
    Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
    Minsoo Kim, Sihwa Lee, Janghwan Lee, Suk-Jin Hong, Du-Seong Chang, and 2 more authors
    The Thirty-Seventh Annual Conference on Neural Information Processing Systems (NeurIPS), 2023
  6. EMNLP 2023
    Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization
    Janghwan Lee*, Minsoo Kim*, Seungcheol Baek, Seokjoong Hwang, Wonyong Sung, and 1 more author
    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  7. EACL 2023
    Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers
    Minsoo Kim, Kyuhong Shim, Seongmin Park, Wonyong Sung, and Jungwook Choi
    Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
  8. EMNLP 2022
    Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders
    Minsoo Kim, Sihwa Lee, Suk-Jin Hong, Du-Seong Chang, and Jungwook Choi
    Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022