Yekyung Kim email scholar github

fric the frog with a magnifying glassToward faithful evaluation and alignment of long-context language models.
Yekyung Kim

I am a third-year Ph.D. student at the University of Maryland, CLIP Lab, advised by Mohit Iyyer. My research in NLP asks how to evaluate and align language models as everything gets long — across long-context understanding, long-form generation, and long-horizon problem-solving. I began my Ph.D. at UMass NLP and later transferred to UMD along with my advisor.

Evaluating faithfulness & factuality — in long-context understanding (FABLES, OneRuler) and long-form generation (VeriScore), including recent work on argument collapse.

Aligning language models — post-training with synthetic data for instruction following (BLEUBERI) and compositional reasoning (ongoing work).

Before my Ph.D., I worked at Hyundai Motor Group and LG Electronics as a research engineer. I was selected as a specialist in AI and conducted research at CMU LTI as a visiting scientist mentored by Jaime Carbonell.

news

Jun 2026 Started interning at the Document Intelligence Lab, Adobe (primary mentor: Joe Barrow).
Sep 2025 BLEUBERI was accepted to NeurIPS 2025!
Jul 2025 OneRuler was accepted to COLM 2025!
Sep 2024 VeriScore was accepted to EMNLP Findings 2024.
May 2024 FABLES was accepted to COLM 2024.

publications

  1. BLEUBERI
    BLEUBERI: BLEU is a Surprisingly Effective Reward for Instruction Following
    Yapei Chang, Yekyung Kim, Michael Krumdick, Amir Zadeh, Chuan Li, Chris Tanner, and Mohit Iyyer
    NeurIPS 2025
  2. OneRuler
    One Ruler to Measure Them All: Benchmarking Multilingual Long-Context Language Models
    Yekyung Kim, Jenna Russell, Marzena Karpinska, and Mohit Iyyer
    COLM 2025
  3. VeriScore
    VeriScore: Evaluating the Factuality of Verifiable Claims in Long-Form Text Generation
    Yixiao Song, Yekyung Kim, and Mohit Iyyer
    EMNLP Findings 2024
  4. FABLES
    FABLES: Evaluating Faithfulness and Content Selection in Book-Length Summarization
    Yekyung Kim, Yapei Chang, Marzena Karpinska, Aparna Garimella, Varun Manjunatha, Kyle Lo, Tanya Goyal, and Mohit Iyyer
    COLM 2024
  5. Safe street crossing
    Is It Safe to Cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing
    Hochul Hwang, Sunjae Kwon, Yekyung Kim, and Donghyun Kim
    Ubiquitous Robots (UR) 2024
  6. LINDA
    LINDA: Unsupervised Learning to Interpolate in Natural Language Processing
    Yekyung Kim, Seohyeong Jeong, and Kyunghyun Cho
    arXiv 2021
  7. InfoVerse
    InfoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information
    Jaehyung Kim, Yekyung Kim, Karin de Langis, Jinwoo Shin, and Dongyeop Kang
    ACL 2023
  8. Meta-Crafting
    Meta-Crafting: Improved Detection of Out-of-Distributed Texts via Crafting Metadata Space
    Ryan Koo, Yekyung Kim, Dongyeop Kang, and Jaehyung Kim
    AAAI 2024  (Student Abstract & Poster)
  9. Deep Active Learning
    Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient
    Yekyung Kim
    Life-long Learning for Spoken Language Systems Workshop @ AACL 2021
  10. Korean NER
    Learning Sub-Character Level Representation for Korean Named Entity Recognition
    Yejin Kim and Yekyung Kim (equal contribution)
    FLAIRS 2020
  11. Nowplaying the Future Billboard
    #Nowplaying the Future Billboard: Mining Music Listening Behaviors of Twitter Users for Hit Song Prediction
    Yekyung Kim, Bongwon Suh, and Kyogu Lee
    SoMeRA Workshop @ SIGIR 2014
  12. Visual Analytics of Tweets
    A Visual Analytics Approach to Summarizing Tweets
    Ramik Sadana, Yekyung Kim, Bongwon Suh, and Eunyee Koh
    Industry Day @ SIGIR 2014

industry projects

  1. Airstar airport robot
    Airstar — Incheon Airport Robot
    LG Electronics
  2. Hyundai in-car AI assistant
    AI Assistant for Cars
    Hyundai Motor Group
  3. LG ThinQ chatbot
    Chatbot for Home Appliances
    LG Electronics