My research asks: how do we move LLMs beyond pattern matching toward genuine understanding?

I think of it like Taylor expansion. Prompt engineering gives us first-order approximation—linear workflows. RL and fine-tuning add second-order terms—reasoning chains. But true creativity lives in higher-order terms.

My work focuses on mental models—structured ways of understanding that could give LLMs higher-order capabilities. Current focus: agentic deep research. End goal: machines that genuinely predict, not recall.

Approximating Intelligence

target

f(x) ≈ a₀

1st Order

Prompt Engineering

2nd Order

RL / Reasoning

Higher Order

Mental Models

·
01

Temporal Leakage in Search-Engine Date-Filtered Web Retrieval

71% of date-filtered queries return post-cutoff data

arXiv (pending for ACL 2026) · Yuxuan Wang et al.