Thoughts

2026 Thoughts

✦ February

The End of the Vibe Check

There is a persistent push to make LLMs more conversational and friendly. This caters to users who mistake emotional mimicry

01 Feb

2025 Thoughts

✦ December

The Personal AI Stack

It’s interesting to see the proliferation of “share your local LLM setup” threads—I learned a lot recently. A

05 Dec

2025 Thoughts

✦ November

We Are Not Losing Our Voice to AI

People are scared that AI is making the internet flat and boring. They worry it's creating a sea

29 Nov

2025 Thoughts

✦ November

Hot Takes from Ilya's Interview

Ilya Sutskever, on Dwarkesh Podcast: But when people do RL training, they do need to think. They say, “Okay, we

27 Nov

2025 Thoughts

✦ October

The Art of the Productive Failure

I'm starting to believe the most valuable runs are the ones that fail catastrophically. The runs where the

23 Oct

2025 Thoughts

✦ October

The Frontend Verification Gap for AI Agents

Agentic coding loops are a marvel for backend work. An LLM can write some code, run a test to verify

14 Oct

2025 Thoughts

✦ September

Learnings from Cursor Tab

Cursor's "Tab RL" blog: To use policy gradient methods to improve Tab, we defined a reward

21 Sep

2025 Thoughts

✦ September

Confidence != Wisdom

The single most dangerous design flaw in today's language models isn't that they hallucinate; it'

20 Sep

2025 Thoughts

✦ September

LLM's Data

We obsess over model architecture and FLOPS, but the most important component of any new language model is the one

19 Sep

2025 Thoughts

✦ August

The "AI Button"

The new trend of every app getting an "AI button" feels like a solution in search of a

26 Aug

2025 Thoughts

✦ August

The Browser Window

There seems to be a strange desperation in the air with AI companies trying to buy web browsers. It’s

19 Aug

2025 Thoughts

✦ July

Benchmark Matters

A recent clarity I’ve had is that the most consequential design document in modern AI research isn’t the

08 Jul

2025 Thoughts

✦ June

Type 1 / 2 Hard Problems

An underdeveloped meta-skill in AI research is distinguishing between "Type 1 Hard" and "Type 2 Hard"

03 Jun

2025 Thoughts

✦ May

Keynote / Experiments

It's that time of year again—keynote season. We're about to see a firehose of new

15 May