Philosophy

How I Think About AI

Working on frontier AI systems has shaped how I think about technology, responsibility, and the future. These are the principles that guide my work.

Core Principles

Safety is not a feature, it's the foundation

Building powerful AI without safety considerations is like building a rocket without thinking about where it lands. The goal isn't to slow down progress - it's to ensure we're progressing in a direction worth going.

Interpretability is not optional

I believe we shouldn't deploy systems we don't understand. Mechanistic interpretability isn't just academically interesting - it's how we build trust, catch failures, and ensure alignment. Black boxes are tech debt we can't afford.

Rigor over speed (usually)

In safety research, a poorly designed experiment can miss critical failure modes. I've learned to slow down, triple-check the experimental setup, and question my assumptions. The 30 minutes you spend reviewing your methodology saves the 30 hours you'd spend wondering why the model fooled your evaluation.

Collaborate like the stakes are high

Because they are. AI development is too important for hero culture. I believe in open discussion, honest disagreement, and the humility to change my mind when someone has a better idea. The best ideas rarely come from one person.

What I Believe

On AI Safety

●AI alignment is a real, technical problem - not just philosophy
●Current systems are not aligned by default; they're aligned by effort
●The time to solve safety is before we need it, not after
●Interpretability research is undervalued relative to capabilities research
●We should be building AI that helps us understand AI

On Research Engineering

●Good infrastructure is invisible; bad infrastructure is all you can see
●Reproducibility is non-negotiable, even when it's painful
●The best code is the code you don't have to write
●Documentation is a gift to your future self
●If you can't explain it simply, you don't understand it well enough

On Work & Life

●Sustainable pace beats burnout heroics every time
●The best debugging happens after a good night's sleep
●Mentorship scales better than individual contribution
●Imposter syndrome is universal; confidence is learned
●Taking breaks is not slacking - it's maintenance

Questions I'm Thinking About

These are the questions that keep me up at night. If you have thoughts on any of them, I'd love to talk.

“How do we ensure AI systems remain aligned as they become more capable?”

“What's the right balance between open research and responsible deployment?”

“How do we build interpretability tools that scale with model size?”

“What does 'understanding' a neural network actually mean?”

“How do we train models that are genuinely helpful without being sycophantic?”

“What organizational structures best support AI safety research?”

Note: These views are my own and don't necessarily represent the positions of my employer. I'm always learning and happy to be proven wrong.