/ai-ml
Posts touching ai ml.
11 posts
- February 28, 2026 4 min
What AI changes about engineering leadership
Code review matters more than code writing. Architectural taste matters more than implementation speed. Evaluating AI-generated code is the new critical skill.
- /ai-ml
- /engineering-leadership
- /hiring
- /decision-making
- /blog
- February 9, 2026 4 min
Opus 4.6 has a million-token context window and I am still not sure what to do with it
Feeding entire codebases into a single prompt for architecture review produced mixed results. The practical sweet spots are narrower than the capability suggests.
- /ai-ml
- /engineering-leadership
- /decision-making
- /learning
- December 8, 2025 5 min
S3 Vectors at re:Invent made me reconsider our entire RAG architecture
2 billion vectors per index at 90% cost reduction. Our pgvector pipeline still makes sense, but the calculus for startups without millisecond latency needs is changing.
- /aws
- /ai-ml
- /architecture
- /financeops
- /learning
- December 1, 2025 5 min
Opus 4.5 is the first AI model I trust to refactor production code unsupervised
Claude Opus 4.5 refactored a 500-line TypeScript module, maintained all tests, and passed review without modification. The bottleneck has shifted from writing to reviewing.
- /ai-ml
- /engineering-leadership
- /typescript
- /blog
- September 10, 2025 4 min
Sonnet 4.5 replaced our first-pass code review and nobody complained
AI handles style violations and missing error handling. Human reviewers focus on architecture and business logic. Review turnaround dropped from 24 hours to 4.
- /ai-ml
- /engineering-leadership
- /ci-cd
- /blog
- August 11, 2025 4 min
GPT-5 shipped and my team asked if we still need junior engineers
The answer is yes, but for reasons that forced us to articulate what junior engineers actually contribute beyond lines of code.
- /engineering-leadership
- /ai-ml
- /hiring
- /decision-making
- /blog
- May 26, 2025 5 min
Claude Opus 4 and Sonnet 4: the week AI coding tools stopped being novelties and became infrastructure
When Claude Opus 4 hit 72.5% on SWE-bench and solved a TypeScript generics issue that had stumped our team, the conversation shifted from "should we use AI" to "how do we integrate it."
- /ai-ml
- /typescript
- /architecture
- /engineering-leadership
- /blog
- April 21, 2025 4 min
OpenAI o3 and o4-mini: reasoning models are getting good enough to replace junior code review
o3 drops 20% fewer major errors than o1, and o4-mini makes reasoning affordable for CI pipelines. A financial calculation rounding error caught by AI review that three humans missed.
- /ai-ml
- /ci-cd
- /typescript
- /engineering-leadership
- /blog
- February 17, 2025 5 min
Claude 3.7 Sonnet's extended thinking and what it means for code review at a small team
Extended thinking mode changed how I approach code review on a team too small for dedicated reviewers. Step-by-step deliberation catches subtle type issues that fast-pass models miss.
- /ai-ml
- /ci-cd
- /engineering-leadership
- /typescript
- /blog
- January 27, 2025 5 min
DeepSeek R1 and the moment I realized open-source AI would change how we build internal tools
DeepSeek R1 shipped as a 671B open-source model matching GPT-4o benchmarks for under $6M training cost. Self-hostable reasoning models change the calculus for regulated fintech.
- /ai-ml
- /architecture
- /fintech
- /decision-making
- /blog
- October 23, 2024 5 min
Claude 3.5 Sonnet v2 and the week I mass-refactored our codebase with an AI pair programmer
I used Claude 3.5 Sonnet v2 to refactor our error handling layer, migrate 80 test files, and generate TypeScript types from our OpenAPI spec. This is a workflow journal.
- /typescript
- /architecture
- /ai-ml
- /startup-life
- /blog