Verification-First Development: The Practice That 2-3x Your AI Coding Results

When Boris Cherny—the creator of Claude Code—was asked for his single most important tip, he didn't mention prompting techniques or model selection.

He said: "Give Claude a way to verify its work."

That's it. That one practice, according to him, improves output quality by 2-3x.

Most developers skip this step. They prompt, review, fix, repeat. It's exhausting. Verification-first development flips this around: you set up the feedback loop before you start, then let the AI iterate until it gets things right.

Here's how to implement this in your workflow.

Why Verification Changes Everything

Without verification, AI coding is a guessing game. Claude generates code, you read it, you decide if it looks right, you run it, you find the bugs, you go back and forth.

With verification, Claude generates code, runs it against your criteria, sees what failed, fixes it, and repeats—often without needing your input at all.

The shift is fundamental: you move from being a reviewer to being a system designer. Your job isn't to catch every bug. Your job is to build the system that catches bugs automatically.

The Verification Hierarchy

Not all verification is equal. Here's how different methods stack up:

Level 1: Test Suites (Baseline)

If you have tests, Claude can run them.

## In your CLAUDE.md:

Before completing any task:
1. Run `pnpm test` to verify changes don't break existing functionality
2. If tests fail, fix the issues before marking complete
3. Add new tests for new functionality

This is table stakes. If you don't have tests, Claude is flying blind.

Level 2: Type Checking and Linting

Static analysis catches entire categories of errors before runtime.

// In your Claude Code hooks:
{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Write|Edit",
      "hooks": [{
        "type": "command",
        "command": "pnpm typecheck && pnpm lint || true"
      }]
    }]
  }
}

Now every time Claude writes or edits a file, it immediately sees type errors and lint violations. It can fix them in the same session.

Level 3: Browser Testing

For frontend work, nothing beats actually seeing the result.

Claude Code can open a browser, navigate to your app, and check if things work. Boris's team uses this extensively:

## In your CLAUDE.md:

For UI changes:
1. Run the dev server: `pnpm dev`
2. Open Chrome and navigate to the affected page
3. Verify the change looks correct visually
4. Test any interactive elements
5. Check browser console for errors

This catches the issues that pass tests but look broken to users.

Level 4: End-to-End Verification

The highest level: automated flows that verify the entire user journey.

## Subagent: verify-app

Before any PR is ready for review:
1. Start the application
2. Run through the critical user paths:
   - User can sign up
   - User can log in
   - User can complete core action
   - User can log out
3. Check for console errors at each step
4. Verify no visual regressions

Boris uses a dedicated subagent for this. It's the final gate before anything ships.

Setting Up Stop Hooks

Stop hooks run when Claude completes a task. They're perfect for verification.

{
  "hooks": {
    "Stop": [{
      "type": "command",
      "command": "pnpm test && pnpm typecheck && echo '✓ All checks passed'"
    }]
  }
}

Now Claude can't mark something "done" without passing your quality gates.

For more complex verification:

{
  "hooks": {
    "Stop": [{
      "type": "command",
      "command": "./scripts/verify.sh"
    }]
  }
}

Your verify.sh script can run whatever checks make sense for your project:

#!/bin/bash
echo "Running verification suite..."

echo "1. Type checking..."
pnpm typecheck || exit 1

echo "2. Linting..."
pnpm lint || exit 1

echo "3. Unit tests..."
pnpm test || exit 1

echo "4. Build check..."
pnpm build || exit 1

echo "✓ All verification passed"

Verification by Project Type

Web Applications

## Verification Requirements

For any UI change:
- [ ] Component renders without errors
- [ ] No TypeScript errors
- [ ] No console warnings
- [ ] Responsive on mobile viewport
- [ ] Accessible (keyboard navigation works)

For any API change:
- [ ] Endpoint returns expected data
- [ ] Error cases return proper error responses
- [ ] No breaking changes to existing consumers

CLI Tools

## Verification Requirements

For any command change:
- [ ] Help text is accurate
- [ ] Command runs without errors on valid input
- [ ] Command fails gracefully on invalid input
- [ ] Exit codes are correct (0 for success, non-zero for failure)

Libraries/Packages

## Verification Requirements

For any public API change:
- [ ] All tests pass
- [ ] TypeScript types are correct
- [ ] Documentation is updated
- [ ] No breaking changes (or version bump if breaking)
- [ ] Bundle size hasn't increased significantly

The Verification Prompt Pattern

When you don't have automated verification, you can build it into your prompts:

Instead of:

"Add a login form to the homepage"

Try:

"Add a login form to the homepage. After implementing:

Run the dev server and verify the form renders

Test that form validation shows errors for empty fields

Test that form submission calls the auth endpoint

Check the browser console for any errors

Tell me what you verified and any issues found"

This forces Claude into a verification mindset even without automated tooling.

Common Verification Mistakes

Mistake 1: No Verification at All

The most common error. Developers trust the code "looks right" and ship it.

Fix: Start with test suites. Even basic tests are better than none.

Mistake 2: Verification That's Too Slow

If your test suite takes 10 minutes, Claude won't run it often enough.

Fix: Create a fast verification path for development:

# Fast checks (run constantly)
pnpm typecheck && pnpm lint

# Full checks (run before commit)
pnpm test && pnpm build

Mistake 3: Verification Without Feedback

Running tests isn't enough if Claude doesn't see the output.

Fix: Make sure test output is visible to Claude. Don't suppress errors. Let failures be loud.

Mistake 4: Only Testing Happy Paths

Your verification only checks that things work when used correctly.

Fix: Include error case verification:

## Error Handling Verification

For any new feature:
- Test with missing required fields
- Test with invalid data types
- Test with network failures (if applicable)
- Test with unauthorized users

Building Verification Into Your Workflow

Daily Development

Start each session by confirming tests pass
Run fast checks (typecheck, lint) after each change
Run full test suite before committing

Code Review

Before approving, run the verification suite
Check that new code has corresponding tests
Verify edge cases are covered

Before Shipping

Run end-to-end verification
Test in staging environment
Verify rollback plan exists

The Investment Pays Off

Setting up verification takes time. Writing tests takes time. Configuring hooks takes time.

But here's the math:

Without verification: You review every line Claude writes. You catch bugs manually. You fix things Claude breaks. Hours per day.
With verification: Claude catches its own mistakes. You review only what passes checks. You fix only what automation misses. Minutes per day.

Boris runs 10-15 Claude sessions in parallel. That's only possible because he doesn't manually verify each one. The verification system does it for him.

Start Small

You don't need perfect verification coverage to start.

This week:

Add a basic test for one critical function
Set up a PostToolUse hook for linting
Add verification instructions to your CLAUDE.md

This month:

Expand test coverage to core functionality
Add a Stop hook that runs tests before completion
Document verification requirements for different task types

This quarter:

Full test coverage for critical paths
End-to-end verification subagent
Verification built into every workflow

The best verification system is the one you actually use. Start with something simple and grow it over time.

What verification practices have made the biggest difference in your AI-assisted development? I'm collecting examples from different stacks and workflows.