Back

The human + agent software team wins

Every engineering leader I know is wrestling with the same question: how do software teams change as models get better at writing code? My team ships the next generation of models through Anthropic's APIs, the models powering the agents that are great at software development, and we're navigating the same shift as everyone else.

Models today can write code, review it, debug it, iterate on it. Everyday people can one-shot basic apps. Thanks to this, when a team operating software at scale leverages agents really well, the bottleneck shifts to all the things around the code itself: deciding what to build, getting alignment, review cycles, deployment, monitoring.

This is changing what it takes for software companies to win. I think the ones that pull ahead will be the ones that understand what humans are irreplaceable for, set up agents to do everything else, and create a flywheel between the two. Let me start with the humans, because getting this right is what matters most.

What humans are irreplaceable for

Here is what I think humans will always be uniquely necessary for, regardless of how great agents become.

The motivation to solve problems

There is no loss function for "what should this company care about."

An agent can analyze your market, identify gaps, rank opportunities by expected value, and write a convincing strategy for any of them. But the best product decisions aren't the highest expected value play. They're bets rooted in conviction about a future that isn't in the data yet. You can't optimize for the objective because the objective is the thing you're choosing.

A person who says "actually we were wrong, let's pivot" is exercising judgment that can't be delegated. There's no ground truth to optimize against. Someone has to author the goal, update it as the world changes, and create the activation energy to go after it. That requires an entity with real stakes.

Taste

Taste isn't just about knowing what good looks like. Models can pattern-match that from training data. Taste is originating a new standard of good that doesn't exist yet.

When my best API designers review a proposed API shape and something feels wrong about it, that feeling comes from having integrated with hundreds of APIs and developing an acute sense of what creates friction and what creates delight. It's a perspective about what developers should want, sometimes before they know they want it.

Here's why this matters for competition: when every company consults the same models to make product decisions, they converge on the same designs. The companies that differentiate are the ones where a human says "I believe this is better" and pushes the whole product in a direction the model wouldn't have chosen. Models can reflect human taste back at us and even combine existing ideas in interesting ways. But the frontier of what "good" means is still set by humans.

Trust that drives growth

Agents can draft emails, prepare for meetings, and chat pretty convincingly. But trust is built on mutual vulnerability. A customer trusts you because you showed up when things went sideways, because you made a commitment you could have broken. An agent can't do that because it has nothing at stake.

The most successful B2B software businesses have built a level of trust that leads their customers to bring them the fundamental problems blocking their entire business. That kind of candor only happens between humans who have built real relationships. And it provides signal that shapes a roadmap around what customers actually need rather than what looks like the highest expected value.

This means the companies with the deepest human relationships will make the best product decisions, because they have access to information nobody else has. You can deploy the best agents in the world, but if your competitors have more trust with their customers, they'll know things about the market that you don't. They'll build the right thing while you build the obvious thing.

Leveraging agents today

If motivation, taste, and trust are where humans are irreplaceable, then the goal is clear: make agents handle everything else so your best people spend their time on those three things. The best agent setups I’ve seen today succeed by treating agents the way we treat humans on high-performing teams. This sounds obvious but only the few winning teams are getting it right at this stage.

Agents exclusively run in self-verifying loops. LLMs are not one-shotting solutions to hard problems. You need to tell a coding agent to make a plan, write the code, run the tests, check the output, and keep iterating until everything actually passes. This is dramatically more effective than expecting a clean result on the first try, and it saves you from the doom loop of prompting further until it works. People dismiss this kind of scaffolding as not being "AGI-pilled" but it's exactly what makes imperfect entities effective. It's why humans self-review a document before sharing it. Set the bar for done, give the agent the tools to verify against it, and don't let it stop until it gets there.

Build agents that check each other. One agent writes code, another reviews it with a different lens. One proposes a technical design, another attacks it from the perspective of a developer seeing it for the first time. During an incident, multiple agents investigate in parallel with different hypotheses, the same way your best on-call engineers think of things each other missed. This isn't overhead yet.

Give agents memory and context. Humans are effective partly because they accumulate organizational knowledge over time. Your best team members know where the bodies are buried, which parts of the codebase are fragile, what was tried before and why it failed. Most agents today start from scratch every session. To really make them successful, you have to give them the right knowledge at the right time. Think about why a senior engineer is so much more effective than a new hire with the same raw ability. The senior engineer knows "we tried this approach six months ago and it failed for a subtle reason." That accumulated context is worth more than any amount of raw coding skill. Agents need the same thing, and giving it to them is one of the highest-leverage investments you can make.

Set all of this up well and you can compress a project that takes a large group of engineers several months down to just a couple engineers shipping in days or weeks instead, leaving your humans time to do what they do best. My team is sprinting to provide more solutions in this space. If you've got ideas, I'd love to hear from you.

Leveraging agents in the future

Let’s play this out as models and infrastructure get better. We’ll all need to rethink how we build and run systems from first principles. The patterns we use today (PRDs, design docs, sprints, code review, on-call rotations, team boundaries) exist because we were solving for human constraints. As models and infrastructure improve, it will no longer make sense to replicate these patterns with agents assisting.

In the fully agentified version of a software company, maybe there are no separate PRDs and technical designs. Instead, there is a single evolving spec that captures what we're building and how, continuously updated as agents learn from implementation. Maybe there are no milestones or task breakdowns, because agents hold the whole problem and update their state as they learn.

Maybe code doesn't even move through pull requests. The PR lifecycle came about because humans wrote code in isolation and then needed other humans to verify it before production. Future agents will likely write, verify, and integrate continuously. Changes will flow through comprehensive automated testing rather than blocking on a reviewer.

Production incidents might (hopefully!) be rare and self-resolving. Systems built by agents are tested more exhaustively because the testing pyramid no longer really applies, they’re documented more thoroughly, and they’re designed more cleanly. When something does break, agents will likely detect the anomaly, correlate signals, diagnose the root cause, and ship a fix faster than a human could ack the page.

In this world, the operational unit shifts from a team that ships projects to a domain with an army of agents converging towards an always-evolving spec. The companies that win will reinvent how teams work.

The winning human-agent flywheel

The most agentified version of a software company takes a fundamentally different shape. A small number of humans who decide what problems matter, bring taste to how they're solved, and build trusting relationships. Agents do everything else.

The compounding will work like this: fewer humans focused on the right things means faster decisions, which means tighter feedback loops, which means better products, which means deeper customer relationships, which means better signal, which means better decisions. The whole flywheel accelerates when you get the human-agent boundary right.

The companies that figure this out first will have a structural advantage that compounds. Not because they have better models, but because they've designed the right interface between human judgment and agentic execution. They keep humans exactly where humans are irreplaceable and automate everything else.

The path to getting there isn't about waiting for smarter models. It's about building the right agentic setups now. Treat agents like teammates, not tools. Give them memory, give them domain expertise, and make them check each other's work. Then find the humans with the best judgment, taste, and relationships. Hold on to them. If you're going to win, you'll need them.