What Agentic Engineering Means for Product Managers

Andrej Karpathy's agentic engineering framework redefines the bottleneck in AI-assisted development — and it runs directly through the product management function.

About the author

Shawn Livermore

Sr. Consultant, Product Perfect

Senior Consultant and best-selling author with over 24 years of industry experience leading high-volume custom software implementations and driving large tech migrations for Fortune 500 clients.

Andrej Karpathy, presenting at Sequoia Capital's AI Ascent 2026 conference this spring, drew a line that has been circulating through engineering and product circles since. On one side: vibe coding — the casual practice of describing what you want to an AI model and accepting the output. On the other: what he called "agentic engineering," the professional discipline of directing multiple AI agents through complex tasks while actively preserving correctness, security, and quality.

The distinction matters. But not primarily for the reasons engineering leaders are discussing. The deeper implication runs through the product management function, and most organizations have not noticed it yet.

stateDiagram-v2
direction TB
state "Vibe coding" as VC
state "Agentic engineering" as AE
state "Spec debt accumulates" as SD
state "High-quality output" as HQ
[*] --> VC
VC --> AE : Spec discipline and eval loops added
VC --> SD : Complexity grows without discipline
SD --> AE : Failures force structure
AE --> HQ : Understanding is the bottleneck
HQ --> [*]

Vibe coding ends where production systems begin

Karpathy named December 2025 as an inflection point. Models crossed a quality threshold where error rates dropped enough that the correction loop — the constant back-and-forth that made early AI-assisted coding feel like managing an unreliable junior — effectively disappeared. What he can now do with AI agents, he says, he could not do reliably even a year ago.

But higher model quality did not make the discipline of direction less important. It made it more important. The agentic engineer, in Karpathy's framing, does not simply prompt and accept. The engineer designs specifications, supervises agent plans, inspects diffs, writes evaluation tests, manages permissions, and holds a quality bar throughout. The AI generates; the engineer directs, reviews, and decides what meets the bar.

Product commentator Jeff Gothelf made the observation plainly: what Karpathy described as agentic engineering maps almost exactly to skilled product management. Clear problem definition. Precise specifications of intent and outcome. Acceptance criteria tight enough to be tested. The ability to say "this is not what we asked for" and explain why. These are not new engineering skills. They are the skills product managers have always been responsible for developing — and too often left in a form too loose to be useful.

Product managers: your specifications now bound engineering output

The practical implication is direct. When AI agents are doing the bulk of code generation, the quality of what ships is directly bounded by the quality of the direction they receive. In most organizations, that direction flows from the product function.

Teams have learned, often deliberately, to write user stories loosely. The reasoning made sense: leave room for engineering judgment, for technical discovery, for the creativity that emerges when engineers are not over-constrained by requirements. That approach works when engineers interpret ambiguity through professional experience and fill in gaps. It works poorly when the "engineer" is an agent that interprets instructions literally and produces output at speed.

What product managers need to invest in now:

Problem definitions that hold under scrutiny. A vague problem definition produces a technically sound implementation of the wrong thing. In an agentic workflow, this compounds across a sprint before review catches it. The spec review that identifies misalignment is worth more per hour than almost any other product work.

Acceptance criteria as evaluation inputs. Karpathy talks about evaluation tests as a core part of agentic engineering discipline. The product manager's "definition of done" now needs to be specific enough to run as a test, not just a checklist. If it cannot be operationalized, it will not constrain agent output.

The review gate as a quality function, not a formality. As model quality improves, teams will be tempted to compress review steps. The volume and velocity of agent-generated output means any systematic misalignment compounds quickly. Review is where direction quality is verified.

The teams that move fastest with agentic tooling will be the ones whose product managers have already developed these habits.

Executives: the delivery constraint just moved

The most discussed executive concern about AI in software development is workforce impact. The more immediate question is where the delivery constraint moves.

Karpathy's observation is that "understanding becomes the bottleneck." When code generation is fast and cheap, the limiting factor is the quality of direction entering the pipeline. Organizations that have invested in rigorous product management — clear problem definitions, precise outcome metrics, tight specification discipline — will see disproportionate returns from agentic tooling. Organizations that have not will find that AI accelerates their existing confusion.

Concrete implications:

Hiring criteria need to shift. Evaluating engineers purely on coding output, or product managers purely on stakeholder management, misses the new premium. The ability to direct complex agent workflows — precise specs, plan oversight, quality judgment — matters for both functions now. It is worth testing for directly.

Evaluation infrastructure is not optional. Agentic engineering requires test frameworks, evaluation loops, and quality review tooling — not just AI tool licenses. Teams that skip this infrastructure discover the debt when a fast-moving agent pipeline produces a large volume of well-executed wrong output.

The upside is real but unevenly distributed. Karpathy suggested the productivity multiple for teams that master agentic engineering may far exceed the traditional "10x" benchmark. That upside accrues to teams with clear direction and tight quality loops. It does not accrue evenly to teams that add tools without adding discipline.

Our take

A logistics platform client came to us with a clear brief: rebuild the UI. The presentation layer was the problem as the stakeholder understood it. Discovery revealed something different. The underlying data model forced warehouse operators to duplicate work the system should have automated. A redesigned UI would have shipped without touching the real problem — and would have done so faster and at higher fidelity than the previous system, making the underlying issue harder to see.

Agentic engineering has this same failure mode, scaled up. When AI agents can generate and ship code faster than any team historically moved, the quality of the initial direction is no longer a soft upstream input — it becomes the rate limiter of the entire delivery pipeline. A poorly specified goal, handed to capable agents, produces a large volume of technically correct implementations of the wrong thing.

The product managers who will have the most leverage in this environment are not the ones learning to use AI tools. They are the ones who have already learned to define problems precisely, write acceptance criteria that function as tests, and maintain a quality bar that connects to user outcomes. That work was always central to the product function. It just became the hard constraint on engineering throughput.

Frequently Asked Questions

What is the difference between vibe coding and agentic engineering?

Vibe coding, a term Andrej Karpathy coined in early 2025, describes the casual practice of describing what you want to an AI model and accepting the output — low-friction, high-speed, and suited to prototypes. Agentic engineering, the framework he articulated at Sequoia AI Ascent 2026, is the professional discipline that comes after: designing specifications, supervising agent plans, inspecting code diffs, writing evaluation tests, and holding a quality bar through the full delivery process. The distinction is not about tool sophistication. It is about whether someone is directing the system or just prompting it. Vibe coding raises the floor for what anyone can build; agentic engineering maintains the ceiling for what professional teams should ship.

How should product managers adapt their specifications for agentic engineering workflows?

The core shift is from specifications that leave room for engineering interpretation to specifications precise enough to function as evaluation criteria. In traditional development, engineers fill in gaps between requirements using professional judgment. In an agentic workflow, agents interpret instructions literally and produce output at volume — so any imprecision in the spec produces a technically correct implementation of the wrong thing, faster than ever. Product managers who develop tight problem definitions, acceptance criteria that can be operationalized as tests, and clear quality bars connected to user outcomes will find they have more leverage in agentic development pipelines, not less. Loose specs are the primary failure mode.

Does agentic engineering change executive decisions about engineering headcount?

Agentic engineering changes what the productivity constraint is, which in turn should change how headcount decisions are made. When code generation is fast and cheap, the limiting factor is the quality of direction entering the pipeline — not the number of people available to write code. Organizations that make headcount reductions based on individual-level productivity gains without addressing the direction and oversight functions are likely to find the gap between what agents produce and what users need grows. The skill that needs to be hired and retained is not just the ability to build, but the ability to direct, evaluate, and hold a quality bar — in both engineering and product roles.

More on

The Integration Layer Is Where Enterprise AI Projects Actually Fail

Continue reading

Why AI Productivity Gains Aren't Showing Up in Delivery Metrics

Continue reading

What Agentic Engineering Means for Product Managers

Continue reading

Self-Service Business Intelligence

Continue reading

Remote Workers are on Cybersecurity Frontlines

Continue reading

Corporate America is Still Building-Out Big Data

Continue reading

See all topics

See All

Other Trending Topics

Connect with our team for a focused, collaborative session.

Schedule Call

Discovery or Introductory Call

Senior consultants with previous experience at with these types of projects. These usually set the stage for a well-formed and properly framed engagements.

Discovery Call Details

Industry or Product Deep-Dive

Focused session on your specific industry, or, your in-house software platform for migration, conversion, enhancement, or integration. 

Product Call Details