When the AI Writes the Code: Accountability, Skill Erosion, and the Ethics of Vibe Coding

Table of Contents
The Hard Truth
A founder builds an MVP in a weekend — three thousand lines of Python, none of which she has read. The auth flow works. The payments page works. Customers arrive. Six months later, a researcher finds her database exposing authentication tokens she never knew existed. She did not write the bug. She does not know it is there. Whose code is it?
This is no longer a thought experiment. Vibe Coding — Andrej Karpathy’s phrase for “fully giving in to the vibes” and writing software through natural-language prompts, treating the code itself as something you can “forget exists” — has moved from a 2025 social-media coinage into a workflow that, per Trend Micro, generates roughly 46% of new code on GitHub. The research arriving alongside that adoption tells a more uncomfortable story than the productivity numbers do.
The Question Under the Velocity
What are the ethical concerns of relying on vibe coding for production software? Stated like that, the question sounds abstract. It is not. An audit of 5,600 deployed applications, reported by Trend Micro, found roughly 2,000 critical vulnerabilities, around 400 exposed secrets, and 175 exposures of personally identifiable information that included medical and payment data. Wiz documented a single misconfigured database, generated through prompt-driven development, that exposed 1.5 million authentication tokens and 35,000 email addresses.
The Georgia Tech Vibe Security Radar, via Trend Micro, tracked CVEs traced to AI-generated code from 6 in January 2026 to 35 by March. The CodeRabbit study of 470 open-source pull requests, per Wikipedia, found that AI co-authored merges contained 1.7 times more “major” issues and 2.74 times more security vulnerabilities than their human-only equivalents.
The systems are not broken. They produce code that compiles, runs, and serves real users. They simply produce it with a defect density earlier practices would have refused — inside a workflow that, by design, discourages the close reading that would have caught the defect.
What the Honest Case for Vibe Coding Actually Says
The strongest argument is not “developers are obsolete.” Building software has always been gated by an asymmetry: a founder with an idea and no syntax fluency could not test it cheaply, and a senior engineer could not duplicate themselves at will. Vibe coding collapses that asymmetry. The Y Combinator W2025 cohort, per Wikipedia, included a quarter of startups whose codebases were roughly 95% AI-generated. Those startups exist, learn from customers, and surface insights the world would otherwise not have heard.
For experienced engineers, the case is subtler. The pro-tier tools — Cursor, Claude Code, Windsurf — automate the parts of the work that were never the interesting parts: boilerplate, refactors, framework changes. AI Code Migration workflows compress weeks of mechanical translation into hours, letting senior developers reclaim time for the design and judgment calls they entered the profession to make.
The argument continues that the security problems are growing pains. Linters will adapt. Enterprise platforms will add guardrails. By the time the EU AI Act’s high-risk obligations land on August 2, 2026, per Latham & Watkins, hygiene will have caught up. This position has been right about other technology transitions before.
The Assumption Hidden Inside “Just Read the Diff”
The defense rests on an assumption nearly everyone states but few examine: that the developer remains the meaningful reviewer of the generated code. The record makes that assumption hard to hold.
The Sonar State of Code Developer Survey 2026, summarized by InfoQ, found that while 96% of developers do not fully trust AI-generated code, only 48% always verify it before committing. Veracode’s secure-versus-insecure-choice study, via Trend Micro, found that AI models select the insecure implementation 45% of the time when both options are available. Georgetown CSET, also via Trend Micro, observed cross-site scripting vulnerabilities in 86% of code samples across five major language models.
The harder finding is about the reviewers themselves. Shen and Tamkin’s February 2026 paper on AI’s impact on skill formation, published on arXiv and summarized by Anthropic Research, ran a controlled trial with 51 participants learning a new Python library. The group using AI assistance scored 17% lower on comprehension quizzes than the group working unaided. Anthropic Research separately documented the productivity paradox: experienced developers using AI were 19% slower at completing real tasks while subjectively feeling 20% faster. AI CERTs reported a sharper split — developers who used AI for conceptual inquiry scored at or above 65% on skill assessments, while those who delegated heavily scored below 40%.
The assumption is that the developer is the last line of defense. The data says the developer is becoming worse at being that line of defense in proportion to how much they rely on the tool that is also generating the defects.
A Profession That Forgot Its Apprenticeship
There is a useful historical parallel. Every skilled trade that survived industrialization built an apprenticeship — a long, deliberate period during which a junior practitioner did the foundational work badly, then less badly, then competently, under the eye of someone who had done it themselves. The boredom was load-bearing. It was where intuition compiled from repetition.
Software’s apprenticeship was always informal, but it existed. Junior developers learned by writing the auth flow themselves, getting it wrong, having a senior review the diff, and absorbing why one approach was load-bearing and another was theatre. That loop is being quietly replaced: junior developer prompts, AI generates, junior developer accepts. A generation is learning to evaluate code it could not have written and could not now write from scratch.
What corporations once solved with vicarious liability — the supervisor responsible for the employee, the company for the supervisor, the insurer for residual risk — has no equivalent in the AI-mediated workflow. UBI Interactive reports that under GDPR, data controllers remain responsible for AI-generated code regardless of who or what produced it; “the AI did it” is not a defense. The EU AI Act, per Latham & Watkins, can assess penalties up to €35 million or 7% of worldwide turnover. Legal infrastructure has decided the human is liable. The technical workflow has decided the human cannot meaningfully review what they will be liable for.
Where This Argument Lands
Thesis (one sentence, required): Producing code a developer would not have written, cannot now write, and does not have time to review — and then releasing it into systems that touch real people — transfers editorial authority from a human professional to a workflow whose accountability chain we have not yet designed.
The argument grows sharper, not weaker, as the tools improve. As Model Context Protocol and similar integration layers let AI assistants reach further into codebases, build pipelines, and live data, the surface where the developer might have intervened gets thinner. Aikido’s State of AI in Security & Development 2026, via Really Good Computer Support, found that one in five organizations had suffered a major breach linked to AI-generated code, and roughly 70% had identified AI-introduced vulnerabilities in their stack. These are organizations that already employ security teams. The current tooling is operating inside the existing accountability scaffolding and breaking through it anyway.
Questions Worth Sitting With
There are no clean prescriptions here, only better starting questions. When a developer accepts a suggestion they did not fully read, what is the institutional record of that acceptance — is it logged, is it auditable, does it survive the next refactor? When a startup raises a Series A on a codebase 95% generated, who on the team can defend the architecture under questioning from an enterprise security review? When an apprentice generation grows up evaluating output it cannot itself produce, who teaches the next generation what wrong feels like?
These are governance questions wearing engineering clothes. Treating them as engineering — as something the next IDE release will solve — is how the responsibility chain stays diffuse.
Where This Argument Is Most Vulnerable
The case rests on a claim that the gap between adoption and accountability is not closing fast enough. That could be wrong. Static analysis tooling for AI-generated code is improving. Enterprise platforms are adding policy layers. The EU AI Act, the NIST AI Risk Management Framework, and emerging professional standards may, over the next two years, supply the structured oversight the field is missing. If liability law, insurance markets, and developer practice converge on a sustainable equilibrium before the next major breach class becomes routine, this essay will look too pessimistic. The honest position is that it is too early to know whether the institutional response will catch up before the velocity does.
The Question That Remains
If a developer accepts code they did not write and cannot review, released into systems that affect people they will never meet, under a legal regime that holds them responsible and a workflow that designs them out of the loop — whose decision was that, exactly? Until the profession can answer without flinching, every accepted AI commit is a quiet bet that nothing will go wrong on a day when nobody was reading.
Disclaimer
This article is for educational purposes only and does not constitute professional advice. Consult qualified professionals for decisions in your specific situation.
Ethically, Alan.
AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors