MONA explainer 10 min read

What Is AI Code Migration and How LLM Agents Translate Languages and Modernize Legacy Codebases

Diagram of an AI code migration pipeline translating legacy COBOL into Java through deterministic and LLM-agent stages
Before you dive in

This article is a specific deep-dive within our broader topic of AI Code Migration.

This article assumes familiarity with:

ELI5

AI code migration uses LLM agents and rule-based tools to translate old code into modern languages or frameworks—turning COBOL into Java, or upgrading a decade-old app—while trying to keep its behavior identical.

A COBOL system somewhere is running payroll for millions of people right now. It compiles. It runs. And no one currently employed can safely change a line of it, because the engineers who understood it retired years ago. That is the problem AI code migration exists to solve—and the interesting part is not that a machine can rewrite the code, but how, because two very different kinds of machine both claim they can.

The tempting mental model is simple: paste the old code into a chatbot, ask for the new language, ship the result. The output even looks right—fluent, idiomatic, properly indented. That fluency is exactly the trap. A language model optimizes for plausible tokens, not preserved behavior. The code it produces is grammatically perfect and occasionally, silently, wrong.

Not translation. Reconstruction under constraint.

Two Machines, Two Definitions of “Correct”

Before any tool rewrites a line, it has to decide what “the same program” even means. There are two answers to that question, and they split the entire field in half.

What is AI code migration?

AI code migration is the use of LLM agents and automated transformation tools to translate code between languages and frameworks—COBOL, VB6, or PL/SQL into Java, C#, or Python—and to modernize legacy codebases, including framework version upgrades (Addepto). But that single label hides two fundamentally different mechanisms, and confusing them is the most common mistake teams make.

The first mechanism is deterministic. Tools like Codemod utilities parse your source into an Abstract Syntax Tree—a structured representation of the code’s grammar—modify specific nodes according to fixed rules, then regenerate the source. Meta’s jscodeshift does exactly this, wrapping a library called recast so the regenerated code keeps its original formatting (jscodeshift GitHub). Given the same input, it produces the same output every time. You can prove it correct.

The second mechanism is probabilistic. An LLM agent reads the old code, predicts the most likely modern equivalent token by token, and generates something that resembles a correct translation. It can handle ambiguity and undocumented intent that a rigid rule could never anticipate. But it offers no guarantee—only a high-probability guess wearing the syntax of certainty.

The whole discipline lives in the tension between those two. Deterministic tools are trustworthy but brittle; they only know the patterns someone wrote a rule for. Probabilistic agents are flexible but unverifiable on their own. The serious systems combine them.

Inside the Translation Loop

Watch a modern migration agent work and you will notice it does not behave like a one-shot translator. It behaves like an engineer debugging in a tight feedback cycle—and that loop is where the reliability comes from.

How does AI-assisted code migration and framework upgrading actually work?

Start with the deterministic end, because it sets the standard the agents are trying to reach. OpenRewrite, maintained by Moderne, does not operate on a plain syntax tree at all. It builds a Lossless Semantic Tree—a representation that is both type-attributed (it knows that customer is a Customer, resolved through the type system) and format-preserving (it remembers your exact whitespace and comments). Transformations are applied as composable “recipes” (OpenRewrite Docs). Because the tree carries type information, a recipe can reason about what the code means, not just what it says—and because the migration is rule-driven, the result is reproducible.

That distinction between a plain syntax tree and a semantic tree is not pedantic. It is the difference between a translator who knows only grammar and one who also knows the dictionary.

The agentic end works differently. Amazon’s Amazon Q Code Transformation (now surfaced under the AWS Transform brand) upgrades Java 8 and 11 Maven projects to Java 17 by auto-generating a transformation plan, updating dependencies, and refactoring deprecated code (Amazon Q Developer Docs). For the .NET world, the same family migrates Windows-bound .NET Framework applications to cross-platform .NET and produces a Linux compatibility readiness report. The capability is agentic and improves per execution, though its exact general-availability scope is best treated as a moving target.

The frontier is research like LegacyTranslate, a Multi Agent Systems method that splits the work across three specialized agents—an Initial Translation agent, an API Grounding agent, and a Refinement agent—and was applied to roughly 2.5 million lines of PL/SQL translated to Java (arXiv). The structure matters more than the numbers. One agent drafts, one checks the draft against real API contracts, one repairs the mismatches. This is the same Agent Planning And Reasoning loop a human follows: write, test, fix.

What makes the loop close at all is the environment. An agent that can only read code is guessing. An agent that can run the build, read the compiler errors, and execute the test suite is iterating—each failed test becomes a signal pointing toward the fix. Connecting the model to those tools is increasingly handled by the Model Context Protocol, an open standard for wiring LLM applications to external tools and data; its current specification is dated November 25, 2025, and Anthropic donated it to the Linux Foundation’s Agentic AI Foundation in December 2025 (Model Context Protocol spec).

The Pipeline, Part by Part

A production migration is never a single model call. It is an assembly line, and each station does one job the next one depends on.

What are the parts of an AI code migration pipeline?

Most pipelines, deterministic or agentic, share five stages:

StageWhat it doesDeterministic versionAgentic version
Ingestion & parsingTurn source text into a structured treeAST or Lossless Semantic TreeSame tree, plus natural-language context
TransformationRewrite the tree into the target formFixed recipes / codemodsLLM proposes the rewrite
GroundingAnchor the rewrite to real constraintsType system, recipe preconditionsAPI contracts, tests, retrieval over the codebase
ValidationProve the result behavesCompile + test suiteCompile + test suite, fed back to the agent
ReviewHuman approves the changePull requestPull request

The grounding stage is where probabilistic and deterministic approaches quietly converge. A deterministic recipe grounds itself in the type system; it physically cannot apply a transformation whose preconditions are unmet. An agent grounds itself by retrieving the actual function signatures it must call and by running the tests—the verification a rule gets for free, the agent has to earn at runtime.

Notice what the agent is really doing here. It is not recalling a memorized translation; it is conditioning its next-token predictions on the type signatures, the error messages, and the test results placed in its context. Change what you put in front of it, and you change the geometry of what it generates next.

The validation stage is the one nobody can skip. Without a test suite, an agentic migration is a confident assertion with no proof. With one, every red test is a gradient the agent can descend.

AI code migration pipeline: ingestion and parsing, transformation, grounding, validation, and human review stages
The five stages most AI code migration pipelines share, from parsing the source tree to human review.

What the Split Predicts

Once you see migration as deterministic-versus-probabilistic with a verification loop between them, you can predict where each approach wins and where it fails.

  • If the transformation is mechanical and well-specified—Java 8-to-17 deprecation fixes, a known framework upgrade—expect deterministic recipes to beat LLMs on cost, speed, and trust. An internal Amazon team reported upgrading 1,000 production applications from Java 8 to 17 in two days, about ten minutes per app, using the agentic upgrader (AWS Blog)—but that is a vendor benchmark on a highly structured task, not a guarantee for arbitrary code.
  • If the source carries undocumented business logic with no clean rule, expect an LLM agent to handle the ambiguity that a recipe cannot—provided you can verify the output.
  • If your legacy code has strong test coverage, expect agentic migration to converge, because each failing test gives the agent something concrete to fix.

Rule of thumb: if a behavior is checkable by a test, an agent can iterate toward it; if it is not, you are trusting probability and calling it migration.

When it breaks: the dangerous failure mode is the migration that compiles and passes a weak test suite while silently changing behavior at the edges—a Hallucination expressed as valid code. Deterministic tools fail loudly when they meet a pattern with no recipe; agents fail quietly when they meet a case no test covers. The quiet failure is the expensive one.

The Deterministic Comeback Nobody Predicted

There is a counterintuitive lesson buried in this. The most reliable parts of AI code migration are often the least AI-driven. OpenRewrite’s semantic tree and a well-written test suite do more for correctness than a larger model does, because they convert “probably right” into “provably right.” The frontier work—LegacyTranslate, environment-in-the-loop research—is not trying to replace deterministic checks with smarter models. It is wrapping the probabilistic translator inside a deterministic cage, so that the model’s flexibility gets the rule-based system’s guarantees. The future of migration is not the agent or the compiler. It is the loop between them.

The Data Says

AI code migration is two mechanisms sharing one name: deterministic tools that transform a typed tree by fixed rules, and probabilistic agents that predict modern code and verify it by running it. The reliable systems do not pick one—they ground the agent’s guesses in tests, types, and tooling until the guess becomes checkable. Migration quality tracks verification coverage far more than model size.

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors

Share: