MAX guide 13 min read May 25, 2026 Updated July 9, 2026

Using AI to Translate Python 2 to Python 3 and Convert COBOL to Java in 2026

Specification-first framework for AI code migration across COBOL to Java, Python 2 to 3, and React legacy systems

Table of Contents

TL;DR

AI doesn’t replace the migration plan. It executes the parts you specify. Split your code into what rewrites mechanically, what needs AI judgment, and what a human must verify.
The only contract that matters is semantic equivalence — same input, same output. A clean compile tells you nothing about behavior.
Run deterministic AST tools first, AI agents second, human review on business-critical paths last.

A bank’s overnight batch job ran in COBOL for thirty years. An AI tool converted it to Java in an afternoon. It compiled. It passed the smoke test. Three weeks later, interest on a few thousand accounts was off by a cent — because the migration quietly changed how decimals rounded, and nobody had said it shouldn’t. The code wasn’t wrong. The specification was missing.

Before You Start

You’ll need:

An AI migration tool matched to your stack — AWS Transform for mainframe or IBM watsonx Code Assistant for Z for COBOL, LibCST or an AI assistant for Python 2 to 3, the Codemod Registry for React
A working understanding of AI Code Migration and how an Abstract Syntax Tree transform rewrites code without running it
A test oracle: the legacy system’s real inputs and the outputs it actually produces

This guide teaches you: how to decompose a migration so the mechanical parts, the AI-judgment parts, and the human-review parts each go to the tool that can actually handle them.

The Migration That Compiled and Lied

Here’s the failure mode I see on nearly every legacy project. Someone points an AI at a repository, types “convert this to Java,” and gets back code that compiles. Green checkmark. Ship it. The AI optimized for the thing it could see — syntax — and guessed at the thing it couldn’t: what the program was supposed to do.

Compiling is not migrating. A compile proves the syntax parses. It says nothing about whether the new code produces the same answer on the same input. That gap is where currency rounding drifts, where date math slips a day, where an EBCDIC-encoded field turns into mojibake. The build went green on Friday. On Monday, a downstream report didn’t reconcile because an assumption nobody wrote down had changed.

Step 1: Sort Your Code Into Three Buckets

Before any tool touches a line, decompose the migration by transformation type. Not by file. Not by module. By how much judgment each change requires. Most teams skip this and hand the AI one undifferentiated pile — that’s the original sin.

Your migration has three layers — three buckets, not one pile:

Mechanical — syntax and API changes a deterministic transform can rewrite reproducibly. Python 2’s print statement becoming print(). A COBOL PERFORM loop becoming a Java for. These run through an AST rewrite the same way every time, and you can review the diff.
Idiomatic — restructuring that needs judgment. COBOL paragraphs collapsing into Java methods. A React class component’s lifecycle untangling into hooks. There’s no single correct output here, which is exactly why it needs AI or a human, not a fixed rule.
Business-critical — logic that must produce byte-identical results. Interest calculations, tax rounding, encoding boundaries. This bucket doesn’t get “migrated.” It gets preserved and verified.

The Architect’s Rule: If you can’t say which bucket a file belongs in, you don’t understand it well enough to let an AI rewrite it.

Step 2: Write the Equivalence Contract

The AI will guess at everything you leave unstated — confidently and fast. So state it. The equivalence contract is the spec that turns “convert this” into “convert this, preserving exactly these behaviors.”

Context checklist:

Source language and exact version — Python 2.7, not “Python 2.” Your specific COBOL dialect, not “COBOL.”
Target version — Java 21, Python 3.13, whichever you’re committing to.
Behavior-preservation rules — what counts as the same output: same input, same output, down to rounding mode and field width.
Edge cases listed explicitly — decimal rounding, null handling, character encoding (EBCDIC to UTF-8 is a classic silent corruptor), integer overflow when a fixed-width target meets an arbitrary-precision source.
The test oracle — the real production inputs and known-good outputs you’ll measure against.

The Spec Test: If your contract doesn’t pin the rounding mode, the AI picks one. It will choose whatever its training favors, and your financial totals will drift by fractions of a cent that compound into a reconciliation failure nobody can trace.

Step 3: Run Deterministic First, AI Second

Build order matters more in migration than in greenfield work, because every layer constrains the next. The rule: deterministic first, AI second, humans last.

Migration order:

Deterministic AST transforms first — because they’re reproducible and the diff is reviewable. On the JVM, OpenRewrite runs LST-based recipes that rewrite code the same way every run; rewrite-core is at 8.83.0 with more than 5,000 community recipes under Apache 2.0, per OpenRewrite Docs. For running those recipes across thousands of repositories at once, Moderne is the commercial platform built on the same engine. One caution: OpenRewrite is JVM-focused — Java version bumps, Spring Boot, JUnit migrations. It is not a Python 2-to-3 tool. For Python, reach for LibCST or parso, or an AI assistant working against your contract.
AI agents second — for the idiomatic bucket the rules can’t touch. For COBOL, AWS Transform for mainframe is an agentic system that analyzes, decomposes, and plans the work in waves, refactoring COBOL and JCL into Java while preserving business logic, per AWS Transform; it reached general availability in May 2025. IBM watsonx Code Assistant for Z converts COBOL and PL/I to Java using a multi-agent setup and the Granite model, per IBM Docs (version 2.8.20). On the JVM-upgrade side, the agentic Amazon Q Code Transformation handles Java-version jumps rather than COBOL. These agents increasingly expose their tools over the Model Context Protocol so they can pull repository context on demand — and since that spec is moving fast (the 2025-11-25 revision is current, with the next due later in 2026, per MCP Spec), pin to a known version rather than “latest.” For React class-to-hooks work, the Codemod ecosystem carries the idiomatic load.
Human review last — every file in the business-critical bucket. Not because humans are faster, but because this is where silent corruption hides, and a person who knows the domain is the only reviewer who’ll catch a rounding change that still compiles.

Step 4: Prove Semantic Equivalence

Now verify the output is correct. Not by reading the first few files and nodding — a clean compile is not a passing test. You prove equivalence by running the new code against the same inputs as the old and diffing the results.

Validation checklist:

Characterization tests (golden master) — capture real production inputs and the legacy outputs, then assert the new code matches. Failure looks like: a diff on a real input the smoke test never exercised.
Auto-generated equivalence tests — some agents do this for you. IBM watsonx Code Assistant for Z auto-generates unit tests aimed at semantic equivalence, per IBM Docs. Failure looks like: a generated test that passes on the happy path but was never given a boundary value.
Edge-case battery — the list from your contract, run deliberately: max currency value, leap-year dates, empty fields, encoding boundaries. Failure looks like: an off-by-one at exactly the boundary you forgot to specify.

Vendors will tell you this collapses timelines — AWS frames mainframe modernization as moving from “years to months,” IBM frames it as “minutes not months.” Treat those as vendor claims, not benchmarks. The agent can be fast and the work can still be wrong if you skip the equivalence proof.

Three-layer AI code migration model: deterministic AST rewrites, AI judgment layer, and human-reviewed business-critical logic — The migration stack: mechanical transforms run first, AI agents handle ambiguity, humans guard business-critical logic.

Tool status notes:
Python 2to3 / lib2to3: Removed in Python 3.13 (deprecated in 3.11), per Python Docs. Do not build a migration on it — it no longer ships with modern Python. Use LibCST or parso, or an AI assistant working against an equivalence contract.
jscodeshift: No active maintainers at Meta and a large open-issue backlog; the Codemod team now drives it, with ast-grep (jssg) emerging as the modern successor. Usable, but don’t expect upstream fixes — keep your own validation around it.

Common Pitfalls

What You Did	Why AI Failed	The Fix
Pointed an AI at the whole repo: “convert to Java”	Too many concerns; it optimized for compiling, not behavior	Sort into mechanical / idiomatic / business-critical first
Didn’t specify the source version	AI assumed a modern dialect; legacy quirks got dropped	State exact versions on both sides of the migration
Trusted “it compiles” as done	Compilation proves syntax, not equivalence	Run characterization tests on real production inputs
Reached for 2to3 on Python	The tool was removed in Python 3.13	Use LibCST or parso, or an AI assistant with a contract
Shipped a React class-to-hooks codemod output	Codemods can’t untangle complex lifecycle or `this` logic	Treat the output as a draft; review state and effects by hand

Pro Tip

The most reusable artifact from any migration isn’t the new code — it’s the characterization test suite you build to prove equivalence. Write the golden-master tests against the legacy system before you migrate a line. They become your oracle for every tool you try, deterministic or AI, and they outlive the migration: the next time someone touches that code, the tests already define what “correct” means.

Frequently Asked Questions

Q: How to use AI to translate code from one programming language to another? A: Pin the source and target versions, then split the job — a deterministic AST tool handles mechanical syntax, an AI agent handles idiomatic restructuring, and a human verifies business logic. One watch-out the steps above skip: type systems rarely map one-to-one, so specify how unsupported types should be represented (a Python arbitrary-precision int in a fixed-width target) before the AI guesses for you.

Q: How to use AI agents to convert legacy COBOL mainframe code to Java? A: AWS Transform for mainframe and IBM watsonx Code Assistant for Z both decompose COBOL and JCL and refactor to Java while preserving business logic. The detail that trips teams up: AWS Transform for mainframe is region-limited — US East (N. Virginia) and Europe (Frankfurt) at GA — so confirm your data-residency rules allow it before you plan a migration wave.

Q: How to migrate a React class component codebase to hooks with AI? A: The Codemod Registry’s class-to-function-component recipe handles state, lifecycle, refs, and context with optional AI review; react-declassify is a heuristic, no-LLM alternative. Neither reliably converts tangled lifecycle chains or heavy this references, so treat the output as a first draft and migrate components with dense componentDidUpdate logic by hand.

Your Spec Artifact

By the end of this guide, you should have:

A three-bucket migration map — which files are mechanical, which need AI judgment, which are business-critical
An equivalence contract — source and target versions, behavior-preservation rules, and an explicit edge-case list
A characterization test suite that defines “correct” using real production inputs, independent of any tool

Your Implementation Prompt

Drop this into your AI migration agent or coding tool (AWS Transform, watsonx Code Assistant for Z, Claude Code, Cursor) at the start of a migration. Fill every bracket with your own values — each one maps to a checklist item from Step 2. The prompt forces the tool to plan before it translates.

You are helping me migrate [source language + exact version, e.g.,
COBOL, IBM Enterprise dialect] to [target language + version, e.g.,
Java 21]. Do not translate yet. Work these steps with me.

1. MAP: Read the codebase at [path/repo] and sort every module into
   three buckets:
   - Mechanical: syntax/API changes a deterministic AST transform handles
   - Idiomatic: restructuring needing judgment (e.g., [COBOL paragraphs
     -> methods / class lifecycle -> hooks])
   - Business-critical: logic that must produce identical output,
     especially [currency rounding / date math / EBCDIC encoding]

2. CONTRACT: For each business-critical module, state the equivalence
   rules:
   - Inputs/outputs that must match exactly: [list]
   - Edge cases that must not change: [rounding mode / null handling /
     encoding / overflow behavior]

3. SEQUENCE: Propose a build order — deterministic transforms first,
   AI restructuring second, every business-critical module flagged for
   human review last.

4. VALIDATE: For each module, generate characterization tests comparing
   new output against [legacy system / golden-master fixtures] using
   these real inputs: [sample inputs].

Output the bucket map and the test plan first. Wait for my approval
before generating any migrated code.

Ship It

You now have a way to think about migration that doesn’t depend on which vendor you pick. Decompose by transformation type, write the equivalence contract, run deterministic before AI, and prove it with characterization tests. The tools will change — the agents will get faster, the recipe libraries will grow. The framework holds.

Aha Moments

MONA

What MAX calls the equivalence contract is, underneath, a statement about behavior rather than text. Two programs can share no source at all and still be equivalent — or share nearly all of it and diverge on a single rounding boundary. An AST transform operates on the structure of code, which is why mechanical rewrites are safe: structure maps cleanly. The idiomatic bucket is harder precisely because the same behavior has many valid structural expressions, and choosing among them requires a model of intent the parser never had. That’s the real reason migration resists full automation. Syntax is decidable. Intent is inferred. The three-bucket split is a way of routing each change to the level of reasoning it actually demands.

DAN

The strategic shift MAX is pointing at is that migration stopped being a multi-year capital project and became a workflow. When agentic tools can decompose a COBOL estate and plan it in waves, the bottleneck moves from translation to verification — and the teams that win are the ones who built their test oracle first. I’d push his point further: the characterization suite isn’t just a safety net, it’s the asset. It’s what lets you swap a migration vendor without starting over. Lock yourself to a tool and you’ve traded a legacy dependency for a newer one. Own the equivalence tests and you stay free to move.

ALAN

Both of you are circling the thing that worries me. MAX puts human review on the business-critical bucket, and MONA explains why intent can’t be parsed — but in a decades-old COBOL system, who still knows the intent? The author retired. The spec is the code. When an agent migrates logic nobody fully understands and the tests pass, we’ve proven the new code matches the old behavior, including the bugs and the undocumented assumptions we never meant to keep. Speed makes that easier to wave through. So if the only surviving record of what a system should do is the system itself, what exactly are we preserving when we migrate — the intent, or just the output we never questioned?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors