Using AI to Translate Python 2 to Python 3 and Convert COBOL to Java in 2026

Table of Contents
TL;DR
- AI doesn’t replace the migration plan. It executes the parts you specify. Split your code into what rewrites mechanically, what needs AI judgment, and what a human must verify.
- The only contract that matters is semantic equivalence — same input, same output. A clean compile tells you nothing about behavior.
- Run deterministic AST tools first, AI agents second, human review on business-critical paths last.
A bank’s overnight batch job ran in COBOL for thirty years. An AI tool converted it to Java in an afternoon. It compiled. It passed the smoke test. Three weeks later, interest on a few thousand accounts was off by a cent — because the migration quietly changed how decimals rounded, and nobody had said it shouldn’t. The code wasn’t wrong. The specification was missing.
Before You Start
You’ll need:
- An AI migration tool matched to your stack — AWS Transform for mainframe or IBM watsonx Code Assistant for Z for COBOL, LibCST or an AI assistant for Python 2 to 3, the Codemod Registry for React
- A working understanding of AI Code Migration and how an Abstract Syntax Tree transform rewrites code without running it
- A test oracle: the legacy system’s real inputs and the outputs it actually produces
This guide teaches you: how to decompose a migration so the mechanical parts, the AI-judgment parts, and the human-review parts each go to the tool that can actually handle them.
The Migration That Compiled and Lied
Here’s the failure mode I see on nearly every legacy project. Someone points an AI at a repository, types “convert this to Java,” and gets back code that compiles. Green checkmark. Ship it. The AI optimized for the thing it could see — syntax — and guessed at the thing it couldn’t: what the program was supposed to do.
Compiling is not migrating. A compile proves the syntax parses. It says nothing about whether the new code produces the same answer on the same input. That gap is where currency rounding drifts, where date math slips a day, where an EBCDIC-encoded field turns into mojibake. The build went green on Friday. On Monday, a downstream report didn’t reconcile because an assumption nobody wrote down had changed.
Step 1: Sort Your Code Into Three Buckets
Before any tool touches a line, decompose the migration by transformation type. Not by file. Not by module. By how much judgment each change requires. Most teams skip this and hand the AI one undifferentiated pile — that’s the original sin.
Your migration has three layers — three buckets, not one pile:
- Mechanical — syntax and API changes a deterministic transform can rewrite reproducibly. Python 2’s
printstatement becomingprint(). A COBOLPERFORMloop becoming a Javafor. These run through an AST rewrite the same way every time, and you can review the diff. - Idiomatic — restructuring that needs judgment. COBOL paragraphs collapsing into Java methods. A React class component’s lifecycle untangling into hooks. There’s no single correct output here, which is exactly why it needs AI or a human, not a fixed rule.
- Business-critical — logic that must produce byte-identical results. Interest calculations, tax rounding, encoding boundaries. This bucket doesn’t get “migrated.” It gets preserved and verified.
The Architect’s Rule: If you can’t say which bucket a file belongs in, you don’t understand it well enough to let an AI rewrite it.
Step 2: Write the Equivalence Contract
The AI will guess at everything you leave unstated — confidently and fast. So state it. The equivalence contract is the spec that turns “convert this” into “convert this, preserving exactly these behaviors.”
Context checklist:
- Source language and exact version — Python 2.7, not “Python 2.” Your specific COBOL dialect, not “COBOL.”
- Target version — Java 21, Python 3.13, whichever you’re committing to.
- Behavior-preservation rules — what counts as the same output: same input, same output, down to rounding mode and field width.
- Edge cases listed explicitly — decimal rounding, null handling, character encoding (EBCDIC to UTF-8 is a classic silent corruptor), integer overflow when a fixed-width target meets an arbitrary-precision source.
- The test oracle — the real production inputs and known-good outputs you’ll measure against.
The Spec Test: If your contract doesn’t pin the rounding mode, the AI picks one. It will choose whatever its training favors, and your financial totals will drift by fractions of a cent that compound into a reconciliation failure nobody can trace.
Step 3: Run Deterministic First, AI Second
Build order matters more in migration than in greenfield work, because every layer constrains the next. The rule: deterministic first, AI second, humans last.
Migration order:
- Deterministic AST transforms first — because they’re reproducible and the diff is reviewable. On the JVM, OpenRewrite runs LST-based recipes that rewrite code the same way every run; rewrite-core is at 8.83.0 with more than 5,000 community recipes under Apache 2.0, per OpenRewrite Docs. For running those recipes across thousands of repositories at once, Moderne is the commercial platform built on the same engine. One caution: OpenRewrite is JVM-focused — Java version bumps, Spring Boot, JUnit migrations. It is not a Python 2-to-3 tool. For Python, reach for LibCST or parso, or an AI assistant working against your contract.
- AI agents second — for the idiomatic bucket the rules can’t touch. For COBOL, AWS Transform for mainframe is an agentic system that analyzes, decomposes, and plans the work in waves, refactoring COBOL and JCL into Java while preserving business logic, per AWS Transform; it reached general availability in May 2025. IBM watsonx Code Assistant for Z converts COBOL and PL/I to Java using a multi-agent setup and the Granite model, per IBM Docs (version 2.8.20). On the JVM-upgrade side, the agentic Amazon Q Code Transformation handles Java-version jumps rather than COBOL. These agents increasingly expose their tools over the Model Context Protocol so they can pull repository context on demand — and since that spec is moving fast (the 2025-11-25 revision is current, with the next due later in 2026, per MCP Spec), pin to a known version rather than “latest.” For React class-to-hooks work, the Codemod ecosystem carries the idiomatic load.
- Human review last — every file in the business-critical bucket. Not because humans are faster, but because this is where silent corruption hides, and a person who knows the domain is the only reviewer who’ll catch a rounding change that still compiles.
Step 4: Prove Semantic Equivalence
Now verify the output is correct. Not by reading the first few files and nodding — a clean compile is not a passing test. You prove equivalence by running the new code against the same inputs as the old and diffing the results.
Validation checklist:
- Characterization tests (golden master) — capture real production inputs and the legacy outputs, then assert the new code matches. Failure looks like: a diff on a real input the smoke test never exercised.
- Auto-generated equivalence tests — some agents do this for you. IBM watsonx Code Assistant for Z auto-generates unit tests aimed at semantic equivalence, per IBM Docs. Failure looks like: a generated test that passes on the happy path but was never given a boundary value.
- Edge-case battery — the list from your contract, run deliberately: max currency value, leap-year dates, empty fields, encoding boundaries. Failure looks like: an off-by-one at exactly the boundary you forgot to specify.
Vendors will tell you this collapses timelines — AWS frames mainframe modernization as moving from “years to months,” IBM frames it as “minutes not months.” Treat those as vendor claims, not benchmarks. The agent can be fast and the work can still be wrong if you skip the equivalence proof.

Tool status notes:
- Python 2to3 / lib2to3: Removed in Python 3.13 (deprecated in 3.11), per Python Docs. Do not build a migration on it — it no longer ships with modern Python. Use LibCST or parso, or an AI assistant working against an equivalence contract.
- jscodeshift: No active maintainers at Meta and a large open-issue backlog; the Codemod team now drives it, with ast-grep (jssg) emerging as the modern successor. Usable, but don’t expect upstream fixes — keep your own validation around it.
Common Pitfalls
| What You Did | Why AI Failed | The Fix |
|---|---|---|
| Pointed an AI at the whole repo: “convert to Java” | Too many concerns; it optimized for compiling, not behavior | Sort into mechanical / idiomatic / business-critical first |
| Didn’t specify the source version | AI assumed a modern dialect; legacy quirks got dropped | State exact versions on both sides of the migration |
| Trusted “it compiles” as done | Compilation proves syntax, not equivalence | Run characterization tests on real production inputs |
| Reached for 2to3 on Python | The tool was removed in Python 3.13 | Use LibCST or parso, or an AI assistant with a contract |
| Shipped a React class-to-hooks codemod output | Codemods can’t untangle complex lifecycle or this logic | Treat the output as a draft; review state and effects by hand |
Pro Tip
The most reusable artifact from any migration isn’t the new code — it’s the characterization test suite you build to prove equivalence. Write the golden-master tests against the legacy system before you migrate a line. They become your oracle for every tool you try, deterministic or AI, and they outlive the migration: the next time someone touches that code, the tests already define what “correct” means.
Frequently Asked Questions
Q: How to use AI to translate code from one programming language to another? A: Pin the source and target versions, then split the job — a deterministic AST tool handles mechanical syntax, an AI agent handles idiomatic restructuring, and a human verifies business logic. One watch-out the steps above skip: type systems rarely map one-to-one, so specify how unsupported types should be represented (a Python arbitrary-precision int in a fixed-width target) before the AI guesses for you.
Q: How to use AI agents to convert legacy COBOL mainframe code to Java? A: AWS Transform for mainframe and IBM watsonx Code Assistant for Z both decompose COBOL and JCL and refactor to Java while preserving business logic. The detail that trips teams up: AWS Transform for mainframe is region-limited — US East (N. Virginia) and Europe (Frankfurt) at GA — so confirm your data-residency rules allow it before you plan a migration wave.
Q: How to migrate a React class component codebase to hooks with AI?
A: The Codemod Registry’s class-to-function-component recipe handles state, lifecycle, refs, and context with optional AI review; react-declassify is a heuristic, no-LLM alternative. Neither reliably converts tangled lifecycle chains or heavy this references, so treat the output as a first draft and migrate components with dense componentDidUpdate logic by hand.
Your Spec Artifact
By the end of this guide, you should have:
- A three-bucket migration map — which files are mechanical, which need AI judgment, which are business-critical
- An equivalence contract — source and target versions, behavior-preservation rules, and an explicit edge-case list
- A characterization test suite that defines “correct” using real production inputs, independent of any tool
Your Implementation Prompt
Drop this into your AI migration agent or coding tool (AWS Transform, watsonx Code Assistant for Z, Claude Code, Cursor) at the start of a migration. Fill every bracket with your own values — each one maps to a checklist item from Step 2. The prompt forces the tool to plan before it translates.
You are helping me migrate [source language + exact version, e.g.,
COBOL, IBM Enterprise dialect] to [target language + version, e.g.,
Java 21]. Do not translate yet. Work these steps with me.
1. MAP: Read the codebase at [path/repo] and sort every module into
three buckets:
- Mechanical: syntax/API changes a deterministic AST transform handles
- Idiomatic: restructuring needing judgment (e.g., [COBOL paragraphs
-> methods / class lifecycle -> hooks])
- Business-critical: logic that must produce identical output,
especially [currency rounding / date math / EBCDIC encoding]
2. CONTRACT: For each business-critical module, state the equivalence
rules:
- Inputs/outputs that must match exactly: [list]
- Edge cases that must not change: [rounding mode / null handling /
encoding / overflow behavior]
3. SEQUENCE: Propose a build order — deterministic transforms first,
AI restructuring second, every business-critical module flagged for
human review last.
4. VALIDATE: For each module, generate characterization tests comparing
new output against [legacy system / golden-master fixtures] using
these real inputs: [sample inputs].
Output the bucket map and the test plan first. Wait for my approval
before generating any migrated code.
Ship It
You now have a way to think about migration that doesn’t depend on which vendor you pick. Decompose by transformation type, write the equivalence contract, run deterministic before AI, and prove it with characterization tests. The tools will change — the agents will get faster, the recipe libraries will grow. The framework holds.
AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors