Accelerating Massive-Scale Check Migration with LLMs | by Charles Covey-Brandt | The Airbnb Tech Weblog

Rising Patterns in Constructing GenAI Merchandise

Menace Modeling Information for Software program Groups

Airbnb just lately accomplished our first large-scale, LLM-driven code migration, updating almost 3.5K React part check recordsdata from Enzyme to make use of React Testing Library (RTL) as an alternative. We’d initially estimated this may take 1.5 years of engineering time to do by hand, however — utilizing a mix of frontier fashions and strong automation — we completed your complete migration in simply 6 weeks.

On this weblog publish, we’ll spotlight the distinctive challenges we confronted migrating from Enzyme to RTL, how LLMs excel at fixing this specific sort of problem, and the way we structured our migration tooling to run an LLM-driven migration at scale.

In 2020, Airbnb adopted React Testing Library (RTL) for all new React part check improvement, marking our first steps away from Enzyme. Though Enzyme had served us properly since 2015, it was designed for earlier variations of React, and the framework’s deep entry to part internals not aligned with fashionable React testing practices.

Nevertheless, due to the basic variations between these frameworks, we couldn’t simply swap out one for the opposite (learn extra concerning the variations right here). We additionally couldn’t simply delete the Enzyme recordsdata, as evaluation confirmed this may create vital gaps in our code protection. To finish this migration, we wanted an automatic strategy to refactor check recordsdata from Enzyme to RTL whereas preserving the intent of the unique checks and their code protection.

In mid-2023, an Airbnb hackathon crew demonstrated that giant language fashions may efficiently convert a whole bunch of Enzyme recordsdata to RTL in only a few days.

Constructing on this promising end result, in 2024 we developed a scalable pipeline for an LLM-driven migration. We broke the migration into discrete, per-file steps that we may parallelize, added configurable retry loops, and considerably expanded our prompts with extra context. Lastly, we carried out breadth-first immediate tuning for the lengthy tail of complicated recordsdata.

We began by breaking down the migration right into a sequence of automated validation and refactor steps. Consider it like a manufacturing pipeline: every file strikes by means of levels of validation, and when a examine fails, we carry within the LLM to repair it.

We modeled this circulate like a state machine, shifting the file to the subsequent state solely after validation on the earlier state handed:

Diagram reveals refactor steps from Enzyme refactor, fixing Jest, fixing lint and tsc, and marking file as full.

This step-based method supplied a strong basis for our automation pipeline. It enabled us to trace progress, enhance failure charges for particular steps, and rerun recordsdata or steps when wanted. The step-based method additionally made it easy to run migrations on a whole bunch of recordsdata concurrently, which was vital for each rapidly migrating easy recordsdata, and chipping away on the lengthy tail of recordsdata later within the migration.

Early on within the migration, we experimented with totally different immediate engineering methods to enhance our per-file migration success charge. Nevertheless, constructing on the stepped method, we discovered the best route to enhance outcomes was merely brute pressure: retry steps a number of occasions till they handed or we reached a restrict. We up to date our steps to make use of dynamic prompts for every retry, giving the validation errors and the latest model of the file to the LLM, and constructed a loop runner that ran every step as much as a configurable variety of makes an attempt.

*Diagram of a retry loop. For a given step N, if the file has errors, we retry validation and try to repair errors except we hit the max retries or the file not accommodates errors.*

With this straightforward retry loop, we discovered we may efficiently migrate numerous our simple-to-medium complexity check recordsdata, with some ending efficiently after just a few retries, and most by 10 makes an attempt.

For check recordsdata as much as a sure complexity, simply rising our retry makes an attempt labored properly. Nevertheless, to deal with recordsdata with intricate check state setups or extreme indirection, we discovered the most effective method was to push as a lot related context as attainable into our prompts.

By the tip of the migration, our prompts had expanded to anyplace between 40,000 to 100,000 tokens, pulling in as many as 50 associated recordsdata, a complete host of manually written few-shot examples, in addition to examples of present, well-written, passing check recordsdata from throughout the similar venture.

Every immediate included:

The supply code of the part underneath check
The check file we had been migrating
Validation failures for the step
Associated checks from the identical listing (sustaining team-specific patterns)
Basic migration pointers and customary options

Right here’s how that regarded in observe (considerably trimmed down for readability):

// Code instance reveals a trimmed down model of a immediate 
// together with the uncooked supply code from associated recordsdata, imports, 
// examples, the part supply itself, and the check file emigrate.const immediate = [
'Convert this Enzyme test to React Testing Library:',
`SIBLING TESTS:n${siblingTestFilesSourceCode}`,
`RTL EXAMPLES:n${reactTestingLibraryExamples}`,
`IMPORTS:n${nearestImportSourceCode}`,
`COMPONENT SOURCE:n${componentFileSourceCode}`,
`TEST TO MIGRATE:n${testFileSourceCode}`,
].be a part of('nn');

This wealthy context method proved extremely efficient for these extra complicated recordsdata — the LLM may higher perceive team-specific patterns, widespread testing approaches, and the general structure of the codebase.

We must always be aware that, though we did some immediate engineering at this step, the principle success driver we noticed was selecting the proper associated recordsdata (discovering close by recordsdata, good instance recordsdata from the identical venture, filtering the dependencies for recordsdata that had been related to the part, and so forth.), fairly than getting the immediate engineering good.

Support authors and subscribe to content

This is premium stuff. Subscribe to read the entire article.

Gain access to all our Premium contents.
More than 100+ articles.

Subscribe Now

Buy Article

Unlock this article and gain permanent access to read it.

Unlock Now

Accelerating Massive-Scale Check Migration with LLMs | by Charles Covey-Brandt | The Airbnb Tech Weblog | Mar, 2025

Rising Patterns in Constructing GenAI Merchandise

Rising Patterns in Constructing GenAI Merchandise

Menace Modeling Information for Software program Groups

Subscribe

Buy Article

Rising Patterns in Constructing GenAI Merchandise

Rising Patterns in Constructing GenAI Merchandise

Menace Modeling Information for Software program Groups

Forest And Desert

Rising Patterns in Constructing GenAI Merchandise

Adopting Bazel for Internet at Scale. How and Why We Migrated Airbnb’s… | by Sharmila Jesupaul | The Airbnb Tech Weblog

Recommended Stories

AI Revives Misplaced Agatha Christie Story

Study Air pollution – Earth Children Superheroes

Killers and Helpers: How T Cells Acknowledge SARS-CoV-2

Popular Stories

Eat Clear Assessment: Is This Meal Supply Service Value It?

RBI panel suggests extending name cash market timings to 7 p.m.

Working from home is the new normal as we combat the Covid-19

Dataiku Brings AI Agent Creation to AI Platform

The Significance of Using Instruments like AI-Primarily based Analytic Options

About Us

Categories

Recent News

Are you sure want to unlock this post?

Are you sure want to cancel subscription?

Accelerating Massive-Scale Check Migration with LLMs | by Charles Covey-Brandt | The Airbnb Tech Weblog | Mar, 2025

RELATED POSTS

Support authors and subscribe to content

Subscribe

Buy Article

Related Posts

Recommended Stories

Popular Stories

About Us

Categories

Recent News

Are you sure want to unlock this post?

Are you sure want to cancel subscription?