Poor Meta. Technical debt and user training made its exabyte-scale data migration tricky

Here's one from the "welcome to the real world, kids, we have no sympathy for your plight" files: social media giant Meta's engineering team has bemoaned the complexity of migrating from legacy technology.

In a Thursday post detailing migration of exabyte-scale data stores to new schemas, a quartet of Meta software engineers offered the following insight into their work.

Fellas, we're going to let you in on a secret: everyone gets technical debt, and everyone has trouble educating users about new systems.

Meta is not special. It is not a beautiful and unique snowflake. It is the same sort of decaying collection of cobbled-together tech that every other organisation accrues over time.

In this case, the decrepit tech was "numerous heterogeneous services, such as warehouse data storage and various real-time systems, that make up Meta's data platform - all exchanging large amounts of data among themselves as they communicate via service APIs."

As Meta detailed in December 2022, those systems struggled to scale as the data-harvesting giant built more AI workloads that needed to access data from diverse sources.

Improved data logging and serialization was the answer, so that data could describe itself more effectively and therefore be more easily ingested by diverse applications.

Meta built a system called "Tulip" to sort that out. And was chuffed that the formats it used required 40 percent to 85 percent fewer bytes and uses 50 percent to 90 percent fewer CPU cycles.

As Meta's Thursday post explains, Tulip may have been top tech but making it work was hard, not least because the social media giant employed over 30,000 logging schemas.

Across the four-year effort to adopt Tulip, Meta engineers found some data wasn't able to be easily ingested or converted, or that doing so was computationally expensive. Some tools designed to ease migration created problems as they ran, so engineers created rate limiters so that issues didn't snowball.

And then there were those pesky users, whose role planting Tulip in Meta's tech garden necessitated the creation of a migration guide, an instructional video, plus a support team.

"Making huge bets such as the transformation of serialization formats across the entire data platform is challenging in the short term, but it offers long-term benefits and leads to evolution over time," the post winds up.

"Designing and architecting solutions that are cognizant of both the technical as well as nontechnical aspects of performing a migration at this scale are important for success," the post adds. "We hope that we have been able to provide a glimpse of the challenges we faced and solutions we used during this process."

Meta's four engineers probably have offered useful insights for those who face similar data-wrangling challenges. The rest of you who have lived through legacy migrations? Maybe less so.

And for everyone else, the insight here is that Met has become more efficient at wielding exabytes of data. Much of it gathered from, and about, you. ®

About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Mar 31
The changing data landscape

Webinar How AI demands a new navigation

Mar 31
FTC urged to freeze OpenAI's 'biased, deceptive' GPT-4

AI policy wonks slam chatty hallucination-prone model in formal complaint

Mar 30
So you want to integrate OpenAI's bot. Here's how that worked for software security scanner Socket

Exclusive Hint: Hundreds of malicious npm and PyPI packages spotted

Mar 30
It's official: Ubuntu Cinnamon remix has been voted in

And it looks like educational flavor Edubuntu is returning, too

Mar 30
This US national lab turned to AI to hunt rogue nukes

All it needs to do is detect ■■■■■■■■■■ in the ■■■■■ at ■■■■■■ when the ■■■■■■■■

Mar 30
Judge grants subpoena to ID Twitter source code leaker

Unmasking also in store for anyone who's 'posted, uploaded, downloaded or modified' tweet biz code

Mar 29
Had enough of Android? First 'Focal' based Ubuntu Touch is out

First version built on 20.04 hits smartphones and tablets of UBPorts fans