Wednesday, April 8, 2026

Open Libraries for Accelerated Knowledge Processing Enhance A/B Testing for Snap


The options on social media apps like Snapchat evolve practically as quick as what’s trending. To maintain tempo, its mother or father firm Snap has adopted open information processing libraries from NVIDIA on Google Cloud companies to spice up improvement. 

Each new characteristic rolled out to Snapchat’s greater than 940 million month-to-month energetic customers goes via a set of managed experiments earlier than it’s launched. Throughout this A/B testing cycle, the event crew research totally different variables with a subset of customers, measuring practically 6,000 metrics that analyze engagement, app efficiency and monetization. 

Snap runs hundreds of those experiments every month — processing over 10 petabytes of information inside a three-hour window every morning utilizing the Apache Spark distributed framework. By adopting Apache Spark accelerated by NVIDIA cuDF, the corporate is boosting these information processing workloads on NVIDIA GPUs to realize 4x speedups in runtime with the identical variety of machines, offering an economical path to scale.

By pairing NVIDIA’s GPU-optimized software program, together with NVIDIA CUDA-X libraries, with Google’s infrastructure administration companies resembling Google Kubernetes Engine, Snap is harnessing a full-stack platform for information processing at scale. 

“Experimentation is on the core of our firm. Altering our information infrastructure from CPUs to GPUs permits us to effectively scale this experimentation to extra options, extra metrics and extra customers over time,” mentioned Prudhvi Vatala, senior engineering supervisor at Snap. “The extra experiments we’re in a position to run, the extra revolutionary experiences we will ship for Snapchat customers.”

A Sustainable Approach to Scale

Snapchat followers continuously see new options within the app — from arrival notifications to AI-generated stickers — however Snap can be constantly rolling out behind-the-scenes updates resembling efficiency optimizations and compatibility updates for brand new working system variations. 

The A/B testing for all these new options now runs on cuDF, which permits builders to run present Apache Spark purposes on NVIDIA GPUs with no code modifications for simple deployment. The open library for accelerated information processing builds on the ability of the NVIDIA cuDF GPU DataFrame library whereas scaling it for the Apache Spark distributed computing framework.

With this migration, the crew has — primarily based on Snap inner information collected between January 1 and February 28 — realized 76% every day price financial savings utilizing NVIDIA GPUs on Google Kubernetes Engine in contrast with CPU-only workflows.

“We had been projecting an formidable roadmap to scale up experimentation that may have blown up our computing prices primarily based on our present infrastructure,” Vatala mentioned. “Switching to GPU-accelerated pipelines with cuDF gave us a solution to flatten the scaling curve, and the outcomes had been great.”

To help workload migration, the crew additionally harnessed cuDF suite of microservices that routinely qualify, check, configure and optimize Spark workloads for GPU acceleration at scale. 

Working with NVIDIA consultants, the Snap crew optimized its pipelines on Google Cloud’s G2 digital machines powered by NVIDIA L4 GPUs so that they required simply 2,100 GPUs operating concurrently — versus the preliminary projection that round 5,500 GPUs would want to run concurrently, based on information Snap collected between January 1 and March 13.

“After I noticed the outcomes of the preliminary experiments, they had been fairly loopy — we noticed a lot larger price financial savings than we had anticipated,” mentioned Joshua Sambasivam, a backend engineer on the A/B testing crew. “The Spark accelerator is an ideal match for our workloads.”

Wanting forward, the Snap crew plans to combine the Spark accelerator past the A/B crew to a broader vary of manufacturing workloads. 

“We didn’t understand we had been sitting on this gold mine,” Vatala mentioned. “We’ve to date migrated our two largest pipelines, however there’s quite a lot of alternative forward.” 

Study extra by tuning into Vatala’s session at NVIDIA GTC, going down Tuesday, March 17 at 1 p.m. PT

Learn extra about NVIDIA cuDF and get began with GPU acceleration for Apache Spark.

Foremost picture above courtesy of Snap, depicting A/B check of its Maps characteristic.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles