Thursday, February 5, 2026

Senior Director of Million-Greenback Regexes – O’Reilly

The next article initially appeared on Medium and is being republished right here with the creator’s permission.

Don’t get me mistaken, I’m up all night time utilizing these instruments.

However I additionally sense we’re heading for an costly hangover. The opposite day, a colleague advised me a few new proposal to route 1,000,000 paperwork a day by way of a system that identifies and removes Social Safety numbers.

I joked that this was going to be a “million-dollar common expression.”

Run the maths on the “naïve” implementation with full GPT-5 and it’s eye-watering: 1,000,000 messages a day at ~50K characters every works out to round 12.5 billion tokens every day, or $15,000 a day at present pricing. That’s practically $6 million a yr to test for Social Safety numbers. Even in the event you migrate to GPT-5 Nano, you continue to spend about $230,000 a yr.

That’s a hit. You “saved” $5.77 million a yr…

How about working this code for 1,000,000 paperwork a day? How a lot would this price:

import re; s = re.sub(r”bd{3}[- ]?d{2}[- ]?d{4}b”, “[REDACTED]”, s)

A plain previous EC2 occasion might deal with this… A single EC2 occasion—one thing like an m1.small at 30 bucks a month—might churn by way of the identical workload with a regex and price you a couple of hundred {dollars} a yr.

Which implies that in observe, corporations will likely be calling folks like me in a yr saying, “We’re burning 1,000,000 {dollars} to do one thing that ought to price a fraction of that—are you able to repair it?”

From $15,000/day to $0.96/day—I do assume we’re about to see plenty of corporations understand {that a} pondering mannequin related to an MCP server is far more costly than simply paying somebody to jot down a bash script. Beginning now, you’ll be capable to make a profession out of un-LLM-ifying purposes.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles