How much does it cost to process 5 million leads per month?

The Claude Code portion (Stages 1, 2, 5, 6) costs a few hundred dollars per month in API tokens. The enrichment stage (Stage 4) is by far the most expensive, running several thousand per month depending on provider rates and how many leads need enrichment. Domain health checks (Stage 3) are essentially free since DNS lookups have no per-query cost. Total pipeline cost is significantly less than what we'd spend on manual labor or per-row SaaS tools at this volume.

Can I build this pipeline without Claude Code?

Yes. Everything Claude Code does can be done with Python scripts, a dedup database, and manual classification. The question is speed and maintenance. Our Python pipeline took 3 engineers to maintain and processed 5K rows per second. The Claude Code pipeline is maintained by 1 person and processes 272K rows per second. If you're processing under 50K leads per month, manual processes or basic scripts might be fine. Above that, the time savings compound fast.

What happens when classification is wrong?

Wrong classification means a lead gets the wrong messaging angle. It's not catastrophic (the lead still gets a relevant email, just not the most relevant one), but it reduces reply rates. We track classification accuracy by reviewing a random sample of 200 leads per week. Current accuracy is around 89% for primary situation tags. Leads that get misclassified usually get tagged with a related but suboptimal situation (e.g., "post-fundraise" when the real situation is "competitive displacement after fundraise"). We use these errors to improve the classification rules quarterly.

How do you handle GDPR and data privacy in the pipeline?

All processing happens locally. Lead data never leaves our infrastructure during Stages 1, 2, 5, and 6 (Claude Code processes files on our machines, not in the cloud). Stage 3 sends only domain names for DNS lookups (no personal data). Stage 4 (enrichment) uses GDPR-compliant providers. We maintain suppression lists per client and honor opt-out requests across all campaigns within 24 hours. Cross-client dedup also prevents someone who opted out of Client A's campaigns from being contacted by Client B.

What data providers do you use for Stage 4 enrichment?

We use a waterfall approach with multiple providers because no single provider has complete coverage. The specific providers rotate based on accuracy testing we run quarterly. What matters more than which provider you use is the waterfall architecture itself. Check Provider A first (highest accuracy for your market). If no result, try Provider B. Then C. Then verify all found emails. This approach typically gets us to 85% to 92% valid email coverage on any given list, compared to 60% to 70% from a single provider.

B2B Prospecting

How We Process 5 Million Leads Per Month Using Claude Code

March 11, 202614 min read

Mitchell Keller

Founder & CEO, LeadGrow · Managed 1,626+ cold email campaigns. 4.1% average reply rate. Booked 2,230+ meetings in 2025.

TL;DR

**We process 5 million+ leads per month across 11 clients.** The pipeline handles CSV merge, deduplication, domain health checks, waterfall enrichment, situation classification, ICP scoring, and campaign assignment. All automated.
**Claude Code + Bun scripts process at 272K rows per second for classification tasks.** The same work in Clay takes 27 hours for 1M rows. Traditional ETL tools break on messy, inconsistent CSV formats. Claude Code handles the chaos.
**The pipeline catches problems before they cost money.** Invalid domains, duplicate contacts across clients, misclassified job titles, and dead companies get filtered before a single email is sent. That filtering is why our bounce rate stays under 2% across 1,626+ campaigns.

By Mitchell Keller, Founder & CEO, LeadGrow. Managed 1,626+ cold email campaigns. 4.1% average reply rate. 2,230+ meetings booked in 2025.

The lead processing problem nobody talks about

Every cold email guide focuses on the same things. Copy. Subject lines. Follow up sequences. Deliverability. These matter. But they assume your lead data is clean, accurate, and properly segmented.

It almost never is.

When you buy a list from Apollo, ZoomInfo, or any data provider, you're getting raw material. Job titles are inconsistent ("VP Sales" vs "Vice President of Sales" vs "VP, Revenue"). Email addresses are partially verified at best. Company data is months old. Domains are sometimes parked, sometimes dead, sometimes catch-all servers that accept everything and bounce later.

Send to a dirty list and three things happen. Your bounce rate spikes (killing sender reputation). Your reply rate tanks (because half the list isn't in your ICP). And you burn through addressable market that you can't contact again without looking like spam.

We learned this the hard way in our first year. Now we process every lead through a 6-stage pipeline before it touches a campaign. That pipeline processes 5 million+ leads per month across our 11 client accounts.

The 6-stage pipeline

Every lead that enters our system passes through these stages in order. No exceptions.

Stage	What It Does	Speed	Failure = Reject
1. Merge and Normalize	Combine CSVs, standardize columns, fix formatting	272K rows/sec	No (fix and continue)
2. Deduplicate	Remove dupes within list and across all active campaigns	150K rows/sec	Yes (duplicate = skip)
3. Domain Health	MX records, catch-all detection, parked domain check	Network-bound (~15 min for 100K)	Yes (red domains = reject)
4. Waterfall Enrichment	Find/verify emails, fill missing company data	Provider-dependent	Yes (no valid email = reject)
5. Situation Classification	Assign buying situation tags based on signals	272K rows/sec	No (unclassified = manual review)
6. ICP Scoring and Campaign Assignment	Score fit, assign to correct campaign and sequence	272K rows/sec	Yes (below threshold = hold)

Stages 1, 2, 5, and 6 run on Claude Code + Bun scripts locally. Stage 3 is a custom Bun script that parallelizes DNS lookups. Stage 4 uses Clay and waterfall enrichment providers via API.

Let me walk through each one.

Stage 1: Merge and normalize

Lead data comes from everywhere. Apollo exports. ZoomInfo downloads. LinkedIn Sales Navigator lists. Custom scrapes. Client CRM exports. Each source formats data differently.

A typical week might include:

3 Apollo exports with different column names ("Company Name" vs "Organization" vs "Account")
1 LinkedIn export with connection data mixed into the lead fields
2 custom scrapes from industry databases with non-standard formatting
1 client CRM dump with 47 columns, 40 of which we don't need

Claude Code handles this because it understands context. When you tell it "merge these 7 CSVs into one clean list with columns: first_name, last_name, email, company, title, domain, linkedin_url," it figures out the column mapping from each source automatically.

The normalization step is where most manual processes fail. Job title normalization alone has hundreds of variations. "VP of Sales," "Vice President Sales," "VP, Sales and Marketing," "Sales VP," "Vice President, Revenue." These all need to map to standardized categories so our targeting logic works downstream.

Claude Code generates a Bun script that handles title normalization using pattern matching and fuzzy logic. The script runs at 272K rows per second because it's local computation (no API calls, no network latency). For a 500K row merge, the entire stage takes about 2 minutes.

What gets fixed in this stage

Column name standardization across sources
Job title normalization to canonical categories
Company name cleanup (removing Inc., LLC, Ltd. variations for matching)
Phone number formatting
URL cleanup (adding https://, removing trailing slashes)
Empty row removal
Character encoding issues (UTF-8 normalization)

Stage 2: Deduplicate

Deduplication sounds simple. Remove rows with the same email address. Done.

It's not that simple when you're running campaigns for 11 clients simultaneously.

We deduplicate at three levels:

Within the current list. Data providers sell the same contact under multiple entries (different scraped profiles, different data snapshots). A 50K list from Apollo might have 2K to 5K internal duplicates.
Across the client's active campaigns. If a contact is already in an active sequence for this client, sending them a different campaign creates a terrible experience. "Didn't I just get an email from you about something else?"
Across all clients. If Client A is already emailing a contact and Client B wants to email the same person, that's a conflict. The contact gets two cold emails from companies using the same infrastructure, which is a deliverability risk and a brand risk.

Level 3 is the one most agencies skip. They don't have the systems to check across clients because each client's data lives in a separate tool. Our pipeline checks every new lead against every active campaign across all 11 clients.

Claude Code handles this by maintaining a deduplication index. Every email that enters any campaign gets logged. New leads are checked against this index before moving to Stage 3. The check runs at 150K rows per second using a Bun script with an in-memory hash map.

Typical dedup results for a 100K row import:

Internal duplicates removed: 3K to 8K (3% to 8%)
Cross-campaign duplicates flagged: 1K to 5K (1% to 5%)
Cross-client conflicts flagged: 200 to 1K (0.2% to 1%)

That means 5% to 14% of a typical list gets filtered before we even check if the data is valid. Without this stage, those contacts get double-emailed or cross-contaminated. Both are campaign killers.

Stage 3: Domain health

Sending to a dead domain wastes volume and hurts sender reputation. Every bounced email is a signal to mail servers that you're not sending to people who want your mail.

Our domain health check runs three tests per domain:

MX record lookup. Does the domain have mail servers configured? No MX record means nobody can receive email at that domain. Immediate reject.
Catch-all detection. Does the domain accept mail for any address? Catch-all domains accept "[email protected]" without bouncing. This means the specific email address we have might not exist. It just won't bounce. Send to enough non-existent addresses at catch-all domains and your sender reputation drops.
Parked domain check. Is the domain actually being used by a real company, or is it parked/for sale? Parked domains sometimes still have MX records from previous owners.

Each domain gets classified:

Green: Valid MX, not catch-all, active company. Safe to send.
Yellow: Catch-all domain. Send with caution, lower priority, smaller volume.
Red: No MX, parked, or known problematic. Do not send.

The DNS lookups are network-bound, but Claude Code generates a Bun script that runs 50 to 100 concurrent lookups. A 100K unique domain list processes in about 15 minutes. The script caches results so repeated domains (500 employees at the same company = 1 domain check, not 500) don't waste lookups.

Typical health check results:

Green: 70% to 80%
Yellow: 10% to 15%
Red: 8% to 15%

That 8% to 15% red rate means roughly 1 in 10 contacts on any purchased list would bounce or land in a dead inbox. Filter those out and your bounce rate stays under 2% (ours averages 1.4% across all campaigns).

Stage 4: Waterfall enrichment

This is the one stage that doesn't run primarily on Claude Code. Enrichment requires external data providers, and each provider has different coverage, accuracy, and pricing.

Our waterfall enrichment approach:

Primary provider checks the email first (highest accuracy, highest cost)
If no result, secondary provider tries (good accuracy, lower cost)
If still no result, tertiary provider tries (broadest coverage, lower accuracy)
All found emails run through verification (is this a real, active inbox?)

Claude Code orchestrates this waterfall by generating the API calls, managing rate limits, and consolidating results. But the actual enrichment happens in Clay and via direct API calls to providers. For a deeper look at this approach, see our guide to waterfall enrichment.

We also enrich company-level data at this stage. Revenue estimates, employee count, tech stack, recent funding, hiring activity. This data feeds into the situation classification in Stage 5.

Enrichment is the slowest and most expensive stage. For 100K leads, it takes 4 to 8 hours and costs vary significantly by provider. The key insight: everything we do in Stages 1 through 3 reduces the number of leads that need enrichment. If we filter 15% of a 100K list before enrichment, we save the enrichment cost on 15,000 leads.

Stage 5: Situation classification

This is our secret weapon. Every other stage is hygiene. This stage is intelligence.

Situation classification takes the enriched lead data and assigns buying situation tags. These tags determine which campaign, sequence, and messaging angle each lead receives.

Example situations we classify for a typical B2B SaaS client:

Post-fundraise growth pressure: Company raised in last 6 months, hiring aggressively, likely under pressure to hit new growth targets
Founder-led sales ceiling: Company under 50 employees, no dedicated sales team, founder still in every deal
Failed outbound attempt: Previously used an outbound agency or tool (detectable via tech stack data or job posting patterns), likely skeptical but also proven they want outbound to work
Competitive displacement: Using a competitor product, showing signs of dissatisfaction (G2 reviews, support forum posts, job postings mentioning migration)
Expansion trigger: New office, new market entry, new product launch indicating they need to reach new audiences

Claude Code classifies leads into these situations by analyzing the enriched data holistically. It's not simple rule-based logic ("if raised Series B, tag as post-fundraise"). It considers multiple signals together and weights them.

A company that raised 8 months ago, has been hiring SDRs for 3 months, but recently posted a VP of Sales job might be classified as "outbound investment not working" rather than "post-fundraise growth." The combination of signals tells a different story than any individual signal.

Claude Code handles this classification at 272K rows per second because we pre-generate the classification logic as a Bun script. The script applies weighted rules that Claude Code designed based on our historical data about which signal combinations correlate with the highest reply rates.

Output: every lead gets 1 to 3 situation tags, a confidence score, and a recommended messaging angle. Leads with low confidence scores get flagged for manual review.

Stage 6: ICP scoring and campaign assignment

The final stage scores each lead against the client's ICP definition and assigns it to the right campaign.

ICP scoring factors:

Company fit: Industry, size, revenue, growth stage (weighted 40%)
Contact fit: Title, seniority, department, decision-making authority (weighted 30%)
Situation fit: How well their situation matches our strongest messaging angles (weighted 20%)
Data quality: How complete and verified is their contact information (weighted 10%)

Leads scoring above the threshold (varies by client, typically 70+) get assigned to an active campaign. Leads scoring 50 to 70 go into a nurture queue. Leads below 50 get held (not rejected permanently, but not contacted until we have better data or a new angle).

Campaign assignment considers:

Which situation tag matches which active campaign
Current volume per campaign (don't overload a campaign that's already at capacity)
Client-specific rules (some clients want enterprise leads in Campaign A, mid-market in Campaign B)
Time-sensitive factors (event-based campaigns need leads before the event, not after)

The scoring and assignment runs at 272K rows per second. For a typical weekly batch of 200K leads across all clients, the entire Stage 6 takes under a minute.

Why Claude Code instead of traditional ETL

We tried building this pipeline with traditional tools. Python scripts with pandas. Airflow for orchestration. dbt for transformations. Custom SQL for deduplication.

Three problems killed that approach:

Schema drift. Lead data formats change constantly. Apollo updates their export format. A new data provider uses different column names. A client sends a CRM export with custom fields. Traditional ETL pipelines break when the schema changes. Claude Code adapts because it reads the data and figures out the mapping.
Edge cases everywhere. Job titles have infinite variations. Company names have legal suffixes, DBA names, parent/subsidiary relationships. Address formatting differs by country. Traditional scripts need explicit handling for every edge case. Claude Code handles edge cases by understanding context.
Iteration speed. When we want to add a new classification rule or change the scoring weights, a traditional pipeline requires code changes, testing, deployment. With Claude Code, we describe what we want and it updates the processing script in minutes.

The speed difference is real. Our traditional Python pipeline processed about 5K rows per second with pandas. Claude Code's generated Bun scripts hit 272K rows per second. That's a 54x improvement, and it matters when you're processing millions of leads per month.

The results

Since building this pipeline:

Bounce rate: Dropped from 4.2% to 1.4% average across all campaigns
Reply rate: 4.1% average (industry average is 1% to 3%)
Meetings booked: 2,230+ in 2025
Processing time: A full 500K lead batch goes from raw CSV to campaign-ready in under 2 hours (including enrichment wait time). Previously took 2 to 3 days.
Cross-client conflicts: Zero since implementing cross-client dedup (previously happened 2 to 3 times per month)

The pipeline isn't magic. It's just fast, consistent data processing applied to every lead before we send. The 4.1% reply rate comes from clean data, accurate targeting, and situation-specific messaging. The pipeline makes all three possible at scale. For the full tooling breakdown, see our outbound sales tech stack guide.

Frequently Asked Questions

b2b-prospecting

Want us to run this playbook for you?

Book a strategy call and we'll show you how these frameworks apply to your business.

Book Strategy Call →