Cyberax AI Playbook
cyberax.com
How-to · Communications & Customer Work

Outbound prospecting research at SDR scale

If your sales team sends 30+ outbound emails per rep per day, the math has shifted under you. This is the research pipeline that pulls company news, hiring signals, recent funding, and tech-stack changes per prospect, then generates first-touch outreach that actually references something specific. The architecture, the data sources, and the deliverability discipline that doesn't burn your sending domain.

At a glance Last verified · May 2026
Problem solved Run high-volume outbound prospecting with per-prospect research — company news, hiring signals, recent funding, tech-stack changes — and generate first-touch outreach that references something genuinely specific, at the volume that previously required templated outreach
Best for SDR teams, outbound-focused RevOps, founders running their own outbound, agencies running prospecting for clients
Tools Claude, GPT-4o, Clay, Apollo, ZoomInfo, BuiltWith, PhantomBuster, LinkedIn Sales Navigator
Difficulty Advanced
Cost $1–$5 per prospect researched (multi-source enrichment + LLM) → $100–500/seat/month bundled with Clay / Apollo / Outreach
Time to set up 2–4 weeks for the research pipeline; 1–2 months including personalisation generation and deliverability tuning

If you run a sales-development team, the outbound math hasn’t worked since around 2022. Reply rates on templated outreach dropped from “respectable single digits” to “fractions of a percent” as inboxes filled with the same Apollo-templated pitch.

The honest sales-leader response was to lean into research — find a real signal per prospect, write a message that referenced it specifically, send one targeted email instead of 200 templated ones. The honest observation that followed: it works, but the manual version doesn’t scale past 10–15 prospects per rep per day, which is too low to keep an SDR pipeline producing.

The fix is a research pipeline. Pull the signals automatically (company news, recent hiring, funding events, tech-stack changes, conference attendance, leadership changes). Let the model identify which signal is most relevant for your specific value proposition. Generate first-touch outreach that references the signal in a way that lands. Volume goes back up, personalisation stays real, reply rate stays defensible.

The rest of this guide covers the production version — data sources, orchestration, personalisation generation, and the deliverability discipline that keeps the sending domain healthy at scale.

When to use

Where this fits — and where it doesn't

Use this if you have an SDR motion sending 30+ outbound emails per rep per day, your ideal customer profile has identifiable signals (companies hiring for X role, companies that just raised Y round, companies using Z technology), and your current reply rates have decayed to the point that the math is breaking. Common fits: B2B SaaS sales orgs with defined ICP, services firms with clear trigger events, agencies running outbound-as-a-service.

Don’t use this if your ICP is too broad to define signals against (you’re not sure what makes a “good” prospect), you’re in a category where outbound is fundamentally low-leverage (some commoditised products), or you don’t have the deliverability infrastructure to support high-volume sending (warmup, multiple sending domains, monitoring). For the last case, fix the infrastructure first — the research pipeline produces personalised emails that still fail if the domain is on a spam list.

Prerequisites

What you'll need before starting

  • A defined ICP with concrete signal types — “VP of Engineering at SaaS companies between 50 and 500 employees that just raised Series B in the last 90 days” beats “tech companies.” The signals are what the pipeline searches for.
  • Data sources for the signals you care about: Clay, Apollo, ZoomInfo, BuiltWith, Crunchbase, LinkedIn Sales Navigator, news APIs. Each signal type lives in a different source; multi-source enrichment is the norm.
  • Sales engagement platform (Outreach, Salesloft, Apollo) for the send mechanics, with multiple sending domains warmed up for the volume you plan to send.
  • A model API key with web-search or browse capability for the long-tail signals not in structured databases. Claude, GPT, and Gemini all have versions of this.
  • Brand-voice and ICP-message samples — what your best prospecting emails look like and what value props resonate with each segment of your ICP.
The solution

Six steps from prospect list to personalised first touch

  1. Define the signal taxonomy — what counts as a buying trigger for your ICP

    Map signals to your value prop. If you sell DevTools to engineering teams, signals might be: hiring a new VP Eng, growing engineering headcount more than 30% in the last quarter, recently funded (cash to spend), known competitor whose product they currently use. If you sell to marketing, signals are different: new CMO, just acquired another company, recent product launch. Lock the signal taxonomy — 5–8 signal types is enough — before building the data pipeline.

  2. Build multi-source enrichment per prospect

    For each prospect, run the data pulls: company news (from a news API or web search), hiring signals (LinkedIn Sales Navigator, BuiltWith hiring data, Clay’s job-posting tracker), funding (Crunchbase, PitchBook), tech-stack changes (BuiltWith, HG Insights), leadership changes (LinkedIn, news). Each source returns structured data; the orchestration tool (Clay, Apollo) is usually where this happens. Budget for tool subscriptions; data is the input quality lever.

  3. Score and select the best signal per prospect

    Most prospects will have multiple signals; the strongest one is what the outreach should reference. Score each signal by recency (more recent = stronger), specificity (a named person or product = stronger than a generic trend), and value-prop fit (a signal that directly suggests your product’s value = stronger). Use the model to score and select; the output is one signal-plus-context per prospect that becomes the personalisation anchor.

  4. Generate the first-touch outreach — reference the signal specifically

    The email should reference the specific signal (not “I saw your company is growing” — “I saw you brought on Pat as VP Eng last month — congrats”), tie it to your value prop in one sentence, and ask one specific question. Avoid generic-personalisation phrasing (“I was impressed by your recent…”) which reads as templated even when the underlying signal is real. Keep total length under 100 words; the longer the message, the more pattern-matching as outbound.

  5. Apply deliverability discipline — multiple domains, send-rate limits, content checks

    High-volume outbound burns domains fast. Use multiple sending domains warmed up appropriately; rotate sends across domains; cap per-domain daily send volume; monitor inbox placement weekly. The content side: avoid links in the first email, no attachments, no images, plain-text formatting. The most personalised email in the world hits zero replies from the spam folder.

  6. Track signal-to-reply correlation — tune the taxonomy

    Log which signals correlate with replies, meetings booked, and pipeline created. Some signals will outperform — funding events typically beat hiring signals; leadership-change signals work for some products and not others. Tune the signal taxonomy quarterly: keep the high-performing signals, drop the low-performing ones, test new ones. Without this loop, the pipeline runs but the relevance flat-lines; with it, signal quality compounds over quarters.

The numbers

What it costs and what to expect

Per-prospect research cost (multi-source enrichment + LLM) $1–$5 per prospect at typical depth
Orchestration platform cost (Clay, Apollo, Outreach with research bundle) $100–$500 per seat per month
Prospects researchable per SDR per day 50–200 with full pipeline, vs 10–15 manual
Reply rate — templated outbound (baseline) Under 2% typical; trending lower
Reply rate — signal-personalised outbound 5–12% typical at well-tuned pipelines
Meeting-booked rate (replies that convert) 20–40% typical — varies by signal type and message quality
Deliverability rate with proper hygiene 90–98% inbox placement on warm domains
Domain burnout time without rotation discipline 6–12 weeks before reputation degrades materially
Time to first working pipeline 2–4 weeks including data-source integrations
Time to fully tuned (signals + voice + deliverability) 1–2 months

The reply-rate multiple is the headline. The deliverability commitment is the most-often-underestimated cost — research-driven outbound that burns a domain is worse than templated outbound on a healthy domain.

Alternatives

Other ways to solve this

Bundled outbound platforms (Clay, Apollo, Outreach with research features). Increasingly bundle research and personalisation alongside engagement. Right answer for most teams — the platforms handle data integration and engagement at once. Trade-off: per-seat cost adds up at scale, less control over signal weighting.

Pure manual research with templated send. SDRs do the research themselves and send through a templated tool. High quality per prospect; doesn’t scale past 10–15 per rep per day. Pairs well with the AI pipeline for the tier of prospects worth full manual research; the AI pipeline handles the bulk.

LinkedIn-centric outbound (Sales Nav + LinkedIn messaging). Different channel, different mechanics. LinkedIn outreach has different deliverability dynamics; the research pattern is similar. Some teams run LinkedIn-first with email as a follow-up channel.

ABM with marketing-led outreach (6sense, Demandbase). Account-level rather than contact-level approach. Targets companies through display ads, content, and intent data before sales outreach. Complement rather than alternative to outbound prospecting; the two layers compose for some companies.

What's next

Related work

For the follow-up email pipeline that comes after first-touch reply, see Sales follow-up sequences with CRM context. For the call-analysis pipeline that feeds insights back into prospecting strategy, see Sales-call coaching at scale. For the CRM-hygiene pipeline that keeps prospect data clean, see CRM data hygiene at scale. For the AI-tells problem in generated content, see First-draft marketing copy without the AI tells.

Common questions

FAQ

How is this different from what Clay or Apollo's built-in AI does?

Functionally similar — Clay especially is built for exactly this pattern. The build-vs-buy decision depends on volume, signal customisation needs, and integration complexity. For most teams, Clay or Apollo is the faster path; custom builds make sense at large volumes or when the signal taxonomy doesn't fit what the platforms support.

What about the legal side of pulling all this data per prospect?

Publicly available data (LinkedIn profiles, news, funding announcements) is legally fine in most jurisdictions. Web scraping has more nuance — terms of service, regional regulations (GDPR, CCPA). Use the structured-data vendors (Clay, ZoomInfo, Apollo) rather than scraping where possible; the vendors handle the compliance layer. For EU prospects specifically, GDPR requirements apply and the data-handling discipline is stricter.

How do we prevent prospects from realising it's AI-personalised?

Better signals plus better voice. The AI tell isn't that the signal is real (good outbound has always involved research); the tell is the generic-personalisation phrasing ("I noticed your recent...", "It looks like your company is..."). Voice guardrails kill these phrases; specific signal references with specific language read as genuinely researched. Detection is mostly about phrase patterns, not about the underlying personalisation depth.

What about multi-touch sequences after the first email?

The research pipeline produces the first-touch personalisation; follow-up emails in the sequence reference the same signal plus any new ones that emerge. Don't templatise the follow-ups; the value of the research is sustained across the sequence, not just in email one. See sales follow-up sequences with CRM context for the follow-up generation pattern.

How do we measure if specific signals are actually working?

Tag every outbound email with the signal it referenced. Track reply rate, meeting-booked rate, and pipeline-created rate per signal type. After a quarter, the data tells you which signals are converting and which are noise. The taxonomy gets pruned and expanded based on real performance, not vendor pitches.

Sources & references

Change history (1 entry)
  • 2026-05-13 Initial publication.