Step 2 in depth

Public data. Twice a day. Traced to source.

Once your ICP and trigger thesis are defined, the signal monitor runs. Twice daily, it scans the open web for the specific events that mean your buyer is in-market — and lands the result, with citations, in your dashboard before you've finished your morning coffee.

Each run, end to end

The pipeline.

Every 8 AM and 4 PM Eastern, the monitor wakes up and runs the same six steps. You don't see the machinery; you see the curated queue it produces. Here's what happens between cron tick and dashboard refresh.

  1. Load recent signals as dedup state. The last 14 days of captured signal URLs are loaded so the same story doesn't surface twice.
  2. Issue 8–18 structured web search queries. Each query is a precise question — written during Step 1 to match your trigger thesis — executed through Anthropic Claude's web_search tool against the major public search indexes.
  3. Filter raw results. Drop anything matching the dedup list. Drop anonymous mentions ("a Fortune 500 financial services firm"). Drop pure vendor-marketing pieces that don't name a real customer.
  4. Classify into a bucket. Each remaining hit is mapped to one of your buying-cycle buckets (A, B, C, D — see below) and assigned a priority: HIGH, MEDIUM, or LOW.
  5. Score against your ICP. The account behind the signal is scored on your ICP rubric — industry fit, size band, region — to assign a Tier: S, 1, 2, 3, or 4.
  6. Write to the dashboard. The signal lands with: company name, headline, bucket, priority, tier, prioritization reason, and 1–3 verified public-source URLs. Every row is one click from its origin.
The bucket framework

Buying-cycle stages, mapped to signal types.

Not every signal means the same thing. A company that just hired a Chief AI Officer is at a very different stage than one that just published an MCP server on GitHub. The bucket framework — built with you during Step 1 — assigns each signal a place in the buying journey so the dashboard surfaces them in priority order.

Here's what one client's partial bucket framework looks like in production (anonymized example from a current engagement targeting enterprise AI deployment teams):

BUCKET A

Active deployment

The company is rolling out the specific technology your product secures, governs, or extends. Highest priority — they have a live runtime problem.

BUCKET B

Platform-building

The company is building or publishing related internal infrastructure (MCP servers, agent platforms, governance toolkits). Self-identified buyers.

BUCKET C

Hiring signals

The company is hiring roles that imply they're about to build or operate what you offer. A leading indicator that often precedes deployment by weeks or months.

BUCKET D

Strategic commitment

The earliest stage. CAIO appointments, AI Center of Excellence launches, earnings-call AI commitments. Get in before the specs get written.

Bucket D is the strategic sweet spot. Getting in front of a buyer before they pick their tool stack means your wedge can shape the requirements, not fit into an existing spec sheet.

What the queries look like

Precise, not "AI" + space + "rollout".

Each bucket gets its own set of web search queries, written precisely enough that they return signal — not noise. A representative sample from the same client engagement:

Bucket A — example queries

"rolled out Cursor" OR "deployed Cursor" OR "Cursor company-wide" "Claude Code" enterprise rollout OR deployment OR "company-wide" "GitHub Copilot Enterprise" rollout OR deployment OR "engineering organization" Windsurf OR Cognition Devin enterprise rollout OR adoption announcement

Bucket D — example queries

"Chief AI Officer" OR "Head of AI" appointed OR named OR hired OR joins "AI Center of Excellence" launched OR announced OR established OR "AI CoE" "Microsoft 365 Copilot" enterprise rollout OR "all employees" OR "company-wide" earnings call "AI productivity" OR "AI cost takeout" OR "AI efficiency" "AI transformation" program OR initiative OR strategy announced

These aren't generic. They're tailored to your ICP, your wedge, and the actual language buyers and analysts use when this signal appears in public. We refine them quarterly based on which queries are producing high-signal hits and which are returning noise.

Where signals actually come from

Source categories — and where each one shines.

Public data isn't one thing. It's six or seven different kinds of data, each with its own latency, fidelity, and noise profile. Knowing which source catches which signal is most of the methodology. Examples by source type:

Source category Examples typically surfaced
Vendor customer pages Anthropic, Microsoft, OpenAI, Cursor customer-story pages naming enterprise adopters by name.
Corporate engineering blogs First-party posts from Cloudflare, Stripe, Spotify, Netflix Tech — the company tells you what they're doing in their own words.
Press releases Business Wire, PR Newswire, corporate newsrooms — formal announcements that hit the wire on a known schedule.
Trade press TechCrunch, The Information, CIO Dive, Insurance Journal, regional business journals. Where the rumor becomes a story.
Earnings transcripts The Motley Fool, Seeking Alpha, Stock Titan, company IR pages. Where executives say what they're committing to.
SEC filings 10-K, 8-K, proxy statements via SEC EDGAR. Where the legal language reveals strategic intent before the press release does.
Job boards (public indexed) Greenhouse, Lever, Ashby, Workday public job pages, indexed via search. Leading indicators — they hire before they buy.
GitHub repositories Public repos open-sourced by enterprise engineering teams. Tells you what they're actually building.
Conference talks AI Engineer Summit, KubeCon, vendor user-group case studies. Practitioners talking shop.
Data integrity

Every signal traces back to its source.

The credibility of the entire pipeline depends on what the LLM is not allowed to do. The classifier ships with a strict ruleset:

  • No invented numbers. If employee count or revenue can't be confirmed from a public source, the signal goes to the Watch tier — not guessed into Large or Mid.
  • No anonymous mentions. "A Fortune 500 financial services firm" gets dropped during filtering. Every retained signal names a specific buying organization.
  • No vendor-marketing puff. Roundup articles ("10 best AI coding tools") get dropped unless they reference a real, named enterprise customer.
  • Verified URLs only. Every signal ships with 1–3 source URLs. Click any signal in the dashboard to land on the underlying public article.
  • 14-day URL deduplication. The same news story doesn't appear twice in your dashboard. If a new outlet covers a previously-captured story, we keep the existing record.

If we can't show you the source, the signal doesn't make the dashboard. There are no inferences passed off as facts.

What this system does NOT do

The constraints that make the output trustworthy.

Some of the most common shortcuts in outbound prospecting create legal exposure, ethical compromise, or both. We don't take them. Specifically:

  • No web scraping. Every hit is a public URL returned by a major search index. We don't bypass paywalls or scrape authenticated content.
  • No LinkedIn scraping or API misuse. Job postings surface only via publicly indexed pages — never via LinkedIn's authenticated graph.
  • No private intelligence feeds in the monitor. ZoomInfo, Apollo, Cognism, Crunchbase API — none are queried by the trigger monitor itself. (Clay enrichment downstream uses Clay's own waterfall providers, but that's a separate pipeline with its own consent posture.)
  • No third-party email scrapers in this layer. Contact discovery happens in a clearly-separated enrichment pipeline.
  • No fixed publication subscriptions. The system doesn't poll specific RSS feeds. Source diversity is a function of what the search index returns on any given run — which means you're not over-indexed on any single outlet's editorial slant.
Cadence & output

Transparent operations.

2×/day 8 AM & 4 PM ET
5–10 accounts surfaced/day
<24h publication → dashboard latency

Most major announcements land in your dashboard the same day they break — usually within the next cron cycle. Niche conference talks or regional trade-press pieces take longer (multi-day) because of indexing delays. The 14-day dedup window means you'll never see the same story twice; you'll see it once, when it first hits.

Cadence is configurable

Twice daily is the default. For high-volume ICPs we run hourly. For long-cycle enterprise plays we run once daily. The trade-off is API cost and signal freshness — we calibrate to your buying cycle.

Want to see your trigger framework in action?

Step 2 is where your hypothesis becomes a working signal feed. If you've got the trigger thesis, we'll show you what landing in your dashboard looks like.

Book a 30-minute intro → Back to overview