§ 01 — Blueprint

Composable CDP Architecture on GCP

A modular, cloud-native Customer Data Platform built on Google Cloud — combining BigQuery as the unified data warehouse, Pub/Sub for real-time event streaming, Dataflow for transformation, and Cloud Functions for activation — with full consent, identity, and governance layers.

0
Events / sec
0
Unified Profiles
0ms
P99 Latency
0%
ID Match Rate
0
Blocked Events
INGESTION STREAM STORAGE GOVERN ACTIVATE WEB SDK GA4 · Tag Manager Tealium · sGTM MOBILE SDK iOS · Android Flutter · React Native CRM / ERP Salesforce · SAP HubSpot · Dynamics OFFLINE POS · Call Centre Email · Direct Mail STREAMING IoT · Kafka Clickstream · Webhook 3P DATA LiveRamp · Acxiom Experian · Dun & B AD PLATFORMS Google · Meta · Amazon SA360 · DV360 CLOUD PUB/SUB Real-time Event Bus 10M+ msgs / sec DATAFLOW Apache Beam Transforms Enrich · Dedupe · Validate CLOUD FUNCTIONS Consent Check · Rules Schema Validation BIGQUERY ML Propensity · LTV Scoring Audience Segmentation BIGQUERY Event Tables · Profile Store Audience · Audit Tables FIRESTORE Real-time Profile Lookup Session State · Consent IDENTITY GRAPH Golden Record · Cross-Device Household ID · Collision Rules CLOUD STORAGE Raw Event Archive Model Artefacts · Exports DATA CATALOG Schema Registry · Lineage PII Classification · Tags CONSENT ENGINE GDPR · PECR · TCF 2.2 ACCESS CONTROL IAM · VPC-SC · CMEK EGRESS THROTTLE Rate Limits · DLP · Audit Log BLOCKING RULES Suppression · DSR · Quarantine GOOGLE ADS Customer Match · RLSA META ADS CAPI · Custom Audiences AMAZON DSP AMC · Sponsored Ads SA360 Floodlight · Bid Adjust mOENGAGE Push · Email · In-App AMPLITUDE Cohorts · Experiments APPSFLYER MMP · SKAN · PBA
🔵 GCP Services Used
BigQueryPub/SubDataflowCloud FunctionsFirestoreCloud RunGCSSecret ManagerData CatalogDLP APICloud IAMVPC-SCCMEKVertex AICloud Composer
📐 Architecture Principles
Warehouse-Native
BigQuery is the system of record — not a copy of data
Composable Layers
Each layer is independently replaceable (no vendor lock-in per layer)
Privacy First
Consent, blocking, and DLP enforced before egress — not after
⚡ SLA Targets
Stream event ingestion< 200ms
Profile unification< 2s
Audience export< 4hrs
DSR deletion< 72hrs
API egress P99< 500ms
§ 02 — Integration

Offline → Online Signal Bridging

Transform in-store, call-centre, and direct mail interactions into addressable digital signals — using deterministic ID matching, BigQuery computed events, and real-time profile enrichment via Firestore.

Data Flow
Dummy Data
Enrichment Logic
Use Cases
POS / STORE transaction_id · email CALL CENTRE phone · crm_id EMAIL OPENS hashed_email · campaign DIRECT MAIL postcode · ref_code BIGQUERY offline_events raw · normalized SHA-256 hashed IDs ID MATCH email → gaid/idfa phone → device Firestore lookup GOLDEN RECORD Merged Profile offline_score=HIGH purchase_intent=0.87 GOOGLE ADS Customer Match META CAPI Offline Conversions mOENGAGE Triggered Push
ℹ️ All PII (email, phone, postcode) is SHA-256 hashed at source before entering BigQuery. Raw PII lives only in Cloud KMS-encrypted Secret Manager references used during match. Match keys are never stored in plaintext.

🛒 POS Transaction — Raw Offline Event

BigQuery: cdp_raw.offline_events -- Arriving from POS system via Cloud Pub/Sub { "event_id": "evt_pos_20240317_0091234", "event_type": "purchase_offline", "event_ts": "2024-03-17T14:23:11Z", "source": "store_manchester_001", "store_id": "STR_MAN_001", "transaction_id": "TXN_2024_443920", "revenue_gbp": 149.99, "items": [ { "sku": "PROD_SHOE_NK_001", "qty": 1, "price": 149.99 } ], "identity": { "email_sha256": "a3d9c...f84e2", "phone_sha256": "b7f12...c93a1", "loyalty_id": "LYL_0092837" }, "enriched": false, "consent_flags": { "marketing": true, "analytics": true } }

✅ Enriched Profile — Post Identity Match

Firestore: profiles/{uid} { "uid": "usr_8f2a9c_unified", "created_at": "2023-09-01T09:00:00Z", "updated_at": "2024-03-17T14:23:15Z", "identity_keys": { "email_sha256": "a3d9c...f84e2", "gaid": "4f3a2b1c-...", "idfa": "E621E1F8-...", "loyalty_id": "LYL_0092837", "crm_id": "SF_CONTACT_00X9" }, "offline_signals": { "last_store_visit": "2024-03-17", "total_offline_ltv": 1247.82, "purchase_frequency": 3.2, "preferred_store": "manchester" }, "scores": { "churn_risk": 0.12, "purchase_intent":0.87, "ltv_percentile": 94 }, "audiences": [ "high_value_offline", "footwear_buyer", "q1_active", "win_back_exclude" ] }
🔄 BigQuery Computed Events — SQL Logic
-- Scheduled daily via Cloud Composer CREATE OR REPLACE TABLE cdp.computed_offline_online_bridge AS WITH matched AS ( SELECT o.event_id, o.event_ts, o.revenue_gbp, p.uid AS unified_uid, p.gaid, -- resolved from Firestore p.idfa, 'purchase_offline' AS online_event_type, o.store_id, o.items FROM cdp_raw.offline_events o JOIN cdp.identity_map p ON o.identity.email_sha256 = p.email_sha256 OR o.identity.loyalty_id = p.loyalty_id WHERE o.enriched = false AND o.consent_flags.analytics = true ) SELECT *, CURRENT_TIMESTAMP() AS processed_at FROM matched;
⚙️ Cloud Function — Real-time Enrichment Trigger
# Triggered by Pub/Sub message import functions_framework from google.cloud import firestore, bigquery @functions_framework.cloud_event def enrich_offline_event(cloud_event): data = cloud_event.data email_hash = data['identity']['email_sha256'] # 1. Lookup identity in Firestore db = firestore.Client() profile = db.collection('profiles')\ .where('identity_keys.email_sha256','==',email_hash)\ .limit(1).get() if profile: uid = profile[0].to_dict()['uid'] # 2. Write enriched event to BQ bq = bigquery.Client() row = {**data, "unified_uid": uid, "enriched": True} bq.insert_rows_json( "cdp.enriched_offline_events", [row]) # 3. Update profile last_seen profile[0].reference.update({ "offline_signals.last_store_visit": data["event_ts"] })
⚠️ Match Rate Reality: Expect 55–75% deterministic match rate for email-hashed offline events. Below that threshold, probabilistic matching via postcode + purchase pattern clustering (BigQuery ML KMEANS) is used — flagged as match_confidence=PROBABILISTIC and excluded from PII-sensitive activations.
Use CaseOffline SignalOnline ChannelLogicPlatform
In-store Win-BackLast purchase > 90 daysPaid SocialExclude recent buyers, target lapsedMeta CAPI
Cross-sell Footwear → Apparelfootwear_buyer=trueDisplay / SearchBid +40% for apparel queriesSA360 + GAds
Offline LTV → Digital ROASltv_percentile > 80Customer MatchHigh-value seed audience for lookalikeGoogle Ads
Call Centre Recoverycomplaint_logged=trueIn-App MessageSend apology + voucher within 2hrsmoEngage
Post-Purchase NPSpurchase within 24hrsEmail / PushTrigger satisfaction surveymoEngage
Direct Mail Respondersref_code scannedSearch + DisplaySuppress DM audience; shift budget onlineSA360
Loyalty Tier Upgradecumulative_ltv > £500Push NotificationPlatinum tier promotionmoEngage + Amplitude
§ 03 — Activation

Data Connectors & Egress

Activate BigQuery audiences and Firestore computed events into the full stack of paid media, analytics, and engagement platforms — with consent gating, rate limiting, and schema validation on every outbound pipe.

🔵
Google Ads
Paid Search · Display · YouTube
LIVEBATCH
Methods: Customer Match v2 API, RLSA, Display & Video 360 Audience Sharing

Auth: OAuth2 service account + Google Ads API token

Rate Limit: 100 req/min · 1M rows/upload

Latency: 6–24hrs match propagation

Consent: Requires explicit marketing consent + EEA consent mode v2

PII: SHA-256 email, phone, address only
🔷
Meta Ads (CAPI)
Facebook · Instagram · Threads
LIVEBATCH
Methods: Conversions API (CAPI) server-side, Custom Audiences

Auth: System User access token (never user token)

Rate Limit: 200 events/sec per pixel · 10K users/batch

Dedup: event_id + event_source_url required

Consent: data_processing_options must be set for LDU

Limitation: 7-day attribution window max for iOS14+
🟠
Amazon DSP / AMC
DSP · Sponsored · AMC
BATCH
Methods: Amazon Marketing Cloud (SQL) + DSP audience upload via S3

Auth: IAM role with AMC dataset access

Rate Limit: 1 BQ export/day · max 10M rows

Latency: 24–48hrs

Limitation: AMC data cannot be used to suppress audiences in real-time — batch only

Format: SHA-256 email → S3 → AMC match
🔴
Search Ads 360
Enterprise Search Management
LIVEBATCH
Methods: Floodlight Activities, Audience Lists, Bid Adjustments via API v0

Auth: Service account + DV360 linking

BQ Native: SA360 → BigQuery export via Google Ads Data Hub

Latency: Real-time floodlight · 4–6hrs audience

Consent: Inherits Google Ads consent mode signals
💜
moEngage
Multi-channel Engagement
LIVESTREAM
Methods: Events API v1, User Attribute API, Segment Sync

Auth: App ID + Secret Key (stored in Secret Manager)

Rate Limit: 500 events/sec · 2K users/bulk call

Trigger: Pub/Sub → Cloud Run → moEngage Events API

Channels: Push, Email, SMS, In-App, WhatsApp

Limitation: No native BQ connector — requires Cloud Function bridge
🟡
Amplitude
Product Analytics · Cohorts
LIVEBATCH
Methods: HTTP API v2, Identify API, Cohort Sync via Amplitude Data

BQ Native: Amplitude → BQ export (one-way); BQ → Amplitude via HTTP

Rate Limit: 30 events/sec per key

Dedup: insert_id required to prevent duplicates

Limitation: Cohort sync latency 15–30 min; no real-time push from BQ
🟢
AppsFlyer
MMP · SKAN · PBA
LIVEBATCH
Methods: S2S Events API, People-Based Attribution, Data Locker → GCS

Auth: Dev key per app + S2S auth token

BQ: Data Locker writes raw attribution to GCS → BQ external table

Constraint: DSR deletion must go via AppsFlyer API — CDP cannot directly delete AF data

SKAN: Postbacks received via callback URL → Cloud Function → BQ
📊
BigQuery (Internal)
Computed Offline Events
STREAMBATCH
Pattern: BQ scheduled query → audience table → export trigger

Firestore: BQ → Dataflow → Firestore for real-time profile lookup

Rate: Streaming inserts: 1GB/s sustained

Cost: Storage: £0.016/GB/mo · Query: £4.50/TB

Limitation: BQ streaming inserts not instantly queryable (30-90s delay)

Connector Egress Architecture

BIGQUERY Audience Tables Scheduled Queries EGRESS GATE Consent Check Rate Throttle Schema Validate CLOUD RUN Connector Services Per-destination Retry + DLQ GOOGLE ADS Cust. Match META CAPI S2S Events SA360 Floodlight mOENGAGE Events API APPSFLYER S2S + Data Locker AUDIT TRAIL BQ: egress_audit_log Cloud Audit Logs Cloud Monitoring
📋 Egress Throttle Config
# Cloud Run connector config (per destination) egress_config: google_ads: max_rps: 10 daily_row_limit: 1_000_000 retry_attempts: 3 backoff_strategy: exponential dlq_topic: cdp-egress-dlq-gads meta_capi: max_events_per_sec: 200 batch_size: 50 dedup_window_hrs: 48 retry_on_codes: [429, 500, 503] moengage: max_rps: 500 bulk_limit: 2000 timeout_ms: 5000 circuit_breaker: enabled appsflyer: s2s_rps: 100 data_locker_sync: daily_02:00_UTC skan_postback_validation: strict
⚠️ Meta CAPI: Always send event_id as SHA-256(session_id + event_name + timestamp). Without this, dedup at Meta fails silently — you will see inflated conversion counts with no error.
🚨 AppsFlyer DSR: When a user submits a deletion request, you MUST call the AppsFlyer GDPR API directly. Your CDP cannot delete data from AppsFlyer's servers — only AF can. SLA: 72hrs.
SA360 + BQ Native: Use Google Ads Data Hub as the privacy-safe clean room — compute audiences in BQ with DH, push back to SA360 without exposing raw user IDs.
ℹ️ Amplitude Dedup: Always populate insert_id with event_id from your schema. Amplitude de-duplicates on this field within a 7-day window — without it, retry storms create phantom events.
§ 04 — Identity

Identity Resolution & Collision Strategy

Deterministic and probabilistic matching across web, mobile, offline, and CRM identity spaces — with a full collision resolution framework covering merge, split, override, and quarantine strategies.

Identity Graph
▶ Real-Life Flow
Collision Strategy
⏪ Fallback & Rollback
Data Model
Limitations

A walkthrough of how Marcus gets identified and cross-device stitched in real time — starting from a tablet session, then progressing through a desktop login. Every step maps to a live Firestore write, sGTM decision, and CDP profile update.

Step 1 / 7
Step 1 — Existing Mobile Session
Marcus previously signed in on his tablet. Firestore already holds document Profile_M1 linking Tablet_iPad7 to his authenticated account ID marcus_id_447. One device, one profile, already stitched.
Marcus's Tablet
📱
Device ID: Tablet_iPad7
User ID: marcus_id_447
master_id: Profile_M1
State: Identified ✓
sGTM + CDP Engine
Server-Side Tag Manager · Cloud Function
🗄️
Firestore
Document: Profile_M1
EXISTS
aliases: [
Tablet_iPad7 // mobile device
marcus_id_447 // auth account
Desktop_Win11 // new device ← arrayUnion
]
master_id: "Profile_M1"
Marcus's Desktop
💻
Device ID: Desktop_Win11
User ID: null
master_id: null
State: Anonymous
Confidence Tiers HIGH — Deterministic (email, phone, CRM, loyalty) MEDIUM — Device ID (GAID, IDFA, IDFV) MEDIUM-LOW — On-Device Signals (fingerprint, SDK) LOW — Probabilistic (cookie, postcode) HOUSEHOLD — Address + Email + Phone cluster
DETERMINISTIC DEVICE IDs PROBABILISTIC ON-DEVICE SDK HOUSEHOLD STORAGE GOLDEN RECORD GCP Identity Graph uid: usr_8f2a9c_unified EMAIL SHA256 a3d9c...f84e2 DETERMINISTIC · HIGH PHONE SHA256 b7f12...c93a1 DETERMINISTIC · HIGH LOYALTY ID LYL_0092837 DETERMINISTIC · HIGH CRM ID SF_CONTACT_00X9 DETERMINISTIC · HIGH GAID 4f3a2b1c-9d8e-...cd ANDROID AD ID · MED IDFA E621E1F8-C36C-...F8 iOS AD ID · MEDIUM IDFV A1B2C3D4-...7890 iOS VENDOR ID · MED FINGERPRINT HASH Canvas + Audio + Font hash PROBABILISTIC · LOW COOKIE / FPC _ga=2.123...789 PROBABILISTIC · LOW ON-DEVICE SDK SIGNALS SCREEN RESOLUTION 1920×1080 · DPR 2.0 OS VERSION iOS 17.2 · Android 14 NETWORK CARRIER EE UK · Vodafone · O2 TIMESTAMP PATTERN Session cadence · TZ offset USER AGENT / BROWSER Chrome 120 · WebKit LANGUAGE / LOCALE en-GB · Europe/London HOUSEHOLD PROFILE HOUSEHOLD ID HH_SW1A_2AA_0091 PRIMARY EMAIL sha256 → household anchor POSTCODE (HASHED) SW1A 2AA → cluster key LINKED MEMBERS Parent · Spouse · Child UIDs FIRESTORE Real-time deterministic feed Master_ID → aliases[ ] Sub-10ms profile lookup ← feeds GCP Identity Graph

Collision Scenarios

Collision TypeTriggerResolution
Shared DeviceSame GAID, 2+ email logins within 2hrsSPLIT into separate profiles; GAID marked shared
Merged HouseholdsSame postcode + loyalty_id collisionPARENT/CHILD household graph created
Data ConflictBQ and CRM disagree on emailPRIORITY RULE CRM > offline > web (configurable)
Ghost ProfileCookie profile never matched deterministicallyQUARANTINE after 90 days inactivity
ID Theft SignalSame email, 5+ different devices in 1hrFREEZE + alert + manual review queue
Consent MismatchProfile A consents, Profile B (same person) opts outOPT-OUT WINS always — merged profile inherits lowest consent
Late Arriving OfflineOffline event arrives 7 days after online sessionRETROACTIVE MERGE re-process historical audiences

Resolution Decision Tree

NEW EVENT ARRIVES with identity signals Deterministic Match? email · phone · loyalty_id YES MERGE Update Golden Record NO Probabilistic? device · postcode · behaviour COLLISION CHECK Shared device? Consent conflict? PROBABILISTIC LINK match_confidence=LOW RESOLVE + MERGE ANONYMOUS PROFILE
ℹ️ Every merge, split, or collision resolution action writes an immutable snapshot to cdp.merge_audit_log. The rollback system reads this log to reconstruct any prior profile state — point-in-time recovery to within 1 second of any event.

Rollback Architecture

Snapshot-on-Write
Before every merge/split, the current profile state is serialised to cdp.profile_snapshots (BQ + GCS backup). This is the rollback source of truth — not a diff, a full copy.
Audit Log Replay
The merge audit log is append-only and immutable (Cloud Storage WORM bucket). Any sequence of merge events can be replayed or reversed chronologically using the rollback_identity Cloud Function.
Fallback Triggers
Automatic rollback triggers: fraud_freeze event (auto-reverts last 3 merges), False Positive match score detected (<0.65 on review), DSR-linked merge (must undo before deletion), manual override via admin UI.
Downstream Re-sync
After rollback, all connected destinations (Google Ads, Meta, moEngage) receive a delete_user + re-add signal with the corrected profile. Audiences are refreshed within the next scheduled export window (max 4hrs).
Split Rollback
If a "shared device" split was incorrect (e.g., same person using two browsers), the two child profiles can be re-merged using the original parent snapshot, restoring all event history and audience memberships.

🔁 Rollback Cloud Function

# rollback_identity.py — Cloud Function from google.cloud import bigquery, firestore def rollback_identity(request): data = request.get_json() uid = data['uid'] target = data.get('rollback_to_ts') # ISO timestamp reason = data['reason'] # fraud|false_pos|dsr|manual bq = bigquery.Client() db = firestore.Client() # 1. Fetch snapshot closest to target timestamp query = f""" SELECT snapshot_json, snapshot_ts FROM cdp.profile_snapshots WHERE uid = '{uid}' AND snapshot_ts <= '{target}' ORDER BY snapshot_ts DESC LIMIT 1 """ rows = list(bq.query(query).result()) if not rows: return {'error': 'no snapshot found'}, 404 snapshot = rows[0]['snapshot_json'] # 2. Restore profile in Firestore db.collection('profiles').document(uid).set( snapshot, merge=False) # full overwrite # 3. Mark downstream for re-sync bq.insert_rows_json('cdp.rollback_queue', [{ 'uid': uid, 'reason': reason, 'rolled_back_to': target, 'queued_at': datetime.utcnow().isoformat(), 'destinations': [ 'google_ads','meta_capi', 'moengage','amplitude' ], 'status': 'PENDING' }]) # 4. Log audit entry bq.insert_rows_json('cdp.merge_audit_log', [{ 'audit_id': str(uuid.uuid4()), 'action_type': 'ROLLBACK', 'source_uid': uid, 'resolution': reason, 'event_ts': datetime.utcnow().isoformat() }]) return {'status': 'rolled_back', 'uid': uid}, 200

Rollback Scenario Matrix

ScenarioTriggerRollback MethodScopeSLAAuto?
Wrong Merge — False Positivematch_confidence < 0.65 on reviewRestore last snapshot; split UIDs backBoth merged profiles1 hrSemi-auto
Fraud Freeze Triggered5+ devices in 1hr anomalyAuto-revert last 3 merge operationsTarget profile only5 minAuto
Incorrect Shared Device SplitManual admin reviewRe-merge child UIDs from parent snapshotChild profiles + events4 hrsManual
Consent Override ErrorOpt-out cascade applied to wrong UIDRestore consent flags from snapshot; re-add to audiencesConsent fields only30 minAuto
DSR Merge Pre-DeletionDSR received for a merged profileRollback merge to identify original UID scope; then delete only that scopePre-merge UIDs72 hrsSemi-auto
Retroactive Merge Wrong Offline EventSource system corrects offline transactionRemove enrichment; undo BQ computed event; re-run matchEnriched events only24 hrsManual
Schema Version MismatchNew schema breaks profile shapeRevert identity_map schema; restore from BQ snapshotEntire profile table2 hrsSemi-auto
🚨 Critical Constraint: Rollback does NOT undo data already delivered to ad platforms. Once an email hash has been uploaded to Google Ads Customer Match or Meta Custom Audiences, the downstream platform must be notified separately via its own removal API. The CDP rollback system queues these removal calls automatically, but propagation takes 6–48hrs depending on the destination.
BigQuery: cdp.identity_map -- Identity Graph Table Schema CREATE TABLE cdp.identity_map ( uid STRING NOT NULL, -- synthetic UUID created_at TIMESTAMP NOT NULL, updated_at TIMESTAMP NOT NULL, -- Deterministic keys email_sha256 STRING, phone_sha256 STRING, loyalty_id STRING, crm_id STRING, subscriber_id STRING, -- Device identifiers gaid STRING, -- Android Ad ID idfa STRING, -- iOS Ad ID idfv STRING, -- iOS Vendor ID web_cookie_id STRING, -- First-party cookie -- Match metadata match_confidence STRING, -- HIGH|MEDIUM|LOW match_method STRING, -- DETERMINISTIC|PROBABILISTIC matched_keys ARRAY<STRING>, is_household BOOL DEFAULT false, parent_uid STRING, -- for household graphs is_frozen BOOL DEFAULT false, -- fraud flag -- Consent (lowest wins on merge) consent_marketing BOOL NOT NULL, consent_analytics BOOL NOT NULL, consent_updated TIMESTAMP ) PARTITION BY DATE(created_at) CLUSTER BY email_sha256, loyalty_id;
BigQuery: cdp.merge_audit_log -- Every merge/split action recorded immutably CREATE TABLE cdp.merge_audit_log ( audit_id STRING NOT NULL, event_ts TIMESTAMP NOT NULL, action_type STRING, -- MERGE | SPLIT | FREEZE | QUARANTINE -- CONSENT_OVERRIDE | RETROACTIVE_MERGE source_uid STRING, -- before state target_uid STRING, -- after state triggering_event STRING, -- event_id that caused merge match_keys_used ARRAY<STRING>, confidence_score FLOAT64, -- Collision resolution collision_type STRING, -- SHARED_DEVICE | CONSENT_MISMATCH ... resolution STRING, -- what rule was applied consent_before JSON, consent_after JSON, -- Accountability triggered_by STRING, -- service account / job name reviewable BOOL, -- requires human sign-off? reviewed_by STRING ) PARTITION BY DATE(event_ts) OPTIONS(require_partition_filter=true);
⚠️ Known Limitations
Apple ATT / iOS 14.5+
IDFA is opt-in only. Expect <30% IDFA availability on iOS. Fall back to IDFV + email for cross-device matching.
Third-Party Cookie Deprecation
Chrome GDPR-3PC removed. Web probabilistic matching degrades significantly. First-party data strategy is mandatory.
Hashed Email Match Rates
Cross-platform SHA-256 match requires SAME normalisation (lowercase, trimmed). Even minor formatting differences cause match failures — 5–15% loss common.
Shared Family Devices
Single GAID/IDFA used by multiple family members is a structural problem. No perfect solution — use household flagging and exclude from personalisation targeting.
BQ Streaming Insert Delay
BQ streaming inserts are not immediately queryable — 30–90 second delay. Identity merge jobs reading from BQ must account for this lag.
✅ Best Practices
Normalise Before Hashing
Always lowercase + strip whitespace before SHA-256. Store normalisation function in Cloud Functions (single source of truth).
Never Store Raw PII in BQ Alongside Ad Identifiers
PII (email, phone) and GAID/IDFA must not coexist in the same column grouping — use column-level encryption with Cloud KMS.
Opt-Out Cascade
When a profile opts out, cascade deletion/suppression to ALL linked device IDs and ALL downstream activation destinations simultaneously.
Confidence Score Routing
Only route DETERMINISTIC matches to regulatory-sensitive destinations (Google Ads Customer Match). Use PROBABILISTIC only for internal analytics.
Immutable Audit Log
Every merge event logged to merge_audit_log with require_partition_filter=true. Never update or delete — WORM pattern for regulatory compliance.
§ 05 — Schema & Governance

Event Schema, Deduplication & Governance Rules

The complete event contract — from JSON schema validation and deduplication logic to blocking rules, access control, and egress throttle configuration.

JSON Schema
Deduplication
Blocking Rules
Access Control
Egress Throttle

📋 Master CDP Event JSON Schema

{ "$schema": "https://json-schema.org/draft/2020-12", "$id": "cdp-event-v3.json", "title": "CDP Universal Event", "type": "object", "required": [ "event_id", "event_type", "event_ts", "source", "consent" ], "properties": { "event_id": { "type": "string", "pattern": "^evt_[a-z]{3,10}_\\d{8}_\\d{7}$", "description": "Globally unique. Used for dedup." }, "event_type": { "type": "string", "enum": [ "page_view", "add_to_cart", "purchase", "purchase_offline", "identify", "sign_up", "login", "app_open", "push_click", "call_centre_contact", "store_visit", "email_open", "email_click", "refund", "product_view", "search", "custom" ] }, "event_ts": { "type": "string", "format": "date-time", "description": "ISO 8601 UTC. Client time only." }, "source": { "type": "object", "required": ["channel", "platform"], "properties": { "channel": { "enum": ["web","ios","android","offline","api","email"] }, "platform": { "type": "string" }, "store_id": { "type": "string" } } }, "identity": { "type": "object", "minProperties": 1, "properties": { "uid": { "type": "string" }, "email_sha256": { "type": "string", "pattern": "^[a-f0-9]{64}$" }, "phone_sha256": { "type": "string", "pattern": "^[a-f0-9]{64}$" }, "gaid": { "type": "string", "format": "uuid" }, "loyalty_id": { "type": "string" } } }, "consent": { "type": "object", "required": ["analytics", "marketing", "version"], "properties": { "analytics": { "type": "boolean" }, "marketing": { "type": "boolean" }, "version": { "type": "string", "description": "TCF consent string v2" }, "updated_at": { "type": "string", "format": "date-time" } } }, "properties": { "type": "object", "additionalProperties": true, "maxProperties": 50 }, "revenue": { "type": "object", "properties": { "amount": { "type": "number", "minimum": 0 }, "currency": { "type": "string", "pattern": "^[A-Z]{3}$" }, "tax": { "type": "number" } } } }, "additionalProperties": false }

✅ Valid Event Example

{ "event_id": "evt_web_20240317_0091234", "event_type": "purchase", "event_ts": "2024-03-17T14:23:11.432Z", "source": { "channel": "web", "platform": "brand.co.uk" }, "identity": { "uid": "usr_8f2a9c_unified", "email_sha256": "a3d9c...f84e2", "gaid": "4f3a2b1c-9d8e-..." }, "consent": { "analytics": true, "marketing": true, "version": "CPdqkAAPdqkAAMA...", "updated_at": "2024-03-17T13:00:00Z" }, "revenue": { "amount": 149.99, "currency": "GBP", "tax": 24.99 }, "properties": { "product_ids": ["PROD_SHOE_NK_001"], "category": "footwear", "quantity": 1, "coupon": "SPRING20" } }
⚠️ Schema is enforced by Cloud Function before Pub/Sub publish. Invalid events are routed to cdp-schema-violations topic — never dropped silently. Dead-letter queue retains for 7 days.
🔄 Deduplication Strategy
Layer 1 — Pub/Sub Message Dedup
Pub/Sub exactly-once delivery within 10 min window using message_id. Protects against SDK double-sends on network retry.
Layer 2 — Dataflow Window Dedup
Apache Beam Fixed Window (5 min) + GroupByKey(event_id). Collapses duplicates arriving within the same processing window.
Layer 3 — BigQuery Merge Dedup
Hourly scheduled MERGE statement using event_id as the unique key. Ensures idempotency even if Dataflow delivers twice.
Layer 4 — Destination-Level Dedup
Meta: event_id field. Amplitude: insert_id. Google Ads: upload job idempotency key. Each destination handles its own window.
BigQuery: Hourly Dedup Merge -- Runs hourly via Cloud Composer MERGE cdp.events_clean t USING ( SELECT event_id, ARRAY_AGG(e ORDER BY _PARTITIONTIME DESC LIMIT 1)[OFFSET(0)] AS latest FROM cdp.events_raw e WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 HOUR) GROUP BY event_id ) src ON t.event_id = src.latest.event_id WHEN NOT MATCHED THEN INSERT ROW WHEN MATCHED AND src.latest._PARTITIONTIME > t._PARTITIONTIME THEN UPDATE SET enriched = src.latest.enriched; -- Dedup stats tracked in monitoring table INSERT INTO cdp.dedup_stats VALUES ( CURRENT_TIMESTAMP(), @@row_count, (raw_count - clean_count) AS duplicates_removed );
ℹ️ Expected Duplicate Rate: Web SDK: 2–5% (network retry). Mobile SDK: 1–3%. Offline batch: <1% (controlled upload). Call Centre: 0.5% (manual entry errors). Total dedup savings: typically 3–6% of raw event volume.
Rule NameTrigger ConditionActionScopeReversible
opt_out_blockconsent.marketing = falseBlock all marketing egressProfile + all linked devicesYes
dsr_suppressionDSR request receivedImmediately suppress from all audiences; queue deletionAll destinations + BQNo
fraud_freeze5+ device IDs in 1hr OR velocity anomalyFreeze profile; alert security teamProfile-levelManual review
minor_blockage_verified = false OR age < 18Block all personalised advertising egressAll marketing platformsYes (on verification)
pii_in_event_blockDLP API detects raw PII in event bodyQuarantine event; alert data teamSingle eventAfter remediation
schema_violation_blockJSON schema validation failsRoute to DLQ; never ingest to main streamSingle eventFix + resubmit
geo_restrictionuser_country = sanctioned listBlock all processing; alert complianceProfile + eventsCompliance only
egress_rate_exceededConnector exceeds configured RPSQueue event; apply exponential backoffConnector-levelAuto-resolves
consent_mode_downgradeConsent signal degrades (e.g., CMP update)Retroactively remove from active audiencesProfile + downstreamOn re-consent
🚨 Critical: All blocking rules are enforced at the Egress Gate (Cloud Run service) BEFORE data leaves GCP. Blocking at the destination level (e.g., excluding from an audience in Google Ads) is NOT sufficient — data would still have left your environment. Block at source.
🔐 IAM Role Matrix
RoleBQ AccessFirestorePub/SubSecrets
cdp-ingestion-saWriter (raw only)WriterPublisherNone
cdp-dataflow-saReader + Writer (clean)ReaderSubViewer
cdp-connector-saReader (audiences)ReaderNoneAccessor
cdp-analyst-saReader (no PII tables)NoneNoneNone
cdp-admin-saFull (audit only)FullFullFull
🔒 Column-Level Security on PII Tables
-- Policy tag applied to PII columns CREATE policy_tag "pii_email" taxonomy: "cdp-taxonomy"; -- Only roles with Fine-Grained Reader -- can see unmasked values ALTER TABLE cdp.identity_map ALTER COLUMN email_sha256 SET DATA GOVERNANCE policy_tag: "pii_email"; -- Default masked view for analysts: -- email_sha256 → "****sha256****" -- VPC Service Controls perimeter -- prevents BQ data leaving org boundary gcloud access-context-manager perimeters create cdp-perimeter --resources=projects/cdp-prod --restricted-services=bigquery.googleapis.com
⚡ Egress Throttle Architecture
Token Bucket Algorithm
Each connector has a token bucket with configurable fill rate. Requests consume tokens; empty bucket → queue with exponential backoff (cap: 32s).
Circuit Breaker
If a destination returns 5xx for >30 consecutive requests, the circuit opens for 60s. All events during open circuit go to DLQ for replay.
Priority Queuing
DSR suppressions: P0 (immediate). Real-time triggers (abandon cart, push): P1. Batch audiences: P2. Historical backfill: P3.
Dead Letter Queue
Failed events go to cdp-egress-dlq-{destination} Pub/Sub topic. Replayed after human review or automated retry rules.
Cloud Run: egress-throttle.py from ratelimit import limits, sleep_and_retry from circuitbreaker import circuit import google.cloud.pubsub_v1 as pubsub LIMITS = { "google_ads": { "rps": 10, "dlq": "cdp-dlq-gads" }, "meta_capi": { "rps": 200, "dlq": "cdp-dlq-meta" }, "moengage": { "rps": 500, "dlq": "cdp-dlq-moe" }, "appsflyer": { "rps": 100, "dlq": "cdp-dlq-af" }, } @sleep_and_retry @limits(calls=LIMITS[dest]["rps"], period=1) @circuit(failure_threshold=30, recovery_timeout=60) def send_to_destination(dest, payload): try: response = connector_map[dest].send(payload) log_egress_audit(dest, payload["event_id"], "SUCCESS", response.status_code) return response except Exception as e: route_to_dlq(LIMITS[dest]["dlq"], payload, str(e)) raise def log_egress_audit(dest, event_id, status, code): bq.insert_rows_json("cdp.egress_audit_log", [{ "destination": dest, "event_id": event_id, "status": status, "http_code": code, "logged_at": datetime.utcnow().isoformat() }])
§ 06 — Prerequisites & Constraints

Prerequisites, Limitations & Compliance

Everything you need in place before day 1 — GCP configuration, team capabilities, data quality thresholds, and regulatory prerequisites.

✅ Technical Prerequisites

RequirementMinimum SpecStatus Check
GCP ProjectOrg-level Billing Account + VPCRequired
BigQuery DatasetMulti-region EU, CMEK enabledRequired
Cloud Pub/Sub3 topics minimum (raw, clean, DLQ)Required
FirestoreNative mode, eu-west2 regionRequired
Cloud KMSKey ring for CMEK + PII column encryptionRequired
VPC Service ControlsPerimeter around BQ + FirestoreStrongly Rec
Secret ManagerAll API keys (never env vars)Required
Cloud Composer (Airflow)v2.x for BQ job orchestrationRequired
Data CatalogSchema registry + PII taggingStrongly Rec
GCP Budget AlertsSet at 80% + 100% of monthly limitBest Practice

⚖️ Legal & Compliance Prerequisites

RequirementDetailOwner
GDPR Lawful BasisConsent (Art.6a) or Legitimate Interest documented per purposeDPO
PECR ComplianceUK: soft opt-in for email; hard opt-in for cookiesLegal
TCF 2.2 CMPIAB-registered CMP integrated with consent signal flowMarTech
Data Processing AgreementDPA with GCP, each connector vendor, each data supplierLegal
ROPA (Records of Processing)Every data flow documented in Article 30 registerDPO
DSR Process72hr response SLA; tested quarterlyEngineering
EEA Data Transfer MechanismSCCs or Binding Corporate Rules for non-EU destinationsLegal
Retention PolicyRaw events: 90 days; Profiles: active+24mo; Audit: 7yrData Governance

⚠️ Known Platform Limitations

BigQuery Limitations
Streaming Insert Delay
30–90 sec before streaming inserts are queryable. Do not use BQ streaming for real-time lookup — use Firestore.
Storage Costs at Scale
At 10B events/day, active storage costs ~£140/month. Enable long-term storage automatically after 90 days (60% cheaper).
DML Row Limit
MERGE/UPDATE statements have a 90-day partition requirement. Always partition event tables by ingestion date.
Identity Graph Limitations
iOS ATT — IDFA Availability
<30% opt-in rate. SKAN 4.0 coarse conversion values only for non-consented users.
Hash Normalisation Drift
If email normalisation differs between SDK, CRM, and ad platform — match rate drops 5–15%. Central normalisation function mandatory.
Cross-Device False Positives
Probabilistic matching: 8–12% false positive rate in shared device environments. Threshold-gate: only merge if score ≥ 0.80.
Connector Limitations
Google Ads Match Rate
Customer Match typically 40–60% match rate on hashed email. Requires 1K minimum users per list.
Meta CAPI Latency
Server-side events processed with 5–15 min delay vs browser events. May affect real-time bidding and optimisation signals.
AppsFlyer DSR Constraint
Your CDP cannot delete data from AppsFlyer's servers. You must call AF's GDPR API. This is AF's limitation, not a CDP design issue.

💰 Cost Model (Indicative)

£0.016
Per GB/mo BQ Storage
£4.50
Per TB BQ Query
£0.04
Per 1M Pub/Sub msgs
£0.06
Per 1M Firestore reads
ℹ️ At 1M daily active profiles and 50M events/day: estimated GCP bill ~£3,200–£5,800/month depending on query patterns. Enable BQ slot commitments (flex or annual) above 10TB/day query volume for 40% cost reduction. Use Firestore TTL to expire session data after 30 days.
§ 07 — Use Cases

Data Use Cases & Activation Playbooks

Concrete, end-to-end activation playbooks showing exactly which data, which BigQuery queries, which connectors, and which consent requirements apply — per use case.

🎯 UC-01: High-Value Lookalike Expansion
Google Ads Meta Offline + Online
Signal
Users with ltv_percentile ≥ 90 AND offline_purchase_count ≥ 3 AND consent.marketing = true
BQ Query
SELECT email_sha256 FROM cdp.unified_profiles WHERE ltv_percentile >= 90 AND offline_signals.purchase_count >= 3 AND consent_marketing = true
Activation
Export to Google Ads Customer Match + Meta Custom Audience as seed for 2% Lookalike. Refresh weekly.
Expected Size
~2-5% of total profiles. Min 1,000 required for Google Ads, 100 for Meta.
Consent Required
Explicit marketing consent + TCF v2 purpose 1, 3, 4.
🛒 UC-02: Abandoned Cart Recovery — Cross-Channel
moEngage Meta CAPI Real-time
Signal
add_to_cart event fired, NO purchase event within 60 min from same uid
Logic
Dataflow tumbling window (60 min). If no matching purchase, trigger Cloud Function to push to moEngage and Meta CAPI.
Sequence
T+60min: Push notification (moEngage). T+2hrs: Email reminder. T+24hrs: Paid retargeting (Meta).
Suppression
If purchase happens during recovery sequence, immediately suppress remaining touches via blocking rule.
📊 UC-03: Offline-to-Online Attribution
SA360 Google Ads Measurement
Signal
Offline purchase within 30 days of a paid search click (gclid captured at web session)
BQ Join
JOIN web_sessions (gclid) with offline_events (email_sha256) via identity_map. Attribute revenue to campaign.
Activation
Send offline conversion via Google Ads Offline Conversion Import (gclid + revenue). SA360 Floodlight enhanced conversions.
Impact
Typical 15–35% uplift in measured ROAS when offline conversions correctly attributed.
🔄 UC-04: Churn Prevention — Predictive Cohort
moEngage Amplitude ML-Driven
Signal
Vertex AI model score: churn_risk ≥ 0.70. No purchase or app_open in last 28 days. Previously high engagement.
Model
BigQuery ML BQML LOGISTIC_REG trained on 12-month event history. Weekly retraining via Cloud Composer.
Activation
Personalised push notification with dynamic discount (tier-based: 10% for gold, 20% for platinum). In-app message on next open.
Measurement
Holdout group (10%) tracked via Amplitude experiment. Report via Amplitude chart + BQ analysis.
📱 UC-05: Mobile Attribution Clean Room
AppsFlyer Amazon AMC SKAN 4.0
Signal
AppsFlyer Data Locker → GCS → BQ external table. SKAN postbacks → Cloud Function → BQ.
Logic
Join AF attribution data with CDP profiles (using IDFV + hashed email where ATT consented). Compute campaign-level incremental LTV.
Clean Room
Amazon Marketing Cloud SQL query: join AMC impressions with CDP email-matched converters. Overlap analysis without ID sharing.
Limitation
SKAN values are coarse (0–63). Probabilistic revenue attribution only for opted-out iOS users.
🔍 UC-06: Search Bid Optimisation via Audience
SA360 Google Ads RLSA
Signal
Audiences: high_value (ltv ≥ £500), recent_purchaser (purchase ≤ 30d), lapsed_buyer (90-180d no purchase).
Bid Logic
high_value: +40% bid adjustment. recent_purchaser: -20% (reduce waste). lapsed_buyer: +25% (re-engage intent).
Activation
BQ audience export → Google Ads RLSA list. SA360 bulk bid modifier upload via API. Refreshed daily.
Consent
Requires Consent Mode v2. Signals passed via gtag with ad_personalization and analytics_storage consent.

📡 Paid Channel & Owned Channel Activation

💎 UC-07: Value-Based Bidding (VBB)
Google Ads SA360 DV360
Concept
Instead of bidding for conversion volume, bid proportionally to the predicted lifetime value of each user — letting Smart Bidding maximise ROAS against value, not just click count.
Signal
Vertex AI LTV model produces predicted_90d_ltv per uid. This score is passed as conversion value at the time of the conversion event (not a flat £1).
BQ Logic
SELECT uid, gclid, ROUND(predicted_90d_ltv * ltv_weight_factor, 2) AS conv_value FROM cdp.scored_profiles WHERE predicted_90d_ltv > 0 AND consent_marketing = true
Activation
Cloud Function sends gclid + conv_value via Google Ads Offline Conversion Import API. SA360 Enhanced Conversions handle the web path; server-side via CAPI for Meta (event_value field).
Tiers
Platinum LTV (>£1,000): value=1.0x. Gold (£500–£999): value=0.6x. Silver (£100–£499): value=0.3x. Suppress below £100 threshold — don't waste Smart Bidding budget on low-value signals.
Consent
Requires Google Consent Mode v2 ad_user_data=GRANTED. Modelled conversions kick in for non-consented users — do not suppress the conversion ping, just omit PII.
Limitation
Smart Bidding needs min 30 value-weighted conversions/month to stabilise. Cold start: use flat conversion value for first 30 days, then switch to LTV-weighted.
🚫 UC-08: Suppression Across Paid Channels
Google Ads Meta SA360 Amazon DSP
Purpose
Stop spending money advertising to people who have already converted, opted out, been flagged as fraud, or are in a loyalty hold. Suppression is the highest-ROI use case in any CDP.
Suppression Segments
recent_converter — purchase ≤ 14 days
opted_out — consent.marketing = false
active_crm — open support ticket / complaint
loyalty_platinum — no need to bid up, they're already retained
dsr_requested — immediate, always P0
fraud_flagged — frozen profiles
BQ Query
SELECT email_sha256 FROM cdp.unified_profiles WHERE (last_purchase_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 14 DAY) OR consent_marketing = false OR is_frozen = true OR dsr_active = true) AND consent_analytics = true
Activation
Google Ads: Upload as "exclusion" Customer Match list. Meta: Custom Audience with "Exclude" toggle in campaign. SA360: Negative audience modifier via API. DV360: Exclusion list at insertion order level. Amazon DSP: Suppression list upload via S3.
Refresh Cadence
DSR suppressions: real-time (<15min via P0 egress path). Opted-out: 4-hourly sync. Post-purchase: daily. All others: daily. Always-on — never let suppression lists stale beyond 24hrs.
Expected Saving
Typical 8–18% reduction in wasted paid media spend. At £500K/month budget, that's £40K–£90K saved monthly.
📬 UC-09: Owned Channel Remarketing — Email · SMS · Push
moEngage Email SMS Push In-App WhatsApp
Concept
Use CDP audience segments to power personalised, sequenced owned-channel remarketing — triggered from BigQuery computed signals or real-time Pub/Sub events into moEngage via Cloud Run bridge.
Segment Examples
browse_abandoner_24h post_purchase_7d
winback_90d loyalty_tier_upgraded
price_drop_wishlist low_stock_viewed
birthday_minus7d anniversary_offer
Channel Priority (Sequencing)
T+0: In-App Message (highest open rate, zero cost)
T+1hr: Push Notification (if app installed + opted in)
T+4hr: Email (universal fallback; consent: soft opt-in UK)
T+24hr: SMS (highest urgency signals only; hard opt-in)
T+48hr: WhatsApp (if WhatsApp Business API connected)
Stop sequence on any conversion event
BQ → moEngage Bridge
SELECT uid, email_sha256, push_token, phone_sha256, segment_name, personalisation_payload FROM cdp.audience_export WHERE segment_name IN ('browse_abandoner_24h','winback_90d') AND consent_marketing = true AND moEngage_opted_in = true
Personalisation Payload
Each record includes personalisation_payload JSON: last browsed product, price, stock level, recommended alternatives (Vertex AI), loyalty points balance, first name — injected into moEngage template via Liquid tags.
Suppression from Owned Channels
Always check: consent_email=true, consent_sms=true, push_opted_in=true per channel individually — one consent does NOT cover all channels. Double-opt-in for SMS required under PECR.
Frequency Capping
Global cap: max 3 marketing messages/7 days per uid across all owned channels combined. Tracked in Firestore profiles/{uid}/msg_frequency. Cap enforced by Cloud Function before any moEngage API call.
Expected Metrics
Email open rate: 28–42% (vs 18% batch avg). Push click rate: 12–22%. SMS response: 8–15%. In-app engagement: 35–55%. Revenue/send (personalised): 2.4–4.1x vs generic broadcast.

📈 Expected Business Outcomes

Use CaseKPITypical UpliftTime to ValueRisk
Lookalike ExpansionCPM efficiency, ROAS+20–40% ROAS4–8 weeksLow
Abandoned Cart RecoveryRecovery rate, incremental revenue12–18% cart recovery2–4 weeksLow
Offline AttributionTrue ROAS, measured conversions+15–35% measured ROAS6–12 weeksMedium
Churn PreventionRetention rate, LTV8–15% churn reduction8–16 weeksMedium
Mobile Attribution Clean RoomCPI accuracy, incremental installs+25% attribution accuracy10–20 weeksHigh
Value-Based BiddingROAS, CPA efficiency vs LTV+25–45% ROAS vs flat-bid6–10 weeksMedium
Suppression — PaidWasted spend reduction8–18% budget saving1–2 weeksLow
Owned Channel RemarketingRevenue/send, unsubscribe rate2.4–4.1x revenue vs broadcast3–6 weeksLow
§ 08 — Operations

Ops Dashboard — Live CDP Health

Real-time error rates, pipeline trend lines, egress health, and profile lookup — all driven from BigQuery monitoring tables and Firestore live reads. Refresh every 1.2 seconds.

Ingestion Error Rate
0.0%
↓ improving
Events / sec (live)
12,481
→ stable
ID Match Rate
71.3%
↑ +0.2% vs yesterday
Egress Success
99.2%
↑ all connectors healthy
DLQ Depth
247
↑ Meta CAPI backpressure
Profile Merges Today
8,934
→ within normal range
Blocking Events (24h)
1,892
↓ vs 2,110 yesterday
DSR Queue
3
↓ all within SLA
👤
usr_8f2a9c_unified
Created 2023-09-01 · Last seen 2024-03-17 · Match: DETERMINISTIC · HIGH
Identity Keys
Offline Signals
Scores & Consent
Event Ingestion Rate (60s window)
Events/sec
Errors/sec
Identity Match Rate Trend (60s)
Match %
Collision/sec
Egress Connector Health (60s)
Success
DLQ
Suppression & Blocking Events (60s)
Blocked
DSR Actions
📡 Error Rate Gauge — Ingestion Pipeline
0% OK 100%
0.3%
Error Rate
Schema violations: 0.1%
Consent blocked: 0.15%
DLQ failures: 0.05%
✅ Below 1% SLA threshold
🔴 Live Error Stream
🔌 Connector Live Status
ConnectorStatusSuccess RateAvg LatencyDLQ DepthLast SendRate Limit
§ AUDIENCE INTELLIGENCE — Live via Platform APIs

Audience Match Rate & Active Use Monitor

Real-time audience health across Google Ads, Meta, Amazon DSP, and SA360 — match rates, active usage, staleness flags, and ROI signal pulled programmatically from each platform's API. Data refreshes every 4 hours. Min 1,000 matched users required for platform disclosure (privacy threshold).

ℹ️ Technically feasible: Google Ads API exposes user_list.size_for_display/search + match rate per offline_user_data_job. Meta Marketing API returns approximate_count on every Custom Audience. Amazon DSP API returns audience size with privacy-threshold approximation. SA360 inherits from Google Ads user_list + surfaces bid modifier performance per audience. All four APIs require OAuth service account tokens with ads_management / userlist.read scopes stored in Secret Manager. Note: Google Ads API Customer Match upload capability migrates to Data Manager API — April 2026 deadline for active tokens.
Total Audiences Active
24
across 4 platforms
Avg Match Rate
61.4%
↑ +2.1% vs last week
Stale Audiences
3
↑ not refreshed >7d
Actively Delivering
19
impressions this week
Total Matched Users
4.2M
across all platforms
📊 Audience Match Rate & Activity Detail
Auto-refresh every 4hrs
Platform Audience Name CDP Segment Uploaded Matched Match Rate Search Size Display Size Last Refreshed Status Delivering? ROI Signal
Match Rate by Platform — 30-day Trend
Google Ads
Meta
Amazon
SA360
🔬 API Feasibility Matrix
PlatformMatch Rate APIList Size APIActive Use APIStaleness FlagConstraint
🔵 Google Ads ✓ offline_user_data_job ✓ user_list resource ✓ ad_group_audience_view ✓ REFRESH recommendation Data Manager API migration Apr 2026
🔷 Meta ✓ approximate_count ✓ Custom Audience API ✓ Insights API Partial — no native flag Min 1,000 users for disclosure
🟠 Amazon DSP ✓ DSP Audiences API ✓ Approximated count ✓ AMC + DSP Reporting Partial — manual check Privacy threshold; batch-only match
🔴 SA360 Via Google Ads API ✓ Inherits from GAds ✓ Bid modifier + Floodlight ✓ Via GAds recommendation No raw match rate in SA360 API directly
What "Actively Used" means per platform
Google Ads: Impressions > 0 in last 7d from ad_group_audience_view
Meta: Reach > 0 in Insights API; audience delivery_status = active
Amazon DSP: Line item targeting this audience with >0 impressions (DSP Reporting API)
SA360: Bid modifier applied + Floodlight conversion credit in last 14d
⚠️ Staleness & Refresh Recommendations
PlatformAudienceLast RefreshDays StaleImpactRecommended Action
⚠️ Staleness effect: Google Ads match rate degrades ~3–5% per week of inactivity as matched users change email/phone. Meta audiences expire after 30 days without refresh — they show 0 reach but no error. Amazon DSP audiences don't auto-expire but match quality degrades with churn. Always refresh high-value suppression lists within 24hrs — stale suppressions waste paid media budget on converted users.