Attribution Modeling with AI
Use AI to build and compare multi-touch attribution models. Includes SQL queries, practical prompts, and a framework for choosing the right attribution approach.
Why Attribution Is Still Broken (And How AI Helps)
Attribution modeling has been a pain point for marketing analysts since forever. You know the scenario: your CMO asks 'which channel is driving revenue?' and the honest answer is 'it depends on which model we use.' First-touch says it's SEO. Last-touch says it's paid search. The CEO saw an ad on LinkedIn and thinks brand is doing all the work.
AI doesn't magically solve attribution, but it does three things that make your life significantly easier: it can process more touchpoint data than you could manually, it can build and compare multiple models simultaneously, and it can help you explain the tradeoffs to stakeholders in plain language.
The Five Attribution Models You Need to Know
- First-touch: 100% credit to the first interaction. Great for understanding awareness channels.
- Last-touch: 100% credit to the final interaction before conversion. What most platforms default to.
- Linear: Equal credit to every touchpoint. Fair but not very insightful.
- Time-decay: More credit to touchpoints closer to conversion. Usually the best rule-based option.
- Data-driven: AI/ML determines credit based on actual conversion patterns. The gold standard.
SQL for Multi-Touch Attribution
Here's a practical SQL query that builds a multi-touch attribution table from your event data. This works with most data warehouses (BigQuery, Snowflake, Redshift).
WITH touchpoints AS (
SELECT
user_id,
session_id,
channel,
timestamp,
conversion_value,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp ASC) AS touch_order,
COUNT(*) OVER (PARTITION BY user_id) AS total_touches,
FIRST_VALUE(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp ASC) AS first_touch_ts,
LAST_VALUE(timestamp) OVER (
PARTITION BY user_id ORDER BY timestamp ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_touch_ts
FROM marketing_touchpoints
WHERE user_id IN (SELECT DISTINCT user_id FROM conversions)
),
attribution AS (
SELECT
user_id,
channel,
touch_order,
total_touches,
conversion_value,
-- First-touch
CASE WHEN touch_order = 1 THEN conversion_value ELSE 0 END AS first_touch_credit,
-- Last-touch
CASE WHEN touch_order = total_touches THEN conversion_value ELSE 0 END AS last_touch_credit,
-- Linear
conversion_value / total_touches AS linear_credit,
-- Time-decay (half-life of 7 days)
conversion_value * POW(0.5, DATEDIFF(day, timestamp, last_touch_ts) / 7.0)
/ SUM(POW(0.5, DATEDIFF(day, timestamp, last_touch_ts) / 7.0))
OVER (PARTITION BY user_id) AS time_decay_credit
FROM touchpoints
)
SELECT
channel,
ROUND(SUM(first_touch_credit), 2) AS first_touch_revenue,
ROUND(SUM(last_touch_credit), 2) AS last_touch_revenue,
ROUND(SUM(linear_credit), 2) AS linear_revenue,
ROUND(SUM(time_decay_credit), 2) AS time_decay_revenue
FROM attribution
GROUP BY channel
ORDER BY time_decay_revenue DESC;Get AI to recommend the right attribution model for your specific business
I have marketing touchpoint data with these columns: user_id, channel (organic_search, paid_search, social, email, direct, referral), timestamp, and session_id. I also have a conversions table with user_id, conversion_date, and revenue. Help me build a comparison of attribution models. For each channel, show me first-touch, last-touch, linear, and time-decay attribution. Then recommend which model is most appropriate for our B2B SaaS with a 45-day average sales cycle. Explain your reasoning like I'm presenting this to my VP of Marketing.
Building Data-Driven Attribution with AI
Data-driven attribution used to require a data science team. Now you can get AI to help you build a simplified version using Shapley values or Markov chains. You won't need to understand the math deeply — but you do need to understand what the output means.
Build a Markov chain attribution model without needing data science expertise
I want to build a data-driven attribution model using Markov chain analysis. I have a CSV with columns: user_id, touchpoint_sequence (comma-separated channels in order), and converted (0 or 1). Write me a Python script that calculates the removal effect for each channel and assigns attribution credit proportionally. Include visualization of the results. Keep the code simple enough that I can explain the approach to my team.
Manual Workflow
With AI
Presenting Attribution Results to Stakeholders
The biggest challenge with attribution isn't the math — it's getting stakeholders to agree on a model. Here's the framework: present the range across models, highlight where models agree (high-confidence insights), and focus the discussion on where they disagree (the interesting strategic questions).
Run the multi-touch attribution SQL on your own data and use AI to interpret the results
Here are my attribution results across four models for each channel: [paste your results]. For each channel, tell me: (1) Where do models agree? (2) Where do they disagree and why? (3) What's the strategic implication? (4) Which model should I recommend to my CMO and how should I frame it? Our average deal cycle is [X] days and we're a [B2B/B2C/DTC] company.
Get weekly job alerts
Curated marketing analytics roles — delivered every Monday.