Learning PathLesson 2 of 7 · Advanced Analytics
Advanced Analytics · Lesson 2 of 7advanced12 min read

Attribution Modeling with AI

Use AI to build and compare multi-touch attribution models. Includes SQL queries, practical prompts, and a framework for choosing the right attribution approach.

Why Attribution Is Still Broken (And How AI Helps)

Attribution modeling has been a pain point for marketing analysts since forever. You know the scenario: your CMO asks 'which channel is driving revenue?' and the honest answer is 'it depends on which model we use.' First-touch says it's SEO. Last-touch says it's paid search. The CEO saw an ad on LinkedIn and thinks brand is doing all the work.

AI doesn't magically solve attribution, but it does three things that make your life significantly easier: it can process more touchpoint data than you could manually, it can build and compare multiple models simultaneously, and it can help you explain the tradeoffs to stakeholders in plain language.

The Five Attribution Models You Need to Know

  • First-touch: 100% credit to the first interaction. Great for understanding awareness channels.
  • Last-touch: 100% credit to the final interaction before conversion. What most platforms default to.
  • Linear: Equal credit to every touchpoint. Fair but not very insightful.
  • Time-decay: More credit to touchpoints closer to conversion. Usually the best rule-based option.
  • Data-driven: AI/ML determines credit based on actual conversion patterns. The gold standard.

SQL for Multi-Touch Attribution

Here's a practical SQL query that builds a multi-touch attribution table from your event data. This works with most data warehouses (BigQuery, Snowflake, Redshift).

multi_touch_attribution.sqlsql
WITH touchpoints AS (
  SELECT
    user_id,
    session_id,
    channel,
    timestamp,
    conversion_value,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp ASC) AS touch_order,
    COUNT(*) OVER (PARTITION BY user_id) AS total_touches,
    FIRST_VALUE(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp ASC) AS first_touch_ts,
    LAST_VALUE(timestamp) OVER (
      PARTITION BY user_id ORDER BY timestamp ASC
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS last_touch_ts
  FROM marketing_touchpoints
  WHERE user_id IN (SELECT DISTINCT user_id FROM conversions)
),
attribution AS (
  SELECT
    user_id,
    channel,
    touch_order,
    total_touches,
    conversion_value,
    -- First-touch
    CASE WHEN touch_order = 1 THEN conversion_value ELSE 0 END AS first_touch_credit,
    -- Last-touch
    CASE WHEN touch_order = total_touches THEN conversion_value ELSE 0 END AS last_touch_credit,
    -- Linear
    conversion_value / total_touches AS linear_credit,
    -- Time-decay (half-life of 7 days)
    conversion_value * POW(0.5, DATEDIFF(day, timestamp, last_touch_ts) / 7.0)
      / SUM(POW(0.5, DATEDIFF(day, timestamp, last_touch_ts) / 7.0))
        OVER (PARTITION BY user_id) AS time_decay_credit
  FROM touchpoints
)
SELECT
  channel,
  ROUND(SUM(first_touch_credit), 2) AS first_touch_revenue,
  ROUND(SUM(last_touch_credit), 2) AS last_touch_revenue,
  ROUND(SUM(linear_credit), 2) AS linear_revenue,
  ROUND(SUM(time_decay_credit), 2) AS time_decay_revenue
FROM attribution
GROUP BY channel
ORDER BY time_decay_revenue DESC;
Prompt Example
any

Get AI to recommend the right attribution model for your specific business

I have marketing touchpoint data with these columns: user_id, channel (organic_search, paid_search, social, email, direct, referral), timestamp, and session_id. I also have a conversions table with user_id, conversion_date, and revenue. 

Help me build a comparison of attribution models. For each channel, show me first-touch, last-touch, linear, and time-decay attribution. Then recommend which model is most appropriate for our B2B SaaS with a 45-day average sales cycle. Explain your reasoning like I'm presenting this to my VP of Marketing.

Building Data-Driven Attribution with AI

Data-driven attribution used to require a data science team. Now you can get AI to help you build a simplified version using Shapley values or Markov chains. You won't need to understand the math deeply — but you do need to understand what the output means.

Prompt Example
claude

Build a Markov chain attribution model without needing data science expertise

I want to build a data-driven attribution model using Markov chain analysis. I have a CSV with columns: user_id, touchpoint_sequence (comma-separated channels in order), and converted (0 or 1). Write me a Python script that calculates the removal effect for each channel and assigns attribution credit proportionally. Include visualization of the results. Keep the code simple enough that I can explain the approach to my team.
Watch Out
Data-driven attribution is only as good as your tracking data. If you have significant gaps in your touchpoint data (and you almost certainly do), a well-chosen rule-based model like time-decay can actually outperform a data-driven model trained on incomplete data.

Manual Workflow

Running separate SQL queries for each attribution model, manually building comparison spreadsheets, spending days on a single attribution analysis.

With AI

AI generates all model comparisons in one pass, explains the differences in stakeholder-friendly language, and recommends the best approach based on your sales cycle and data quality.
Time saved: 2-3 days per attribution analysis

Presenting Attribution Results to Stakeholders

The biggest challenge with attribution isn't the math — it's getting stakeholders to agree on a model. Here's the framework: present the range across models, highlight where models agree (high-confidence insights), and focus the discussion on where they disagree (the interesting strategic questions).

Try It Yourself

Run the multi-touch attribution SQL on your own data and use AI to interpret the results

Here are my attribution results across four models for each channel: [paste your results]. For each channel, tell me: (1) Where do models agree? (2) Where do they disagree and why? (3) What's the strategic implication? (4) Which model should I recommend to my CMO and how should I frame it? Our average deal cycle is [X] days and we're a [B2B/B2C/DTC] company.
Pro Tip
Pro move: present attribution as a range rather than a single number. Instead of saying 'organic search drove $500K,' say 'organic search drove between $350K (last-touch) and $620K (first-touch), with our recommended time-decay model at $480K.' This builds more trust with stakeholders.

Get weekly job alerts

Curated marketing analytics roles — delivered every Monday.