Sequence models apply the premise that there is an implicit ‘worth’ to when an interaction occurs in a customers’ research journey. The complexity of these models may vary, but two main strands of approach involve;
- How distant was an interaction from the conversion point, and
- Does interaction, “A”, have more or less worth when either preceding, or following, another interaction, “B”
The first proposal really is one of recency where there is greater importance placed on interactions either nearer the end or the start of a customer journey. There is a recall consideration implied within these models, along the lines of: “Did an impression served two weeks ago really have any bearing on a customer converting now?” And conversely, “Was the original impression a key component in putting the customer on their current track?”
I find the second strand an intriguing proposal, if only for the creativity I’ve seen in examples of manually created rules. Individual cases of imaginative thinking aside, there is merit to the principle that certain combinations of campaigns outperform others, and certainly outperform isolated activity. Likewise, we may also assume that an awareness campaign occurring after a user has visited your website several times is of limited value.
Both these approaches are offered in limited fashion within the Google reporting platforms. The downside is that it is nigh impossible to see exactly what weights are being applied to your data, and there is little in the way of guidance or a testing mechanism to see whether these are evenly remotely close to accurately fitting your data.
In practise, I would suggest that these models are likely a poor fit to your data and should be taken with a hefty pinch of salt. Even assuming (for example) the time decay model is a fair reflection of overall effect, broadly applying this to each channel is nonsensical. The brand recall effect of a Search Click from a week ago will in the vast majority of cases be stronger than a Display impression from the same time.
In addition, these models are dropped on top of only successful paths and there is no consideration of ‘failure’ in their calculation. For any given simple model, consider these cases:
If we now see a couple of paths like this:
Then given our available information, we would probably end up giving the search click and display impression equal worth in the above two paths. After all, why not? There’s nothing in our data to suggest one is better than the other. Simple frequency of occurrence might just be down to user exposure, volume of spend, or some campaign aggregation effect.
If however you now introduce these new facts:
Then suddenly there’s an indication that the display impression may not share the same influence on conversion as the search click, and given that the user may well have converted without the impression, our impression should only receive a fraction of the value we would previously have awarded.
It’s clear then that a more accurate approach is to include, if not all, then at least a large sampling of failed paths so as to mitigate an overly generous attribution. By hand, it would be impossible to calculate the fractional weights we should apply to each campaigns’ interactions, and by each campaign, to each position in the path, and so we turn to the power of computers to solve this algorithmically.
Solver Algorithms sift through the vast quantities of path data adjusting a series of variables that represent the positional weight of each activity. A ‘goodness of fit’ is measured by calculating the difference between each individual case’s predicted outcome (a fraction between 0 and 1), and the actual outcome (0 = not converted, 1 = converted).
The algorithm then iterates to minimise this difference across the entire data set and, after a given time or number of iterations with no improvement have elapsed the best solution is returned. We apply these final weights to give us the share of conversions for each campaign grouping.
All very clever.
However for all the complexity of this approach the solver returns simply the best solution to our pre-supposed model, which is itself a human based construct. It is our assumption that positional values decay over time, and that this decay is based on a power series, an exponential, or simple 1/n proportional to n [steps] or t [hours]. If we’ve chosen a poor model to begin with, then we’ve merely achieved a very precise level of wrongness. For this reason, my preference is to avoid decay curve-only modelling.
(Worth a view here is this presentation: https://www.youtube.com/watch?v=AZtLZn34IuY which brings together decay modelling with logistic regression.)