eeghor / mta Goto Github PK

View Code? Open in Web Editor NEW

97.0 12.0 32.0 352 KB

Multi-Touch Attribution

Python 100.00%

marketing markov-model shapley probabilistic-models

mta's Introduction

mta

Multi-Touch Attribution. Find out which channels contribute most to user conversion.

Models

This package contains implementations the following Multi-Touch Attribution models:

Shapley
Markov
So-called Simple Probabilistic Model by Shao and Li
Bagged Logistic Regression by Shao and Li
Additive Hazard (Survival)

In addition, some popular heuristic “models” are included, specifically

First Touch
Linear
Last Touch
Time Decay
Position Based

Included Data

The package comes with the same test data set as an R package called ChannelAttribution - there are 10,000 rows containing customer journeys across 12 channels: alpha, beta, delta, epsilon, eta, gamma, iota, kappa, lambda, mi, theta and zeta.

These are conversion aggregations by path. Suppose there’s a path (customer journey)

a > b > c

with total_conversions equal to 2 and total_null equal to 5. This means that we recorded 2 consumer journeys

a > b > c > (conversion)

and 5 customer journeys

a > b > c > (null)

There’s an option to generate timestamp data if you want to use the Additive Hazard model (the only model that explicitly incorporates exposure times).

References

Nisar and Yeung (2015) - Purchase Conversions and Attribution Modeling in Online Advertising: An Empirical Investigation pdf
Shao and Li (2011) - Data-driven Multi-touch Attribution Models pdf
Dalessandro et al (2012) - Causally Motivated Attribution for online Advertising pdf
Cano-Berlanga et al (2017) - Attribution models and the Cooperative Game Theory pdf
Ren et al (2018) - Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Advertising pdf
Zhang et al (2014) - Multi-Touch Attribution in Online Advertising with Survival Theory pdf
Geyik et al (2014) - Multi-Touch Attribution Based Budget Allocation in Online Advertising pdf

mta's People

Contributors

Stargazers

Watchers

mta's Issues

'ABCMeta' object is not subscriptable

Hey Eeghor,

Hope you're doing well!

First thank you for this package it has been really useful for our company. But since your last commits, we now encounter an error when importing the library which returns us:
TypeError: 'ABCMeta' object is not subscriptable

Do you know how can we handle this error ? The code is running on a google colab with Python 3.7.12.

Thank you in advance,
Thibault

How can these models be validated?

I saw that you reference articles in the read me/comments. For example, http://wnzhang.net/share/rtb-papers/data-conv-att.pdf.
How can you validate an attribution model like this?
The authors suggest that "a good MTA model should have a high degree of accuracy in correctly classifying a user as positive (with a conversion action) or negative (without a conversion action)".
What does this mean in practice? Attribution models look at all purchases, and attribute those purchases to different channels. They do not predict number of purchases. How do these authors treat these interpretive models as predictive models for this form of validation?

Goodness of fit/ model accuracy

Which way would be suited best to compare the models? I´m looking for a way to compare all models and only use the best performing one i.e. the one with the highest model accuarcy. I was thinking of AIC to compare the dynamic models.
Is there a way to calculate the model fit within the functions or would you recommend calculating them manually?

lists_ and flags_ in logistic_regression

Hi,
Maybe I haven't understood the entire thing, but in the method logistic_regression, we have:

                # expand the original data set into a feature matrix
		lists_ = []
		flags_ = []

		for i, row in enumerate(self.data.itertuples()):

			for _ in range(row.total_conversions):
				lists_.append({c: 1 for c in row.path})
				flags_.append(1)

			for _ in range(row.total_null):
				lists_.append({c: 1 for c in row.path})
				flags_.append(0)

When I do the inspection with:

print(lists_[:10], end = '\n')
print(flags_[:10], end = '\n')

I got:

# lists_[:10]
[{'e': 1, 't': 1, 'a': 1, ' ': 1, '>': 1, 'i': 1, 'o': 1, 'l': 1, 'p': 1, 'h': 1}, 
{'e': 1, 't': 1, 'a': 1, ' ': 1, '>': 1, 'i': 1, 'o': 1, 'l': 1, 'p': 1, 'h': 1}, 
{'e': 1, 't': 1, 'a': 1, ' ': 1, '>': 1, 'i': 1, 'o': 1, 'l': 1, 'p': 1, 'h': 1}, 
{'e': 1, 't': 1, 'a': 1, ' ': 1, '>': 1, 'i': 1, 'o': 1, 'l': 1, 'p': 1, 'h': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}, 
{'i': 1, 'o': 1, 't': 1, 'a': 1, ' ': 1, '>': 1}]

# flags_[:10]
[1, 0, 0, 0, 1, 1, 0, 0, 0, 0]

Finally by calling show(), I got:

linreg
a  0.449081
g  0.003764
h  0.454263
k  0.085471
n  0.003918
t  0.000376
z  0.003128

I am wondering, shouldn't I get something like:

linreg
alpha  0.449081
beta  0.003764
...

rather than spliting each alpha to a, l, p, h and a, etc?

Thank you.
Yours,
Xiaotian

ROI implementation

Hi, very good job! I saw that you implemented ROI. Choosing which touchpoints belong to which 'abc' channels. Was it random? Like, why is 'alpha' on channel 'b'?

show(self) funcrtion issue in mta.py

Hi,
In the mta.py, there is a function:

            def show(self):
		"""
		show simulation results
		"""
		res = pd.DataFrame.from_dict(mta.attribution)
		print(res)

in which the
res = pd.DataFrame.from_dict(mta.attribution)
should actually be
res = pd.DataFrame.from_dict(self.attribution)

Yours,
Xiaotian

Conversion value calculation and Total Null value

Hi,
You did an amazing job by implementing MTA.
but I don't get the calculation behind the total conversion value and total null value.

So can you please tell me, how they have calculated or any equations?

Thank you.
Divyesh