How Infusing Marketing with Data Science Can Lead to Better Marketing Decisions By Faisal Wasswa

data-lazy-src= Faisal Wasswa is Lead Marketing Data Scientist at NatWest Group. He holds a BSc in Economics from Brunel University London and an MSc in Financial Economics from Birkbeck, University of London. Faisal’s specialisms include Econometrics, Python, R, Eviews, and SAS.
Here, Faisal explains how data science can optimise a company’s marketing strategy, making it more cost-efficient and targeted. He examines how data science has the power to measure each facet of marketing, from competitor analysis to media spend:

Why?
In the 1920s, John Wanamaker is famously quoted for stating that “half the money I spend on advertising is wasted; the trouble is I don’t know which half”. More recently in 2013, Martin Sorrel (previous CEO of WPP – the largest ad agency holding company in the UK), stated that his clients are wasting 15-25% of their advertising budgets. He just doesn’t know which 15-25%.

Without knowing what works and what doesn’t, how can we optimise our spending and get the most out of every pound, euro, or dollar? Not knowing can lead to wasted advertising spend, as mentioned by Wanamaker and Sorrell. For FTSE 100 companies and brands, the average annual advertising spend is £100 million, resulting in £15-£25 million wasted per brand each year.

Market mix modeling (known colloquially as MMM), is an interdisciplinary field that combines concepts from Economics, Econometrics (Statistics), Marketing, and Data Science to address this challenge. A market mix modeler can answer:

How each media activity/campaign is performing, by providing a return on investment (ROI) for each marketing activity used. This provides a backward lens.
How to provide a forward lens for a scenario-setting application (an Optimiser App) that allows the marketeer to forecast how best to allocate media spend amongst multiple media channels. This allows them to optimise the impact of their marketing spend:
- What is the impact of my pricing (brand and competitor) – i.e., what is the price elasticity, and depending on the type of model done, how has this evolved over time?
- What is the impact of distribution?
- What is the impact of seasonality? Can I leverage seasonality and by how much?

How?

The background
A customer’s conversion journey is complex and holistic, and is influenced by several factors such as the economy, seasonal events, competitors, and our own actions as a brand. Market mix models condense this complex real-world into a mathematical model. To effectively measure the impact of marketing efforts, a model needs to capture these touchpoints.

A robust market mix model includes all the drivers. It is this inclusion of disparate and diverse datasets which makes market mix models a golden source for evaluating the effectiveness of marketing spend.

The mathematical models

The model, in its simplest form, is an overtime linear regression where a Key Performance Indicator (KPI) is explained by a base value and several drivers. It goes beyond correlation and aims to uncover causality, allowing us to infer the relationship between each driver and the KPI. In mathematical terms, the KPI is referred to as the ‘dependent’ or ‘explained’ variable, as it is determined by the drivers. The drivers, on the other hand, are referred to as the ‘independent’ or ‘explanatory’ variables, as they explain changes in the KPI and are not influenced by other variables or the KPI within the model. In the equation, the KPI is represented as Y and the explanatory variables as X. The model typically uses linear regression or ordinary least squares (OLS) to analyse the causal relationship between the drivers and the KPI.

Based on Binet & Field’s research on the long and short-term effects of media advertising, we developed two types of models to better understand its impact. One type of model focuses on estimating the short-term effects of marketing, while the other type estimates the long-term effects of media advertising. This approach allows us to provide insights into what works in the short-term (three months or less, plus what works in the long-term (up to three years).

A key consideration is the use of digital attribution, which requires less historical data and is faster to produce. Bluntly, that would be the wrong question. This isn’t a zero sum game – it’s not one or the other, it’s both. MMM and digital attribution should coexist in a system, where digital attribution guides the operational performance (the day to day) and MMM provides the strategic guidance. The one caveat to be mindful of is the impact that privacy regulations will have on the feasibility of multi-touch or single touch attribution going forward.

“Without knowing what works and what doesn’t, how can we optimise our spending and get the most out of every pound, euro, or dollar?”

The type of data we include

To produce robust models, it is imperative that the data encompassing each driver or variable spans a minimum of three years, preferably at a weekly level. The rationale behind this ‘magic number’ is best exemplified by seasonality. If our sales or key performance indicator (KPI) metric is influenced by events such as Christmas, having three instances of this event in our market mix models allows us to gauge the impact more accurately. For instance, if we observe two instances with positive effects and one with negative, we can deduce that, on average, Christmas has a positive impact on our KPI, and vice-versa. As time series regression is centered on the estimation of the average impact of each activity, having less than three years can reduce the robustness of the models.

When collecting data for our models, they fall into three buckets:

● The first covers data that looks at the state of the wider economy, i.e., economic performance metrics (inflation, GDP, unemployment), COVID, seasonality (Christmas, Easter, Summer holidays) and temperature.

● The second covers data that answers what are we doing, i.e., what is our price, what is our promotion strategy and activation, what is our distribution and what is our marketing activity.

● The third covers data that captures what are our competitors are doing, i.e., competitor pricing orcompetitor marketing.

First bucket (what is happening in the economy)

We collect and compile data on a weekly basis, which includes information on seasonality such as Christmas and Easter, macroeconomic data such as interest rates and unemployment rates, as well as data on the impact of COVID. Some of this data is readily available through various Python packages, while other data can be accessed from publicly-available websites and directly integrated into our modeling pipeline.

Second bucket (what are we doing) – the 4Ps

In marketing, the 4Ps (Price, Product, Place, Promotions) are essential for effective promotion. Similarly, our models must include these data assets to accurately capture the KPI drivers.

● Price (pricing and product promotions – for banking this is interest rates, switcher offers (£125 if a customer switches, etc.)

● Product (how good a product is – the type of current accounts we offer, the type of mortgages we offer, any changes to the product needs to be accounted for in the models)

● Place (distribution, for example Branch openings and closures, regions of the UK that get the same product)

● Promotions (this is the advertising/marketing element such as ad media spend). This includes all our marketing activity, above the line (ATL), below the line (BTL), and digital. It is crucial that we gather both media spend and audience metrics when collecting data. The audience engagement metrics are used within the models to assess incrementality and media spend for ROI calculations. This is because, at the heart of our market mix models, we are assessing the relationship between the public and the take-up of our KPI. For example, for TV, we would use Television Ratings (TVRs), for radio we would use Gross Rating Points (GRP), and so forth. Although given different names, each media format will have data metrics that indicate how much attention a particular campaign or activity received. In most cases, we can access this data via our media agencies or directly connect into the platforms to extract the data via application programming interface (APIs).

Third bucket (what are competitors doing)

Equally crucial are the actions of our competitors. This becomes even more vital in the case of banking services, where the product offerings are highly homogeneous. The banking sector in the UK is known for its intense competition, with each competitor employing their own pricing, product, and marketing strategies to secure a larger share of the market. Exclusion of this data could potentially lead to omitted variable bias.

IN PRACTICE Extract, Transform and Load (ETL)

We automated our Extract, Transform, and Load (ETL) process through the collaborative efforts of our skilled data engineers and Data Scientists. This has streamlined the extraction of data from disparate sources, as well as the cleaning, data enrichment and standardisation of the data, making it ready for modeling. With automation, we can efficiently handle large volumes of data, resulting in time and resource savings. Our automated ETL process ensures that the data is clean, consistent, and analysisready, leaving our Data Scientists to focus on building models and generating actionable insights.

Quality Assurance (QA)

Ensuring the accuracy and reliability of data is crucial in any data pipeline. To achieve this, we leverage data science techniques to automate the process of quality assurance (QA). First, we implement checks to verify that the processed data matches our expected results, such as comparing the sum of raw input to the sum of raw output. Additionally, we use visual cues and exploratory data analysis (EDA) techniques to thoroughly inspect the data. Python offers several modules that expedite the creation of PowerPoints from the data pipeline, enabling us to easily share snapshots of data assets with internal and external stakeholders. This early engagement with stakeholders also helps solidify the insights they can expect from the models.

In collaboration with our data engineers, we’ve developed data products that serve as reusable assets

across the bank. Once the data is prepared, it is made accessible through our data visualisation platform, enabling a transparent culture that enhances the quality assurance process. This openness to the wider bank increases the chances of identifying and correcting data errors, as more eyes can review the data.

Data transformations

A prerequisite within market mix models is the transformation of our media data metrics to account for the nonlinear relationship between media advertising and our Key Performance Indicators (KPIs), as observed in real-life scenarios. In other words, in most, if not all, media channels, there is a saturation point where each additional pound we spend will have a diminishing impact on our KPI.

This phenomenon can be likened to the concept of diminishing returns in economics, where the media activity we engage in reaches a point of diminishing effectiveness. As an illustrative example, I live in Rugby, a town famous for the sport, but with a relatively modest population. If I were to place a press ad in the Rugby Advertiser, a publication that primarily circulates to the local population, I would eventually reach a point where the number of additional customers generated by the ad would plateau, regardless of the amount of money invested.

An additional aspect that needs to be incorporated into our media metrics is the concept of memorability/

adstock/decay carryover. This captures the lasting impression and presence that creative advertisements leave in the subconscious mind, also known as mental availability, even after they have stopped airing or been activated. This transformation allows us to better understand the true impact and effectiveness of our advertising beyond their initial airing or launch.

Modelling

Producing our modelling framework, we integrated speed, reproducibility, consistency (to reduce biases of individual Data Scientists), and governance. To this end, we developed a customised Python-based auto modeler tool (AMT) that generates market mix models based on predefined benchmarks. With thresholds in place, the AMT allows us to iterate through a wide range of adstock and diminishing returns parameters for media activity to identify the optimal model. Even with modest compute power, we can iterate through 10,000 models within five minutes, enhancing the robustness of the final model. This capability provides valuable insights to stakeholders, helping them understand the long-term memorability of marketing activities or campaigns, as well as identifying the optimal diminishing returns for each media channel.

Validation of models

The validation process involves adhering to standard econometric principles to ensure model interpretability

and accuracy. We use well-known metrics such as R-Squared (R2 – coefficient of determination) and MAPE (mean absolute percentage error) to assess model performance. The R-Squared (R2) helps us understand the proportion of variance in the dependent variable explained by the model, while MAPE calculates the average percentage difference between predicted and actual values. These metrics provide a quantitative measure of the model’s predictive performance.

For each driver, we assess its statistical significance by examining its t-statistics and p-values. This helps us determine if its estimated coefficient is statistically different from zero (statistically significant). We layer on business context to evaluate its magnitude and sign of coefficient, to ensure this aligns with the expected relationship between variables.

In assessing the ROIs, we benchmark the results against previous outcomes and compare them with industry standards. Accounting for key drivers and if any are missing, we initiate further investigation and interrogation to ensure a clear rationale. Additionally, for the estimated diminishing returns of each media channel, we examine saturation points to ensure that these make sense.

An essential final stage of the validation process is peer review. Our Data Scientists review each other’s models’ methodology, assumptions, and results to identify potential biases or limitations. This process enhances the credibility of our findings and ensures that the models are reliable and accurate for informed decision-making.

Overall, by combining quantitative metrics and qualitative assessments, we incorporate econometric best practices, business context and peer review to ensure that the robustness of the models is maximised.

Initial presentation of models

As we actively collaborate with our business stakeholders throughout the entire modelling phase, our philosophy is to share initial models with them at the first possible instance. This approach increases transparency and provides an opportunity for stakeholders to pose important questions that they would like answered. Furthering our commitment to deliver actionable insights that address their needs.

Equipped with insights and contextual knowledge obtained from stakeholders, our models go through a final modeling phase, where variables are included or excluded based on their relevance. This approach combines human expertise with machine learning, resulting in modeling outcomes that make sense and provide insights that are relevant to the business.

Final presentation of models

In the final stage, post creation of the econometric

models, the results are output into two formats: PowerPoint presentations (decks) and data visualisation dashboards. Output of the decks is automated via python scripts as much as possible. These include automated charts, tables, and key statements, providing a timestamped snapshot of the outputs and insights that can be easily shared with internal and external stakeholders. One drawback, however, is that they are not dynamic. To address this, we also provide results in a data visualisation dashboard. This allows our internal stakeholders to navigate and explore the results in more detail. Providing them with a self-serve option, giving them access to granular data behind the tables and charts. With the advancement of AI Language Models, we aim to further automate the generation of insights in the decks, increasing the efficiency of this process even further.

DYNAMIC AND DATA-DRIVEN: THE POWER OF MARKET MIX MODELS

One of the key advantages of using market mix models is their ability to empower stakeholders, and to forecast and optimise their marketing media budgets. By simply clicking a button, marketers gain valuable insights, allowing them to make data-driven decisions on how to allocate media spend across different channel options. To facilitate this process, we developed an application that leverages our web development skills and uses a Platform-as-a-Service (PaaS) to host our optimisation models (embedded with response curves or diminishing curves estimated from the market mix models).

This approach involves solving a non-linear optimisation subject to constraints problem. Currently we use Python packages to find the optimal solution in scenarios such as:

1. When the budget controller has allocated £X million for the next year and we need to determine the most effective allocation of this spend across media channels to optimise revenue.

2. When the budget for the next year has already been set but an additional £Y million is available for spend, and we need to identify the most optimal allocation for maximum impact.

3. When there is a desire to spend an additional £Z million in TV advertising, but the current spend is already £T million, and a cost-benefit analysis is needed to understand the potential impact of the additional spend.

Our living, breathing optimisation apps enable us to constantly adapt and optimise our media strategies, ensuring that our marketing efforts are data-driven and yield the best possible outcomes. This reduces the wasted advertising spend to 0%.