Serving Email Metric Predictions

Chen Song | 5/23/2022

Table of Contents

1. Data
2. Model & Algorithm
3. Use Cases
4. Constraints & Limitations
5. Conclusions
6. Future Considerations


Finding the ideal length for an email campaign message can have a significant impact on the target audience’s engagement rates. Using the most acceptable and reasonable character counts, clients of email service providers are able to get higher audience engagement rates. Here, we address the problem by constructing a machine learning model that provides in-session realtimeML predictive analytics on the key features within an email campaign editor (in the case of the text editor) and serves email-based metrics on the character counts for a particular type of email in a particular type of industry. ( Inputs)

The model was trained on real-world proprietary email data and important email campaign features. Heuristics or synthetic variables were incorporated into the training data generation using statistical methodologies. The character count and engagement metrics are then served within milliseconds, within the workflow of an email text editor, and derive the optimal recommendation character length based on the input target variable. The independent variables include industry type, campaign type, the sentiment of the email, and links in the email.

I. Data

1.1. Data Source & Data Set

As there are no existing datasets directly related to email content available publicly across industries and target variables (open rate, click-through rate, unsubscribed rate and bounce rate, etc), we collected different email campaigns from 9 different industries, including academic and education, energy, entertainment, finance and banking, healthcare, hospitality, real estate, retail, and software and technology. We also human labelled these emails for to include the campaign type, the sentiment of the email and the number of links in the email as additional features. This is a Loxz Digital Proprietary DataSet used to train models in-house.

For the target variable, we utilize the 2022 email marketing benchmarks from the Campaign Monitor. To fit the benchmarks in our email campaign dataset, we utilize statistical methodologies to create distributions based on the benchmarks and normalized the distributions.

1.2 Feature engineering

Along with the word embedding vectors of email content text bodies, we extract several other features from the email campaigns:

  • URL_cnt: numbers of links in the email campaign.
  • Tone_sentiment: The tone of the email campaign. To generate the tones, we utilize the pre-trained tiny BERT model and added a dense layer to the transformed vectors to generate the tone labels (Figure 1). In the model, a softmax dense layer was implemented to the output vectors to generate 8 different tones, including analytical, casual, confident, friendly, joyful, optimistic, respectful, and urgent.
  • The dependent variables are open rate, click-through rate, unsubscribe rate, and bounce rate. Users select the dependent variable that they want to optimize, and the model will make the prediction based on the users' selection. The benchmarks are from the Campaign Monitor website. Each model we prepare for an ESP will have it’s own baseline, so it’s important to keep that in mind. To fit the benchmarks into the training dataset, we generate “n” random numbers with a standard normal distribution (n is the number of emails in a particular industry). Then we normalize the distributed random numbers according to the domain knowledge of experts in our team.

    Figure 1. Feature engineering of sentiment tones

    II. Model & Algorithm

    2.1. Model Development

    Predicting the best character counts in an email campaign with the highest engagement rate is a regression problem. The algorithm we use in this model is a tree-based regressor: the random forest regressor.

    After the campaign engineer writes a draft of the email, the engineer selects the industry type her email is associated with, as well as the type of campaign being run. Both industry type and type of campaign, ( albeit, promotion, transactional, webinar, etc) The algorithm takes these values, along with the email word embeddings themselves, sentiment tones, and URL counts that are extracted from the email contents as inputs.

    Once inputs have been completed, running the model would provide a real-time output (Figure 2) in milliseconds of the expected customer response or engagement rate based on the current character count, for a particular industry and campaign type. So character counts and engagement rates will differ from a Webinar based email, as opposed to a promotional. Further, a promotional type email for a Real Estate company, may require more characters in the email for higher engagement than a Software or Technology company. Why guess? That’s what the model if for. Additionally, the model will suggest alternate character counts which are predicted to get a better reaction/engagement to the dependent variable of the user’s choice.

    Figure 2. Example of model output

    2.2 Model Assumption & Validation

    Our model generates predictions on the engagement rate, as well as the optimized character counts for the particular industry and campaign type that the user selects. For the open rate prediction, the email campaign must not be automatically marked as “Opened” when it is delivered to receptors to allow the model to perform normally. Particularly, the IOS system automatically marks any received email as “Opened”.

    We apply different regression models, and we use the R-square score as a validation score to evaluate the model performance. This performance is also rendered within the predictions to the given endpoint. R-square score, also known as the coefficient of determination, is a statistical measure of how well the linear model explains the variable variation. In our case, the r-square score provides insights into the model performance on the data. The R-square score ranges from 0 to 1, and the higher the score, the better the model performance is.

    The result from the model tests shows that the decision tree regressor performs the best in predicting the character counts and the engagement rates, with a 0.904 r-square score. The Evaluation scores are displayed in Table 1.

    Model: Email Character Count R-square Score
    Linear Regression 0.44
    Polynomial (degree=2) 0.52
    Polynomial (degree=3) 0.60
    DecisionTreeRegressor 0.958
    Table 1. R-square scores 6

    III. Use Cases

    The end-users are email campaign marketing team or campaign engineer whose goal is to increase email engagement rates, revenue per email, perhaps deliverability and overall engagement, such as open rate, click-through rate, and to decrease unsubscribe rate and bounce rate. These target variables can be very different based on the campaign engineers choosing. There are over 25 different email campaign metrics so the campaign engineer has options. The model is also flexible enough to have additional dependent variables as inputs. So if the campaign engineer wants to optimize for revenue per email, she would get a different set of predictive engagement metrics as well as an accuracy score.

    Our model provides real-time predictions on your email engagement rate and the alternative character counts with a higher engagement rate. This model is a RealTimeML use case inferring higher engagement rates prior to the campaign marketing team clicking send. Keep in mind that pipeline health, is key to rendering predictions with low latency.

    Once the user completes a draft of an email, the model will calculate and serve the predicted user response to the length of that email in real-time. This allows the user to quickly and accurately know if users will find the email too long or too short, before sending the email out. Since the predictions are served in real-time, the user will be able to run many drafts through the models in the Loxz family of email recommendations to tune their emails to their audience without delay, improving engagement rates.

    IV. Constraints & Limitations

    Currently, the dependent variables in the training dataset are simulated and calculated based on the benchmarks using distribution techniques. In the real-world case, the distribution of the engagement rates might vary from case to case. However, as new data is ingested using streaming or batch process we are able to retrain the model with the new data and serve inferences on the new dataset, and we are able to consistently monitor the model to manage data, features or concepts drift. On the other hand, the benchmark we currently use for model training is from the year 2022, the benchmarks need to be updated annually to keep the model updated. However, as a service provider, Loxz is able to customize the model as customer needs change, and as new data comes in, the benchmarks will be replaced with clients' data.

    V. Conclusions

    As the model was trained on real-world campaign emails, the effectiveness of our predictions and recommendations will be directly correlated to the conversion rates of alternative character lengths.

    The model is constructed to predict the customer response to the character counts of an email campaign, for the purpose of optimizing campaign email character counts. We showed the model to have a high accuracy, which indicates the model has the ability to make useful predictions for the users. We also showed the algorithm serves the prediction in less than half a second which makes this project eligible for use in real-time applications. The character count model aims to provide the best character counts in an email to help the clients of ESPs get the highest engagement in their email campaigns.

    VI. Future Considerations

    Integrating or blending the character count model with the sentiment analysis model is going to provide our users with a more efficient and cohesive solution. In the future, we are going to combine these two models and also introduce data from social media and e-commerce engagement to optimize for higher engagement rates. For example, with different counts of characters in the email campaign, the model will provide different sentiment tones for the users to get the highest engagement rate. It has yet to be experimented with, but combining two or more email optimization models could have a dramatic effect on engagement rates given the experimenting techniques. For example, the combination of the character count model with the image optimization model could lead to a much higher engagement rate than character count or sentiment. By introducing engagement data from social media posts, we could potentially determine an image that will ensure optimal engagement rates and serve click-through data predictions in real-time for the campaign engineer to decipher. The idea of a campaign engineer utilizing three or more optimized models in real-time may lead to inference latency. This has yet to be examined. The current model portfolio includes but not limited to:

    • Character Count Model
    • Sentiment Analysis Model
    • Survey Incentive Model
    • Product Discount Model
    • Numeric Link Model
    • Call To Action Model
    • Image-Based Optimization Model
    • Data Augmentation Model
    • Send Time Optimization Model
    • Inactive Subscriber Optimization Model

    One could potentially discern that a product discount model and an image optimization model could allow for higher engagement rates. This has yet to be determined. For now, we have a baseline set for the Send Time Optimization model and we are testing this model for ways to perform the benchmark open and click-through rates for email.

    The model’s computational efficiency is reduced by the practice of training a new model each time a prediction is called for. Predictions may be served more quickly by pre-training models instead, eliminating the need to spend time constructing another baseline model.

    Currently, the model is based on the 2022 email campaign benchmarks from the campaign monitor, we will look to integrate the client’s data into the model and customize the model for clients as their needs. It is also worthy to note that the current model portfolio or supply of models has been developed internally and there are no external models we’ve chosen to introduce. All of these models we’ve developed trained and tested by Loxz Digital Group, Inc.