Significance calculation for AOV and ARPV

November 06, 2024 09:49

T-test

A t-test is a widely used method to assess the statistical significance of differences between two groups (e.g., a control group and a test group). A t-test determines whether the observed difference in these metrics is likely due to the changes made or if it could have occurred by random chance.

Why Use T-Test for AOV and ARPV in A/B Testing?

• For AOV: The t-test for AOV allows you to see if changes have affected the average spending per transaction. A significant increase in AOV indicates that customers are spending more per order, which can boost revenue even if the conversion rate remains unchanged.

• For ARPV: The t-test for ARPV is valuable because it captures both conversion rate and order value. A significant difference in ARPV implies that one variant is more effective at generating revenue per visitor. This metric provides a comprehensive view of how changes affect the overall revenue potential of each visitor.

Steps taken to determine if there is a significant difference between the means of two groups using a T-test and Welch–Satterthwaite equation

1. Collect Data for each group:

Get all order values from each variation.

2. Calculate Means for each group:

ARPV = total revenue / total visitors
AOV = total revenue / transactions

3. Sample Variances for each group

Variance measures how much the data within each group being compared are spread out around their respective means. It quantifies the variability or dispersion of the sample data. In a t-test, variance is crucial because it is used to calculate the standard error of the difference between the two sample means.

Where:

is the sample mean.

represents each individual observation.

is the degrees of freedom, accounting for the estimation of the mean from the sample.

4. Determine the Sample Sizes:

Count the number of observations in each group to get and

5. Calculate the Standard Error of the Difference

For each group, divide the sample variance by the sample size. Add these two results together. Take the square root of the sum to get the standard error

6. Compute the t-Statistic:

T-statistic is a numerical value that quantifies the difference between sample means relative to the variability in the data. It is used to determine whether the observed difference between two groups is statistically significant or likely due to random chance.

7. Approximate the Degrees of Freedom:

Degrees of Freedom refer to the number of independent values or quantities that can vary in the analysis without violating any constraints. Essentially, it represents the amount of information available to estimate parameters like the variance.

Degrees of freedom are used to find the critical t-values from the t-distribution table, which are essential for determining statistical significance.

Higher degrees of freedom result in narrower confidence intervals, indicating more precise estimates of the population parameter.

We use the Welch–Satterthwaite equation to estimate the degrees of freedom

8. Determine the p-Value:

The p-value in a t-test is a probability measure that helps determine the statistical significance of the observed difference between two group means. It quantifies the likelihood of obtaining a test statistic as extreme as, or more extreme than, the one observed, assuming that the null hypothesis is true (i.e., there is no actual difference between the group means).

We use the t-statistic and the degrees of freedom to find the p-value from the t-distribution.

A p-value below the significance threshold (e.g., 0.05) indicates that there is a statistically significant difference between the groups.

Results for two-tail - Interpreting the Confidence Interval

Use the confidence interval to assess not just whether a difference exists, but also the range of values for the difference, which aids in understanding the practical significance.

Entirely Positive Interval indicates that Group 1’s mean is higher than Group 2’s mean
Entirely Negative Interval indicates that Group 2’s mean is higher than Group 1’s mean
Interval Includes Zero indicates that the true difference could be zero; thus, there is no statistically significant difference between the group means at the chosen confidence level.

Why use Welch's equation

The Welch-Satterthwaite equation is used when performing the test as that does not assume equal variances between two groups. This equation calculates the effective degrees of freedom for the test, adjusting for differences in sample variances and sizes. By using the Welch-Satterthwaite equation, you obtain a more accurate estimation of the test statistic and p-value when the assumption of equal variances (homoscedasticity) is violated. This leads to more reliable results in statistical inference when comparing the means of two independent samples with unequal variances.