Five ways to reduce variance in A/B testing

Marton Trencseni - Sun 19 September 2021 • Tagged with ab-testing, variance, stratification, cuped

I use toy Monte Carlo simulations to demonstrate 5 ways to reduce variance in A/B testing: increase sample size, move towards a more even split, reduce variance in the metric definition, stratification and CUPED.

Historic lift

Continue reading

Correlations, seasonality, lift and CUPED

Marton Trencseni - Sun 05 September 2021 • Tagged with ab-testing, cuped

In this final blog post about CUPED, I will address some questions about CUPED, such as, is correlation between "before" and "after" the same as seasonality?

Continue reading

A/A testing and false positives with CUPED

Marton Trencseni - Sun 15 August 2021 • Tagged with ab-testing, cuped

I use Monte Carlo simulations of A/A tests to demonstrate how Data Scientists can incorrectly skew lift and p-values if they pick-and-choose between reporting traditional and CUPED results after the experiment has concluded.

Historic lift

Continue reading

Reducing variance in conversion A/B testing with CUPED

Marton Trencseni - Sat 07 August 2021 • Tagged with ab-testing, cuped

I use Monte Carlo simulations of conversion A/B tests to demonstrate how CUPED reduces measurement variance in conversion experiments.

Historic lift

Continue reading

Reducing variance in A/B testing with CUPED

Marton Trencseni - Sat 31 July 2021 • Tagged with ab-testing, cuped

I use Monte Carlo simulations of A/B tests to demonstrate CUPED, a method to use historic "before" data to reduce the variance in the measurement of the treatment lift.

Historic lift

Continue reading

A/B testing and the historic lift paradox [redacted]

Marton Trencseni - Sun 25 July 2021 • Tagged with redacted

I show how looking at historic "before" values in A/B testing can lead to an apparent paradox.

Historic lift

Continue reading

The Full Stack Data Scientist

Marton Trencseni - Fri 23 July 2021 • Tagged with data, fallacies

What are the core skills a data scientist needs to sustainably achieve bottom-line impact, without blocking on external help from other roles?

Data Scientist

Continue reading

Comparing NeuralProphet and Prophet for timeseries forecasting

Marton Trencseni - Tue 20 July 2021 • Tagged with modeling, timeseries, prophet, neuralprophet

I compare Prophet and NeuralProphet performance using a toy forecasting benchmark.

NeuralProphet

Continue reading

Timeseries forecasting with Prophet

Marton Trencseni - Sun 18 July 2021 • Tagged with modeling, timeseries, prophet

Prophet is a simple to use timeseries forecasting library by Facebook.

Prophet

Continue reading

YOLO object detection architecture

Marton Trencseni - Sat 10 July 2021 • Tagged with yolo, yolov5, vision, object detection

I discuss the YOLO neural network architecture for object detection.

YOLO architecture

Continue reading

YOLOv5 object detection experiments

Marton Trencseni - Fri 02 July 2021 • Tagged with yolo, yolov5, vision, object detection

I run object detection experiments with pre-trained YOLOv5 models.

YOLO object detection example

Continue reading

Predicting party affiliation of US politicians using fasttext

Marton Trencseni - Sun 20 June 2021 • Tagged with statistics, trump, politics, fasttext, twitter

I train a fasttext classifier on 1.2M data points to predict US politicians' party affiliations from their twitter messages.

Trump Schiff

Continue reading

Random digits and Benford's law

Marton Trencseni - Sat 29 May 2021 • Tagged with statistics

The post explores the distribution of digits of random and non-random numbers from receipts, verifying Benford's law of first digit distribution.

Early stopping

Continue reading

Tricks vs implementation in coding interviews

Marton Trencseni - Sat 22 May 2021 • Tagged with hiring, interviewing

Thinking up tricky solutions in 3-5 minutes is not a requirement in a work setting. Usually, there are days or weeks for that. But implementing an idea, once the idea is there, should be straightforward for a good programmer.

Daily Coding Problem

Continue reading

10 ways to iterate from 0 to 1 with deciles

Marton Trencseni - Fri 14 May 2021 • Tagged with mlflow, tracking

What's the best way to iteratore from 0 to 1 in steps of 0.1 in Python, and what are the potential pitfalls?

Iterating from 0 to 1 in steps of 0.1

Continue reading

Sometimes brute forcing just works

Marton Trencseni - Thu 06 May 2021 • Tagged with python

I describe a real world use-case where a simple, brute force search based solution worked really well, making more sophisticated Machine Learning unnecessary.

Sample receipt

Continue reading

Building intuition for p-values and statistical significance

Marton Trencseni - Sun 25 April 2021 • Tagged with ab-testing

This is the transcript of a talk I did on experimentation and A/B testing, to give the audience an intuitive understanding of p-values and statistical significance.

Coin flip

Continue reading

Random numbers, the natural logarithm and higher dimensional simplexes

Marton Trencseni - Sat 17 April 2021 • Tagged with bayesian, ab-test

The base $e$ of the natural logarithm shows up in an unexpected place. Let's derive why!

Simplex

Continue reading

Classification accuracy of quantized Autoencoders with Pytorch and MNIST

Marton Trencseni - Fri 09 April 2021 • Tagged with python, pytorch, cnn, torchvision, mnist, autoencoder

I measure how the classification accuracy of quantized Autoencoder neural network varies with encoding bits on MNIST digits.

Classifier accuracy on quantized Autoencoder output after quantization

Continue reading

Investigating information storage in quantized Autoencoders with Pytorch and MNIST

Marton Trencseni - Sun 04 April 2021 • Tagged with python, pytorch, cnn, torchvision, mnist, autoencoder

I investigate how much information an Autoencoder neural network encodes for MNIST digits.

Pytorch Autoencoder loss with encoding dimension and quantization bits

Continue reading