Early stopping in A/B testing

Marton Trencseni - Thu 05 March 2020 • Tagged with ab-testing

Increased false positive rate due to early stopping is beautiful nuance of statistical testing. It is equivalent to running at an overall higher alpha. Data scientists need to be aware of this phenomenon so they can control it and keep their organizations honest about their experimental results.

Early stopping

A/B testing and Fisher's exact test

Marton Trencseni - Tue 03 March 2020 • Tagged with ab-testing

Fisher’s exact test directly computes the same p value as the Chi-squared test, so it does not rely on the Central Limit Theorem to hold.

Fisher's test, Fisher Monte Carlo and Chi-squared test p values

A/B testing and the Chi-squared test

Marton Trencseni - Fri 28 February 2020 • Tagged with ab-testing

In an ealier post, I wrote about A/B testing conversion data with the Z-test. The Chi-squared test is a more general test for conversion data, because it can work with multiple conversion events and multiple funnels being tested (A/B/C/D/..).

Chi-squared distribution

A/B testing and the t-test

Marton Trencseni - Sun 23 February 2020 • Tagged with ab-testing

The t-test is better than the z-test for timespent A/B tests, because it explicitly models the uncertainty of the variance due to sampling. Using Monte-Carlo simulations I show that around N=100, the t-test becomes the z-test.

Normal distribution vs t-distribution

A/B testing and the Z-test

Marton Trencseni - Sat 15 February 2020 • Tagged with ab-testing

I discuss the Z-test for A/B testing and show how to compute parameters such as sample size from first principles. I use Monte Carlo simulations to validate significance level and statistical power, and visualize parameter scaling behaviour.

Conversion difference vs N

Beyond the Central Limit Theorem

Marton Trencseni - Thu 06 February 2020 • Tagged with data, ab testing, statistics

In the previous post, I talked about the importance of the Central Limit Theorem (CLT) to A/B testing. Here we will explore cases when we cannot rely on the CLT to hold.

Running mean for Cauchy distribution

A/B testing and the Central Limit Theorem

Marton Trencseni - Wed 05 February 2020 • Tagged with data, ab testing, statistics

When working with hypothesis testing, the desciptions of the statistical method often has normality assumptions. For example, the Wikipedia page for the z-test starts like this: "A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution". What does this mean? How do I know it’s a valid assumption for my data?

Normal distribution from uniform

Optimizing waits in Airflow

Marton Trencseni - Sat 01 February 2020 • Tagged with data, airflow, python

Sometimes I get to put on my Data Engineering hat for a few days. I enjoy this because I like to move up and down the Data Science stack and I try to keep myself sharp technically. Recently I was able to spend a few days optimizing our Airflow ETL for speed.

Airflow DAG

SQL best practices for Data Scientists and Analysts

Marton Trencseni - Sun 26 January 2020 • Tagged with data, programming, sql

My list of SQL best practices for Data Scientists and Analysts, or, how I personally write SQL code. I picked this up at Facebook, and later improved it at Fetchr.

SQL code

How I write SQL code

Marton Trencseni - Fri 24 January 2020 • Tagged with data, programming, sql

This is a simple post about SQL code formatting. Most of this comes from my time as a Data Engineer at Facebook.

SQL code

Small team planning

Marton Trencseni - Fri 10 January 2020 • Tagged with planning, teams, goaling

I’ve worked at 5-10 different organizations, most of them were startups or startuppy companies. I’ve done a lot of planning in small teams, and also taken part in company-wide leadership planning. Here I will describe what has worked well for me in small team settings, focusing on time estimation.

Reaching the peak

Personal goaling

Marton Trencseni - Sun 22 December 2019 • Tagged with self help, goaling

The meta-goal of goaling is to stretch yourself to achieve more, and to feel good about what you’ve achieved. Whatever happened this year, it’s always possible to achieve a lot more and feel better about yourself next year. To hijack a Feynman quote, there is plenty of room at the top.

2019 running

Pytorch in 2019

Marton Trencseni - Thu 12 December 2019 • Tagged with pytorch

2019 was another big year for Pytorch, one of the most popular Deep Learning libraries out there. Pytorch has become the de facto deep learning library used for research thanks to it’s dynamic graph model which allows fast model experimentation. It’s also become production ready, with support for mobile and infrastructure tooling such as Tensorboard.

Pytorch Google Trends 2019

Warren Buffett style fundamental metrics of long-term company performance

Marton Trencseni - Mon 02 December 2019 • Tagged with investing, stocks, warren buffett

I look at some fundamental charts of Apple, Activision Blizzard and Intel.

AAPL shareholder wealth curve

Calibration curves for delivery prediction with Scikit-Learn

Marton Trencseni - Thu 21 November 2019 • Tagged with machine, learning, fetchr, skl, calibration

I show calibration curves for four different binary classification Scikit-Learn models we built for delivery prediction at Fetchr, trained using real-world data: LogisticRegression, DecisionTree, RandomForest and GradientBoosting.

Logistic regression calibration curve

Using simulated self-play to solve all OpenAI Gym classic control problems with Pytorch

Marton Trencseni - Thu 14 November 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym

I use simulated self-play by ranking episodes by summed reward. Game outcomes are divided in two by cutting at the median, winners are assigned +1 rewards, losers are assigned -1 rewards, like in games like Go and Chess. Unlike naive policy gradient descent used in previous posts, this version solves all OpenAI classic control problems, albeit slowly.

OpenAI mountaincar

Applying policy gradient to OpenAI Gym classic control problems with Pytorch

Marton Trencseni - Tue 12 November 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym

I try to generalize the policy gradient algorithm as introduced earlier to solve all the OpenAI classic control problems. It works for CartPole and Acrobot, but not for Pendulum and MountainCar environments.

OpenAI classic control environments

Machine Learning at Fetchr

Marton Trencseni - Tue 29 October 2019 • Tagged with machine, learning, fetchr, skl

Opportunities for automating, optimizing and enabling processes with ML at a delivery company such as Fetchr are plentiful. We put three families of ML models into production. These 3 areas are: Scheduling, Notifications and Operational choice.

Operational choice

Solving the CartPole Reinforcement Learning problem with Pytorch

Marton Trencseni - Tue 22 October 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym, cartpole

The CartPole problem is the Hello World of Reinforcement Learning, originally described in 1985 by Sutton et al. The environment is a pole balanced on a cart. CartPole is one of the environments in OpenAI Gym, so we don't have to code up the physics. Here I walk through a simple solution using Pytorch.

Cartpole animation

Metrics Atlas

Marton Trencseni - Thu 29 August 2019 • Tagged with data, fetchr

The idea is simple: write a document which helps new and existing people—both managers and individual contributors—get an objective, metrics-based picture of the business. This is helpful when new people join, when people start working in new segments of the business, and to understand other parts of the company.

Metrics atlas

