Conditional Probabilities and Simpson's Paradox

Marton Trencseni - Sun 11 June 2023 • Tagged with probability, statistics, simpsons, paradox

I give examples of "unintuitive" conditional probabilities and discuss Simpson's paradox.

Simpson's paradox

Continue reading

Fair coin from biased coin

Marton Trencseni - Tue 22 March 2022 • Tagged with probability, statistics

Given a biased coin, construct a fair coin.

Fair coin from biased coin

Continue reading

The german tank problem in World War II

Marton Trencseni - Sat 12 March 2022 • Tagged with statistics, war

I run Monte Carlo simulations to show the frequntist solution to the German tank problem.


Continue reading

Predicting party affiliation of US politicians using fasttext

Marton Trencseni - Sun 20 June 2021 • Tagged with statistics, trump, politics, fasttext, twitter

I train a fasttext classifier on 1.2M data points to predict US politicians' party affiliations from their twitter messages.

Trump Schiff

Continue reading

Random digits and Benford's law

Marton Trencseni - Sat 29 May 2021 • Tagged with statistics

The post explores the distribution of digits of random and non-random numbers from receipts, verifying Benford's law of first digit distribution.

Early stopping

Continue reading

Beyond the Central Limit Theorem

Marton Trencseni - Thu 06 February 2020 • Tagged with data, ab testing, statistics

In the previous post, I talked about the importance of the Central Limit Theorem (CLT) to A/B testing. Here we will explore cases when we cannot rely on the CLT to hold.

Running mean for Cauchy distribution

Continue reading

A/B testing and the Central Limit Theorem

Marton Trencseni - Wed 05 February 2020 • Tagged with data, ab testing, statistics

When working with hypothesis testing, the desciptions of the statistical method often has normality assumptions. For example, the Wikipedia page for the z-test starts like this: "A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution". What does this mean? How do I know it’s a valid assumption for my data?

Normal distribution from uniform

Continue reading

Beat the averages

Marton Trencseni - Sat 07 July 2018 • Tagged with statistics, data

When working with averages, we have to be careful. There are pitfalls lurking to pollute our statistics and results reported.

Probability distribution

Continue reading

Beautiful A/B testing

Marton Trencseni - Sun 05 June 2016 • Tagged with ab-testing, strata, statistics, data

I gave this talk at the O’Reilly Strata Conference London in 2016 June, mostly based on what I learned at Prezi from 2012-2016.

14. slide

Continue reading