Entropy in Data Science

Marton Trencseni - Sun 24 October 2021 • Tagged with entropy

I discuss 4 uses of entropy in Data Science: (i) cross entropy as a loss function for training neural network classifiers (ii) entropy as a splitting criterion for building decision trees (iii) entropy for evaluating clustering algorithms (iv) entropy for understanding relationships in tabular data.

Information gain

Continue reading

Cross entropy, joint entropy, conditional entropy and relative entropy

Marton Trencseni - Sat 09 October 2021 • Tagged with entropy, cross-entropy, joint-entropy, conditional-entropy, relative-entropy, kullbackÔÇôleibler-diverence

What's the difference between cross entropy, joint entropy, conditional entropy and relative entropy?

Information gain

Continue reading

What's the entropy of a fair coin toss?

Marton Trencseni - Sat 25 September 2021 • Tagged with entropy, interviews, cross-entropy, physics

What's the entropy of a fair coin toss? What if the coin almost always returns Heads? My recruiter reports that very few candidates can answer these entropy related DS screening questions.

Entropy

Continue reading

Exploring prior beliefs with MCMC

Marton Trencseni - Sat 06 July 2019 • Tagged with python, math, pymc3

I use PyMC3 to solve the food delivery toy problem and explore some alternative priors.

PyMC3 traceplot()

Continue reading

Food deliveries, Bayes and Computational Statistics

Marton Trencseni - Sat 22 June 2019 • Tagged with python, math, fetchr

I was grabbing a burger at Shake Shack, Mall of the Emirates in Dubai, when I noticed this notebook on the counter. The staff is using it to track food deliveries and each service (Carriage, Talabat, UberEats, Deliveroo) has its own column with the order numbers. Let's assume this is the only page for the day, and ask ourselves: given this data, what is the probability that UberEats is the most popular food delivery service?.

Shake shack food deliveries

Continue reading

The Collatz conjecture

Marton Trencseni - Sun 02 June 2019 • Tagged with python, math

The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: start with any positive integer n. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of n, the sequence will always reach 1.

Collatz

Continue reading