2024 Data Outlook

Marton Trencseni - Sun 11 February 2024 • Tagged with outlook, 2024, datahub, capm

It is the beginning of the year — a good time to reflect on the previous year and make plans for the year ahead. I wrote this document for my team members in 2024 January to kick off the year. This is an abridged version with sensitive content removed.


Continue reading

Introduction to Marketing Mix Modeling

Marton Trencseni - Sun 23 July 2023 • Tagged with mmm, marketing, mixed, model, lightweight_mmm, google, python

I describe the concept of Marketing Mix Modeling using Google's LightweightMMM library.

MMM attribution

Continue reading

Real-world experiments I: 5 Lessons from Google, Bing, Netflix and Alibaba

Marton Trencseni - Sun 18 June 2023 • Tagged with ab-testing

I discuss five lessons from large-scale experiments conducted by Google, Bing, Netflix and Alibaba: Kohavi's 1 out of 3 rule, Google's 41 shades of blue, Bing's unexpected big win, Alibaba's personalization experiment and Netflix' movie image personalization.


Continue reading

Conditional Probabilities and Simpson's Paradox

Marton Trencseni - Sun 11 June 2023 • Tagged with probability, statistics, simpsons, paradox

I give examples of "unintuitive" conditional probabilities and discuss Simpson's paradox.

Simpson's paradox

Continue reading

Common patterns in technical interviewing

Marton Trencseni - Sat 01 October 2022 • Tagged with interviewing

I will attempt to enumerate all the categories of questions commonly asked in technical interview loops, and my experience with them.


Continue reading

More Data Scientists should learn SQL

Marton Trencseni - Sun 29 May 2022 • Tagged with data, sql

In my experience, many Data Scientists struggle to write SQL queries in interviews.


Continue reading

100 articles

Marton Trencseni - Mon 18 October 2021 • Tagged with meta

A review and introspect on the first 100 articles written on Bytepawn.


Continue reading

The Full Stack Data Scientist

Marton Trencseni - Fri 23 July 2021 • Tagged with data, fallacies

What are the core skills a data scientist needs to sustainably achieve bottom-line impact, without blocking on external help from other roles?

Data Scientist

Continue reading

YOLO object detection architecture

Marton Trencseni - Sat 10 July 2021 • Tagged with yolo, yolov5, vision, object detection

I discuss the YOLO neural network architecture for object detection.

YOLO architecture

Continue reading

YOLOv5 object detection experiments

Marton Trencseni - Fri 02 July 2021 • Tagged with yolo, yolov5, vision, object detection

I run object detection experiments with pre-trained YOLOv5 models.

YOLO object detection example

Continue reading

Predicting party affiliation of US politicians using fasttext

Marton Trencseni - Sun 20 June 2021 • Tagged with statistics, trump, politics, fasttext, twitter

I train a fasttext classifier on 1.2M data points to predict US politicians' party affiliations from their twitter messages.

Trump Schiff

Continue reading

Random digits and Benford's law

Marton Trencseni - Sat 29 May 2021 • Tagged with statistics

The post explores the distribution of digits of random and non-random numbers from receipts, verifying Benford's law of first digit distribution.

Early stopping

Continue reading

10 ways to iterate from 0 to 1 with deciles

Marton Trencseni - Fri 14 May 2021 • Tagged with mlflow, tracking

What's the best way to iteratore from 0 to 1 in steps of 0.1 in Python, and what are the potential pitfalls?

Iterating from 0 to 1 in steps of 0.1

Continue reading

Building intuition for p-values and statistical significance

Marton Trencseni - Sun 25 April 2021 • Tagged with ab-testing

This is the transcript of a talk I did on experimentation and A/B testing to give the audience an intuitive understanding of p-values and statistical significance.

Coin flip

Continue reading

Random numbers, the natural logarithm and higher dimensional simplexes

Marton Trencseni - Sat 17 April 2021 • Tagged with bayesian, ab-test

The base $e$ of the natural logarithm shows up in an unexpected place. Let's derive why!


Continue reading

Building a Pytorch Autoencoder for MNIST digits

Marton Trencseni - Thu 18 March 2021 • Tagged with pytorch, autoencoder, mnist

I build an Autoencoder network to categorize MNIST digits in Pytorch.

Conversion difference vs N

Continue reading

Training a Pytorch Wasserstein MNIST GAN on Google Colab

Marton Trencseni - Wed 03 March 2021 • Tagged with python, pytorch, torchvision, mnist, gan

I train a Pytorch Wasserstein MNIST GAN on Google Colab to beautiful MNIST digits.

Wasserstein GAN Generated MNIST digits

Continue reading

Training a Pytorch Lightning MNIST GAN on Google Colab

Marton Trencseni - Sat 20 February 2021 • Tagged with python, pytorch, gan, mnist, google-colab

I explore MNIST digits generated by a Generative Adversarial Network trained on Google Colab using Pytorch Lightning.

Softmax GAN after 5 epoch, 100 samples.

Continue reading

Automatic MLFlow logging for Pytorch

Marton Trencseni - Sun 24 January 2021 • Tagged with mlflow, tracking

I explore the automatic logging capabilities of MLFlow for Pytorch.

MLFlow Pytorch loss example.

Continue reading

Automatic MLFlow logging for Scikit Learn

Marton Trencseni - Fri 15 January 2021 • Tagged with mlflow, tracking

I explore the automatic logging capabilities of MLFlow for Scikit Learn. In the process I found a bug in MLFlow, reported it and wrote a pull request to fix it.

MLFlow scatter plot.

Continue reading