Bytepawn - Marton Trencseni

Optimizing waits in Airflow

Marton Trencseni - Sat 01 February 2020 • Tagged with data, airflow, python

Sometimes I get to put on my Data Engineering hat for a few days. I enjoy this because I like to move up and down the Data Science stack and I try to keep myself sharp technically. Recently I was able to spend a few days optimizing our Airflow ETL for speed.

Airflow DAG

Using simulated self-play to solve all OpenAI Gym classic control problems with Pytorch

Marton Trencseni - Thu 14 November 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym

I use simulated self-play by ranking episodes by summed reward. Game outcomes are divided in two by cutting at the median, winners are assigned +1 rewards, losers are assigned -1 rewards, like in games like Go and Chess. Unlike naive policy gradient descent used in previous posts, this version solves all OpenAI classic control problems, albeit slowly.

OpenAI mountaincar

Applying policy gradient to OpenAI Gym classic control problems with Pytorch

Marton Trencseni - Tue 12 November 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym

I try to generalize the policy gradient algorithm as introduced earlier to solve all the OpenAI classic control problems. It works for CartPole and Acrobot, but not for Pendulum and MountainCar environments.

OpenAI classic control environments

Solving the CartPole Reinforcement Learning problem with Pytorch

Marton Trencseni - Tue 22 October 2019 • Tagged with python, pytorch, reinforcement, learning, openai, gym, cartpole

The CartPole problem is the Hello World of Reinforcement Learning, originally described in 1985 by Sutton et al. The environment is a pole balanced on a cart. CartPole is one of the environments in OpenAI Gym, so we don't have to code up the physics. Here I walk through a simple solution using Pytorch.

Cartpole animation

Playing Go with supervised learning in Pytorch

Marton Trencseni - Sun 25 August 2019 • Tagged with python, pytorch, cnn, go

Using historic gameplay between strong Go players as training data, a CNN model is built to predict good Go moves on a standard 19x19 Go board.

Go prediction sample

Exploring prior beliefs with MCMC

Marton Trencseni - Sat 06 July 2019 • Tagged with python, math, pymc3

I use PyMC3 to solve the food delivery toy problem and explore some alternative priors.

PyMC3 traceplot()

Food deliveries, Bayes and Computational Statistics

Marton Trencseni - Sat 22 June 2019 • Tagged with python, math, fetchr

I was grabbing a burger at Shake Shack, Mall of the Emirates in Dubai, when I noticed this notebook on the counter. The staff is using it to track food deliveries and each service (Carriage, Talabat, UberEats, Deliveroo) has its own column with the order numbers. Let's assume this is the only page for the day, and ask ourselves: given this data, what is the probability that UberEats is the most popular food delivery service?.

Shake shack food deliveries

The Collatz conjecture

Marton Trencseni - Sun 02 June 2019 • Tagged with python, math

The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: start with any positive integer n. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of n, the sequence will always reach 1.

Collatz

MNIST pixel attacks with Pytorch

Marton Trencseni - Sat 01 June 2019 • Tagged with python, pytorch, cnn, torchvision, mnist, skl

It’s easy to build a CNN that does well on MNIST digit classification. How easy is it to break it, to distort the images and cause the model to misclassify?

MNIST attack accuracy

Solving CIFAR-10 with Pytorch and SKL

Marton Trencseni - Tue 14 May 2019 • Tagged with python, pytorch, cnn, torchvision, cifar, skl

CIFAR-10 is a classic image recognition problem, consisting of 60,000 32x32 pixel RGB images (50,000 for training and 10,000 for testing) in 10 categories: plane, car, bird, cat, deer, dog, frog, horse, ship, truck. Convolutional Neural Networks (CNN) do really well on CIFAR-10, achieving 99%+ accuracy. The Pytorch distribution includes an example CNN for solving CIFAR-10, at 45% accuracy. I will use that and merge it with a Tensorflow example implementation to achieve 75%. We use torchvision to avoid downloading and data wrangling the datasets. Like in the MNIST example, I use Scikit-Learn to calculate goodness metrics and plots.

CIFAR examples

Solving MNIST with Pytorch and SKL

Marton Trencseni - Thu 02 May 2019 • Tagged with python, pytorch, cnn, torchvision, mnist, skl

MNIST is a classic image recognition problem, specifically digit recognition. It contains 70,000 28x28 pixel grayscale images of hand-written, labeled images, 60,000 for training and 10,000 for testing. Convolutional Neural Networks (CNN) do really well on MNIST, achieving 99%+ accuracy. The Pytorch distribution includes a 4-layer CNN for solving MNIST. Here I will unpack and go through this example. We use torchvision to avoid downloading and data wrangling the datasets. Finally, instead of calculating performance metrics of the model by hand, I will extract results in a format so we can use SciKit-Learn's rich library of metrics.

MNIST example digits

rxe: literate and composable regular expressions

Marton Trencseni - Sat 02 March 2019 • Tagged with python

rxe is a thin wrapper around Python's re module. The various rxe functions are wrappers around corresponding re patterns. For example, rxe.digit().one_or_more('a').whitespace() corresponds to \da+\s. Because rxe uses parentheses but wants to avoid unnamed groups, the internal (equivalent) representation is actually \d(?:a)+\s. This pattern can always be retrieved with get_pattern().

rxe example code

Newer Posts