<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Bytepawn - Marton Trencseni</title><link href="https://bytepawn.com/" rel="alternate"></link><link href="https://bytepawn.com/feeds/all.atom.xml" rel="self"></link><id>https://bytepawn.com/</id><updated>2024-02-11T00:00:00+01:00</updated><entry><title>2024 Data Outlook</title><link href="https://bytepawn.com/2024-data-outlook.html" rel="alternate"></link><published>2024-02-11T00:00:00+01:00</published><updated>2024-02-11T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2024-02-11:/2024-data-outlook.html</id><summary type="html">&lt;p&gt;It is the beginning of the year — a good time to reflect on the previous year and make plans for the year ahead. I wrote this document for my team members in 2024 January to kick off the year. This is an abridged version with sensitive content removed. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/outlook-2024-1.jpg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;It is the beginning of the year — a good time to reflect on the previous year and make plans for the year ahead. This document is an attempt at that, focusing on the latter, since our 2023 performance reviews covered the previous year.&lt;/p&gt;
&lt;p&gt;I invite you to the same for yourself — think about what happened to the industry in 2023, where it's going in 2024, and what does that mean to you? What do you want to learn, whether at the company or outside? What do you want to "import" into the company, whether into an existing project or as a new project? What does a good 2024, the year of AI, look like on your area, in your Business Unit? What do you want to do more of, and what do you want to do less of — both at work and outside of work.&lt;/p&gt;
&lt;p&gt;2024 will be a great year for data.&lt;/p&gt;
&lt;p&gt;It will be what we make of it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;It will be what you make of it.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-1.jpg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Industry opportunities&lt;/h2&gt;
&lt;h2&gt;Data Science and AI&lt;/h2&gt;
&lt;p&gt;I started studying at the Technical University in Budapest in 1999 for a Computer Science major. Around the third year we had to take the mandatory &lt;em&gt;Artificial Intelligence&lt;/em&gt; course, taught by the polish-hungarian professor Dobrowiecki Tadeusz. We were using the dreadful Norvig &lt;em&gt;Artificial Intelligence&lt;/em&gt; book, but even worse, the hungarian translation, by Tadeusz, himself not a native speaker.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-5.png" alt="." style="width: 100px;"/&gt;&lt;/p&gt;
&lt;p&gt;I didn't get much out of this course, but I do remember some things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Learning that neural networks are very limited: a neural network — even with multiple "hidden" layers — is essentially a linear transform, a large matrix that maps vectors into vectors&lt;/li&gt;
&lt;li&gt;.. but, no linear map can implement even XOR — this is of course is true, but only if you don't have activation functions in there. Once you add activation functions, you introduce nonlinearity, and the network can "even" learn XOR 🙂&lt;/li&gt;
&lt;li&gt;AI springs and winters: Tadeusz explained to us that optimism about the role of AI comes and then goes, with long periods of AI winters in between. &lt;em&gt;"This time it's real!"&lt;/em&gt; has been said many times, but so far it always ended up being unreal.&lt;/li&gt;
&lt;li&gt;The first AI winter was triggered by the 1969 book Perceptrons, written famously by Marvin Minsky, and using — among others — the above XOR argument to discourage work in AI.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In 2024, after ~50 years of research, we know that the great man (Minsky) was wrong. We now understand the importance of activation functions, stochastic gradient descent, batch learning, dropouts, different cost functions, regularization, we have invented transformer architectures, and so on. I pay \$20 per month for my OpenAI subscription and use ChatGPT on a weekly basis, and I can't wait for further commoditization, so the monthly price drops to \$5 and eventually \$0. The opening image was drawn by ChatGPT4 (after numerous re-prompts by yours truly, because it kept drawing male figures only). Even &lt;a href="https://en.wikipedia.org/wiki/John_Carmack"&gt;John Carmack&lt;/a&gt; (creator Wolfenstein, Doom, Quake, then CTO of Oculus) is working on AI these days. &lt;em&gt;"This time it's real!"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The transformer neural network (and training) architectures driving Generative AI are interesting technologies, and I don't think we should (or can afford to) ignore them. In 2024, having a good grasp of how and why they work is part of the forward-thinking (and intellectual) Data Scientist's (or ML Engineer's, or, any data person's) area of interest. Also, there are hundreds of startups trying to commercialize these technologies (Word+GenAI, Powerpoint+GenAI, Jira+GenAI, IDE+GenAI, etc). We need to keep our eyes open for possible application areas within our company. &lt;/p&gt;
&lt;p&gt;Personally, I have spent a fair amount of time in the past learning about and training neural networks (in the pre-generative era), in 2024 I plan to invest time into this area again, eg. play around with &lt;a href="https://github.com/openlm-research/open_llama"&gt;Llama&lt;/a&gt;s, play around with vector databases and &lt;a href="https://github.com/langchain-ai/langchain"&gt;langchain&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But it's not just the latest and greatest Generative AI that is exciting in 2024. The investment and excitement into (neural network based and/or generative) AI spills over into adjacent technical areas of traditional Data Science and Machine Learning (and Data Engineering), so we have &lt;a href="https://github.com/unit8co/darts"&gt;new forecasting libraries&lt;/a&gt;, &lt;a href="https://ai.meta.com/blog/dino-v2-computer-vision-self-supervised-learning/"&gt;new vision models&lt;/a&gt;, better implementations in SKL and LGBM, and tons of technologies to try out on the MLOps and tooling front — literally 100s of tools and libraries.&lt;/p&gt;
&lt;h2&gt;Data Engineering and Platforms&lt;/h2&gt;
&lt;p&gt;This is one of the most exciting areas for us in 2024! When we built our organization's Data Strategy for 2024, our data platform and the data engineering work was the center piece of the plan!&lt;/p&gt;
&lt;p&gt;There's so much to do, I don't have to sell it — this stuff sells itself. In 2024, we want to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;detach from "corporate IT" and build out own way-of-working (eg. with devboxes)&lt;/li&gt;
&lt;li&gt;significantly improve our developer experience&lt;/li&gt;
&lt;li&gt;replace Vertica with a managed DWH solution (such as Snowflake)&lt;/li&gt;
&lt;li&gt;stop building N DWHs and just have N=1, ours&lt;/li&gt;
&lt;li&gt;..and thus, ingest from a lot more source operational databases&lt;/li&gt;
&lt;li&gt;move more of our ingestion to be real-time (vs batch)&lt;/li&gt;
&lt;li&gt;build out a set of core datasets&lt;/li&gt;
&lt;li&gt;build a new, thin ETL framework on top of Airflow&lt;ul&gt;
&lt;li&gt;.. and open source it!&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;significantly improve our pipeline and dataset monitoring and alerting&lt;/li&gt;
&lt;li&gt;significantly improve the way we deploy, track and update models in production (MLOps)&lt;/li&gt;
&lt;li&gt;sunset significant portions of the Data Science infra, as move to a unified platform&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That's 10+ items, each of them non-trivial!&lt;/p&gt;
&lt;p&gt;Some technologies that I want to look at in 2024:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dagshub.com/"&gt;DagsHub&lt;/a&gt;: GitHub for machine learning&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openlm-research/open_llama"&gt;OpenLlama&lt;/a&gt;: let's make our own LLM&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/jetpack-io/devbox"&gt;DevBox&lt;/a&gt;: open source dev box management&lt;/li&gt;
&lt;li&gt;&lt;a href="https://airflow.apache.org/blog/airflow-2.8.0/"&gt;Airflow 2.8&lt;/a&gt; and beyond&lt;ul&gt;
&lt;li&gt;I think we're getting a lot of bang for our buck from Airflow&lt;/li&gt;
&lt;li&gt;I ❤️ Airflow, I've been using it since 2017 (!)&lt;/li&gt;
&lt;li&gt;.. but, I wonder what's next?&lt;/li&gt;
&lt;li&gt;Today we use Airflow for ML pipelines, but it seems inefficient&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datahub.io/"&gt;DataHub&lt;/a&gt;: we have already deployed it, we need more adoption \&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-6.png" alt="." style="width: 100px;"/&gt;&lt;/p&gt;
&lt;h2&gt;AI products&lt;/h2&gt;
&lt;p&gt;Traditionally most of our product-work is adding Data Science to an existing product: e.g. we have our loyalty app, it displays offers [anyway], can we rank them better? We have our cinema app, it shows movie recommendations [anyway], can be rank them better? We send out marketing emails [anyway], can we personalize them better? To be clear, there is no problem with the [anyway], in the end it's impact that matters! Having said that, it's interesting to ponder whether the new AI technologies available in 2024 enable us to do new things, ship entirely new products or experiences [for the company]?&lt;/p&gt;
&lt;p&gt;For example, I've been thinking that Amazon has Alexa, which among other functionalities, allows one to order products from Amazon by voice. It would certainly be easier to build something like Alexa today, given that voice recognition and intent recognition is more commoditized today.&lt;/p&gt;
&lt;p&gt;Having said that, some of us have seen the difficulties in execution around previous AI apps, and we know it's hard to find engagement with Business Units over the Generative AI prototypes we built previously, so AI products are a very long shot, but still something we should think about.&lt;/p&gt;
&lt;h2&gt;Finance: an opportunity?&lt;/h2&gt;
&lt;p&gt;When I was getting my Physics degree at the Eotvos University in Budapest, I took a class on quantitative finance. The context is that in the US, around the 80s and 90, (some) physicists started to work in the finance industry, on Wall Street, as quants, building models [to find arbitrage opportunities]. In today's lingo they were "Data Scientists in Finance". At a high level, they were (and are) doing the same thing as we do as Data Scientists: trying to build forecasting and decision support models based off data to make more money, using mostly the same tools that we use (simple math, statistical tests, linear regression, decision trees, etc). This phenomenon became so widespread that one such physicist quant called Emanuel Derman even wrote a book about it titled &lt;a href="https://emanuelderman.com/books/my-life-as-a-quant/"&gt;My Life as a Quant&lt;/a&gt;, which I read as a University student. And we had a few such quant physicists in Budapest too, working at our local banks (and Morgan Stanley, who has an office in Budapest), so eventually they also taught a course at University. So ever since then I've been interested in understanding how quantitative finance works, and following this area from a distance. Somewhat related to this, I've also been reading a lot of investing related books in the last 10 years, many by or about Warren Buffet, James Simons, George Soros, Ray Dalio, etc. &lt;em&gt;What does this have to do with our work as Data Scientists?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;One thing I've been pondering is the success we have had working with the Financial Planning &amp;amp; Analysis (FP&amp;amp;A) teams. Over the last 12 to 18 months we proved that we (Data Scientists) are able to make significantly better (less biased), more objective forecasts than Finance teams. We are also able to better create what-if scenarios to model and understand the impacts of events such as pandemics, boycotts and wars. We are now not just forecasting revenues, but has also started to work on forecasting costs.&lt;/p&gt;
&lt;p&gt;Over the winter break I was reading the excellent book In &lt;a href="https://www.amazon.com/Pursuit-Perfect-Portfolio-Insights-Pioneers/dp/B099Y6RPBM"&gt;Pursuit of the Perfect Portfolio&lt;/a&gt;, about luminaries and their work in the Finance industry: Markowitz, Sharpe, Fama, Black, Scholes, etc. They invented modern portfolio theory with the idea of modeling risk vs return, covariance and diversification, CAPM and efficient fronter, efficient market hypothesis, Black-Scholes option pricing formula, etc. This is all stuff I've leared at University 20 years ago. When I learned about these, and most of the time when I read about these, it's in the context of trading: you are investing/trading, where do you put your (or your hedge fund's) money, how much do you diversify, etc. One thing that surprised me was that many of their original papers were addressing these topics in the context of &lt;em&gt;corporate finance&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-3.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;So I've began thinking: currently the role of data (and data science) is to improve execution when the company decides to do X. But what about helping to decide what X to do and not to do, ie. where to &lt;em&gt;invest&lt;/em&gt; money, or not to invest, and how to balance this portfolio? In other words, given where and how we can invest money [+]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;what are the expected returns?&lt;ul&gt;
&lt;li&gt;what is the best way to approximate returns?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;what are the risks and covariances?&lt;ul&gt;
&lt;li&gt;what is the best way to approximate risk and covariance?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;in the language of CAPM, what is the efficient frontier for our company?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In practice this may work out to be very similar to the Monte Carlo / variance project that we worked on previously, where they supported decisions by creating a distribution of outcomes (vs point estimates), which reportedly had significant impact.&lt;/p&gt;
&lt;p&gt;[+] This is the exercise we play every year Q4 called budgeting: we ask for money.&lt;/p&gt;
&lt;h2&gt;Personal Development&lt;/h2&gt;
&lt;p&gt;One of my 2023 OKRs for Data Science was &lt;em&gt;"Build a credible career path for Data Scientists"&lt;/em&gt;, where I scored 0. I want to change this 0 to a 1 in 2024, also for Data Engineers and Platform Engineers.&lt;/p&gt;
&lt;p&gt;The fundamental problem is that we are a small organization, with 12 Data Scientists (6 Dubai, 6 Remote), and even less Data Engineers (2 Dubai, 2 Remote) and Platform Engineers (4 Dubai). We have very small population per function, and it's hard to establish statistical baselines/calibrations at each level, especially with most people working alone (esp. DS), and having to do be able to "swim" alone in the "stormy waters", irrespective of level. At tech companies, with 100+ populations per function, the people making the promotion decisions are of the same function, but ranked higher (the people deciding about the promotion of a Data Engineer are the TL/Data Engineering Manager, Director DE, VP DE, CTO, all technical folks. For us this is not the case — of the roughly 5-6 people who need to agree to do a promotion, most will be non-technical. So the reason for promotion needs to be something that non-technical people understand. Examples that seem to work:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Alice is a Data Scientist, but is also leading a team of 4 Data Engineers&lt;/li&gt;
&lt;li&gt;Bob is a Data Scientist, we have a big PAIN [+] in area X, and Bob solved it&lt;/li&gt;
&lt;li&gt;Alice led project X to completion &lt;em&gt;and project X was highly visible to executives&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Bob is critical to keep the lights on, otherwise PAIN&lt;/li&gt;
&lt;li&gt;Alice is &lt;em&gt;already&lt;/em&gt; managing 5 people of the same functions&lt;/li&gt;
&lt;li&gt;Bob had X million AED impact this year&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Realistically, we need at least 2 of these for a strong promotion case.&lt;/p&gt;
&lt;p&gt;[+] PAIN needs to be something that non-technical management understands, like escalations&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-2.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We need to create &lt;strong&gt;personalized plans&lt;/strong&gt; for everybody that attempt to create these situations.&lt;/li&gt;
&lt;li&gt;We will create the plan by understanding the strengths and weakness of everybody, and where they want to improve, along dimensions like in the spider chart above.&lt;/li&gt;
&lt;li&gt;It's the person's job to remember this plan and work on it continuously!&lt;/li&gt;
&lt;li&gt;We need to accept that there's an element of luck, in the sense that the environment also plays a role, and we don't control all of our environment.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;We create our own luck, and, luck favors the prepared.&lt;/em&gt; Eg. if there's an opportunity to work on a high-impact, high visibility project, it's likely we will pick somebody who is already performing at a high level, because that's the right choice for the organizatioon. So even if you feel you're currently not in a lucky situation, you need to run at high performance, so when the opportunity comes, you're in a good position to leverage it.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Job security and layoffs&lt;/h2&gt;
&lt;p&gt;We are not planning layoffs — this has not been on my mind for many months. The reason I added this section is that — to my surprise — some tech companies started doing layoffs in the second week of January: Google, Twitch, Amazon, Discord, etc.&lt;/p&gt;
&lt;p&gt;As I just said, we are not planning layoffs. Also, even that were the case, we're already a very barebones team, so I personally am not worried. Having said that, in 2024, like in every year, the best strategy to hedge the risk of job loss (for you, me and everybody) is to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;be a high performer&lt;/li&gt;
&lt;li&gt;work on your visibility, so people know you're a high performer (!)&lt;/li&gt;
&lt;li&gt;"be so good they can't ignore you", so in case it comes to it, you quickly get a new job&lt;/li&gt;
&lt;li&gt;outside of work, invest time and energy into thinking about your personal finances and investments — the best job is the one you don't really need 🙂&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/outlook-2024-4.jpg" alt="." style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The part in quotes is a reference to Cal Newport's excellent book &lt;a href="https://www.amazon.ae/Good-They-Cant-Ignore-You/dp/1455509124"&gt;So Good They Can't Ignore You&lt;/a&gt; (originally a quote from comedian Steve Martin), which I read many years ago, and I found to be true:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Georgetown University professor Cal Newport debunks the long-held belief that follow your passion is good advice, and sets out on a quest to discover the reality of how people end up loving their careers. Not only are pre-existing passions rare and have little to do with how most people end up loving their work, but a focus on passion over skill can be dangerous, leading to anxiety and chronic job hopping. Spending time with organic farmers, venture capitalists, screenwriters, freelance computer programmers, and others who admitted to deriving great satisfaction from their work, Newport uncovers the strategies they used and the pitfalls they avoided in developing their compelling careers. Cal reveals that matching your job to a pre-existing passion does not matter. Passion comes after you put in the hard work to become excellent at something valuable, not before.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In other words, look out for areas where you have an opportunity to excel, work hard, invest a lot of time to become a master in that area, and then enjoy the fruits of your work. For example, if you're working on forecasting projects, you should become the master of forecasting and financial data science, even if he had no &lt;em&gt;a priori&lt;/em&gt; "passion" for this topic back when he started working on this 2 years ago. Then, once you're the undisputed master at the organization, then you'll enjoy it, and maybe it will feel like you had passion all along!&lt;/p&gt;
&lt;h2&gt;Books&lt;/h2&gt;
&lt;p&gt;In no particular order, here are some books relevant to 2024 (that I have read or have bought and plan to read):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.oreilly.com/library/view/experimentation-for-engineers/9781617298158/"&gt;Experimentation for Engineers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oreilly.com/library/view/reliable-machine-learning/9781098106218/"&gt;Reliable Machine Learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.ae/Information-Theory-Inference-Learning-Algorithms/dp/0521642981"&gt;Information Theory, Inference and Learning Algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Making-Databases-Work-Pragmatic-Stonebraker/dp/1947487167"&gt;Making Databases Work&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;collection of Stonebraker papers&lt;/li&gt;
&lt;li&gt;he's the father (along with Jim Gray) of much of the database technology we use today&lt;/li&gt;
&lt;li&gt;started Ingres, which then became Postres&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oreilly.com/library/view/the-staff-engineers/9781098118723/"&gt;The Staff Engineer's Path&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;Pradeep is reading this&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mlstory.org/"&gt;Patterns, Predictions, and Actions&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;I've already read this, highly recommended&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Financial-Econometrics-Problems-Methods-Princeton-ebook/dp/B07SQRJMXC"&gt;Financial Econometrics&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;because I mentioned Markowitz and CAPM&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oreilly.com/library/view/beautiful-c-30/9780137647767/"&gt;Beautiful C++&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;I've already this, highly recommended&lt;/li&gt;
&lt;li&gt;as a former C++ programmer, I'm always looking for ways to bring back C++ into my life 🙂&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oreilly.com/library/view/machine-learning-design/9781098115777/"&gt;Machine Learning Design Patterns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.lulu.com/shop/fran%C3%A7ois-fleuret/the-little-book-of-deep-learning"&gt;The Little Book of Deep Learning&lt;/a&gt; (recommended by the legendary John Carmack)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.bishopbook.com/"&gt;Deep Learning: Foundations and Concepts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mitpress.mit.edu/9780262048644/understanding-deep-learning/"&gt;Understanding Deep Learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.thescienceofdeeplearning.org/"&gt;The Science of Deep Learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://efficientlinux.com/"&gt;Efficient Linux at the Command Line&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;Mohit is reading this&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/System-Design-Interview-insiders-Second/dp/B08CMF2CQF"&gt;System Design Interview&lt;/a&gt; Volume 1 and &lt;a href="https://www.amazon.com/System-Design-Interview-Insiders-Guide/dp/1736049119"&gt;Volume 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Machine-Learning-System-Design-Interview/dp/1736049127"&gt;Machine Learning System Design Interview&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;I've read all 3, they are great&lt;/li&gt;
&lt;li&gt;One of the best ways you can learn how other companies do things is to see what they ask on interviews&lt;/li&gt;
&lt;li&gt;Here, the author has done this work and asked lots of people at big tech companies, and compiled it into these books!&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://realpython.com/products/cpython-internals-book/"&gt;CPython Internals&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;I read this, I didn't get much out of it&lt;/li&gt;
&lt;li&gt;CPython mostly works as you'd expect it works if you know the basics about compilers and interpreters&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.ae/Good-They-Cant-Ignore-You/dp/1455509124"&gt;So Good They Can't Ignore You&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://emanuelderman.com/books/my-life-as-a-quant/"&gt;My Life as a Quant&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note: I definitely buy more books than I can read..&lt;/p&gt;
&lt;h2&gt;Personal reflections&lt;/h2&gt;
&lt;p&gt;In 2024 I plan to make some minor modifications to how I manage myself:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;be more rigorous with keeping a daily log&lt;/li&gt;
&lt;li&gt;write weekly updates&lt;/li&gt;
&lt;li&gt;defend my time better&lt;ul&gt;
&lt;li&gt;I think I spent a lot of time in 2023 in pointless meetings, which ultimately didn't add any value for the company o rme&lt;/li&gt;
&lt;li&gt;eg. reduce time spent in 1v1s, but increase the effectiveness: I continue to strongly believe in 1v1s, but my hypothesis is that a lot of my hours spent in 1v1s in 2023 was not effective&lt;/li&gt;
&lt;li&gt;.. but, 1v1s are not the main culprit, it's just "too many pointless meetings"&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;write more&lt;ul&gt;
&lt;li&gt;eg. documents like this&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;in the last couple of months I've been thinking a lot about the productivity of people like Elon Musk — why is it that some people are 100x or 1000x much more productive than me?&lt;ul&gt;
&lt;li&gt;of course billionaires who own multiple companies and tell 1000s of people what to work on have significantly higher leverage, so there is no real mystery here 🙂&lt;/li&gt;
&lt;li&gt;there are lots of self-help books written, with advice like "wake up at 5am", "write lists", etc.&lt;ul&gt;
&lt;li&gt;this is actually good advice, I do a lot of things like this (but far from perfect), and I think things like this &lt;em&gt;can&lt;/em&gt; 2x your productivity&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;the one thing I notice when reading stories about highly successful people is that they have the ability to be demanding (vs always just being nice)&lt;ul&gt;
&lt;li&gt;a little bit like when I said in Q1 &lt;em&gt;"everybody owes me a Generative AI prototype"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;maybe I should be more like that&lt;ul&gt;
&lt;li&gt;with myself and others&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;stretch goal: in terms of the way we run, we are essentially a remote team: how should our tooling look like, given that? eg. simple things like vacation tracking or who-is-in-the-office are still not solved&lt;/li&gt;
&lt;/ul&gt;</content><category term="outlook"></category><category term="2024"></category><category term="datahub"></category><category term="capm"></category></entry><entry><title>How to achieve high performance, and, ratings</title><link href="https://bytepawn.com/how-to-achieve-high-performance-and-ratings.html" rel="alternate"></link><published>2024-01-21T00:00:00+01:00</published><updated>2024-01-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2024-01-21:/how-to-achieve-high-performance-and-ratings.html</id><summary type="html">&lt;p&gt;I wrote this document for my team members — Data Scientists and Data Engineers — to help them do a better on their annual performance reviews.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/performance-lion2.jpg" alt="Watts-Strogatz" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;em&gt;I wrote this document for my team members — Data Scientists and Data Engineers — to help them do a better on their annual performance reviews.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/performance-lion2.jpg" alt="Watts-Strogatz" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The purpose of this brief note is to remind us what the key to getting good performance, and, ratings is. I write this because, after looking through 10+ reviews this week, I think it's a good idea — we have room for improvement.&lt;/p&gt;
&lt;p&gt;The key to getting good performance ratings is to consciously invest time, energy and attention to all steps of the &lt;strong&gt;performance journey&lt;/strong&gt;. The steps are, roughly:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Set performance goals and OKRs &amp;nbsp; 📝&lt;/li&gt;
&lt;li&gt;Throughout the year, work hard and focus on impact &amp;nbsp; 📈&lt;/li&gt;
&lt;li&gt;Visibility &amp;nbsp; 👀&lt;/li&gt;
&lt;li&gt;Fill out self-review and score OKRs &amp;nbsp; 🖋️&lt;/li&gt;
&lt;li&gt;Manager speaks to employee and rates employee&lt;/li&gt;
&lt;li&gt;Calibration&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I will concentrate on the first 4 below.&lt;/p&gt;
&lt;h2&gt;Set performance goals and OKRs&lt;/h2&gt;
&lt;p&gt;Setting strong, ambitious goals, so that you feel like you cannot hit all of them is key here. If you set yourself a few, easy goals, even if you hit them, your manager will say, with all honesty, &lt;em&gt;"great job, this is baseline performance (Meets Expectation)"&lt;/em&gt;. If you don't set strong, ambitious key results on a certain objective, your best chance of going beyond the baseline is probably to hope for an external piece of work "forcing" you to go beyond that, even though it was not part of the OKRs. But why would you do that? Plan for excellence, don't wait or hope for it to happen!&lt;/p&gt;
&lt;p&gt;Most large organizations have generic objectives that don't change much year to year. A good rule of thumb is to have 5-10 Key Results (KRs) per Objective. Even for objectives like Company Values you can enumerate plans/ideas of how you will exceed! Make them ambitious: when you set them, you should feel like a bit overwhelmed, like &lt;em&gt;"I'll be happy if I hit half of this"&lt;/em&gt;. I feel like that every time we build team OKRs, and in the end we usually end up hitting 70-80%, because a year is actually a long time, we're all hard workers and often things are not as hard as they seem.&lt;/p&gt;
&lt;p&gt;Personally I'm a big fan of making goals and OKRs. I also do it personally. Every year, in the first couple of weeks in January I build my annual goals (not OKRs) in a Google doc. Usually it ends up being 2-3 pages long. I group them under headings like Reading, Writing, Professional, Fitness, Relax, Money. You can check a &lt;a href="https://docs.google.com/document/d/1BGlDYlS3pqs9DuLQP-msz_CyPu-6C9P_eP-RofcAcog/edit"&gt;sample here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;OKRs, or more broadly, goals, are not something we do because the company is asking us to. We'd do OKRs and goaling anyway. As you can see, although I don't follow the OKR structure specifically, I do something similar personally also, and have for a long time. The point of goaling is to think through what we want to do, to make a plan, and then write down what "good" looks like in terms of outcomes. An additional point about "goaling" is the dopamine hit that we get when we accomplish a key result or goal item.&lt;/p&gt;
&lt;p&gt;We are data people and love metrics, so I won't sell &lt;a href="https://en.wikipedia.org/wiki/SMART_criteria"&gt;SMART goaling&lt;/a&gt; goaling, but give a brief overview below. Here are the five parts of SMART:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Specific&lt;/strong&gt;: Goals should be clear and specific, so you know exactly what you're aiming for. This involves answering the "W" questions: What do you want to accomplish? Why is this goal important? Who is involved? Where is it located? Which resources are needed?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Measurable&lt;/strong&gt;: A goal must have criteria for measuring progress. This helps to stay on track and meet deadlines. Ask questions like: How much? How many? How will I know when it is accomplished?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Achievable&lt;/strong&gt;: The goal should be realistic and attainable to be successful. This means you have to figure out how to accomplish it and whether it's possible, given the current constraints such as time and resources.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Relevant&lt;/strong&gt;: The goal should matter to you and align with other relevant goals. It should address something that is the right time and fits with other efforts or needs. Ask: Does this seem worthwhile? Is this the right time? Does this match our other efforts/needs?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Time-Bound&lt;/strong&gt;: Every goal needs a target date, so there's a deadline to focus on. This part of the SMART goal criteria helps to prevent everyday tasks from taking priority over longer-term goals. Ask: When can I achieve this goal? What can I do six months from now? What can I do six weeks from now? What can I do today?&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In my experience, when making annual goals, most of the time it's not realistic to hit all 5 components of SMART, with &lt;strong&gt;Specific&lt;/strong&gt; and &lt;strong&gt;Measurable&lt;/strong&gt; being the trickiest. My approach is not to obsess too much over this, but to write something which is clear to you and your manager, and perhaps later on you can come back and improve the wording. I don't believe in hard-locked OKRs, minor refinements to definitions are fine!&lt;/p&gt;
&lt;p&gt;Are you doing a good job on this? Most of us can improve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;have you thought through what you want to achieve this year?&lt;ul&gt;
&lt;li&gt;also on Value objectives?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;do you have 5-10 KRs per Objective?&lt;/li&gt;
&lt;li&gt;did you invest time and mental energy to make your goals as SMART as possible?&lt;/li&gt;
&lt;li&gt;have you taken multiple stabs at goaling, you've slept over it between iterations?&lt;/li&gt;
&lt;li&gt;have the OKRs made you stretch and improve or it is about doing the baseline?&lt;/li&gt;
&lt;li&gt;bonus: have you built personal goals?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Work hard and focus on impact&lt;/h2&gt;
&lt;p&gt;This will be the shortest section.&lt;/p&gt;
&lt;p&gt;We don't have a problem here.&lt;/p&gt;
&lt;p&gt;Most of us work hard and accomplish good outcomes for our organization.&lt;/p&gt;
&lt;p&gt;But remember: if you do a "good job" throughout the year, that's the baseline, so that will be "Meets Expectations". The expectation, towards all of us (also your manager), is that we do a good job. The baseline is not that we don't work, and if we do our job then we're already exceeding. The reason I point this out is that sometimes, on self-review, the person will essentially list out baseline activities of their job and then rate themselves "Exceeds" or "Exceptional".&lt;/p&gt;
&lt;p&gt;The easiest way to not exceed is to forget about the performance journey and OKRs thoughout the year. 
I see this happening a lot, when people scramble in December to hit as many key results as possible. The smart thing to do is to create a system — with your manager — to continuously keep the OKRs in mind and work on them. For example, if your KR is to &lt;em&gt;"fix 24 tech debt tickets"&lt;/em&gt;, it's definitely easier to do 2 per month versus trying to do as many as possible in December!&lt;/p&gt;
&lt;h2&gt;Visibility&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;If a tree falls in a forest, and there’s no one around to hear it, does it make a sound?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The physics answer is, &lt;em&gt;of course it does&lt;/em&gt;. But a company is a set of people, and people are not like physics! At any organization, &lt;em&gt;you need to make sure that people know about all the good work you're doing! No exceptions.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;You may think, &lt;em&gt;"But why? I just want to do a good job, and I want to be recognized for it. Isn't it enough if you know? This is unfair, it wasn't on my JD."&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The situation is simple: in any organization beyond a trivial size (the cut-off is somewhere around 100-150, &lt;a href="https://en.wikipedia.org/wiki/Dunbar%27s_number"&gt;the Dunbar number&lt;/a&gt;), there will be some sort of "quorum" or "committee" that will ultimately evaluate you. So, it's not just your manager that needs to know — you have to have &lt;strong&gt;social proof&lt;/strong&gt;. You and your manager can write down the list of these 5-10 people! Your best bet at getting good outcomes based on your work is if these people:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;know who you are&lt;/li&gt;
&lt;li&gt;know what you did&lt;/li&gt;
&lt;li&gt;think you did a great job that had a positive impact for the company&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You may think, &lt;em&gt;"But Marton, isn't it your job to make sure I get recognized by these people?"&lt;/em&gt;. In a way, yes, it's my job. But imagine Alice and Bob, both did a great job, and Alice worked on her visibility, while Bob relied on his manager for it. However, Bob's manager is actually responsible for providing this visibility for 20+ people, and is not able to remember everything that happened with 20 people across 12 months. Who has better chances of getting recognized? If senior management is already aware about Alice and her work, they are already &lt;a href="https://en.wikipedia.org/wiki/Priming_(psychology)"&gt;primed&lt;/a&gt; before they come to the calibration meeting.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;There is all the reason in the world to work on your visibility, and no reasons against it.&lt;/em&gt; Working on your visibility is not "cheating". Also, your manager will personally be grateful to you, because they will have an easier time getting you good outcomes.&lt;/p&gt;
&lt;p&gt;It's also worth noting, as I said, that this is not just our company. I can tell you that for example promotions at Facebook, which has a very different structure, but it was the same story at a high-level. To get promoted was very hard, you had to present a case to a Promotion Committee, and it was much more likely to get approved if the members knew who you were and knew what you did, so it was an "obvious case". It was hard, people worked very hard on improving their visibility. &lt;/p&gt;
&lt;h2&gt;Fill out self-review and score OKRs&lt;/h2&gt;
&lt;p&gt;This week, as I was doing everybody's reviews, my starting point was to read everybody's self-review. I was surprised that many people chose to not fill out the self-review at all, or not put much energy into it! In these cases I sent it back and asked the person to fill it out and put effort into it.&lt;/p&gt;
&lt;p&gt;The self-review is essentially the last part of visibility. Not putting in the effort to present your case here is a huge wasted opportunity! As a manager, I can tell you how I do reviews: I open up the system, and see what you wrote (text), and then see if it supports your rating. Of course, I also try to remember all the things that happened in the year, the good things that you did. But in general, I'm assuming that if something is worth mentioning, you will mention it in the self-review — because it's your job to do so at this stage, and also your self-interest!&lt;/p&gt;
&lt;p&gt;A good self-review looks like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it lists out all the good things you did this year — be concrete, not generic&lt;/li&gt;
&lt;li&gt;it is specific about what you did (vs the team that you worked on)&lt;ul&gt;
&lt;li&gt;use "I" instead of "we"!&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;it is specific about the impact (eg. 10M USD)&lt;ul&gt;
&lt;li&gt;concrete numbers are better than estimates&lt;/li&gt;
&lt;li&gt;bottom-of-funnel numbers (eg. net profit) are better than top-of-funnel numbers (eg. income)&lt;/li&gt;
&lt;li&gt;numbers other than USD numbers are fine (eg. # of techtalks, # of interviews, # pageviews, etc)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;it's fine to practice some salesmanship, but be concrete&lt;/li&gt;
&lt;li&gt;this is the time (along with generic within-the-year visibility) to be bold&lt;/li&gt;
&lt;li&gt;your text should support your rating — if you rate yourself "Exceeds" or "Exceptional" for an objective, don't put down bullet points and sentences that essentially say &lt;em&gt;"This is my baseline job and I did it."&lt;/em&gt; Your comments need to support why you were beyond the baseline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;if you don't take OKRs and performance reviews seriously, don't expect your manager to take you seriously!&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;A note on calibrations&lt;/h2&gt;
&lt;p&gt;The way calibrations work is the following: Corporate HR sets a budget of a certain % of people that can be within each bucket (Meets, Exceeds, Exceptional). The input of calibration is all the people in business unit, with their scores as their manager set it.&lt;/p&gt;
&lt;p&gt;Every year, there are way too many people in the Exceeds and Exceptional buckets, so Calibration is about discussing and figuring out "who to move down". So let's say there are too many people in Exceeds, then Alice would have to explain to the group why she rated Bob 3.8 Exceeds. All the other managers also explain, and by the end of the meeting (or meetings), the managers have to agree who is actually, upon hearing their manager’s explanation, not really Exceeds, but Meets Expectation. In a way, it's grading on a curve.&lt;/p&gt;
&lt;p&gt;This is why visibility is so important. If I'm explaining why Bob is Exceeds, and nobody in the calibration meeting even knows who Bob is, it's much tougher to make that argument. For Exceptional, a lot of people better know who Bob is and what he did!&lt;/p&gt;
&lt;p&gt;As I mentioned above, calibration is not per function or per department, it's across the business unit.&lt;/p&gt;
&lt;h2&gt;A note on promotions&lt;/h2&gt;
&lt;p&gt;I think everybody knows this, but I will say it just to be safe. There is never any guarantee for a given rating on promotion, even if we do a great job. For example during Covid, everybody got a default "Meets" rating, no increment and no bonus.&lt;/p&gt;
&lt;p&gt;This applies also for promotions. In my experience, at our company, promotions are "needs"-based. So the most likely way to get promoted is if there's a slot that the company needs to fill. It has happened, but not so often, that people get promoted simply based on a performance plus level argument (Bob did great, he is currently at level L, but he has been consistently performing at level L+1).&lt;/p&gt;</content><category term="okr"></category><category term="goaling"></category><category term="performance"></category><category term="people"></category><category term="management"></category></entry><entry><title>Wordcount III: Beating the system `wc`</title><link href="https://bytepawn.com/beating-the-system-wc.html" rel="alternate"></link><published>2023-09-29T00:00:00+02:00</published><updated>2023-09-29T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-09-29:/beating-the-system-wc.html</id><summary type="html">&lt;p&gt;In this follow-up to the previous article about writing a C++ version of the Unix command-line utility &lt;code&gt;wc&lt;/code&gt;, I make some modifications to beat my system &lt;code&gt;wc&lt;/code&gt; in performance tests.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/wcpp3.png" alt="wc" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous two articles I re-implemented &lt;code&gt;wc&lt;/code&gt; is modern C++ as a toy exercise:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/implementing-the-unix-command-line-tool-wc-in-modern-cpp.html"&gt;Wordcount I: Implementing the Unix command-line tool &lt;code&gt;wc&lt;/code&gt; in modern C++&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/making-the-wc-state-machines-generic-cpp.html"&gt;Wordcount II: Introducing a cleaner C++ class hierarchy to &lt;code&gt;wc&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I had three reasons to write a third version.&lt;/p&gt;
&lt;p&gt;First, my core architectural takeaways at the end of the second article was since &lt;code&gt;wc&lt;/code&gt; is a fully specified tool that is not going to see new features introduced, a lot of the usual C++ abstractions are not neccesary. For example, the code doesn't need to follow the Open–closed principle and be &lt;em&gt;open for extension, but closed for modification&lt;/em&gt;. So all the complexity introduced in the second version isn't good complexity.&lt;/p&gt;
&lt;p&gt;Second, I also found that the second (and also first) implementation loses time compared to my system &lt;code&gt;wc&lt;/code&gt; in the following &lt;code&gt;for_each&lt;/code&gt; loop (same for different ways of iterating a C++ container):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="nf"&gt;process_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stream_counter_vector_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;eof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;mbstate_t&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// ... code omitted&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;for_each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;                    &lt;span class="o"&gt;&amp;lt;-----&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;     &lt;span class="o"&gt;&amp;lt;-----&lt;/span&gt;
        &lt;span class="c1"&gt;// ... code omitted&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Third, I found that the system &lt;code&gt;wc&lt;/code&gt; has a trick up its sleeve when doing line counting. Instead of processing each character and checking whether it's a &lt;code&gt;\n&lt;/code&gt;, it uses the &lt;a href="https://en.cppreference.com/w/c/string/byte/memchr"&gt;&lt;code&gt;memchr()&lt;/code&gt; family of functions&lt;/a&gt;. &lt;code&gt;memchr()&lt;/code&gt; takes a buffer of character and finds the first occurence of a character, in this case &lt;code&gt;\n&lt;/code&gt;, and is much faster than doing the same in a C loop.&lt;/p&gt;
&lt;h2&gt;Simples classes and code&lt;/h2&gt;
&lt;p&gt;To keep this implementation simple I used a very simple base &lt;code&gt;counter&lt;/code&gt; class, and derived all other implementation classes from this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter_t&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For example, the class counting bytes is now as simple as:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;lass&lt;/span&gt; &lt;span class="nl"&gt;byte_counter_t&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;counter_t&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;st_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since I realized extensibility is not required here, the class itself stores whether the count should happen or not. The &lt;code&gt;active&lt;/code&gt; flag is flipped based on the command-line parameters, and checked in the internal loops. This means that the dynamic &lt;code&gt;for&lt;/code&gt; loop is now replaced with compiled &lt;code&gt;if&lt;/code&gt;s:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt; &lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;byte_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;byte_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;char_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;char_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;line_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;longest_line_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;longest_line_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;word_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As I will show shortly, this yield much faste code. Obviously this is repeated code that goes against the DRY principle, but here it's permissible.&lt;/p&gt;
&lt;p&gt;Lastly, the use of &lt;code&gt;memchr()&lt;/code&gt; for in &lt;code&gt;linecounter&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="nf"&gt;process_block_lines_only&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;eof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memchr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;line_counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Headers&lt;/h2&gt;
&lt;p&gt;To further clean up the code, I factored the command-line parsing and tabulation code into a separate headers clargs.hpp and tabular.hpp. &lt;a href="https://github.com/mtrencseni/wcpp"&gt;The complete code is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Performance&lt;/h2&gt;
&lt;p&gt;With the above changes and optimizations, the code now runs about 10% faster than my system &lt;code&gt;wc&lt;/code&gt; when running all 5 counters in file mode:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ &lt;span class="nb"&gt;time&lt;/span&gt; wc -c -m -l -w -L playground/MNIST_data/t10k-images-idx3-ubyte.gz
   &lt;span class="m"&gt;6121&lt;/span&gt;   &lt;span class="m"&gt;34669&lt;/span&gt;  &lt;span class="m"&gt;874589&lt;/span&gt; &lt;span class="m"&gt;1648877&lt;/span&gt;     &lt;span class="m"&gt;402&lt;/span&gt; playground/MNIST_data/t10k-images-idx3-ubyte.gz

real    0m0.101s
user    0m0.100s
sys     0m0.000s

$ &lt;span class="nb"&gt;time&lt;/span&gt; ./wcpp_v3 -c -m -l -w -L playground/MNIST_data/t10k-images-idx3-ubyte.gz
    &lt;span class="m"&gt;6121&lt;/span&gt;   &lt;span class="m"&gt;34669&lt;/span&gt;  &lt;span class="m"&gt;874589&lt;/span&gt; &lt;span class="m"&gt;1648877&lt;/span&gt;     &lt;span class="m"&gt;402&lt;/span&gt; playground/MNIST_data/t10k-images-idx3-ubyte.gz

real    0m0.089s
user    0m0.088s
sys     0m0.000s
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Results are similar when reading from &lt;code&gt;stdin&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ &lt;span class="nb"&gt;time&lt;/span&gt; cat playground/MNIST_data/t10k-images-idx3-ubyte.gz &lt;span class="p"&gt;|&lt;/span&gt; wc -c -m -l -w -L
   &lt;span class="m"&gt;6121&lt;/span&gt;   &lt;span class="m"&gt;34669&lt;/span&gt;  &lt;span class="m"&gt;874589&lt;/span&gt; &lt;span class="m"&gt;1648877&lt;/span&gt;     &lt;span class="m"&gt;402&lt;/span&gt;

real    0m0.101s
user    0m0.104s
sys     0m0.000s

$ &lt;span class="nb"&gt;time&lt;/span&gt; cat playground/MNIST_data/t10k-images-idx3-ubyte.gz &lt;span class="p"&gt;|&lt;/span&gt; ./wcpp_v3 -c -m -l -w -L
    &lt;span class="m"&gt;6121&lt;/span&gt;   &lt;span class="m"&gt;34669&lt;/span&gt;  &lt;span class="m"&gt;874589&lt;/span&gt; &lt;span class="m"&gt;1648877&lt;/span&gt;     &lt;span class="m"&gt;402&lt;/span&gt;

real    0m0.091s
user    0m0.080s
sys     0m0.008s
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This was a fun toy exercise so far. One final optimization possibility would be to multi-threading when parsing multiple files.&lt;/p&gt;</content><category term="c++"></category><category term="wc"></category></entry><entry><title>Wordcount II: Introducing a cleaner C++ class hierarchy to `wc`</title><link href="https://bytepawn.com/making-the-wc-state-machines-generic-cpp.html" rel="alternate"></link><published>2023-09-23T00:00:00+02:00</published><updated>2023-09-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-09-23:/making-the-wc-state-machines-generic-cpp.html</id><summary type="html">&lt;p&gt;In this follow-up to the previous article about writing a C++ version of the Unix command-line utility &lt;code&gt;wc&lt;/code&gt;, I make the class structure more complicated to keep separate concerns and functionality in different C++ classes. The result ends up being significantly more complex than the original, but does not make the overall program easier to understand or modify.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/wc2.png" alt="wc" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/implementing-the-unix-command-line-tool-wc-in-modern-cpp.html"&gt;previous article&lt;/a&gt;, I wrote a modern C++ version of the Unix command-line utility &lt;code&gt;wc&lt;/code&gt;. The basic architecture of the program was a collection of state machines, one for each type of count that the utility supports:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bytes&lt;/li&gt;
&lt;li&gt;characters (Unicode or ASCII)&lt;/li&gt;
&lt;li&gt;words&lt;/li&gt;
&lt;li&gt;lines&lt;/li&gt;
&lt;li&gt;the longest line&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each counter was implemented as a class that derives from and implements from &lt;code&gt;class counter&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;has_optimization&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;optimized_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="nf"&gt;get_count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The core state machine logic is implemented in &lt;code&gt;process_wchar()&lt;/code&gt;. In the case of eg. byte counting this is trivial:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the case of words it's a real edge-triggered state machine:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt; &lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;whitespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;is_word_sep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;printable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iswprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_word&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;in_word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;in_word&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;printable&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;in_word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The source code for the &lt;em&gt;previous version&lt;/em&gt; is &lt;a href="https://github.com/mtrencseni/playground/blob/master/wcpp_v1.cpp"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Optimized counts&lt;/h2&gt;
&lt;p&gt;However, for some counts, there are more optimized ways to get the desired result. The simplest example, and the one I implemented in the previous post, is byte counting. If the user of &lt;code&gt;wc&lt;/code&gt; just wants to know the byte count by typing &lt;code&gt;wc -c&lt;/code&gt; or &lt;code&gt;wc --bytes&lt;/code&gt; and the input is a file (not &lt;code&gt;stdin&lt;/code&gt;), then we can just use &lt;code&gt;fstat()&lt;/code&gt; and ask the file system for the size in bytes, and return it — no need to actually open the file and count the bytes.&lt;/p&gt;
&lt;p&gt;This is what the &lt;code&gt;has_optimization()&lt;/code&gt; and &lt;code&gt;optimized_count()&lt;/code&gt; functions in &lt;code&gt;class counter&lt;/code&gt; are for. But, I was unhappy with this, since it mixes multiple things in one class:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;querying whether the counter supports optimized counting&lt;/li&gt;
&lt;li&gt;performing the file-based optimized count&lt;/li&gt;
&lt;li&gt;performing the character-wise state machine-based count&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also, even classes that don't support optimized counting will contain (empty) functions for this, derived from the base class.&lt;/p&gt;
&lt;h2&gt;Abstractions&lt;/h2&gt;
&lt;p&gt;To address this, I started to create separate classes to encapsulate separate concepts. In general there are counters, which can return an &lt;code&gt;unsigned long count&lt;/code&gt;; in our case, a counter is either a file system based one (eg. &lt;code&gt;fstat()&lt;/code&gt; for character counting), or a stream-based one (the state machines):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;get_count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;filesystem_counter&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;stream_counter&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this structure, the &lt;code&gt;fstat()&lt;/code&gt; case is nicely separated from other concerns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;byte_filesystem_counter&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;filesystem_counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;st_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then there is the concept of deciding which one is supported for the given count, and retrieving the supported one (with sensible default implementation):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;supports_file_based&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="n"&gt;filesystem_counter_t&lt;/span&gt; &lt;span class="n"&gt;new_filesystem_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="n"&gt;stream_counter_t&lt;/span&gt; &lt;span class="n"&gt;new_stream_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To implement, other &lt;code&gt;class byte_filesystem_counter&lt;/code&gt; shows above, we need two more classes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;byte_stream_counter&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;stream_counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;byte_counter_policy&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;counter_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;supports_file_based&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="n"&gt;filesystem_counter_t&lt;/span&gt; &lt;span class="n"&gt;new_filesystem_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;byte_filesystem_counter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="n"&gt;stream_counter_t&lt;/span&gt; &lt;span class="n"&gt;new_stream_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;byte_stream_counter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is quite clean, but it has drawbacks. After parsing the command-line arguments, I construct an &lt;code&gt;std::vector&amp;lt;counter_policy&amp;gt;&lt;/code&gt; corresponding to what needs to be counted, and eventually an &lt;code&gt;std::vector&lt;/code&gt; of counters. This is where the problems start: an &lt;code&gt;std::vector&amp;lt;derived_class&amp;gt;&lt;/code&gt; cannot be used as an &lt;code&gt;std::vector&amp;lt;base_class&amp;gt;&lt;/code&gt;, because even though all instances of &lt;code&gt;derived_class&lt;/code&gt; are &lt;code&gt;base_class&lt;/code&gt; also, this is not true for container types such as &lt;code&gt;std::vector&lt;/code&gt;. So, if I create a vector of &lt;code&gt;counter&lt;/code&gt;, and I pass it to a function which runs the stream-based state machine logic, then it needs to &lt;code&gt;static_cast&amp;lt;&amp;gt;&lt;/code&gt; to &lt;code&gt;stream_counter&lt;/code&gt; as it iterates through the vector, which is ugly and not type safe. Also, if I wanted to, I cannot mix different types of counters in the vector, since I don't know how to cast (was it a stream-based or file-based one?). Or, I can decide up-front that, due to the command-line arguments used, I will instantiate all stream-based or all file-based counters: in this case I don't need to do ugly casts, but now I have a different problem: the functions that display the results as a table, they don't care how the counts were constructed, they just care about the numbers, so they do want to just call &lt;code&gt;counter::get_count()&lt;/code&gt;, so they want to get passed an &lt;code&gt;std::vector&amp;lt;counter&amp;gt;&lt;/code&gt;, irrespective of whether the count was stream-based or file-based.&lt;/p&gt;
&lt;p&gt;In the end, I decided that instead of passing &lt;code&gt;counter&lt;/code&gt;s to the display functions it's easier to extract the counts into &lt;code&gt;typedef std::vector&amp;lt;unsigned long&amp;gt;&lt;/code&gt;s and pass that. I still have to have two different functions to extract, but that can be handled reasonably with templates:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="n"&gt;count_vector_t&lt;/span&gt; &lt;span class="n"&gt;to_counts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;count_vector_t&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;for_each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;get_count&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So, for example, the function which handles counting things on &lt;code&gt;stdin&lt;/code&gt; looks like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;table_t&lt;/span&gt; &lt;span class="nf"&gt;process_stdin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;cl_args_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;table_t&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;policies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;policies_from_arguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;stream_counter_vector_t&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;stream_counters_from_policies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;process_stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_counts&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;stream_counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tabulate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The function countings things per file has two branches, one if everything can be counted on a file-based way, or it needs to get processed stream-wise, character-by-character:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;count_vector_t&lt;/span&gt; &lt;span class="nf"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;cl_args_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;count_vector_t&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;policies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;policies_from_arguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;all_sfb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;all_of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;supports_file_based&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_sfb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;filesystem_counter_vector_t&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;filesystem_counters_from_policies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;for_each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;
        &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_counts&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;filesystem_counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;stream_counter_vector_t&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;stream_counters_from_policies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;ifstream&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_open&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;process_stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;cannot read from file: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_counts&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;stream_counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This second version of &lt;code&gt;wc&lt;/code&gt;, with more abtracted classes, is &lt;a href="https://github.com/mtrencseni/playground/blob/master/wcpp_v2.cpp"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Complexity&lt;/h2&gt;
&lt;p&gt;Unfortunately, overall there is a hefty price to pay for these additional abstractions. The original version was 469 lines of code, this one is 567, so it's an extra 100 lines of code, or roughly 20% more. Unfortunately, I don't think it's cleaner, clearer, or easier to modify (in terms of lines of code to be touched). This is a case of too much complexity. A good sign of this is the long list forward declarations and &lt;code&gt;typedefs&lt;/code&gt; I introduced to keep things somewhat readable:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter_policy&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;stream_counter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;filesystem_counter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;counter_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;counter_policy&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;counter_policy_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;stream_counter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;stream_counter_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;filesystem_counter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filesystem_counter_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;counter_policy_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;policy_vector_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;counter_vector_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;stream_counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;stream_counter_vector_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;filesystem_counter_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filesystem_counter_vector_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;count_vector_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// .. actual class definitions start here&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only "win" in all this was the realization that it's cleaner to just pass &lt;code&gt;std::vector&amp;lt;unsigned long&amp;gt;&lt;/code&gt;s to the display functions, so they can be completely independent of however I decide to implement counters.&lt;/p&gt;
&lt;h2&gt;Performance&lt;/h2&gt;
&lt;p&gt;This version of &lt;code&gt;wc&lt;/code&gt;, compared to my previous version, has a different code structure, but actual runtime performance is identical. At this point it started to bother me that I invested so much time, the performance is not too far off from my sytem &lt;code&gt;wc&lt;/code&gt;, but it's still about 10% slower. So I did some profiling, and found that following: in the case that's relevant for benchmarking — the stream-based state machine branches — I lose time in the following &lt;code&gt;for_each&lt;/code&gt; loop (this is true for both my previous and the current implementation):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="nf"&gt;process_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stream_counter_vector_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;eof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;mbstate_t&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// ... code omitted&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;for_each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;                    &lt;span class="o"&gt;&amp;lt;-----&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;     &lt;span class="o"&gt;&amp;lt;-----&lt;/span&gt;
        &lt;span class="c1"&gt;// ... code omitted&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I tried multiple different ways of iterating, all had identical performance.&lt;/p&gt;
&lt;p&gt;This was quite a blow, since the whole point of implementing &lt;code&gt;wc&lt;/code&gt; in C++ is to abstract the state machines into their own classes, instantiate the state machines we need based on the command-like arguments, and perform only those counts. This — irrespective of the abstractions introduced in this post — does yield significantly cleaner code than the original &lt;a href="https://github.com/coreutils/coreutils/blob/master/src/wc.c"&gt;C version in &lt;code&gt;coreutils&lt;/code&gt;&lt;/a&gt;, which is one giant &lt;code&gt;while&lt;/code&gt; loop with 10s of variables over 100s of lines of code, ie. all state machines are spilled into one another, with ugly, hard to follow, not-obviously-correct code.&lt;/p&gt;
&lt;h2&gt;Open-closed principle&lt;/h2&gt;
&lt;p&gt;After I wrote this version and in the end decided that it's over-engineered and all the astractions are not worth it, I realized that many otherwise good software engineering principles don't apply to &lt;code&gt;wc&lt;/code&gt;. For example, the &lt;strong&gt;S&lt;/strong&gt; and &lt;strong&gt;O&lt;/strong&gt; in &lt;strong&gt;SOLID&lt;/strong&gt; stand for (from &lt;a href="https://en.wikipedia.org/wiki/SOLID"&gt;Wikipedia&lt;/a&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Single-responsibility principle&lt;/strong&gt;: There should never be more than one reason for a class to change. In other words, every class should have only one responsibility.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Open–closed principle&lt;/strong&gt;: Software entities should be open for extension, but closed for modification.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both cases are about what happens when code &lt;em&gt;changes&lt;/em&gt; to incorporate features in the future. But &lt;code&gt;wc&lt;/code&gt; is not like that: &lt;code&gt;wc&lt;/code&gt; has been around with minimal changes since 1971, so more than 50 years! It's one of the few exceptions where you &lt;em&gt;don't&lt;/em&gt; have to anticipate future changes to the code.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next post, I will show a 3rd variation of my C++ &lt;code&gt;wc&lt;/code&gt; implementation, with a reasonable trade-off between C++ complexity, readability and run-time performance, based on the above considerations — this time with the explicit goal of beating my system &lt;code&gt;wc&lt;/code&gt; in performance tests.&lt;/p&gt;</content><category term="c++"></category><category term="wc"></category></entry><entry><title>Wordcount I: Implementing the Unix command-line tool `wc` in modern C++</title><link href="https://bytepawn.com/implementing-the-unix-command-line-tool-wc-in-modern-cpp.html" rel="alternate"></link><published>2023-09-10T00:00:00+02:00</published><updated>2023-09-10T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-09-10:/implementing-the-unix-command-line-tool-wc-in-modern-cpp.html</id><summary type="html">&lt;p&gt;After reading the excellent book Beautiful C++ about the language's latest features, I implement the Unix command-line tool &lt;code&gt;wc&lt;/code&gt; in modern C++.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://m.media-amazon.com/images/I/51ByxdWleSL.jpg" alt="Beautiful C++" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Like many programmers, I started programming on the Commodore 64, in BASIC, at age 6-7. A few years later, in elementary school, I graduated to Pascal, using the excellent &lt;a href="https://en.wikipedia.org/wiki/Turbo_Pascal"&gt;Turbo Pascal&lt;/a&gt; compiler. Later in High School I learned C and C++ at about the same time. I remember being 16 and going on a trip with my family to Yosemite Park, where I chose my then-favorite &lt;a href="https://www.amazon.com/Teach-Yourself-ANSI-Days-Sams/dp/0672308878"&gt;Teach Yourself ANSI C++ in 21 Days&lt;/a&gt; book over going hiking with my parents.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://m.media-amazon.com/images/I/7166WSX9EPL._AC_UF1000,1000_QL80_.gif" alt="Teach Yourself ANSI C++ In 21 Days" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;I finished University in 2004 (Computer Science, Budapest University of Technology) and I got a job as a junior C++ programmer at the company &lt;a href="https://en.wikipedia.org/wiki/Graphisoft"&gt;Graphisoft&lt;/a&gt;, working on the architectural CAD product ArchiCAD. Later I had a startup called Scalien, where we wrote a distributed key-value database called &lt;a href="https://github.com/scalien/scaliendb"&gt;ScalienDB (code on Github)&lt;/a&gt; in C++.&lt;/p&gt;
&lt;p&gt;In all this reminiscing it's worth noting that I was not a good programmer initially. I was definitely not (and am still not) a natural talent like &lt;a href="https://en.wikipedia.org/wiki/John_Carmack"&gt;John Carmack&lt;/a&gt;, identifying problems worth solving and writing elegant and highly efficient solutions that millions of people still marvel at 20 years later (&lt;a href="https://github.com/id-Software/Quake"&gt;Quake source code on Github&lt;/a&gt;). In retrospect I would say I became a good programmer towards the end of my time at Graphisoft, and I wrote mostly good code at Scalien. However, this was in 2009 onwards, so, before the onset of modern C++. Back then my position was that the best way to use the C++ language is as a C-with-objects. Specifically, I did not enjoy using the Standard Template Library (STL), so in ScalienDB we &lt;a href="https://github.com/scalien/scaliendb/tree/master/src/System/Containers"&gt;rolled our own container classes&lt;/a&gt;, which is considered a big no-no per the C++ Core Guidelines (see below).&lt;/p&gt;
&lt;p&gt;After Scalien I pivoted to working in the data field, where I was able to leverage my second University degree as a Physicist, which I earned while working at Graphisoft. However, this also meant leaving C++ behind, as the &lt;em&gt;lingua franca&lt;/em&gt; of data in the last 10 years became Python, and SQL for querying relational databases.&lt;/p&gt;
&lt;p&gt;Lately I realized I miss writing pedal-to-the-metal shotgun-to-the-foot C++ code. Part of this is the lively progress the language has made, with C++11, C++14, C++17 and the last C++20 standard introducing exciting new features into my favorite programming language. So I ordered a bunch of books on modern C++ development a few months ago, among them the latest version of &lt;a href="https://www.amazon.com/Tour-C-Bjarne-Stroustrup/dp/0136816487"&gt;Bjarne Strousrup's book&lt;/a&gt; (of which I owned an earlier edition before I gave it to my University library in 2015), and started to devour them.&lt;/p&gt;
&lt;p&gt;I started by reading the excellent &lt;a href="https://www.amazon.com/Beautiful-Core-Guidelines-Writing-Clean/dp/0137647840"&gt;Davidson &amp;amp; Gregory book Beautiful C++&lt;/a&gt; the last couple of weeks — to my surprise it's one of the best technical books I've read in a long time, in parallel discussing hard earned programming wisdom with new language features. The book is built on top of the &lt;a href="https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines"&gt;C++ Core Guidelines&lt;/a&gt;, a set of rules for writing good modern C++ code, edited by Bjarne Stroustrup and Herb Sutter.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://m.media-amazon.com/images/I/51ByxdWleSL.jpg" alt="Beautiful C++" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;After reading the book cover-to-cover I had a strong urge to go and use all the new features and write some modern C++ code. I didn't want to write purely toy code, I wanted to approximate solving a real-world problem — but I also have limited time so anything big is out of the question. I settled on re-implementing, to a satisying (to me) degree the standard Unix command-line tool &lt;code&gt;wc&lt;/code&gt;, most commonly used to count lines in a file using &lt;code&gt;wc -l &amp;lt;file&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The out of the box &lt;code&gt;wc --help&lt;/code&gt; output on my Debian devbox is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ wc --help
Usage: wc &lt;span class="o"&gt;[&lt;/span&gt;OPTION&lt;span class="o"&gt;]&lt;/span&gt;... &lt;span class="o"&gt;[&lt;/span&gt;FILE&lt;span class="o"&gt;]&lt;/span&gt;...
  or:  wc &lt;span class="o"&gt;[&lt;/span&gt;OPTION&lt;span class="o"&gt;]&lt;/span&gt;... --files0-from&lt;span class="o"&gt;=&lt;/span&gt;F
Print newline, word, and byte counts &lt;span class="k"&gt;for&lt;/span&gt; each FILE, and a total line &lt;span class="k"&gt;if&lt;/span&gt;
more than one FILE is specified.  A word is a non-zero-length sequence of
characters delimited by white space.

With no FILE, or when FILE is -, &lt;span class="nb"&gt;read&lt;/span&gt; standard input.

The options below may be used to &lt;span class="k"&gt;select&lt;/span&gt; which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
      --files0-from&lt;span class="o"&gt;=&lt;/span&gt;F    &lt;span class="nb"&gt;read&lt;/span&gt; input from the files specified by
                           NUL-terminated names in file F&lt;span class="p"&gt;;&lt;/span&gt;
                           If F is - &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; names from standard input
  -L, --max-line-length  print the maximum display width
  -w, --words            print the word counts
      --help     display this &lt;span class="nb"&gt;help&lt;/span&gt; and &lt;span class="nb"&gt;exit&lt;/span&gt;
      --version  output version information and &lt;span class="nb"&gt;exit&lt;/span&gt;

GNU coreutils online help: &amp;lt;http://www.gnu.org/software/coreutils/&amp;gt;
Full documentation at: &amp;lt;http://www.gnu.org/software/coreutils/wc&amp;gt;
or available locally via: info &lt;span class="s1"&gt;&amp;#39;(coreutils) wc invocation&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I set of off to re-implement all the 5 basic modes of &lt;code&gt;wc&lt;/code&gt;, in C++, while trying to maintain compatibility with the standard C implementation, to a reasonable degree.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/wcpp_v1.cpp"&gt;full source code of my C++ implementation is up on Github&lt;/a&gt;. You can compile like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ g++ -o wcpp -O3 wcpp_v1.cpp
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The C implementation I most often used to peak to understand classic behaviour is &lt;a href="https://github.com/coreutils/coreutils/blob/master/src/wc.c"&gt;in the coretuils Github repo&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Command line arguments&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;wc&lt;/code&gt; only has a few command line arguments, but each has a &lt;code&gt;-&lt;/code&gt; and a &lt;code&gt;--&lt;/code&gt; way to specify it. Additionally, &lt;code&gt;wc&lt;/code&gt; supports passing in &lt;code&gt;-xyz&lt;/code&gt;, which is equivalent to passing in &lt;code&gt;-x -y -z&lt;/code&gt;, and passing in the special &lt;code&gt;key=value&lt;/code&gt; type argument &lt;code&gt;--files0-from=F&lt;/code&gt;. To my surprise, there is no C++ standard library for parsing arguments, and even &lt;a href="https://theboostcpplibraries.com/boost.program_options"&gt;&lt;code&gt;boost::program_options&lt;/code&gt;&lt;/a&gt; does not support this &lt;code&gt;-xyz&lt;/code&gt; behaviour. So I wrote a few helper functions utilizing the STL to implement this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;cl_args_t&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;key_values&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filename_args&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="n"&gt;cl_args_t&lt;/span&gt; &lt;span class="nf"&gt;get_cl_args&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;cl_args_t&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;-&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;error: switches must precede filename arguments&amp;quot;&lt;/span&gt;
                &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;eqi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;&amp;#39;=&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eqi&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="c1"&gt;// no &amp;#39;=&amp;#39; in arg&lt;/span&gt;
                &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eqi&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eqi&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
                &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key_values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;normalize_flags&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl_args_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// convert eg. -lm to -l -m&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;new_flags&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nl"&gt;f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;-&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;-&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;new_f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;-&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
                &lt;span class="n"&gt;new_flags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;new_flags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_flags&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;check_bad_args&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;cl_args_t&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;accepted_flags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;&amp;quot;-l&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--lines&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;&amp;quot;-w&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--words&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;&amp;quot;-m&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--chars&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;&amp;quot;-c&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--bytes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;&amp;quot;-L&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--max-line-length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;--help&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;--version&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nl"&gt;f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;accepted_flags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;invalid option: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nl"&gt;k&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cl_args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key_values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;must_be&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;--files0-from&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;must_be&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;must_be&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;invalid option: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Counting architecture&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;wc&lt;/code&gt; supports counting 5 quantities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bytes&lt;/li&gt;
&lt;li&gt;characters&lt;/li&gt;
&lt;li&gt;words&lt;/li&gt;
&lt;li&gt;lines&lt;/li&gt;
&lt;li&gt;the longest line&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To support this, I defined an abstract base class called &lt;code&gt;counter&lt;/code&gt;, and 5 child classes defining each counting logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;has_optimization&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;optimized_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="nf"&gt;get_count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Essentially, each counter is assumed to be a state machine, and each invocation of the &lt;code&gt;process_wchar(const wchar_t wc, unsigned size, bool error)&lt;/code&gt; triggers the state machine to process a new wide character &lt;code&gt;wc&lt;/code&gt; of length &lt;code&gt;size&lt;/code&gt; bytes that was read; if the current bytes to not form a valid character per the current locale then &lt;code&gt;error&lt;/code&gt; is set and 1 byte is consumed. The &lt;code&gt;has_optimization()&lt;/code&gt; and &lt;code&gt;optimized_count()&lt;/code&gt; is a special path for supporting &lt;code&gt;wc -c&lt;/code&gt; byte counting, in this case we don't have to actually count the bytes, we can just use &lt;code&gt;fstat()&lt;/code&gt; to ask the operating system what the size of the file is — assuming we are processing files and not &lt;code&gt;stdin&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;As an example, here is the implementation for byte counting, the most trivial of the 5:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;byte_counter&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;has_optimization&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;optimized_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;st_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;wchar_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Handling wide characters&lt;/h2&gt;
&lt;p&gt;Although I've been using &lt;code&gt;wc&lt;/code&gt; for 25 years, I never used it to count characters or words, so I was surprised to learn that a signiciant fraction of complexity comes from differentiating between bytes and (non-ASCII) characters. Essentially, in Unix a locale can be set, and &lt;code&gt;wc&lt;/code&gt; honors this setting. On my Debian box, running &lt;code&gt;locale&lt;/code&gt; shows:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ locale
&lt;span class="nv"&gt;LANG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;en_US.utf8
&lt;span class="nv"&gt;LANGUAGE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;LC_CTYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_NUMERIC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_TIME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_COLLATE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_MONETARY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_MESSAGES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_PAPER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_ADDRESS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_TELEPHONE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_MEASUREMENT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_IDENTIFICATION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;en_US.utf8&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;LC_ALL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This means that bytes are interpreted as UTF-8 characters by &lt;code&gt;wc&lt;/code&gt;. So valid UTF-8 byte sequences are counted as 1 character by &lt;code&gt;wc&lt;/code&gt;, and this is also honored by the word counting logic. However, if we set the locale to &lt;code&gt;C&lt;/code&gt;, we switch to &lt;code&gt;chars=bytes&lt;/code&gt; mode, and in this case &lt;code&gt;wc -c&lt;/code&gt; and &lt;code&gt;wc -m&lt;/code&gt; always outputs the same number; you can trigger this like &lt;code&gt;LC_ALL="C" wc -c -m -l -w -L &amp;lt;file&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;However, not all byte sequences are valid UTF-8 (or whatever else the locale is set to), so invalid sequences are skipped one byte at a time, and a new attempt is made to read a valid character. The workhorse here is the C function &lt;code&gt;mbrtowc()&lt;/code&gt;, which attempts to read a character per the current locale, and return it, or set an error. This logic is in my &lt;code&gt;process_block()&lt;/code&gt; function, which after reading a block of bytes, repeatedly tries to extract wide characters. This function is quite C-like, and the core logic is similar to the standard &lt;code&gt;wc&lt;/code&gt; implementations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="nf"&gt;process_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counter_vector&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;eof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;mbstate_t&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;wchar_t&lt;/span&gt; &lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_basic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;wc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mbrtowc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// encountered null character&lt;/span&gt;
            &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// encountered bad wide character&lt;/span&gt;
            &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// encountered incomplete wide character, get more bytes if possible&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;eof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;num_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// success, read a wchar_t&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;for_each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;process_wchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
        &lt;span class="n"&gt;cp&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;num_bytes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Bugs&lt;/h2&gt;
&lt;p&gt;Getting the locale-specific conversion of bytes to characters took at least half of all development time. I used large zipped files as a test, since these contain random characters, some of which are valid UTF-8, many are not, so this allowed me to test many branches of my code. In the end, I got everything to match except one case: &lt;code&gt;U+2029&lt;/code&gt;, the &lt;a href="https://www.compart.com/en/unicode/U+2029"&gt;Unicode paragraph separator character&lt;/a&gt;. My implementation calls &lt;code&gt;std::iswspace(wc)&lt;/code&gt; which considers this as white-space, so it separates words and triggers an extra word count. However, my Debian system's &lt;code&gt;wc&lt;/code&gt; does not consider this character to be a word separator. I reduced the difference to a test file containg: &lt;code&gt;aU+2029a&lt;/code&gt; (5 bytes long, as &lt;code&gt;U+2029&lt;/code&gt; is represented in 3 bytes with UTF-8) , on which &lt;code&gt;wc -w&lt;/code&gt; returns 1 word from the system &lt;code&gt;wc&lt;/code&gt;, but my imlementation counts 2 words (&lt;a href="https://github.com/mtrencseni/playground/blob/master/u2029_test"&gt;test file is here&lt;/a&gt;). I believe mine is correct, so I left this difference.&lt;/p&gt;
&lt;h2&gt;Performance&lt;/h2&gt;
&lt;p&gt;I tested the performance by running all 5 counters on fairly large zip files repeatedly and using &lt;code&gt;time&lt;/code&gt;. To my surprise, without any additional optimizations, just by compiling with &lt;code&gt;-O3&lt;/code&gt;, my C++ implementation is only about 10% slower than my system's highly optimized C version from 20-30 years ago:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ &lt;span class="nb"&gt;time&lt;/span&gt; wc -c -m -l -w -L playground/MNIST_data/train-images-idx3-ubyte.gz
  &lt;span class="m"&gt;36468&lt;/span&gt;  &lt;span class="m"&gt;207566&lt;/span&gt; &lt;span class="m"&gt;5259183&lt;/span&gt; &lt;span class="m"&gt;9912422&lt;/span&gt;     &lt;span class="m"&gt;448&lt;/span&gt; playground/MNIST_data/train-images-idx3-ubyte.gz

real    0m0.560s
user    0m0.548s
sys     0m0.008s
$ &lt;span class="nb"&gt;time&lt;/span&gt; ./wcpp -c -m -l -w -L playground/MNIST_data/train-images-idx3-ubyte.gz
   &lt;span class="m"&gt;36468&lt;/span&gt;  &lt;span class="m"&gt;207567&lt;/span&gt; &lt;span class="m"&gt;5259183&lt;/span&gt; &lt;span class="m"&gt;9912422&lt;/span&gt;     &lt;span class="m"&gt;448&lt;/span&gt; playground/MNIST_data/train-images-idx3-ubyte.gz

real    0m0.619s
user    0m0.620s
sys     0m0.000s
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note the difference in word counts due to the issue mentioned above.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I had a lot of fun writing &lt;code&gt;wcpp.cpp&lt;/code&gt; and learned a lot about Unix, &lt;code&gt;locale&lt;/code&gt;, wide character handling, and new features of C++ such as &lt;code&gt;auto&lt;/code&gt; and &lt;code&gt;for_each&lt;/code&gt;. I'm sure there are still a lot of optimization possibilities in terms of C++ language usage in my code, I'm happy to take feedback. I'm planning to continue on this path, and as time allows, implement other members of &lt;code&gt;coreutils&lt;/code&gt; such as &lt;code&gt;grep&lt;/code&gt; or &lt;code&gt;find&lt;/code&gt;.&lt;/p&gt;</content><category term="c++"></category><category term="wc"></category></entry><entry><title>Better be first 99% of the time than second 100% of the time</title><link href="https://bytepawn.com/better-be-first-99-percent-of-the-time-than-second-100-percent-of-the-time.html" rel="alternate"></link><published>2023-08-25T00:00:00+02:00</published><updated>2023-08-25T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-08-25:/better-be-first-99-percent-of-the-time-than-second-100-percent-of-the-time.html</id><summary type="html">&lt;p&gt;A review of the Donald MacKenzie's book &lt;em&gt;Trading at the Speed of Light&lt;/em&gt;, which gives an excellent history and inside-peek of the world of High Frequency Trading, or HFT.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/trading-at-the-speed-of-light.jpg" alt="Trading at the Speed of Light" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This post is a review of the &lt;a href="https://en.wikipedia.org/wiki/Donald_Angus_MacKenzie"&gt;Donald MacKenzie's&lt;/a&gt; book &lt;a href="https://www.amazon.com/Trading-Speed-Light-Algorithms-Transforming/dp/0691211388"&gt;Trading at the Speed of Light&lt;/a&gt;, which gives an excellent history and inside-peek of the world of High Frequency Trading, or HFT. Donald MacKenzie is a Professor of Sociology at the University of Edinburgh, Scotland, his main focus is the social studies of finance.&lt;/p&gt;
&lt;p&gt;I've been interested in HFT for 20 years, and found this book interesting and a pleasure to read — highly recommended!&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://docs.google.com/document/d/1EXPJJFgokvTjelAMKSKU_2Zpr3EGX9pF7BG60eHUK6E/edit#heading=h.ndkaentqzvps"&gt;took notes&lt;/a&gt; on the book and then used &lt;a href="https://chat.openai.com/share/cf1598f0-1fe2-4c47-abbb-e8cb6c54ee16"&gt;ChatGPT to generate most of the text&lt;/a&gt; in this article.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/donald-mackenzie.jpg" alt="Trading at the Speed of Light" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The Evolution of Trading Systems&lt;/h2&gt;
&lt;p&gt;The book delves deep into the history of trading systems, tracing the journey from pit trading to electronic platforms. Pit trading involved &lt;strong&gt;open outcry&lt;/strong&gt;, where traders shouted orders and clerks recorded them in physical order books. Electronic trading emerged alongside, but the transition was far from smooth. It required overcoming cultural, technical, and regulatory hurdles, such as exchanges resisting algorithmic trading and early HFT firms resorting to robotic arms to punch keys as modern electronic API were not available.&lt;/p&gt;
&lt;p&gt;Though HFT generates estimated revenues of \$3-5 billion in the U.S., it's a high-cost, slim-margin game. Even a large HFT firm making about \$1 billion in revenue faces extensive costs, making profitability precarious. The difference between success and failure often comes down to fractions of a cent per trade.&lt;/p&gt;
&lt;p&gt;The books shows that today's HFT landscape is a culmination of history, physics, computation, politics, market structures and laws. Despite its efficiency, &lt;strong&gt;HFT has not significantly lowered the cost of financial transactions&lt;/strong&gt; for retail investors. Regulatory bodies like the SEC and CFTC play vital roles but are often navigating uncharted waters.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pit-trading.jpg" alt="Pit trading" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Early Pioneers&lt;/h2&gt;
&lt;p&gt;One of the first electronic exchanges was &lt;a href="https://en.wikipedia.org/wiki/Island_ECN"&gt;Island ECN&lt;/a&gt;, introduced around 1998. Unlike its predecessors like &lt;a href="https://en.wikipedia.org/wiki/Instinet"&gt;Instinet&lt;/a&gt;, Island had an algorithmic matching engine. Early HFT firms like Automated Trading Desk (ATD) were among the first to realize the importance of milliseconds and started to move their servers closer to the Island's data centers to gain a speed advantage. &lt;a href="https://www.reuters.com/article/us-citigroup-automatedtradingdesk-idUSN0233485120070702"&gt;ATD was bought by Citigroup&lt;/a&gt; for $680M in 2007. For a list of HFT firms today, &lt;a href="https://medium.com/automation-generation/15-well-known-high-frequency-trading-firms-f45292c56d05"&gt;see here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Financial Evolution and Market Politics&lt;/h2&gt;
&lt;p&gt;The evolution of trading systems was politically fraught. Each step of automation led to job losses, inciting resistance from incumbents. For example, share prices initially &lt;strong&gt;quantized&lt;/strong&gt; to 1/8 dollars to ensure market makers could maintain a wider bid-ask spread, thus securing profits: market makers offered to buy a stock at \$100, and offered to sell at \$100+1/8 (or more), since that was the minimum quantum of the market. When this changed to finer-grained price units of \$0.01, legacy exchanges and market makers lost business to more modern competitors.&lt;/p&gt;
&lt;p&gt;Market makers create liquidity for other market sellers by always having buy and sell orders in the market book. Thus, if a seller $S$  wants to sell at time $T$, and a buyer $B$ wants to buy some later time $T'$, they don't have to wait for each other, since the market maker will buy from $S$ at time $T$, and later sell to $B$ at time $T'$, for a "small profit" related to the bid-ask spread, which itself is a function of the market quantum.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/order-book.jpg" alt="Order book" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The Order Book and Matching Engine&lt;/h2&gt;
&lt;p&gt;A critical component of trading systems is the &lt;strong&gt;matching engine&lt;/strong&gt;, responsible for pairing buy and sell orders. The evolution has led from human-operated matching engines to fully algorithmic systems. With algorithmic matching, an &lt;strong&gt;open order book&lt;/strong&gt; became a possibility, allowing market participants to view buy and sell orders at various prices. An open order book could either be anonymous or not, affecting strategies and politics among trading entities.&lt;/p&gt;
&lt;p&gt;The non-anonymity of some open order books presents a unique challenge for HFT firms. They need to be cautious not to make &lt;strong&gt;too much money&lt;/strong&gt; off big firms like big banks. Banks, having the leverage to influence exchange policies, can block trades if they suspect an HFT firm is disproportionately benefiting from them. Therefore, HFT firms often limit their gains from a single identifiable firm.&lt;/p&gt;
&lt;h2&gt;High-Frequency Signals and Strategies&lt;/h2&gt;
&lt;p&gt;HFT relies heavily on signals like &lt;strong&gt;futures lead&lt;/strong&gt;, where price changes in futures markets (the Chicago Mercantile Exchange, located in Chicago) often precede similar movements in the underlying asset markets (NYSE and NASDAQ, located in New York). Other signals include order book dynamics, market fragmentation, and correlated share price movement. HFT algorithms trade primarily with each other, as well as with execution algorithms that break down large institutional orders into smaller child orders.&lt;/p&gt;
&lt;p&gt;HFT activities are concentrated in about 25 global data centers, with four major ones in the U.S., an &lt;strong&gt;Equities Triangle&lt;/strong&gt; in New Jersey consisting of Nasdaq, NYSE, and NY4/5, and the Chicago Mercantile Exchange. Connections between these data centers, such as CME-NYC, become lifelines for HFT firms, who pay a premium for the fastest possible links.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cme-nyse.jpg" alt="CME NYSE Geodesic" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Hugging the Geodesic&lt;/h2&gt;
&lt;p&gt;Connectivity has always been at the core of HFT. The first fiber optic Gold Line from the CME datacenter to New Jersey was a significant milestone, but companies like Spread Networks invested hundreds of millions to lay cables that more closely "hugged the geodesic" to reduce latency. Spread Networks pushed the limit to 6.65ms, a significant improvement from the previous 8ms but still above the Einstein limit of 3.94ms. AB Services revolutionized the game by introducing microwave links that could perform at 3.98ms, &lt;strong&gt;nearly meeting the theoretical limit&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Laying these microwave links wasn't easy. For example, one of the main challenges was crossing 80km across Lake Michigan. The solution was to shoot signals from skyscraper tops across the lake. Many firms switched to higher frequencies, up to 23Ghz, which are more vulnerable to environmental conditions like fog or rain. Yet, in HFT, it's "better to be first 99% of the time than second 100% of the time."&lt;/p&gt;
&lt;p&gt;As HFT operations expanded, the race extended within cities like New Jersey. In these densely populated areas, millimeter wave technology replaced microwaves due to bandwidth requirements. AOptics introduced a dual-mode millimeter plus laser technology, each with its susceptibilities balanced by the other, to combat issues like rain and fog.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/microwave-tower.jpg" alt="Microwave tower" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The Physicality of Computers in HFT&lt;/h2&gt;
&lt;p&gt;CME's datacenter in Chicago serves as a linchpin in the complex web of HFT. The facility moved from its original location in Cermak in 2012, which was originally a printing plant, to a more advanced, separate facility. Cermak could handle 100MW of electricity, all eventually transformed into heat — a concern when power density is a crucial metric in datacenter operations. Security is stringent, with most doors protected by biometrics.&lt;/p&gt;
&lt;p&gt;In the world of HFT, even the physical attributes of a computer become crucial. Programmers must consider the &lt;strong&gt;physics of computation&lt;/strong&gt;, focusing on elements like wires, photons and electrons, instructions, and data packets, to minimize latency. This brings in concepts from physics, emphasizing the literal speed of light as a limiting factor in trade execution times. Initially, firms relied on C++ programming for a speed advantage, but the bar has been raised significantly. To remain competitive today, HFT operations must be executed on Field-Programmable Gate Arrays (FPGAs). This has brought down response time to as low as &lt;strong&gt;42 nanoseconds&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Two interesting strategies that try to beat others to be first are scout orders and speculative triggering. Small &lt;strong&gt;scout orders&lt;/strong&gt; (a loss worth it) are used to detect price changes microseconds before the competiton, and then trade on it. &lt;strong&gt;Speculative triggering&lt;/strong&gt; is a technique where HFT firms employ custom packet processing in FPGAs. This approach preemptively guesses the correct trading action based on incoming bits and initiates the sending of response bits even before the full incoming packet has been processed. If the guess turns out to be incorrect, the final bits of the response checksum are deliberately altered to ensure that the exchange's network stack discards the packet.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/fpga.jpg" alt="FPGA" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Market Structure, Regulation, and Effects&lt;/h2&gt;
&lt;p&gt;The datacenter houses key components such as trading firms' systems, the exchange's order gateway, and matching engine. When this engine finds a matching pair of orders from two firms, it sends a &lt;strong&gt;fill&lt;/strong&gt; message. If there's no match, a &lt;strong&gt;confirm&lt;/strong&gt; message is returned. Trading firms also have the flexibility to send &lt;strong&gt;cancel&lt;/strong&gt; or &lt;strong&gt;modify&lt;/strong&gt; messages to their existing orders.&lt;/p&gt;
&lt;p&gt;Trading in HFT is a "cat and mouse game," where firms try to decode the logic behind other trading algorithms to anticipate their moves and trap them for profit. This involves offline analysis of massive amounts of anonymized market data to statistically identify algorithmic behaviors. One possible strategy is &lt;strong&gt;market impact trading&lt;/strong&gt;, where the algorithm itself places orders that influence market prices. Firms anticipate the reactions of other algorithms to these price moves and exploit them before canceling their original orders. &lt;strong&gt;Sweeping&lt;/strong&gt; involves selling against existing bids to lower prices, causing smaller firms to liquidate their holdings, which are then purchased by the sweeper at the lower price. Finally, &lt;strong&gt;spoofing&lt;/strong&gt; entails sending large sell offers above the market price, along with a smaller buy order at a lower price. This temporary imbalance drives the price down, enabling the buy order to succeed; the original spoof orders are then canceled, and the market readjusts. These strategies often blur ethical lines and are subject to regulatory scrutiny, as they can manipulate market dynamics to the advantage of HFT firms.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spoofing.jpg" alt="Spoofing" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Exchanges have attempted to level the playing field by introducing protective mechanisms like a minimum quote lifespan of 250ms to counteract spoofing. Large market participants often have the ability to lobby exchanges to block entities that engage in predatory trading activities against them.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The world of high-frequency trading is a high-stakes game that involves heavy investment in technology, innovation, and data analysis. From the architecture of data centers to the microsecond battles fought in the trading algorithms, HFT is a realm where the pursuit of speed and efficiency is relentless, and the financial implications are enormous.&lt;/p&gt;
&lt;p&gt;The layers of complexity and the speed at which changes occur in this field underscore the need for ongoing scrutiny, ethical considerations, and potentially, further regulatory oversight. It's a world where microseconds can mean the difference between millions gained or lost, and where the technological arms race shows no signs of slowing down.&lt;/p&gt;</content><category term="book"></category><category term="hft"></category><category term="trading"></category><category term="fpga"></category></entry><entry><title>Introduction to Marketing Mix Modeling</title><link href="https://bytepawn.com/introduction-to-marketing-mixed-modeling.html" rel="alternate"></link><published>2023-07-23T00:00:00+02:00</published><updated>2023-07-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-07-23:/introduction-to-marketing-mixed-modeling.html</id><summary type="html">&lt;p&gt;I describe the concept of Marketing Mix Modeling using Google's LightweightMMM library.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/mmm_1.png" alt="MMM attribution" style="width: 600px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Imagine a Data Scientist asked to evaluate the &lt;strong&gt;effectiveness of an online marketing campaign&lt;/strong&gt;. The &lt;strong&gt;golden standard&lt;/strong&gt; in this situation is to scope the campaign as a &lt;a href="https://en.wikipedia.org/wiki/Randomized_controlled_trial"&gt;Randomized Controlled Trial (RCT)&lt;/a&gt;, also known as an &lt;a href="https://bytepawn.com/tag/ab-testing.html"&gt;A/B test&lt;/a&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create a control and a test group which are statistically equal by selecting a large enough sample size, stratification, CUPED, etc; see my earlier post &lt;a href="https://bytepawn.com/five-ways-to-reduce-variance-in-ab-testing.html"&gt;Five ways to reduce variance in A/B testing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Send the marketing campaign to the test group, and don't send it to control. Note that control may still receive other marketing campaign from the company, but not the one being tested.&lt;/li&gt;
&lt;li&gt;Compare the Overall Evaluation Metric (OEM), usually sales between control and test, and report a % lift (and potentially extra sales).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This approach works great if we want to evaluate a single campaign, and if the Data Scientist is involved before the campaign runs, so she can re-scope it as an A/B test. But what if the question is about the overall effectiveness of marketing, not just online but also TV, radio, print, billboards, etc), and over a longer period of time. In this case, there are still two options for our Data Scientist:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;For the online marketing component, the Data Scientist can recommend to scope it as a large, year-long test. This is likely to get strong pushback from the business, since not sending marketing to even 10% of customers (the control group) could negatively impact sales and cause the company to miss targets. Another difficulty may be technically implementing a "block" for the control group in a large, complex organization (with many business units and divisions spread across multiple countries).&lt;/li&gt;
&lt;li&gt;The above approach of a large, year-long test doesn't work for offline marketing such as TV, radio, print, billboards, etc. because here randomization is not possible. We cannot tell people &lt;em&gt;"you're in our control group, close your eyes and ears if you see or hear one of our ads".&lt;/em&gt; Also, in an RCT, the experimental units are not supposed to know which variant or group they are in, since that in itself may affect their behaviour and bias the results. The only way to test this would be if the Data Scientist would have access to a parallel universe where the company doesn't do any marketing for a year (but just this year), and then compare the OEM...&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So, without access to parallel universes, what can the Data Scientist do? The industry answer is &lt;a href="https://en.wikipedia.org/wiki/Marketing_mix_modeling"&gt;Marketing Mix Modeling (MMM)&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Marketing mix modeling (MMM) is statistical analysis such as multivariate regressions on sales and marketing time series data to estimate the impact of various marketing tactics (marketing mix) on sales and then forecast the impact of future sets of tactics. It is often used to optimize advertising mix and promotional tactics with respect to sales revenue or profit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The answer is to take ad spend data, promotions data, competitor ad spend data, competitor promotions, revenues, macro trends, etc. (everything we can get) and look for correlations, ie. build a model that explains historic sales resulting from a sum of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a seasonal baseline (the sales the model believes we would have without any marketing)&lt;/li&gt;
&lt;li&gt;marketing activities (ad campaigns, pricing, promotions)&lt;/li&gt;
&lt;li&gt;competitor effects (competitor's campaigns, pricings, promotions)&lt;/li&gt;
&lt;li&gt;macro influences (economy, exchange rates, Covid-19 lockdowns)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Some points worth noting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the most difficult task is likely to collect reliable and accurate input data, both about the company's own marketing and sales, as well as competitor's&lt;/li&gt;
&lt;li&gt;once some input data is available, we can use existing techniques and libraries (see below) to quickly get a model and answers to basic questions&lt;/li&gt;
&lt;li&gt;MMM is typically run using weekly level observations&lt;/li&gt;
&lt;li&gt;extra time and money should be spent on getting more and better quality data, not building more complicated models!&lt;/li&gt;
&lt;li&gt;if the model is built on low quality data then the results will be directional at best&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;if our inputs are garbage, so will our conclusions — Garbage In, Garbage Out (GIGO)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/mmm_0.png" alt="MMM" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The Marketer&lt;/h2&gt;
&lt;p&gt;From the perspective of the user — the Marketer — the objective is to understand how different marketing activities contribute to the outcome and then &lt;strong&gt;use that knowledge to optimize the allocation of marketing resources&lt;/strong&gt;. MMM is particularly useful in multi-channel marketing strategies where various marketing tactics work in unison. These can range from traditional methods like TV, radio, print ads, to modern digital campaigns such as social media, email, and content marketing. By understanding the contribution of each of these elements, a marketer can make informed decisions on where to invest, what to optimize, and where to cut back.&lt;/p&gt;
&lt;p&gt;The concept of &lt;strong&gt;MMM can be likened to a chef perfecting a recipe&lt;/strong&gt;. Each marketing input represents an ingredient in the recipe. Just as a chef adjusts the ingredients to perfect the taste, a marketing analyst adjusts the marketing inputs to optimize sales. By evaluating historical data, MMM helps identify the effectiveness of each marketing input, providing insights on how to allocate budgets effectively and forecast future results.&lt;/p&gt;
&lt;h2&gt;Core MMM inputs&lt;/h2&gt;
&lt;p&gt;At the heart of a Marketing Mix Model are a variety of inputs that capture the full spectrum of marketing activities and external factors influencing sales or market share:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Marketing variables&lt;/strong&gt;: These are the different marketing efforts put forth by a company across various channels. This can include spending on television ads, online marketing, radio, print, social media, direct mail, email campaigns, SEO, SEM, PR, promotions, sponsorships, and any other marketing channel the company is utilizing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sales data&lt;/strong&gt;: This is the dependent variable that the model attempts to explain. It could be total sales, market share, or any other key performance indicator (KPI) that the organization uses to measure success.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Economic indicators&lt;/strong&gt;: This includes factors like inflation rate, unemployment rate, GDP growth, and consumer sentiment, which can impact consumer buying behavior and in turn affect sales. &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Competitive information&lt;/strong&gt;: Details about competitor activities, such as their marketing spends, pricing changes, product launches, promotions, can influence your sales and should be included in the model.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Seasonality and trend variables&lt;/strong&gt;: These capture predictable fluctuations in sales that are due to the time of year (like holiday seasons) or broader market trends. &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Pricing and distribution information&lt;/strong&gt;: Details on the pricing strategy and the distribution reach (number of stores, online presence, etc.) also play a critical role in influencing sales.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Product variables&lt;/strong&gt;: Changes in product characteristics or the introduction of new products can impact sales. This can include changes in product features, packaging, branding, or the introduction of new SKUs.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These inputs help quantify the impact of each marketing activity on sales, allowing for more efficient allocation of marketing resources. By adjusting these inputs, companies can predict the potential impact of different marketing strategies on their overall sales performance.&lt;/p&gt;
&lt;h2&gt;Core MMM outputs&lt;/h2&gt;
&lt;p&gt;The Marketing Mix Model generates several important outputs that aid in understanding the performance and impact of marketing efforts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contribution by marketing input&lt;/strong&gt;: This output quantifies the impact of each marketing channel on the target variable (e.g., sales or market share). This provides a breakdown of how much each marketing activity contributed to the total sales.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Return on Investment (ROI)&lt;/strong&gt;: ROI measures the profitability of each marketing activity by comparing the amount of profit generated to the cost of the activity. This allows marketers to understand which activities are most profitable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Elasticities&lt;/strong&gt;: This is a measure of how sensitive the target variable (e.g., sales) is to a change in a marketing input. For example, if the elasticity of sales with respect to TV advertising is 0.8, it means that a 1% increase in TV advertising spend would lead to a 0.8% increase in sales, all else being equal.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Baseline sales and incremental sales&lt;/strong&gt;: Baseline sales are the sales that would have been achieved without any marketing activity, while incremental sales are the additional sales gained due to marketing efforts.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Synergies&lt;/strong&gt;: Marketing activities often interact with each other, and the model can measure these synergies. For instance, a TV ad campaign may make an online campaign more effective, and the model would quantify this effect.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What-if scenarios/forecasts&lt;/strong&gt;: Using the model, marketers can predict the impact of future marketing plans on sales. These scenarios can provide insights on how changes in the marketing mix can affect outcomes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal spending levels&lt;/strong&gt;: Based on the ROI, the model can suggest the optimal level of spending for each marketing channel to maximize profit or sales.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These outputs enable marketers to evaluate their past marketing efforts and guide future investment decisions. They can also help identify areas of improvement, and optimize the marketing mix for better performance.&lt;/p&gt;
&lt;h2&gt;Google's MMM library&lt;/h2&gt;
&lt;p&gt;To make MMM more concrete, let's look at Google's opensource &lt;a href="https://github.com/google/lightweight_mmm"&gt;Lightweight MMM&lt;/a&gt; (LMMM) library. LMMM fits a bayesian additive model to the data in attempt to decompose the target variable (eg. sales) into a baseline (including a trend and seasonality) and the effect of marketing channels:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; $ kpi = \alpha + trend + seasonality + media\ channels + other\ factors $&lt;/p&gt;
&lt;p&gt;Where $kpi$ is typically the volume or value of sales per time period, $\alpha$ is the model intercept, $trend$ is a flexible non-linear function that captures trends in the data, $seasonality$ is a sinusoidal function with configurable parameters that flexibly captures seasonal trends, $media\ channels$ is a matrix of different media channel activity (typically impressions or costs per time period) which receives transformations depending on the model used (see Media Saturation and Lagging section) and $other\ factors$ is a matrix of other factors that could influence sales. The full model is explained in the &lt;a href="https://lightweight-mmm.readthedocs.io/en/latest/models.html"&gt;model documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The below code runs LMMM with synthetic sample data (&lt;a href="https://github.com/google/lightweight_mmm/blob/main/examples/simple_end_to_end_demo.ipynb"&gt;the code is from the LMMM Github repo&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jax.numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;jnp&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpyro&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lightweight_mmm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lightweight_mmm&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lightweight_mmm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;optimize_media&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lightweight_mmm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;plot&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lightweight_mmm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;preprocessing&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lightweight_mmm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;

&lt;span class="n"&gt;SEED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;105&lt;/span&gt;
&lt;span class="n"&gt;data_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;104&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;
&lt;span class="n"&gt;n_media_channels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;n_extra_features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;number_warmup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;number_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;media_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extra_features&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;costs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;simulate_dummy_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;data_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_media_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_media_channels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_extra_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_extra_features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;split_point&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_size&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;
&lt;span class="n"&gt;media_data_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;media_data&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;split_point&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;media_data_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;media_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;split_point&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;extra_features_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extra_features&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;split_point&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;extra_features_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extra_features&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;split_point&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;target_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;split_point&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here, we have 3 media channels, and 1 extra feature. An extra feature is just an extra timeseries that we think could be helpful for modeling, it could be something like inflation or a Covid-19 factor. Each of these is a timeseries of &lt;code&gt;float&lt;/code&gt;s, with weekly or daily granularity. We can then fit the model:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;media_scaler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preprocessing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomScaler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;divide_operation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;jnp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;extra_features_scaler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preprocessing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomScaler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;divide_operation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;jnp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;target_scaler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preprocessing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomScaler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;divide_operation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;jnp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;cost_scaler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preprocessing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomScaler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;divide_operation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;jnp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;multiply_by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;media_data_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;media_scaler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;media_data_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;extra_features_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extra_features_scaler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extra_features_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;target_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;costs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cost_scaler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;costs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mmm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lightweight_mmm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LightweightMMM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;carryover&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;mmm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;media&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;media_data_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;media_prior&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;costs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;target_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extra_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;extra_features_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;number_warmup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;number_warmup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;number_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;number_samples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SEED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now we can see how LMMM explains our target variable (eg. sales) using a generic baseline and the contributions from our marketing channel (3 channels in this example), ie. how sales is decomposed into a baseline, and 3 additional factors due to the 3 marketing channels:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;media_contribution&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;roi_hat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_posterior_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cost_scaler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cost_scaler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot_media_baseline_contribution_area_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;media_mix_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mmm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;fig_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mmm_1.png" alt="MMM attribution" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Next, what % of sales is coming from each marketing channel (unfortunately colors don't match with above chart):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot_bars_media_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;media_contribution&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Media Contribution Percentage&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mmm_3.png" alt="MMM attribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Each marketing channel's return on investment (ROI), ie. effectiveness:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot_bars_media_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;roi_hat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ROI hat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mmm_2.png" alt="MMM attribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Last, the response curves for the marketing channels, ie. how does their effectiveness vary with spend:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot_response_curves&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;media_mix_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mmm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;target_scaler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SEED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mmm_4.png" alt="MMM attribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this synthetic example, it's linear per channel, so every additional \$ spend marketing gets us the same additional \$ sales, although with a different ROI (slope) per channel.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Google's LMMM library does a lot more, both in terms of handling inputs, outputs and modeling, but for this introduction I will stop here. In the next post I will look at articles evaluating the accuracy of modeling advertising effectiveness.&lt;/p&gt;
&lt;p&gt;Other good sources on MMM:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/facebookexperimental/Robyn"&gt;Meta: Robyn MMM library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://facebookexperimental.github.io/Robyn/docs/analysts-guide-to-MMM/"&gt;Meta: Analyst's guide to MMM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45998.pdf"&gt;Google: Challenges And Opportunities In Media Mix Modeling (paper)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/towards-data-science/media-mix-modeling-how-to-measure-the-effectiveness-of-advertising-with-python-lightweightmmm-b6d7de110ae6"&gt;TorwardsDataScience: How to Measure the Effectiveness of Advertising with Python &amp;amp; LMMM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://getrecast.com/blog/"&gt;Recast blog (MMM vendor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://sellforte.com/en/blog/"&gt;Sellforte blog (MMM vendor)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="mmm"></category><category term="marketing"></category><category term="mixed"></category><category term="model"></category><category term="lightweight_mmm"></category><category term="google"></category><category term="python"></category></entry><entry><title>Leadership models V: The Hero's Journey</title><link href="https://bytepawn.com/leadership-models-the-heros-journey.html" rel="alternate"></link><published>2023-06-23T00:00:00+02:00</published><updated>2023-06-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-06-23:/leadership-models-the-heros-journey.html</id><summary type="html">&lt;p&gt;The Hero's Journey, or Monomyth, is a narrative pattern identified by scholar Joseph Campbell that appears across a wide range of cultures and eras, and is also a useful mental model in Leadership.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://writerswrite.co.za/wp-content/uploads/2017/02/heros_journey.png" alt="Peter principle" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the fifth article in my series on useful mental models in leadership and self-management:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop.html#leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop"&gt;Leadership models I: Iceberg Model, Six Thinking Hats, Trust Equation, Circle of influence, OODA Loop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model.html#leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model"&gt;Leadership models II: Growth Mindset, Eisenhower Matrix, Tuckman Model, Cynefin Framework, SCARF Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-first-principles-thinking.html#leadership-models-first-principles-thinking"&gt;Leadership models III: First Principles Thinking&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-pareto-principle-rumsfeld-matrix-servant-leadership.html"&gt;Leadership models IV: the Pareto Principle, the Peter Principle, the Rumsfeld Matrix, Servant Leadership and Pygmalion Effect, and Goleman's Emotional Intelligence Model&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Hero%27s_journey"&gt;Hero's Journey&lt;/a&gt;, or &lt;strong&gt;Monomyth&lt;/strong&gt;, is a narrative pattern identified by scholar &lt;a href="https://en.wikipedia.org/wiki/Joseph_Campbell"&gt;Joseph Campbell&lt;/a&gt; that appears across a wide range of cultures and eras. At its heart, the Hero's Journey is a story of transformation, depicting the protagonist's path from the familiar to the unknown, through trials and revelations, and ultimately back to their world, forever changed. It's a blueprint for storytelling that not only speaks to our shared human experiences of growth, learning, and self-discovery, but also offers a profound metaphor for personal and professional development.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcQOijupa6Y8b_KvMFspbAHjAorIBqs5FFYu5BHVDJVKqokzRN9Q" alt="Joseph Campbell" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Call to Adventure&lt;/h2&gt;
&lt;p&gt;The Hero's Journey is a narrative structure common to countless stories across time and culture. This &lt;strong&gt;universal blueprint&lt;/strong&gt; traces the &lt;strong&gt;path of a hero&lt;/strong&gt; as they depart from the ordinary world, venture into a realm of fantastic wonders and daunting challenges, and &lt;strong&gt;return transformed&lt;/strong&gt;, bearing gifts for their society.&lt;/p&gt;
&lt;p&gt;From ancient mythological epics to modern cinematic blockbusters, the Hero's Journey pervades our collective storytelling. Its stages can be seen in the narrative arc of Gilgamesh, in the adventures of Odysseus, and in the celebrated saga of a young farm boy on Tatooine named &lt;strong&gt;Luke Skywalker&lt;/strong&gt;. The enduring appeal of this narrative pattern has been ascribed to its deep resonance with our shared human experience.&lt;/p&gt;
&lt;p&gt;Despite the fantastical settings and extraordinary events that often characterize these stories, the Hero's Journey &lt;strong&gt;reflects our own journeys through life&lt;/strong&gt;. It’s not just a plot device, but a &lt;strong&gt;metaphor for personal transformation, growth, and self-discovery&lt;/strong&gt;. As we accompany heroes like Luke Skywalker on their quests, we vicariously confront our own fears, overcome our own obstacles, and discover our own latent potential. The Hero's Journey, then, is more than just a story. It's a journey of the self, towards the self, a narrative pattern that echoes our universal human quest for meaning, purpose, and identity.&lt;/p&gt;
&lt;h2&gt;Mental model&lt;/h2&gt;
&lt;p&gt;The Hero's Journey, or the Monomyth, describes a universal pattern found in many narratives around the world. In relation to leadership, the Hero's Journey can serve as a useful model for leaders and managers in several ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Self-understanding:&lt;/strong&gt; Leaders can compare their own personal and professional development to the stages of the Hero's Journey. Each stage can correspond to various aspects of a leader's journey, such as facing new challenges (Call to Adventure), hesitation in taking up new responsibilities (Refusal of the Call), acquiring new skills and knowledge, and so on. This can help leaders gain a better understanding of their own growth and development.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Development of empathy and understanding:&lt;/strong&gt; By viewing their team members' journeys through the lens of the Hero's Journey, leaders can gain insights into the personal and professional challenges that their team members may be facing. This can lead to more empathetic and effective leadership.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Vision and Strategy Development:&lt;/strong&gt; The Hero's Journey provides a narrative structure that leaders can use to craft their organizational vision and strategy. For instance, the Call to Adventure could represent the need for organizational change, while the Road of Trials could symbolize the challenges that the organization needs to overcome to achieve its goals.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Change Management:&lt;/strong&gt; The Hero's Journey can serve as a model for managing change. Each stage of the journey involves changes and challenges that need to be overcome, providing a useful analogy for the process of implementing organizational change.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Motivation and Inspiration:&lt;/strong&gt; The Hero's Journey, with its narrative of overcoming challenges and achieving a worthy goal, can serve as a powerful source of motivation and inspiration for both leaders and their team members.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Coaching and Mentoring:&lt;/strong&gt; The journey model can serve as a guide for coaching and mentoring, helping leaders to identify the stages of their team members' personal and professional development and provide appropriate support and guidance at each stage.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In sum, Campbell's Hero's Journey offers a universal narrative structure that can provide a rich source of insights and inspiration for leaders and managers.&lt;/p&gt;
&lt;h2&gt;Hero's Journey&lt;/h2&gt;
&lt;p&gt;Joseph Campbell's Hero's Journey is typically divided into three main sections: Departure, Adventure and Resolution. Each of these sections contains various stages:&lt;/p&gt;
&lt;p&gt;Sure, let's examine the Hero's Journey in detail, organized into three broad episodes:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Episode I: Departure&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ordinary World&lt;/strong&gt;: This is the hero's normal world, where we learn crucial details about our hero, their true nature, capabilities, and outlook on life. This anchors the hero as a relatable, sympathetic character.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Call to Adventure&lt;/strong&gt;: The adventure begins with the disruption of the comfort of the hero's ordinary world. This is a challenge or quest that the hero is compelled to undertake.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Refusal of the Call&lt;/strong&gt;: Although the hero may be eager to accept the quest, at this stage, they will have fears that need overcoming. This refusal is often because of fear, duty, or insecurity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Meeting with the Mentor&lt;/strong&gt;: The hero comes across a seasoned traveler of the worlds who gives him or her training, equipment, or advice that will help on the journey.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Crossing the Threshold&lt;/strong&gt;: This is the turning point where the hero commits wholeheartedly to the adventure and finally steps into the unknown, marking the end of the Departure phase.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Episode II: Adventure&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tests, Allies, Enemies&lt;/strong&gt;: Now in a world of problems and challenges, the hero is tested, makes allies or enemies, and learns the rules of this new world.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Approach to the Innermost Cave&lt;/strong&gt;: The hero arrives at the edge of a dangerous place, often deep underground, where the object of the quest is hidden.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ordeal&lt;/strong&gt;: The hero faces the greatest challenge yet, with the potential for death. This is a critical moment in the hero's journey, and everything is at stake.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reward&lt;/strong&gt;: After facing death, the hero achieves the quest's goal or part of it, taking possession of the treasure or the reward that they've been seeking.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Episode III: Resolution&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Road Back&lt;/strong&gt;: The hero is driven to complete the adventure, yet must deal with the consequences of confronting the dark forces in the ordeal.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Resurrection&lt;/strong&gt;: This is the climax, where the hero has their final and most dangerous encounter with death. The hero must use everything they've learned to overcome their most difficult challenge.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Return with Special Knowledge&lt;/strong&gt;: The hero returns to their original world, but the hero's journey has changed them. This change often manifests as a newfound knowledge or insight that the hero can now use to improve their world. This marks the resolution of the journey, where the hero emerges transformed, symbolizing their mastery of life.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;By following this template, you can easily construct a captivating narrative that engages audiences due to its familiarity, emotional resonance, and satisfying resolution.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://writerswrite.co.za/wp-content/uploads/2017/02/heros_journey.png" alt="Peter principle" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Luke Skywalker&lt;/h2&gt;
&lt;p&gt;In the annals of contemporary cinema, there are few stories as universally recognizable as the saga of Luke Skywalker in the Star Wars franchise. Not only has the character become a cultural icon, but his journey also serves as a textbook example of Joseph Campbell's "Hero's Journey." Skywalker's journey from a young farm boy on Tatooine to the savior of the galaxy follows Campbell's model almost to a tee, providing an intriguing case study of this universal narrative structure.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://tvline.com/wp-content/uploads/2023/03/star-wars-r5-d4-mandalorian.jpg" alt="Luke Skywalker" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;In the annals of contemporary cinema, there are few stories as universally recognizable as the saga of Luke Skywalker in the Star Wars franchise. Not only has the character become a cultural icon, but his journey also serves as a textbook example of Joseph Campbell's "Hero's Journey." Skywalker's journey from a young farm boy on Tatooine to the savior of the galaxy follows Campbell's model almost to a tee, providing an intriguing case study of this universal narrative structure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ordinary World&lt;/strong&gt;: Luke begins his journey on the desert planet of Tatooine. He lives a mundane life as a farm boy, longing for adventure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Call to Adventure&lt;/strong&gt;: When Luke stumbles upon two droids carrying secret plans that can help rebel forces defeat the oppressive Galactic Empire, he realizes there is a world beyond his desert home that requires his help.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Refusal of the Call&lt;/strong&gt;: Initially, Luke hesitates to join the Rebellion. He feels he can't leave his family and their farm, despite his desire for adventure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Meeting with the Mentor&lt;/strong&gt;: Luke meets Obi-Wan Kenobi, a Jedi Knight who begins to teach him about the Force and his Jedi lineage, providing guidance and the lightsaber that belonged to his father.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Crossing the Threshold&lt;/strong&gt;: Luke's threshold moment comes when his aunt and uncle are killed by the Empire. With nothing left for him on Tatooine, he decides to join Obi-Wan on his mission to deliver the droid R2-D2 and the Death Star plans to the rebel forces.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tests, Allies, Enemies&lt;/strong&gt;: Luke undergoes several tests, forms alliances, and faces enemies. He learns to pilot a spacecraft, meets Han Solo and Princess Leia, and faces conflicts with the forces of the Empire, including the menacing Darth Vader.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach to the Innermost Cave&lt;/strong&gt;: The innermost cave for Luke is the Death Star, where he embarks on a rescue mission to save Princess Leia.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ordeal&lt;/strong&gt;: His ordeal is the climactic Battle of Yavin, where Luke, relying on his connection to the Force, has to destroy the Death Star.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reward&lt;/strong&gt;: With the Death Star destroyed, Luke and his friends are celebrated as heroes. Luke has proven himself a worthy Jedi and a formidable force against the Empire.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Road Back&lt;/strong&gt;: The road back is the ongoing battle against the Empire. Although the Death Star is destroyed, the Empire remains a powerful threat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resurrection&lt;/strong&gt;: Luke faces a significant ordeal in "The Empire Strikes Back" when he battles Darth Vader and learns the truth about his lineage. It's a form of death and rebirth for his character as he struggles with this truth and continues his Jedi training.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Return with Special Knowledge&lt;/strong&gt;: By "Return of the Jedi", Luke has evolved significantly. He is now a skilled Jedi Knight and has come to terms with the revelation that Darth Vader is his father. He possesses the special knowledge of the Force and uses it to confront the Emperor, leading to the ultimate defeat of the Empire. He brings hope back to the galaxy, symbolizing his complete transformation from farm boy to galactic hero.&lt;/p&gt;
&lt;h2&gt;Leadership&lt;/h2&gt;
&lt;p&gt;In the realm of &lt;strong&gt;Leadership&lt;/strong&gt;, the Hero's Journey can serve as an invaluable roadmap for personal development and transformation. The leader's "Ordinary World" could be their initial state of managing without much strategic thought or direction. The "Call to Adventure" represents the realization that more effective leadership is required, and "Refusal of the Call" shows the initial resistance to adopting new leadership styles or methods. "Meeting with the Mentor" could be a period of learning and development, perhaps through mentorship or leadership training. As the leader "Crosses the Threshold", they begin to apply what they've learned, leading their team with renewed vision and skills. The trials they face and allies they gather represent the everyday challenges of leadership and the team members who support them.&lt;/p&gt;
&lt;p&gt;When it comes to &lt;strong&gt;Team Management&lt;/strong&gt;, the Hero's Journey provides an apt metaphor for team development and goal achievement. The "Ordinary World" is the team's initial state, while the "Call to Adventure" represents the team's project or mission. As the team moves through the stages of the journey, they face various tests, in the form of project hurdles and deadlines, and make allies, in the form of supportive team members or other teams within the organization. The "Approach to the Innermost Cave", "Ordeal", and "Reward" stages reflect the process of tackling the project's most significant challenges and ultimately completing the project successfully. The "Road Back", "Resurrection", and "Return with Special Knowledge" stages can represent the team's review of the project, their learning from any mistakes made, and the application of these lessons to future projects.&lt;/p&gt;
&lt;p&gt;In driving &lt;strong&gt;Change Management&lt;/strong&gt;, the Hero's Journey can illustrate the process of implementing and managing change. The "Ordinary World" represents the initial state of the organization, while the "Call to Adventure" is the realization that change is necessary. The "Refusal of the Call" symbolizes initial resistance to change, while "Meeting with the Mentor" might represent consultation with change management experts or gathering information about how to manage the change. As the organization "Crosses the Threshold", they officially initiate the change. The ensuing stages, such as "Tests/Allies/Enemies", "Approach the Innermost Cave", and "Ordeal", symbolize the challenges encountered during the change process, while "Reward" represents the successful implementation of the change. Finally, "The Road Back", "Resurrection", and "Return with Special Knowledge" signify the process of solidifying the change, dealing with any lingering issues, and applying the lessons learned to future change initiatives.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://assets1.ignimgs.com/thumbs/userUploaded/2020/1/7/chewbacca-medalblogroll-1578434736380.jpg" alt="Peter principle" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The Hero's Journey, as expounded by Joseph Campbell, remains a compelling framework for understanding not only the structure of countless stories from around the world, but also our personal and professional journeys. Whether you're a leader seeking to inspire your team, an educator shaping young minds, or a creative professional striving to craft a captivating narrative, the stages of the Hero's Journey can provide profound insights. It’s a timeless blueprint for transformation, highlighting the challenges we face, the mentors and helpers we encounter, and the ultimate boon we seek in our quest for growth and self-realization. It speaks to the enduring human capacity for change, underscoring that the path to becoming a hero lies within each one of us.&lt;/p&gt;</content><category term="leadership"></category><category term="mental"></category><category term="models"></category><category term="first-principles"></category></entry><entry><title>Real-world experiments I: 5 Lessons from Google, Bing, Netflix and Alibaba</title><link href="https://bytepawn.com/experimentation-lessons-google-bing-netflix-alibaba.html" rel="alternate"></link><published>2023-06-18T00:00:00+02:00</published><updated>2023-06-18T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-06-18:/experimentation-lessons-google-bing-netflix-alibaba.html</id><summary type="html">&lt;p&gt;I discuss five lessons from large-scale experiments conducted by Google, Bing, Netflix and Alibaba: Kohavi's 1 out of 3 rule, Google's 41 shades of blue, Bing's unexpected big win, Alibaba's personalization experiment and Netflix' movie image personalization.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/netflix-movie-art.png" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I have written &lt;a href="https://bytepawn.com/tag/ab-testing.html"&gt;more than 20 articles on A/B testing on Bytepawn&lt;/a&gt;, but most of them are about the technical, statistical aspects. In this post, I will discuss five lessons from large-scale experiments conducted by Google, Bing, Netflix and Alibaba. Each story briefly describes a specific experiment conducted by these companies and the consequent lessons learned. From Ronny Kohavi's enlightening "1 out of 3 rule" at Microsoft that reshapes the understanding of failure in the realm of innovation, to Google's seemingly trivial yet significantly impactful experiment on the '41 shades of blue', these narratives underline how experimentation often leads to surprising revelations and success. Similarly, Alibaba's large-scale randomized field experiment and Netflix's personalized movie image testing provide evidence for the importance of personalization. These stories prove the power of A/B testing and can be an inspiration for practicioners.&lt;/p&gt;
&lt;h2&gt;Kohavi's 1 out of 3 rule&lt;/h2&gt;
&lt;p&gt;Ronny Kohavi, formerly Vice President, Analysis &amp;amp; Experimentation on Bing, author of the excellent book &lt;a href="https://www.amazon.com/Trustworthy-Online-Controlled-Experiments-Practical/dp/1108724264"&gt;Trustworthy Online Controlled Experiments&lt;/a&gt; famously writes in &lt;a href="https://exp-platform.com/Documents/ExP_DMCaseStudies.pdf"&gt;Online Experimentation at Microsoft&lt;/a&gt;: When we first shared some of the above statistics at Microsoft, many people dismissed them. Now that we have run many experiments, we can report that Microsoft is no different. Evaluating well-designed and executed experiments that were designed to improve a key metric, &lt;strong&gt;only about one-third were successful at improving the key metric!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In the later presentation titled &lt;a href="https://exp-platform.com/Documents/2015-08OnlineControlledExperimentsKDDKeynoteNR.pdf"&gt;Lessons from Running A/B/n Tests for 12 years&lt;/a&gt; states that 1/3 of ideas were positive ideas and statistically significant, 1/3 of ideas were flat: no statistically significant difference, 1/3 of ideas were negative and statistically significant. &lt;strong&gt;At Bing, the success rate is lower; the low success rate has been documented many times across multiple companies.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/kohavi-1of3.jpg" alt="Kohavi" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson: most of your ideas will not work. This is learning, not failure.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Google's 41 shades of blue&lt;/h2&gt;
&lt;p&gt;The 2014 Guardian article titled &lt;a href="https://www.theguardian.com/technology/2014/feb/05/why-google-engineers-designers"&gt;Why Google has 200m reasons to put engineers over designers&lt;/a&gt; describes the following experiment at Google. &lt;/p&gt;
&lt;p&gt;Roughly half a decade ago, Google initiated its venture into Gmail advertising. As was the case with their search engine, ads were presented as small blue links leading to various websites. However, it was observed that the shade of blue used in these two distinct products varied when linked to advertisements. Traditionally, the choice of color in such a scenario would have been determined by a chief designer or a marketing director, in what might be referred to as 'the hippo approach'. But in the data-driven world, Google decided to approach this differently.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/google-41-blues.png" alt="Google" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;They conducted a series of '1%' experiments, exposing one percent of users to a specific shade of blue, with each test incorporating a different hue. To ensure thoroughness, forty different variations of blue were tested in this manner. The objective was to discern the most popular shade among users, measured by the frequency of clicks. The results led to the discovery that a marginally purpler blue was more click-friendly than a slightly greener shade. This seemingly minor adjustment, given the magnitude of Google's business operations, had a significant financial impact: an &lt;strong&gt;additional $200 million per year in ad revenue&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson: if you have the sample size, don't test 2 or 3 variants, test a lot of them to find the optimum.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Bing's unexpected big win&lt;/h2&gt;
&lt;p&gt;In Chapter 1 of &lt;a href="https://www.amazon.com/Trustworthy-Online-Controlled-Experiments-Practical/dp/1108724264"&gt;Trustworthy Online Controlled Experiments&lt;/a&gt;, Kohavi opens with the following experiment, also descibed in the HBR article titled &lt;a href="https://hbr.org/2017/09/the-surprising-power-of-online-experiments"&gt;The Surprising Power of Online Experiments&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;In the year 2012, an innovative idea emerged from within the Bing team at Microsoft. An employee proposed a change in the way ad headlines were displayed on the search engine. It was a &lt;strong&gt;straightforward implementation that wouldn't take more than a few days of engineering effort&lt;/strong&gt;. However, being one amongst a sea of proposed ideas, it was &lt;strong&gt;deemed low priority&lt;/strong&gt; by program managers. As a result, the idea remained untouched for over six months. It wasn't until an engineer, noting the minor cost of writing the necessary code, decided to launch an A/B test to evaluate the potential impact of this seemingly inconsequential change.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bing-inline-ads-results.jpg" alt="Bing" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Within a matter of hours, the newly introduced headline variation started generating revenue at an unusually high rate, tripping the "too good to be true" alarm. While such alerts often indicate a glitch, in this case, it pointed towards a genuine increase in revenue. Analysis revealed a staggering 12% increase in revenue, translating to &lt;strong&gt;an annual boost of over $100 million in the United States alone&lt;/strong&gt;, without any negative impact on key user-experience metrics. Despite being one of the most lucrative ideas in Bing's history, its value remained unrecognized until this experiment. This serves as a humbling reminder of the challenge in assessing the potential of new ideas and underscores the benefits of possessing the ability to run numerous inexpensive tests concurrently, a practice more businesses are beginning to appreciate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson: you don't know which experiment will yield big results.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Alibaba's personalization experiment&lt;/h2&gt;
&lt;p&gt;In the paper &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3962157"&gt;The Value of Personal Data in Internet Commerce&lt;/a&gt; researchers from Alibaba share surprising results.&lt;/p&gt;
&lt;p&gt;In collaboration with Alibaba's E-commerce platform, a large-scale randomized field experiment was conducted involving 555,800 customers. Upon opening the Alibaba app, these users were exposed to the homepage recommendation. Half of the user base was randomly selected and assigned to a treatment group. For these users, &lt;strong&gt;personal data like demographics, past clicks, and purchase behaviors were not employed in the product recommendations on the homepage&lt;/strong&gt;. The other half, constituting the control group, experienced the standard product recommendation process, which utilized their personal data. The stakes of this experiment were significant, given the reliance of the platform on homepage recommendations to match its vast user base with countless merchants and considering that customers spend a substantial amount of time interacting with recommended products.&lt;/p&gt;
&lt;p&gt;The experiment illuminated the impact of personal data regulation on user engagement and transaction outcomes. The findings revealed a notable increase in product recommendation concentration and a significant decrease in the alignment of product recommendations with customer preferences in the treatment group compared to the control group. Moreover, recommendations made without personal data led to a drastic drop in matching outcomes, as measured by both customer engagement and market transactions. There was an immediate 75% decrease in customers' click-through rate (CTR) on the recommended products, and a 33% reduction in homepage browsing as measured by product views (PV). The &lt;strong&gt;combined effect resulted in a substantial 81% decline in customer purchases (gross merchandise volume, GMV)&lt;/strong&gt; facilitated by homepage recommendations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson: personalized recommendations are worth high double-digit % lifts.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Netflix' movie image personalization&lt;/h2&gt;
&lt;p&gt;In their blog post titled &lt;a href="https://netflixtechblog.com/selecting-the-best-artwork-for-videos-through-a-b-testing-f6155c4595f6"&gt;Selecting the best artwork for videos through A/B testing&lt;/a&gt; Netflix describes how the artwork or images associated with movies or television shows significantly influence a user's decision on what to watch. To provide the best personalized user experience, &lt;strong&gt;Netflix decided to conduct an A/B test on its artwork&lt;/strong&gt; in different regions to understand the kind of images that resonated with its diverse user base.&lt;/p&gt;
&lt;p&gt;One of the more notable findings was that images featuring expressive facial emotions of the main character had more appeal. For example, in one A/B test, Netflix used different thumbnails for the movie "Good Will Hunting." The artwork with Robin Williams showing a strong emotional expression drove more views than artwork with other characters or less expressive faces. However, Netflix didn't release the specific numeric outcomes of these tests, but they were confident in stating the effectiveness of this approach in increasing engagement.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/netflix-movie-art.png" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Following these A/B tests, Netflix has implemented a dynamic system that personalizes the artwork shown to individual users&lt;/strong&gt; based on their viewing history and preferences. For instance, a user who watches a lot of romantic movies might see artwork emphasizing a love story, while a user who frequently watches action films might see artwork highlighting an action scene. This A/B testing has allowed Netflix to optimize user engagement, personalize content more effectively, and drive longer viewing sessions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson: personalization and experimentation opportunities lurk in unexpected areas.&lt;/strong&gt;&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Conditional Probabilities and Simpson's Paradox</title><link href="https://bytepawn.com/conditional-probabilities-and-simpsons-paradox.html" rel="alternate"></link><published>2023-06-11T00:00:00+02:00</published><updated>2023-06-11T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-06-11:/conditional-probabilities-and-simpsons-paradox.html</id><summary type="html">&lt;p&gt;I give examples of "unintuitive" conditional probabilities and discuss Simpson's paradox.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/simpsons-3.jpg" alt="Simpson's paradox" style="width: 800px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I was reading the excellent book &lt;a href="https://www.amazon.com/Patterns-Predictions-Actions-Foundations-Learning/dp/069123373X"&gt;Patterns, Predictions and Actions&lt;/a&gt;, which mentions the &lt;a href="https://en.wikipedia.org/wiki/Simpson%27s_paradox"&gt;Simpson's paradox&lt;/a&gt; on page 176, in the chapter on Causality. I've encountered Simpson's paradox before, but like any good paradox, I have to take a step back, think things through and convince myself I understand what's going on each time. This time I decided to write it down to make it more sticky!&lt;/p&gt;
&lt;h2&gt;University admissions&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Simpson%27s_paradox"&gt;Simpson's paradox&lt;/a&gt; is usually explained using the toy example of University admissions and gender bias. Suppose there is a University, and there are $N=1000$ applicants, 650 are admitted, leading to an admission rate of 65%. So far so good.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simpsons-1.jpg" alt="Simpson's paradox" style="width: 250px;"/&gt;&lt;/p&gt;
&lt;p&gt;Now, suppose that of the $N=1000$ applicants, half are male and half are female. Out of 500 males, 400 were admitted, so the admission rate was 80%. Out of 500 females, 250 were admitted, so the admission rate was 50%. Note that the numbers add, up, $ 400+250=650 $ total admissions.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simpsons-2.jpg" alt="Simpson's paradox" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The differing rates of admission may lead us to think that there is gender bias, and the University is discriminating against females. A χ² hypothesis test tells us that the difference in acceptance rates in the two branches (male and female) are statistically significant at $ p &amp;lt; 0.001 $.&lt;/p&gt;
&lt;p&gt;Can we stop here? Let's dig deeper: when students apply for University, they actually apply to department, and it's the department that accepts or rejects them. For simplicity, let's assume that each student only applies to one department, and that there are 2 departments, A and B. Simpson's paradox can now manifest itself.&lt;/p&gt;
&lt;h2&gt;No discrimination&lt;/h2&gt;
&lt;p&gt;First, let's look at an example where, once we look at the per-deparment admissions, we can convince ourselves that there is no discrimination:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simpsons-3.jpg" alt="Simpson's paradox" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here, department A has a 90% acceptance rate across both males and females, and department B has a 40% acceptance rate across both males and females. Clearly, there is no discrimination at acceptance. The differing acceptance rate at the male-female level is because 400 out of 500 females apply to the more competitive B department, but only 100 out of 500 males do so.&lt;/p&gt;
&lt;h2&gt;Reversed discrimination&lt;/h2&gt;
&lt;p&gt;By playing around with the numbers, it's easy to create a situation where the perceived discrimination is reversed (changed numbers in red):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simpsons-4.jpg" alt="Simpson's paradox" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this toy example, at the male-female level males have a higher acceptance rate (70% vs 50%), but at the department level, females actually have higher acceptance rates (90% vs 80% for department A and 40% vs 30% for department B). The root cause is the same: even though females are actually more successful at the department level, more of them apply to the more competitive department, leading to lower overall acceptance rates. Note that the same way if would have been erroneous to conclude discrimination without understanding the admission process and seeing the per-department data, it may equally be erroneous to conclude that there is reverse discrimination at the department level; perhaps there are other factors, more depth at play that lead to this outcome.&lt;/p&gt;
&lt;h2&gt;Same rates&lt;/h2&gt;
&lt;p&gt;For completeness, we can of course also have a situation where the per-department acceptance rate happen to match the University level (changed numbers in red):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simpsons-5.jpg" alt="Simpson's paradox" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conditional probability&lt;/h2&gt;
&lt;p&gt;The "root cause" of these scenarion is simple: in general, there is no relationship between $ P(X=x) $ and $ P(X=x|Y=y) $. All three of the relations equals, less-than or greater-than are possible, and the same goes for $P(X=x|Y=y)$ and $P(X=x|Y=y,Z=z)$.&lt;/p&gt;
&lt;p&gt;In our example, if we define $A$ as admission, $G$ as Gender and $D$ as deparment, and treat admission rate as a probability, than $P(A=a), P(A=a|G=g)$ and $P(A=a|G=g,D=d)$, as the above examples show, can be in any relation with each other.&lt;/p&gt;
&lt;h2&gt;Observation, not experiment&lt;/h2&gt;
&lt;p&gt;A novice Data Scientists may get confused and wonder: &lt;em&gt;if an run an A/B test and detect a statistically significant difference, how do I know there is no Simpson's paradox lurking in the background?&lt;/em&gt; If you're a Data Scientist, I recommend you pause and try to answer this for yourself!&lt;/p&gt;
&lt;p&gt;First of all, the example we described above was observational: we didn't interact with the system (applicants or departments), we simply observed the outcomes. In an A/B test (Randomized Controlled Experiment), we conduct an experiment, which means we interact with the system:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We randomly split our population into Treatment (A) and Control (B)&lt;/li&gt;
&lt;li&gt;We treat units in Treatment, but leave units in Control alone.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The whole point of an A/B test is that with enough sample size $N$ (or good stratification), the two populations in A and B will be: &lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the same&lt;/li&gt;
&lt;li&gt;the same as the overall population&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This means that Simpson's paradox cannot occur (other than by variation, which can be limited by the afore-mentioned methods), because sub-population sizes will be the same! Ie. M and F were to branches of an A/B test (which they are not), than the number of people applying to the two departments would be roughly the same; that's the whole point of randomization!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;One important conclusion is that a statistical significance test didn't help here; all it told us that the two are in fact different (the normal distributions of the means are far away), but in general a statistical test does not tell us about the reasons two means are different. It is an error of (statistical and logical) reasoning to say &lt;em&gt;"I think X and Y are different because of Z"&lt;/em&gt;, then use a statistical significance test to establish that &lt;em&gt;"X and Y are indeed different"&lt;/em&gt;, and then conclude, &lt;em&gt;"therefore Z"&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Another suspision is that Simpson's Paradox is more likely to occur if the rates we see reinforce a social construct or stereotype. For example, it's common today to look for gender bias and discrimiation, so it's easier to draw erroneous conclusions around this topic. In other words, we can construct an identical toy example with different words, and the result is less surprising: &lt;em&gt;Ants of colony A have lower lifespans than ants of colony B, even though warrior and harvester ants of both colonies have equal survival probabilities, because colony A has more warrior ants, which have lower lifespans than harvester ants.&lt;/em&gt; This, to me is less surprising and less counterintuitive than the original University toy example with genders.&lt;/p&gt;</content><category term="probability"></category><category term="statistics"></category><category term="simpsons"></category><category term="paradox"></category></entry><entry><title>Testing GPT-4 spatial reasoning and comprehension</title><link href="https://bytepawn.com/testing-gpt-4-spatial-reasoning-and-comprehension.html" rel="alternate"></link><published>2023-05-21T00:00:00+02:00</published><updated>2023-05-21T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-21:/testing-gpt-4-spatial-reasoning-and-comprehension.html</id><summary type="html">&lt;p&gt;I run experiments to determine whether, or to what degree, GPT-4 has developed an comprehension of spatial relationships. I find that it it significantly better than GPT-3.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/square-room.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In an earlier post, I tested GPT-3 spatial reasoning abilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/testing-gpt-3-spatial-reasoning-and-comprehension.html"&gt;Testing GPT-3 spatial reasoning and comprehension&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While on some tasks it did fairly well, I suspect that was due to the question being similar to training data. It did very poorly on the "people in a room" task, it couldn't answer a single question correctly, so my overall conclusion was that GPT-3 has not developed a sense for spatial reasoning. Let's see how GPT-4 (ChatGPT-4) does.&lt;/p&gt;
&lt;h2&gt;People in a room&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
Alice is standing in the north-west corner.&lt;br/&gt;
Bob is standing in the south-west corner.&lt;br/&gt;
Charlie is standing in the south-east corrner.&lt;br/&gt;
David is standing in the north-east corner.&lt;br/&gt;
Ed is standing in the center of the room, looking at Alice.&lt;br/&gt;
How many people are there in the room?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are five people in the room: Alice, Bob, Charlie, David, and Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/square-room.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Who is standing to the &lt;b&gt;left&lt;/b&gt; of Ed?&lt;br/&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If Ed is facing Alice who is in the north-west corner, then his left would be towards the west direction. Therefore, Bob, who is standing in the south-west corner, is to the left of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Who is standing to the &lt;b&gt;right&lt;/b&gt; of Ed?&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If Ed is facing Alice who is in the north-west corner, then his right would be towards the north direction. Therefore, David, who is standing in the north-east corner, is to the right of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Who is standing &lt;b&gt;behind&lt;/b&gt; Ed?&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If Ed is facing Alice who is in the north-west corner, then behind him would be the south direction. Therefore, Charlie, who is standing in the south-east corner, is behind Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;GPT-4 got all of them right. Very impressive! Let's spice it up:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Now everybody goes around clockwise 90 degrees. Who is now in the north-west corner?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If everyone in the room moves 90 degrees clockwise, they would move as follows:&lt;br/&gt;
• Alice, originally in the north-west corner, would move to the north-east corner.&lt;br/&gt;
• Bob, originally in the south-west corner, would move to the north-west corner.&lt;br/&gt;
• Charlie, originally in the south-east corner, would move to the south-west corner.&lt;br/&gt;
• David, originally in the north-east corner, would move to the south-east corner.&lt;br/&gt;
• Ed, originally in the center, would remain in the center (assuming he also turns 90 degrees clockwise).&lt;br/&gt;
Therefore, Bob is now in the north-west corner.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Correct!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Clock faces&lt;/h2&gt;
&lt;p&gt;Can GPT-4 reason about the hands of a clock?&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: It is half past eleven. What number is the minute hand of the clock pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The minute hand of a clock points to "6" when it is half past the hour. This is because a traditional analog clock is divided into 12 sections, and each section corresponds to 5 minutes. Therefore, when 30 minutes have passed, the minute hand points to the 6th section.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: I have a watch. The hour hand is pointing to 11. The minute hand is at a right angle to it, on the right side. What number is the minute hand pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If the hour hand is pointing to 11, and the minute hand is at a right angle to the hour hand on the right side, the minute hand is pointing to 2. This corresponds to 10 minutes past the hour. Therefore, the time would be approximately 11:10.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: I have a watch. The hour hand is pointing to 11:30. The minute hand is at a right angle to it, on the right side. What number is the minute hand pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If the hour hand is pointing to 11:30, it means it is halfway between 11 and 12. If the minute hand is at a right angle to the hour hand on the right side, the minute hand is pointing to approximately 3. This corresponds to 15 minutes past the hour. Therefore, the time would be approximately 11:15.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: not completely correct, but close; the minute hand is pointing at 2:30, the time is 11:12:30&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Here GPT-3 also did fairly well on the first questions, but not on the later ones.&lt;/p&gt;
&lt;h2&gt;Triangles&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle. One of the angles is 70 degrees, another is 80 degrees. What is the third angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: To find the third angle of a triangle, we can use the fact that the sum of the angles in any triangle is always 180 degrees. Let's denote the third angle as x. We know that one angle is 70 degrees, another is 80 degrees, and the sum of all three angles is 180 degrees. We can set up the equation:&lt;br/&gt;
70 + 80 + x = 180&lt;br/&gt;
Simplifying the equation, we have:&lt;br/&gt;
150 + x = 180&lt;br/&gt;
To isolate x, we can subtract 150 from both sides:&lt;br/&gt;
x = 180 - 150&lt;br/&gt;
x = 30&lt;br/&gt;
Therefore, the third angle is 30 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct. It's impressive how GPT-4 derives the answer; unlike with GPT-3, where it just stated the answer, and I wasn't sure whether it's seen this exact question in the training data.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle. One of the angles is 70.51 degrees, another is 80.49 degrees. What is the third angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: To find the third angle of a triangle when two angles are known, we can use the fact that the sum of the angles in any triangle is always 180 degrees. Let's denote the third angle as x. We know that one angle is 70.51 degrees, another is 80.49 degrees, and the sum of all three angles is 180 degrees. We can set up the equation:&lt;br/&gt;
70.51 + 80.49 + x = 180&lt;br/&gt;
Simplifying the equation, we have:&lt;br/&gt;
151 + x = 180&lt;br/&gt;
To isolate x, we can subtract 151 from both sides:&lt;br/&gt;
x = 180 - 151&lt;br/&gt;
x = 29&lt;br/&gt;
Therefore, the third angle is 29 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have two identical right-angled triangles. We snap them together along their longest edges. What is the resulting shape?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: When two identical right-angled triangles are snapped together along their longest edges, the resulting shape is a rectangle.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a very small triangle on a very large sphere. What is the sum of its angles, in degrees?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The sum of the angles of a triangle is always 180 degrees, regardless of the size or shape of the triangle. This property holds true for triangles on a sphere as well. Therefore, the sum of the angles of a very small triangle on a very large sphere would still be 180 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: partially correct; if the triangle is very small, the surface of the sphere is locally flat, so we can approximate with Euclidean geometry.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle on a sphere. What is the sum of its angles, in degrees?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: On a sphere, the sum of the angles of a triangle is greater than 180 degrees. This property is known as the spherical excess or Gauss-Bonnet theorem. The formula for calculating the spherical excess (E) of a triangle on a sphere with angles A, B, and C (measured in radians) is:&lt;br/&gt;
E = A + B + C - π&lt;br/&gt;
To convert the angles from degrees to radians, we need to multiply them by π/180.&lt;br/&gt;
Let's say the angles of the triangle are A, B, and C (in degrees). The sum of the angles on the sphere is:&lt;br/&gt;
A + B + C - 180&lt;br/&gt;
Note that if the angles are given in radians, the sum of the angles would be:&lt;br/&gt;
A + B + C - π&lt;br/&gt;
So, in short, the sum of the angles of a triangle on a sphere is greater than 180 degrees (or π radians).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct; GPT-3 got this wrong.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle on  a sphere. The sum of it's angles is 200 degrees. We scale up the sphere by 10x. What is the sum of the triangle's angles now?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: When you scale up the sphere by 10x, the shape and proportions of the triangle on the sphere will also scale accordingly. However, the sum of the angles of the triangle will remain the same, regardless of the size of the sphere. Therefore, even after scaling up the sphere by 10x, the sum of the triangle's angles will still be 200 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct; GPT-3 got this wrong.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Unlike GPT-3, GPT-4 is able to correctly reason about triangles on a sphere. Very impressive!&lt;/p&gt;
&lt;h2&gt;Areas and volumes&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a white cube, each side is 1 unit long. We paint it red using 1 liter of paint.&lt;br/&gt;
We now have a white cube, each side is 2 units long. How many liters of paint do we need to paint it red?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The amount of paint needed is proportional to the surface area of the cube. The surface area of a cube is given by 6 times the square of the edge length (since a cube has six identical square faces).&lt;br/&gt;
1. For the cube with an edge length of 1 unit, the surface area is 6 * (1)^2 = 6 square units.&lt;br/&gt;
2. For the cube with an edge length of 2 units, the surface area is 6 * (2)^2 = 24 square units.&lt;br/&gt;
If it takes 1 liter of paint to cover 6 square units, then it should take 24/6 = 4 liters of paint to cover 24 square units. Therefore, you'd need 4 liters of paint to paint the larger cube.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a square. We cut up the square into pieces and rearrange the pieces to form a really long rectangle. Which has bigger area, the original square, or the really long rectangle?&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The area of a shape remains the same regardless of how its pieces are rearranged. When you cut up a square and rearrange its pieces to form a rectangle, the total area remains the same. That's because area is a measure of the amount of "space" inside the shape, and cutting and rearranging the pieces doesn't create or remove any of this space. So the original square and the really long rectangle have exactly the same area.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct; GPT-3 got this wrong.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: There is a cube, and a sphere inside the cube. Which has bigger volume?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The volume of the cube is larger than the volume of the sphere inside it, provided the sphere is inscribed in the cube (meaning the diameter of the sphere is equal to the side of the cube).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: The area of a unit square in 2D is 1, the area of the contained circle is Pi/4.&lt;br/&gt;
The volume of a unit cube in 3D is 1, the volume of the contained sphere is Pi/6.&lt;br/&gt;
What is the volume of the sphere contained in a unit cube in 4D?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: In 4D, a "unit cube" would be a hypercube (also known as a tesseract) with edges of length 1, and a "sphere" would be a hypersphere. The general formula for the volume of a hypersphere in n dimensions (also known as an n-sphere), with a radius r, is:&lt;br/&gt;
V = (pi^(n/2) / Gamma(n/2 + 1)) * r^n&lt;br/&gt;
where Gamma is the gamma function. The gamma function of n is equal to (n-1)! for positive integers.&lt;br/&gt;
For the unit hypercube (a cube in 4D), each side length is 1, so the "radius" of the inscribed hypersphere is 0.5 (since the diameter is 1).&lt;br/&gt;
Let's substitute n=4 and r=0.5 into the formula:&lt;br/&gt;
V = (pi^(4/2) / Gamma(4/2 + 1)) * (0.5)^4&lt;br/&gt;
This simplifies to:&lt;br/&gt;
V = (pi^2 / 3) * (1/16)&lt;br/&gt;
Further simplifying gives:&lt;br/&gt;
V = pi^2 / 48 ≈ 0.06544984695&lt;br/&gt;
So, the volume of the hypersphere inscribed in a unit hypercube in 4D is approximately 0.06544984695.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, the correct answer would be $\frac{\pi^2}{32}$.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In higher dimensions what is the ratio of the volumes of a cube and the sphere contained in the cube? Do not give an exact mathematical answer, give a short qualitative answer.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: As the number of dimensions increases, the volume of the sphere inscribed in a cube occupies a smaller and smaller proportion of the volume of the cube. This happens because the corners of the cube, which are outside the sphere, contribute more volume in higher dimensions. The more dimensions you have, the more "corner" there is relative to the total volume, and so the sphere takes up less of the space. In practical terms, the higher the dimension, the less "efficient" a sphere becomes at filling the space within a cube.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;ASCII art&lt;/h2&gt;
&lt;p&gt;Can GPT-4 read or write ASCII art? This is the question that originally triggered my exploration into spatial comprehension. Let's see:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the letter in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗░░██╗
██║░░██║
███████║
██╔══██║
██║░░██║
╚═╝░░╚═╝
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The ASCII art you've posted represents the capital letter "A".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the text in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗░░██╗███████╗██╗░░░░░██╗░░░░░░█████╗░
██║░░██║██╔════╝██║░░░░░██║░░░░░██╔══██╗
███████║█████╗░░██║░░░░░██║░░░░░██║░░██║
██╔══██║██╔══╝░░██║░░░░░██║░░░░░██║░░██║
██║░░██║███████╗███████╗███████╗╚█████╔╝
╚═╝░░╚═╝╚══════╝╚══════╝╚══════╝░╚════╝░
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The ASCII art you've posted represents the word "GREAT".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the text in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;███╗░░░███╗░█████╗░██████╗░████████╗░█████╗░███╗░░██╗
████╗░████║██╔══██╗██╔══██╗╚══██╔══╝██╔══██╗████╗░██║
██╔████╔██║███████║██████╔╝░░░██║░░░██║░░██║██╔██╗██║
██║╚██╔╝██║██╔══██║██╔══██╗░░░██║░░░██║░░██║██║╚████║
██║░╚═╝░██║██║░░██║██║░░██║░░░██║░░░╚█████╔╝██║░╚███║
╚═╝░░░░░╚═╝╚═╝░░╚═╝╚═╝░░╚═╝░░░╚═╝░░░░╚════╝░╚═╝░░╚══╝
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The ASCII art you've posted represents the word "FUTURE".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Generate ASCII art for the letter H.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Sure, here is a simple ASCII art representation of the letter "H":&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;H   H
H   H
HHHHH
H   H
H   H
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Generate ASCII art for the word HELLO.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Sure, here's a simple ASCII representation of the word "HELLO":&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;H   H  EEEEE  L     L      OOO  
H   H  E      L     L     O   O 
HHHHH  EEEE   L     L     O   O 
H   H  E      L     L     O   O 
H   H  EEEEE  LLLLL LLLLL  OOO  
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;This is a simple form of ASCII art where each letter is 5 units tall and about 5 units wide (not counting spaces between letters). More complex or stylized versions could be created with different shapes or patterns.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct. Let's see if it can follow my ASCII style.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Generate ASCII art for the word HELLO, in the style I used above.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:Here's the ASCII representation of the word "HELLO" similar to the style you provided:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗░░██╗███████╗██╗░░░██╗███████╗
██║░░██║██╔════╝██║░░░██║██╔════╝
███████║█████╗░░╚██╗░██╔╝█████╗░░
██╔══██║██╔══╝░░░╚████╔╝░██╔══╝░░
██║░░██║███████╗░░╚██╔╝░░███████╗
╚═╝░░╚═╝╚══════╝░░░╚═╝░░░╚══════╝
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;Each letter is represented in a style similar to the style you provided. Please note that ASCII art can vary widely in complexity and style!&lt;/span&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, but it got the HE right!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Based on these experiments, GPT-4 is a lot better at spatial reasoning than GPT-3. If these models keep advancing at this pace, GPT-5 or 6 will be flawless at these tasks.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-4"></category></entry><entry><title>Leadership models IV: the Pareto Principle, the Peter Principle, the Rumsfeld Matrix, Servant Leadership and Pygmalion Effect, and Goleman's Emotional Intelligence Model</title><link href="https://bytepawn.com/leadership-models-pareto-principle-rumsfeld-matrix-servant-leadership.html" rel="alternate"></link><published>2023-05-19T00:00:00+02:00</published><updated>2023-05-19T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-19:/leadership-models-pareto-principle-rumsfeld-matrix-servant-leadership.html</id><summary type="html">&lt;p&gt;In this post, I describe the Pareto Principle, the Peter Principle, the Rumsfeld Matrix, Servant Leadership and Pygmalion Effect, and finally Goleman's Emotional Intelligence Model. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/peter-principle.jpg" alt="Peter principle" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the fourth article in my series on useful mental models in leadership and self-management:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop.html#leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop"&gt;Leadership models I: Iceberg Model, Six Thinking Hats, Trust Equation, Circle of influence, OODA Loop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model.html#leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model"&gt;Leadership models II: Growth Mindset, Eisenhower Matrix, Tuckman Model, Cynefin Framework, SCARF Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-first-principles-thinking.html#leadership-models-first-principles-thinking"&gt;Leadership models III: First Principles Thinking&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this post, I will describe the Pareto Principle, the Peter Principle, the Rumsfeld Matrix, Servant Leadership and Pygmalion Effect, and finally Goleman's Emotional Intelligence Model.&lt;/p&gt;
&lt;h2&gt;Pareto Principle&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Pareto Principle&lt;/strong&gt;, also known as the &lt;strong&gt;80/20 Rule&lt;/strong&gt;, was introduced by Italian economist Vilfredo Pareto in the late 19th century. Pareto observed that roughly 80% of the land in Italy was owned by 20% of the population. He developed the principle after observing that 20% of the pea pods in his garden contained 80% of the peas.&lt;/p&gt;
&lt;p&gt;The principle has since been applied to a variety of fields, including business and productivity, where it is often interpreted as follows: &lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;80% of outcomes (outputs) come from 20% of causes (inputs).&lt;/li&gt;
&lt;li&gt;80% of results come from 20% of effort.&lt;/li&gt;
&lt;li&gt;80% of a company's profits come from 20% of its customers or products.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/pareto.png" alt="Pareto" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;It's important to note that the 80/20 ratio is not fixed; the distribution could just as well be 90/10 or 70/30, depending on the situation. The key idea is that a minority of causes, inputs, or effort often leads to a majority of the results, outputs, or rewards. Some practical applications of the Pareto Principle:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Business:&lt;/strong&gt; In terms of customers, a common rule of thumb is that 80% of a company's revenue comes from 20% of its customers. Therefore, it can be beneficial to focus on the needs and preferences of that 20%.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Productivity:&lt;/strong&gt; When it comes to personal productivity, the Pareto Principle suggests that 20% of our work activities will account for 80% of our results. Therefore, identifying and focusing on these high-impact tasks can significantly improve productivity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quality control:&lt;/strong&gt; The Pareto Principle is also used in quality control, where it's known as the Pareto Chart. This tool suggests that by resolving the 20% of the issues causing 80% of the problems, overall quality can be significantly improved.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Pareto Principle encourages individuals and organizations to identify and focus on the inputs that lead to the majority of their results, thereby enhancing efficiency and effectiveness.&lt;/p&gt;
&lt;h2&gt;Peter Principle&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Peter Principle&lt;/strong&gt; is a concept in management theory formulated by Dr. Laurence J. Peter in his 1969 book, "The Peter Principle." This principle suggests that in a hierarchical organization, employees tend to rise to their level of incompetence. Here are its core tenets:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Promotion Based on Current Performance&lt;/strong&gt;: The principle asserts that employees in a hierarchy are typically promoted based on their performance in their current role, rather than their potential to perform in the promoted role. This is based on the assumption that performance in one role directly translates to potential performance in a higher role.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rise to the Level of Incompetence&lt;/strong&gt;: As employees are promoted based on their performance, they will continue to rise through the ranks until they reach a position where they no longer perform competently. At this level, they do not have the skills or abilities to perform effectively, and thus their incompetence becomes evident.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stagnation in Incompetent Positions&lt;/strong&gt;: Once an employee has risen to a level of incompetence, they typically stay in that position. This is because their poor performance in the new role prevents further promotions, but it's not poor enough to warrant demotion or dismissal. &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Organizational Inefficiency&lt;/strong&gt;: The accumulation of employees in roles in which they are incompetent can lead to inefficiencies and ineffectiveness within the organization. &lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Peter Principle is a reminder for organizations to consider potential capability and suitability for the future role when making promotion decisions, not just performance in a current role.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/peter-principle.jpg" alt="Peter principle" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Rumsfeld Matrix&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Rumsfeld Matrix&lt;/strong&gt; was popularized by former United States Secretary of Defense, Donald Rumsfeld, during a news briefing in 2002, although the concept has roots in the field of epistemology. The Rumsfeld Matrix consists of four categories of knowledge:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Known Knowns&lt;/strong&gt;: These are things we know that we know. They represent our existing knowledge, facts, or information we are aware of and understand.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Known Unknowns&lt;/strong&gt;: These are things we know that we don't know. They represent the gaps in our knowledge, areas where we are aware that we lack information or understanding.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unknown Knowns&lt;/strong&gt;: These are things we don't know that we know. They represent tacit knowledge or information we possess but may not be consciously aware of, such as unconscious biases, ingrained habits, or learned behaviors.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unknown Unknowns&lt;/strong&gt;: These are things we don't know that we don't know. They represent the areas of complete ignorance, where we are not even aware of our lack of knowledge or understanding.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The purpose of the Rumsfeld Matrix is to encourage individuals and organizations to consider and explore the different aspects of knowledge, helping them to identify blind spots, ask better questions, and make more informed decisions.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/rumsfeld-matrix.jpg" alt="Rumsfeld matrix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Servant Leadership&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Servant Leadership&lt;/strong&gt; is a leadership philosophy in which the primary goal of the leader is to serve others. This is starkly different from traditional leadership where the leader's main focus is on the thriving of their company or employees. A Servant Leader shares power, puts the needs of the employees first and helps people develop and perform as highly as possible. &lt;/p&gt;
&lt;p&gt;This concept was first coined by Robert K. Greenleaf in his 1970 essay, "The Servant as Leader." Greenleaf suggested that the best leaders were servants first, and the key tools for a servant leader are listening, empathy, healing, awareness, persuasion, conceptualization, foresight, stewardship, and commitment to the growth of people and building community.&lt;/p&gt;
&lt;p&gt;Here are the core tenets of Servant Leadership:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Listening&lt;/strong&gt;: Leaders have traditionally been valued for their communication and decision-making skills. Although these are also important skills for the servant-leader, they need to reinforce these with a deep commitment to listening intently to others.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Empathy&lt;/strong&gt;: The servant-leader strives to understand and empathize with others. Workers may be considered not only as employees, but also as people who have complex personal lives.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Healing&lt;/strong&gt;: Servant-leaders recognize that they also have an opportunity to help make whole those with whom they come in contact.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Awareness&lt;/strong&gt;: General awareness, and especially self-awareness, strengthens the servant-leader.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Persuasion&lt;/strong&gt;: Servant-leaders rely on persuasion, rather than positional authority in making decisions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Conceptualization&lt;/strong&gt;: Servant-leaders nurture their abilities to dream great dreams.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Foresight&lt;/strong&gt;: Foresight is a characteristic that enables servant-leaders to understand lessons from the past, the realities of the present, and the likely consequence of a decision in the future.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stewardship&lt;/strong&gt;: Servant leadership, like stewardship, assumes first and foremost a commitment to serving the needs of others.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Commitment to the Growth of People&lt;/strong&gt;: Servant-leaders believe that people have an intrinsic value beyond their tangible contributions as workers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Building Community&lt;/strong&gt;: The servant-leader senses that much has been lost in recent human history as a result of the shift from local communities to large institutions as the primary shaper of human lives.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The overall goal of Servant Leadership is to build a more just, caring and sustainable world.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/servant-leadership.png" alt="Servant leadership" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Pygmalion Effect&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Pygmalion Effect&lt;/strong&gt; is a psychological principle that describes how our expectations about the abilities of others can influence their actual performance. It was first identified by psychologist Robert Rosenthal and school principal Lenore Jacobson in the late 1960s.&lt;/p&gt;
&lt;p&gt;Here are the main aspects of the Pygmalion Effect:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Expectation Influence Performance&lt;/strong&gt;: If you have high expectations of someone, they are likely to perform better. Conversely, if you have low expectations of someone, they are likely to perform poorly. This happens because our expectations subtly communicate what we think the other person is capable of, affecting their self-confidence and effort.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Self-fulfilling Prophecy&lt;/strong&gt;: The Pygmalion Effect is a type of self-fulfilling prophecy. When we believe someone will behave in a certain way, we may treat them in ways that induce them to behave as expected, thereby confirming our initial expectations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Application in Leadership and Education&lt;/strong&gt;: The Pygmalion Effect is especially relevant in leadership and educational settings. Leaders or teachers who believe in their team members' or students' abilities to succeed can inspire those individuals to meet those expectations. However, if a leader or teacher has low expectations, it may negatively affect performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Importance of Positive Expectations&lt;/strong&gt;: The Pygmalion Effect underscores the power of positive expectations. By showing others that you believe in their potential, you can motivate them to reach their goals, improving individual and collective performance.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In summary, the Pygmalion Effect demonstrates the power of expectation and the role it plays in performance. It's an important concept to understand for anyone in a leadership, teaching, or mentoring position.&lt;/p&gt;
&lt;h2&gt;Goleman's Emotional Intelligence Model&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Goleman's Emotional Intelligence Model&lt;/strong&gt; was proposed by psychologist and science journalist Daniel Goleman in his 1995 book "Emotional Intelligence: Why It Can Matter More Than IQ." The model outlines five key components of emotional intelligence that are crucial to leadership, personal development, and interpersonal effectiveness. &lt;/p&gt;
&lt;p&gt;Here are the core components of Goleman's model:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Self-Awareness&lt;/strong&gt;: This refers to the ability to recognize and understand your own emotions, strengths, weaknesses, values, and drives, as well as their impact on others. Self-awareness also includes recognizing your own emotional reactions to people or situations around you.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Self-Regulation (or Self-Management)&lt;/strong&gt;: This is about managing your internal states, impulses, and resources. This includes being able to control or redirect disruptive emotions and adapting to changing circumstances. Essentially, it’s about managing your emotions and keeping them in check.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Motivation&lt;/strong&gt;: This refers to being driven to achieve for the sake of achievement. Motivation includes your personal drive to improve and achieve, commitment to your goals, initiative, or readiness to act on opportunities, and optimism and persistence in pursuing goals, even if things don’t go as planned.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Empathy&lt;/strong&gt;: This is about considering other people's feelings, especially when making decisions. It involves understanding the emotional makeup of other people and treating them according to their emotional reactions. &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Social Skills (or Relationship Management)&lt;/strong&gt;: This refers to managing relationships to move people in the desired direction. Social skills involve the ability to influence, to communicate clearly, to inspire and influence others, to manage conflict, to foster cooperation, and to build and maintain good personal relationships.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Goleman’s model underlines that emotional intelligence is as important, if not more important, than cognitive abilities for personal and professional success, especially in leadership roles.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This concludes my series on mental models useful in Leadership.&lt;/p&gt;</content><category term="leadership"></category><category term="mental"></category><category term="models"></category><category term="first-principles"></category></entry><entry><title>Leadership models III: First Principles Thinking</title><link href="https://bytepawn.com/leadership-models-first-principles-thinking.html" rel="alternate"></link><published>2023-05-16T00:00:00+02:00</published><updated>2023-05-16T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-16:/leadership-models-first-principles-thinking.html</id><summary type="html">&lt;p&gt;I describe Elon Musk's First Principles Thinking model through several examples: SpaceX, Warren Buffet's Berkshire, Google's organization design, Python's language design, Random Forests and Convolutional Neural Networks. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/first-principles-thinking.jpg" alt="Elon Musk First Principles Thinking" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the third article in my series on useful mental models in leadership and self-management:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop.html#leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop"&gt;Leadership models I: Iceberg Model, Six Thinking Hats, Trust Equation, Circle of influence, OODA Loop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model.html#leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model"&gt;Leadership models II: Growth Mindset, Eisenhower Matrix, Tuckman Model, Cynefin Framework, SCARF Model&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I dedicate a full post to &lt;strong&gt;First Principles Thinking&lt;/strong&gt;, popularized by Elon Musk, CEO of SpaceX and Tesla, who has credited First Principles Thinking as a key factor in his ability to create innovative solutions and businesses. He contrasts it with reasoning by analogy, or taking what's already given and making incremental improvements.&lt;/p&gt;
&lt;h2&gt;First Principles Thinking&lt;/h2&gt;
&lt;p&gt;First Principles Thinking is a problem-solving method that involves breaking down complex problems into their most basic, fundamental elements and then reasoning up from there. The idea is to shed assumptions and received wisdom in order to understand the foundational principles at the heart of the problem:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Identify and Define Your Assumptions: Start by identifying what you think you know about a given problem or situation. These are your assumptions or beliefs that may or may not be true.&lt;/li&gt;
&lt;li&gt;Break Down the Problem into Its Fundamental Principles: Once you have identified your assumptions, start breaking down the problem into its basic, fundamental parts. These are the truths or principles that are so fundamental that they cannot be broken down further.&lt;/li&gt;
&lt;li&gt;Create New Solutions from Scratch: Using these fundamental principles, start to build up a solution from scratch. This is where the innovative solutions can come from, as you're no longer constrained by existing methods or conventional wisdom.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For example, when Elon Musk was faced with the high cost of space travel, he didn't just accept the established "fact" that rockets are expensive. Instead, he broke the problem down to its first principles, asking why a rocket is expensive and what materials make up a rocket. He discovered that the materials cost of a rocket was about 2% of the typical price. This insight led him to the idea of building cost-effective rockets, a key factor in SpaceX's success.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/first-principles-thinking.jpg" alt="Elon Musk First Principles Thinking" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;First Principles Thinking can be challenging as it forces you to abandon preconceived ideas and assumptions. However, it can also lead to innovative solutions that would have been missed with traditional problem-solving methods.&lt;/p&gt;
&lt;p&gt;Below are illustrations of First Principles thinking:
- SpaceX and Elon Musk
- Berkshire and Warren Buffet
- Google's organization design
- Python
- Random Forests
- Convolutional Neural Networks&lt;/p&gt;
&lt;h2&gt;SpaceX&lt;/h2&gt;
&lt;p&gt;One of the most well-known examples of First Principles Thinking applied by Elon Musk is his approach to &lt;strong&gt;building reusable rockets at SpaceX.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Traditionally, rockets have been single-use. After delivering their payload, they either burn up in the Earth's atmosphere or fall back to Earth and land in the ocean, rendering them unusable for future launches. This method was considered a "given" in the industry. It contributed significantly to the high cost of space travel, as a new rocket had to be built for every launch.&lt;/p&gt;
&lt;p&gt;Musk, however, didn't accept this as a given. Instead, he applied First Principles Thinking. He started by questioning the fundamental premise: Why are rockets single-use? Is it a fundamental law of physics that they have to be? It turned out that it wasn't.&lt;/p&gt;
&lt;p&gt;Next, he broke down the cost of a rocket into its fundamental principles. He found that the raw materials that make up a rocket only account for about 2% of the final cost of a rocket. This meant that the significant cost was coming from the manufacturing and assembly of these materials into a rocket. If a rocket could be reused, then the cost per launch could be dramatically reduced.&lt;/p&gt;
&lt;p&gt;From these first principles, Musk and his team at SpaceX began developing rockets that can return to Earth and land vertically, allowing them to be reused for future launches. Despite significant skepticism and several early failures, SpaceX has successfully landed multiple rockets back on Earth and reused them, significantly reducing the cost of each individual launch and revolutionizing the space industry in the process. This example illustrates how First Principles Thinking can challenge established norms and lead to innovative solutions, even in industries that have been operating the same way for decades.&lt;/p&gt;
&lt;h2&gt;Berkshire&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Warren Buffet and Charlie Munger, the leaders of Berkshire Hathaway&lt;/strong&gt;, are well known for their application of First Principles Thinking in their investment strategies. They have developed a distinctive approach to investing, which involves careful analysis of a company's fundamentals rather than following market trends or short-term fluctuations.&lt;/p&gt;
&lt;p&gt;A fundamental principle of their investing philosophy is the concept of &lt;strong&gt;"intrinsic value"&lt;/strong&gt;. Instead of focusing on external factors like market sentiment or speculative trends, they look at the underlying value of a business. They assess factors such as the company's earnings, the quality of its management, its competitive position, and its potential for growth. If the market price of a company is significantly below its intrinsic value, they see it as a good investment opportunity.&lt;/p&gt;
&lt;p&gt;This approach requires deep understanding and analysis of a business – its operations, its industry, its competitors, and its risks. This is in contrast to some other investment approaches that focus more on technical analysis of stock price movements or macroeconomic trends.&lt;/p&gt;
&lt;p&gt;Another example of their First Principles Thinking is their focus on &lt;strong&gt;"circle of competence"&lt;/strong&gt;. They believe in investing only in businesses that they understand thoroughly. This means they often avoid hot trends or sectors they are not familiar with, even if these are popular with other investors.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/circle-of-competence.jpg" alt="Warren Buffett Circle of Competence" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Their First Principles approach to investing has helped Buffet and Munger achieve consistent returns over the long term and build Berkshire Hathaway into one of the largest and most successful investment companies in the world. Their success shows how First Principles Thinking can lead to superior results, even in fields like investing where there is a lot of uncertainty and many different approaches can be successful.&lt;/p&gt;
&lt;h2&gt;Google&lt;/h2&gt;
&lt;p&gt;Google's unique approach to organizational design and management can be seen as an application of First Principles Thinking. Instead of accepting traditional management structures and practices as given, they questioned the fundamental principles of how to organize and manage a company for innovation and productivity.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;20% time policy:&lt;/strong&gt; traditionally, companies expect employees to dedicate 100% of their work time to their assigned tasks and projects. Google, however, applied First Principles Thinking to this issue. They started with the premise that creative and innovative ideas can come from anyone in the organization, and that employees may have interests and abilities that aren't being used in their regular work. From this, they developed the "20% Time" policy, which allows engineers to spend 20% of their time working on any project they choose. The idea is that by giving employees time and freedom to explore their own ideas, they can foster innovation and creativity. This policy has led to the development of some of Google's most successful products, including Gmail, Google News, and AdSense.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hiring and team formation:&lt;/strong&gt; Instead of simply looking at a candidate's credentials and experience, Google looks at attributes like problem-solving ability, conscientiousness, and comfort with ambiguity. They recognize that the challenges they face are often unique and require innovative thinking, so they value these qualities over specific experience or expertise. They also pay attention to team dynamics, understanding that the success of a team depends not just on the skills of its members, but on their ability to work together effectively.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Objective and Key Results (OKRs):&lt;/strong&gt; Google has adopted and popularized the use of OKRs, a goal-setting system that requires every team and individual in the company to set ambitious goals and measurable key results each quarter. The OKRs are publicly available to all employees in the organization. This practice promotes transparency, alignment, and focus throughout the organization, ensuring everyone is working towards the same strategic objectives. The public nature of OKRs also fosters a culture of accountability and open communication. Rather than accepting traditional goal-setting or performance management systems, Google's use of OKRs reflects a First Principles approach to driving organizational performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flattened Hierarchy and Openness:&lt;/strong&gt; Google has questioned the traditional hierarchical structure found in many organizations. Instead of a rigid hierarchy, Google promotes a more flattened organizational structure, where the distance between executives and employees is minimized. This is evident in practices such as 'TGIF' meetings (now called 'TGiT' since they moved to Thursdays), where founders Larry Page and Sergey Brin, and later Sundar Pichai, held weekly all-hands meetings where they shared updates and fielded questions - both in person and submitted online - from Googlers around the world. This level of transparency and open communication is unusual in large corporations but reflects Google's First Principles approach to leadership and management.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Focus on Data:&lt;/strong&gt; Google is known for its data-driven decision-making culture. They apply First Principles Thinking in this area by challenging the idea that business decisions are best made based on experience or intuition. Instead, they argue that decisions should be based on data and rigorous analysis. This can be seen in their "People Analytics" function, which applies data analysis methods to HR and management issues. For example, they used data analysis to develop their "Project Oxygen", where they identified the key behaviors of effective managers at Google, challenging preconceived notions of what makes a good manager.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/google-20percent.jpg" alt="Google 20% percent" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;These examples illustrate how Google applies First Principles Thinking to various aspects of its organizational design and management practices, questioning traditional assumptions and developing innovative, effective practices based on fundamental principles.&lt;/p&gt;
&lt;h2&gt;Python&lt;/h2&gt;
&lt;p&gt;Python was created in the late 1980s by Guido van Rossum. At the time, many programming languages were designed with a focus on machine efficiency, resulting in code that could be difficult to write and even more difficult to read. Syntax was often complex and verbose, and different tasks could require vastly different conventions or paradigms. This made many languages difficult to learn and code difficult to maintain.&lt;/p&gt;
&lt;p&gt;Rather than accepting these characteristics as a given, van Rossum took a First Principles approach when designing Python. He started with the premise that &lt;strong&gt;code is read more often than it is written&lt;/strong&gt;, and therefore readability counts. This led to a focus on simplicity and clarity in Python's design. Python's syntax is straightforward and consistent, making it easier to read, write, and learn than many other languages.&lt;/p&gt;
&lt;p&gt;Another first principle was that &lt;strong&gt;there should be one, and preferably only one, obvious way to do it&lt;/strong&gt;. This contrasts with languages like Perl, which proudly proclaims that there is more than one way to do it. This principle guided the design of Python's features and syntax, resulting in a language where different programmers will often write very similar code to perform the same task, which greatly aids readability and maintainability.&lt;/p&gt;
&lt;p&gt;These principles have guided Python's development for over three decades, and have helped it become one of the most popular programming languages in the world. This demonstrates how First Principles Thinking can be applied to software design, leading to innovative solutions that challenge the status quo.&lt;/p&gt;
&lt;h2&gt;Random Forests&lt;/h2&gt;
&lt;p&gt;Before the development of ensemble methods like Random Forest, decision trees were a popular method for both classification and regression tasks. They were easy to understand and interpret, but they &lt;strong&gt;often suffered from high variance&lt;/strong&gt;, meaning that they were &lt;strong&gt;prone to overfitting&lt;/strong&gt; the training data and performed poorly on unseen data.&lt;/p&gt;
&lt;p&gt;Instead of trying to tweak or adjust decision trees to make them more robust, Leo Breiman, the inventor of the Random Forest algorithm, went back to the first principles. He realized that the high variance was a fundamental issue with decision trees, and that the solution was not to fix individual trees but to change the way they were used.&lt;/p&gt;
&lt;p&gt;This led him to the &lt;strong&gt;concept of bagging&lt;/strong&gt; (bootstrap aggregating), where multiple decision trees are trained on different subsets of the data, and their predictions are aggregated to produce a final result. This approach effectively reduces the variance without increasing the bias, making the model more robust and reliable.&lt;/p&gt;
&lt;p&gt;Random Forests added another layer to this concept: feature randomness. In addition to training each tree on a different subset of the data, Random Forests also use a subset of features at each split in the decision tree. This further increases the diversity of the individual trees in the forest, making the model even more robust.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/random-forest-architecture.jpg" alt="Random Forest architecture" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The development of Random Forests has had a significant impact on the field of machine learning. They are one of the most powerful and widely used machine learning algorithms today, and their development has inspired a range of other ensemble methods. This shows how First Principles Thinking can lead to major breakthroughs and innovations.&lt;/p&gt;
&lt;h2&gt;Convolutional Neural Networks&lt;/h2&gt;
&lt;p&gt;First Principles Thinking in data science, machine learning, and AI involves starting from foundational truths about data and algorithms and building up from there. This can lead to more effective models and methods of analysis. Consider the development of convolutional neural networks (CNNs) for image recognition as an example.&lt;/p&gt;
&lt;p&gt;In the early days of AI and machine learning, image recognition was a significant challenge. Traditional algorithms struggled with changes in orientation, size, and lighting conditions in images. The common approach was to engineer features manually, which was very time-consuming and not particularly effective.&lt;/p&gt;
&lt;p&gt;Instead of trying to incrementally improve these methods, researchers applied First Principles Thinking. They started by considering the fundamental nature of the problem: recognizing patterns in pixel data that are invariant to scale, orientation, and other transformations.&lt;/p&gt;
&lt;p&gt;Researchers, notably Yann LeCun and his team, drew inspiration from &lt;strong&gt;the human visual system&lt;/strong&gt;, where &lt;strong&gt;different neurons specialize in recognizing different patterns&lt;/strong&gt; and these patterns are combined to understand the whole image. This led them to design a completely new kind of neural network architecture, the Convolutional Neural Network, which incorporates convolutional layers that can automatically learn a variety of spatial hierarchies of features, effectively learning from the fundamental features of the image data.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cnn-patterns.png" alt="CNN patterns" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The convolutional layer uses a set of learnable filters. Each filter is spatially small but extends through the full depth of the input volume. During the forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a 2-dimensional activation map of that filter. As a result, the network learns filters that activate when they see some type of visual feature such as an edge of some orientation or a blotch of some color on the first layer.&lt;/p&gt;
&lt;p&gt;CNNs revolutionized the field of image recognition and now form the basis for most image recognition technology, including facial recognition and self-driving cars. This is a great example of how First Principles Thinking can lead to significant breakthroughs in technology and AI.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;First Principles Thinking is a problem-solving approach that involves breaking down complex problems into their fundamental, self-evident truths and then reasoning up from these base truths to create new understanding or solutions. This method encourages innovative thinking by challenging existing assumptions and conventions, and building new solutions from the ground up.&lt;/p&gt;</content><category term="leadership"></category><category term="mental"></category><category term="models"></category><category term="first-principles"></category></entry><entry><title>Leadership models II: Growth Mindset, Eisenhower Matrix, Tuckman Model, Cynefin Framework, SCARF Model</title><link href="https://bytepawn.com/leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model.html" rel="alternate"></link><published>2023-05-14T00:00:00+02:00</published><updated>2023-05-14T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-14:/leadership-models-growth-mindset-eisenhower-matrix-tuckman-model-cynefin-framework-scarf-model.html</id><summary type="html">&lt;p&gt;I describe the following mental models useful in leadership and self-management: the Growth Mindset, Eisenhower Matrix, Tuckman Model, Cynefin Framework, SCARF Model. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/growth-mindset.png" alt="Trust equation" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This year I'm attending a multi-day leadership course that spans 6 months. One of the things the coaches talk about is various mental models useful in leadership and self-management. These are some of the mental models mentioned in the course.&lt;/p&gt;
&lt;h2&gt;Growth Mindset&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Growth Mindset&lt;/strong&gt; is a concept introduced by psychologist Carol Dweck in her book "Mindset: The New Psychology of Success." It is centered around the belief that intelligence, abilities, and talents can be developed through effort, learning, and persistence. The Growth Mindset contrasts with the Fixed Mindset, which assumes that these traits are innate and unchangeable. The core tenets of the Growth Mindset are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Embrace challenges: People with a growth mindset see challenges as opportunities for growth and development. They are more likely to take risks, try new things, and persist in the face of difficulties.&lt;/li&gt;
&lt;li&gt;Value effort: In a growth mindset, effort is seen as a critical factor in achieving success. Hard work, dedication, and practice are valued as essential components of improvement and growth.&lt;/li&gt;
&lt;li&gt;Learn from criticism and feedback: Individuals with a growth mindset are open to constructive criticism and feedback, viewing it as an opportunity to learn and improve their skills and understanding.&lt;/li&gt;
&lt;li&gt;Be inspired by others' success: Instead of feeling threatened by the success of others, people with a growth mindset are inspired by it. They see the achievements of others as evidence that they too can improve and succeed.&lt;/li&gt;
&lt;li&gt;Focus on continuous learning: A growth mindset prioritizes learning and self-improvement over simply proving oneself or maintaining a specific image. People with a growth mindset are lifelong learners, always looking for ways to expand their knowledge and abilities.&lt;/li&gt;
&lt;li&gt;Develop resilience and persistence: Individuals with a growth mindset are more likely to persevere in the face of setbacks and obstacles. They understand that failure is a natural part of the learning process and an opportunity to learn, adapt, and grow.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/growth-mindset.png" alt="Growth Mindset" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Cultivating a growth mindset can lead to increased motivation, higher achievement, and greater overall satisfaction in both personal and professional aspects of life. By embracing the principles of the growth mindset, individuals can unlock their full potential and foster a love of learning and self-improvement.&lt;/p&gt;
&lt;h2&gt;Eisenhower Matrix&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Eisenhower Matrix&lt;/strong&gt; is a time management tool that helps individuals prioritize tasks by urgency and importance. The tool is named after Dwight D. Eisenhower, the 34th President of the United States, who is attributed with saying, &lt;em&gt;"I have two kinds of problems, the urgent and the important. The urgent are not important, and the important are never urgent."&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The matrix consists of a 2x2 grid, creating four quadrants:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Quadrant I - Important and Urgent:&lt;/strong&gt; These are tasks that need to be done immediately. They often involve critical issues that carry significant consequences if not addressed promptly. Examples could include crises, deadlines, or problems that need immediate attention.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quadrant II - Important but Not Urgent:&lt;/strong&gt; These are tasks that contribute to long-term goals and values but do not need to be done immediately. This quadrant often involves strategic planning, relationship building, personal growth, and preventive activities. Spending more time on these tasks can reduce the frequency of urgent issues arising.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quadrant III - Not Important but Urgent:&lt;/strong&gt; These tasks require immediate attention but do not contribute significantly to achieving long-term goals or values. They often involve interruptions or distractions, such as unimportant emails, phone calls, or meetings. These tasks can often be delegated to others.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quadrant IV - Not Important and Not Urgent:&lt;/strong&gt; These tasks do not contribute to long-term goals and do not require immediate attention. They often involve time-wasting activities or tasks that offer little to no value. These tasks should be minimized or eliminated.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/eisenhower-matrix.png" alt="Eisenhower matrix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The core tenets of the Eisenhower Matrix include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Distinction between Urgency and Importance:&lt;/strong&gt; Recognizing that urgent tasks are not always important, and important tasks are not always urgent, is key to effective use of the matrix.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Prioritization:&lt;/strong&gt; The matrix helps individuals prioritize tasks based on their importance and urgency, leading to more effective time management and productivity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delegation:&lt;/strong&gt; Tasks that are urgent but not important can often be delegated to others, freeing up time for more important activities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Elimination:&lt;/strong&gt; Unimportant and non-urgent tasks should be eliminated as they consume time and resources without contributing to long-term goals or values.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Focus on Quadrant II:&lt;/strong&gt; The goal is to maximize time spent on Quadrant II activities, which contributes most to long-term success and can prevent issues from becoming urgent crises.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By using the Eisenhower Matrix, individuals can manage their time more effectively, focusing on what truly matters and reducing stress and burnout.&lt;/p&gt;
&lt;h2&gt;Tuckman Model&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Tuckman Model&lt;/strong&gt;, also known as Tuckman's Stages of Group Development, is a framework proposed by psychologist Bruce Tuckman in 1965 to describe the path that most teams follow on their way to high performance. The model outlines four distinct stages:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Forming:&lt;/strong&gt; The initial stage of team development where team members are polite and positive, but also anxious as they haven't fully understood what work the team will do. There is typically a focus on defining the team's goals, structure and leadership, with team members starting to take on roles and responsibilities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Storming:&lt;/strong&gt; As team members start to work together, they begin to push against the boundaries established in the forming stage. This can result in conflict within the team as individuals jockey for position, resist control by group leaders, and start voicing differing opinions about the team's direction. Despite its name, the "storming" phase isn't always a negative space -- it can also be a time of healthy, constructive debate.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Norming:&lt;/strong&gt; Gradually, the team moves into the norming stage, where people start to resolve the conflicts that surfaced during the storming stage. The team becomes more aware of their collective goals, and they begin to appreciate each other's skills and experience. Norms and roles are established, and the team becomes more cohesive and functional.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performing:&lt;/strong&gt; The team reaches the performing stage when hard work leads, without friction, to the achievement of the team's goal. The structures and processes that you have set up support this well. As a leader, you can delegate much of your work, and you can concentrate on developing team members. It's often characterized by high levels of trust, autonomy, and interdependence.&lt;/li&gt;
&lt;li&gt;Later, Tuckman added a fifth stage, &lt;strong&gt;Adjourning&lt;/strong&gt; (or "Mourning"), where the team completes the work and moves on from the project. This can be a challenging stage as team members may have developed strong bonds and may feel a sense of loss upon the project's completion.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/tuckman.jpg" alt="Eisenhower matrix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The core tenets of the Tuckman Model are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sequential Development:&lt;/strong&gt; Teams tend to develop in stages, each of which presents unique challenges and opportunities for growth and development.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Conflict as a Natural Part of Development:&lt;/strong&gt; Conflict isn't necessarily a bad thing, and can be a natural part of the team development process.&lt;/li&gt;
&lt;li&gt;The Role of Leadership: The leader's role may change at each stage, from giving direction in the forming stage to fostering independence and competence in the performing stage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Importance of Clear Goals and Roles:&lt;/strong&gt; Clear goals and well-defined roles can facilitate the team's progress through each stage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Transition and Closure:&lt;/strong&gt; Teams should acknowledge and plan for the end of the project or task, helping team members transition and bring closure to the experience.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Cynefin Framework&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Cynefin Framework&lt;/strong&gt;, developed by Dave Snowden, is a decision-making model that helps leaders and organizations navigate the complexity of different problems and determine the most appropriate approach for each situation. The framework categorizes problems or contexts into five domains, each with its unique characteristics and recommended strategies:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Simple:&lt;/strong&gt; In this domain, problems have clear cause-and-effect relationships and can be solved using established best practices. The recommended approach is to sense the situation, categorize the problem, and then respond using known solutions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complicated:&lt;/strong&gt; Problems in this domain have multiple possible solutions, and cause-and-effect relationships can be determined with expert analysis. The recommended approach is to sense the situation, analyze the problem using expert knowledge, and then respond with the most suitable solution.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complex:&lt;/strong&gt; In this domain, cause-and-effect relationships can only be understood in retrospect due to the unpredictable nature of the problem. The recommended approach is to probe the situation by conducting experiments, sense the emerging patterns, and then respond by adapting to the evolving conditions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chaotic:&lt;/strong&gt; Problems in this domain are characterized by high levels of uncertainty, with no apparent cause-and-effect relationships. The recommended approach is to act immediately to stabilize the situation, sense the new conditions, and then respond by adapting or creating novel solutions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Disorder:&lt;/strong&gt; This domain represents situations where it is unclear which of the other four domains applies. The objective in this context is to gather more information, break the problem down into smaller components, and then assign each component to the appropriate domain for further action.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/cynefin.png" alt="Cynefin framework" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;By understanding the nature of a problem and determining which domain it falls into, leaders can use the Cynefin Framework to choose the most appropriate decision-making approach and improve their ability to address complex and unpredictable challenges.&lt;/p&gt;
&lt;h2&gt;SCARF Model&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;SCARF Model&lt;/strong&gt; is a social cognitive neuroscience model devised by Dr. David Rock. SCARF stands for Status, Certainty, Autonomy, Relatedness, and Fairness. These are the five key domains that influence our behavior in social situations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Status:&lt;/strong&gt; This refers to relative importance to others. Status is about where you rank in relation to those around you. Threats to status, such as feeling inferior or being disrespected, can be incredibly powerful in driving behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Certainty:&lt;/strong&gt; This pertains to our ability to predict the future. The brain craves certainty to make accurate predictions about the world. Uncertainty provokes a threat response and can therefore be stressful and draining.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Autonomy:&lt;/strong&gt; This concerns the sense of control over events. Having a sense of autonomy, the feeling that you have some control over your environment, is crucial for our sense of wellbeing. A lack of autonomy can lead to stress, which impairs performance and morale.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Relatedness:&lt;/strong&gt; This involves the sense of safety with others, of being 'in' or 'out' of a social group. It reflects our need to feel connected to others. When we perceive someone as being in our 'in-group', we work more collaboratively and cooperatively with them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fairness:&lt;/strong&gt; This reflects the perception of fair exchanges between people. The perception of unfairness can provoke a strong threat response and can significantly impact workplace performance and engagement.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These concepts are all drawn from neuroscience research which suggests that the brain treats many social threats and rewards with the same intensity as physical threats and rewards. For instance, a perceived threat to one's status activates the same brain networks as a threat to one's life. By understanding these triggers, leaders and managers can learn to minimize threats and maximize rewards to help their teams be more effective, engaged, and collaborative.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scarf.png" alt="Scarf model" style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="leadership"></category><category term="mental"></category><category term="models"></category><category term="growth-mindset"></category><category term="eisenhower-matrix"></category><category term="tuckman"></category><category term="cynefin"></category><category term="scarf"></category></entry><entry><title>Leadership models I: Iceberg Model, Six Thinking Hats, Trust Equation, Circle of influence, OODA Loop</title><link href="https://bytepawn.com/leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop.html" rel="alternate"></link><published>2023-05-12T00:00:00+02:00</published><updated>2023-05-12T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-12:/leadership-models-iceberg-model-six-thinking-hats-trust-equation-circle-of-influence-ooda-loop.html</id><summary type="html">&lt;p&gt;I describe the following mental models useful in leadership and self-management: Iceberg Model, Six Thinking Hats, Trust Equation, Circle of Influence and OODA Loop. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/trust-equation.jpg" alt="Trust equation" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This year I'm attending a multi-day leadership course that spans 6 months. One of the things the coaches talk about is various mental models useful in leadership and self-management. These are some of the mental models mentioned in the course.&lt;/p&gt;
&lt;h2&gt;Iceberg Model&lt;/h2&gt;
&lt;p&gt;The organizational &lt;strong&gt;Iceberg Model&lt;/strong&gt; is used to understand organizational culture and change management. It emphasizes the visible and tangible aspects of an organization (above the water) and the more subtle, intangible, and often unconscious elements (below the water) that contribute to an organization's overall culture and performance.&lt;/p&gt;
&lt;p&gt;Above the water:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Organizational Structure:&lt;/strong&gt; The formal hierarchy, roles, and reporting relationships within an organization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Processes:&lt;/strong&gt; The systems, procedures, and protocols that govern how work is carried out, including decision-making, communication, and problem-solving.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Skills:&lt;/strong&gt; The abilities, competencies, and expertise of individuals within the organization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Behaviors:&lt;/strong&gt; The observable actions and interactions of individuals and teams, reflecting how people conduct themselves in the workplace.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vision:&lt;/strong&gt; The long-term direction, goals, and aspirations of the organization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strategy:&lt;/strong&gt; The plan or approach to achieving the organization's vision, including the allocation of resources, priorities, and milestones.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stated Values:&lt;/strong&gt; The explicit principles, beliefs, or ethical guidelines that the organization claims to prioritize and uphold.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Below the water:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Feelings:&lt;/strong&gt; The emotions, moods, and attitudes of individuals within the organization, which can influence behavior, decision-making, and overall performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Norms:&lt;/strong&gt; The informal, unwritten rules or expectations that govern behavior and interactions within the organization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Values:&lt;/strong&gt; The deeply-held beliefs and priorities that influence decision-making, actions, and attitudes, often operating at an unconscious level.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unwritten Rules:&lt;/strong&gt; The implicit guidelines, habits, or customs that influence behavior and decision-making within the organization, even though they may not be formally documented or acknowledged.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shared Assumptions:&lt;/strong&gt; The commonly-held beliefs or premises that individuals within the organization take for granted, which can shape their perceptions, actions, and interpretations of events.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Symbols:&lt;/strong&gt; The visual, verbal, or physical representations that convey meaning, values, or identity within the organization, such as logos, jargon, or office layouts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stories:&lt;/strong&gt; The narratives, anecdotes, or legends that circulate within the organization, reflecting its history, culture, and values, and helping to shape its collective identity.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/iceberg.jpg" alt="Iceberg" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The organizational Iceberg Model highlights the importance of considering both the overt, visible aspects of an organization and the underlying, often hidden elements that influence its culture and effectiveness. By addressing both the "above water" and "below water" components, leaders can develop a more comprehensive understanding of their organization and implement more targeted and sustainable change initiatives.&lt;/p&gt;
&lt;h2&gt;Six Thinking Hats&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Six Thinking Hats&lt;/strong&gt; is a parallel thinking process and decision-making tool developed by Dr. Edward de Bono. The model encourages individuals or teams to approach problems or discussions from six different perspectives by "wearing" different metaphorical hats. Each hat represents a specific mode of thinking, and the purpose is to facilitate more efficient, creative, and comprehensive problem-solving, decision-making, and communication. The six hats are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Blue Hat:&lt;/strong&gt; Focused on managing the thinking process. When wearing the blue hat, participants oversee the overall discussion, ensuring that the group stays on track, follows the thinking process, and reaches a conclusion or decision.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;White Hat:&lt;/strong&gt; Focused on data and information. When wearing the white hat, participants gather, analyze, and share facts, figures, and objective information related to the topic.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Red Hat:&lt;/strong&gt; Focused on emotions and intuition. When wearing the red hat, participants express their feelings, emotions, and gut reactions to the issue without the need for justification or rationalization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Green Hat:&lt;/strong&gt; Focused on creativity and innovation. When wearing the green hat, participants generate new ideas, alternative solutions, and innovative approaches, encouraging lateral thinking and brainstorming.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Yellow Hat:&lt;/strong&gt; Focused on optimism and benefits. When wearing the yellow hat, participants explore the positive aspects, advantages, and opportunities associated with an idea or proposal, considering the best-case scenarios and potential value.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Black Hat:&lt;/strong&gt; Focused on critical thinking and potential problems. When wearing the black hat, participants assess potential risks, challenges, and obstacles, identifying why a proposed solution may not work or what could go wrong.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/six-thinking-hats.jpg" alt="Six thinking hats" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;By using the Six Thinking Hats approach, individuals and teams can explore different aspects of a problem, reduce conflict and biases, and make more balanced and well-informed decisions.&lt;/p&gt;
&lt;h2&gt;Trust Equation&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Trust Equation&lt;/strong&gt;, developed by Charles H. Green is a model that defines the key components of trustworthiness in professional relationships. It highlights four elements that contribute to building and maintaining trust between individuals. The Trust Equation is represented as follows:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Trustworthiness = (Credibility + Reliability + Intimacy) / Self-Orientation&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Credibility:&lt;/strong&gt; This component is about the perception of your expertise, knowledge, and competence in a given area. Credibility is established through demonstrating your skills, providing accurate information, and offering well-informed opinions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reliability:&lt;/strong&gt; This component refers to the consistency of your actions and the extent to which you can be counted on to deliver on your promises. Being dependable, punctual, and accountable for your commitments contributes to building reliability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intimacy:&lt;/strong&gt; This component relates to the emotional safety and closeness that others feel when they interact with you. Establishing intimacy requires active listening, empathy, and the ability to create a comfortable environment where others feel safe to share their thoughts and feelings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Self-Orientation:&lt;/strong&gt; This component represents the extent to which you prioritize your own interests over those of others. High self-orientation can undermine trust, as it suggests that you may act in a self-serving manner. Lowering self-orientation involves demonstrating genuine concern for others' needs and focusing on building mutually beneficial relationships.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/trust-equation.jpg" alt="Trust equation" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The Trust Equation provides a framework for understanding and developing trust in professional relationships. By enhancing credibility, reliability, and intimacy while reducing self-orientation, individuals can improve their trustworthiness and foster stronger, more productive relationships with colleagues, clients, and stakeholders.&lt;/p&gt;
&lt;h2&gt;Circle of Influence&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Circle of Influence&lt;/strong&gt; and &lt;strong&gt;Circle of Concern&lt;/strong&gt; concept is a mental model popularized by Stephen Covey. It is a tool designed to help individuals focus their time and energy on the areas where they can make the most impact, leading to increased effectiveness and personal growth. The model consists of two components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Circle of Concern:&lt;/strong&gt; This circle encompasses all the things that an individual may be worried or concerned about, but over which they have little or no control. Examples include global events, economic conditions, or the actions and decisions of other people. Focusing too much on the Circle of Concern can lead to feelings of helplessness and frustration, as it involves situations where the individual cannot directly influence the outcome.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Circle of Influence:&lt;/strong&gt; This circle represents the things that an individual can directly affect or control, such as their own behavior, decisions, and reactions. Examples include personal habits, work performance, or relationships with others. Focusing on the Circle of Influence empowers individuals to take responsibility for their actions and proactively address the challenges they face.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/circle-of-influence.jpg" alt="Circle of Influence" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The key to using the Circle of Influence and Circle of Concern concept effectively is to concentrate on the areas within one's Circle of Influence, rather than dwelling on the concerns that are beyond their control. By doing so, individuals can make a tangible impact on their lives and may even expand their Circle of Influence over time. This approach helps to cultivate a proactive mindset, leading to increased productivity, effectiveness, and personal satisfaction.&lt;/p&gt;
&lt;h2&gt;OODA Loop&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;OODA Loop (Observe, Orient, Decide, Act)&lt;/strong&gt; is a decision-making model developed by U.S. Air Force Colonel John Boyd. It provides a framework for understanding and responding to rapidly changing situations, particularly in competitive or adversarial environments. The OODA Loop emphasizes the importance of agility, adaptability, and quick decision-making. The model consists of four components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Observe:&lt;/strong&gt; The first step involves gathering data and information about the current situation, including the actions of competitors or adversaries, changes in the environment, and any other relevant factors. This step requires maintaining situational awareness and staying up-to-date with the latest developments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Orient:&lt;/strong&gt; In this step, you analyze the collected data to understand the context, identify patterns or trends, and evaluate the implications for your goals or objectives. This may involve updating your mental models, considering various perspectives, and recognizing your own biases and assumptions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Decide:&lt;/strong&gt; Based on your understanding of the situation, you develop a plan or make a decision about the best course of action. This step requires weighing different options, evaluating their potential risks and rewards, and choosing the most appropriate response given the available information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Act:&lt;/strong&gt; Once a decision has been made, you implement the chosen course of action as quickly and effectively as possible. This may involve coordinating with team members, allocating resources, or executing specific tasks or maneuvers.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/ooda.jpg" alt="OODA Loop" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;After completing the Act step, the OODA Loop continues as you return to the Observe phase, monitoring the effects of your actions and gathering new information about the evolving situation. This iterative process allows for continuous learning, adaptation, and improvement, enabling individuals or organizations to respond effectively to dynamic and uncertain environments.&lt;/p&gt;</content><category term="leadership"></category><category term="mental"></category><category term="models"></category><category term="iceberg"></category><category term="thinking-hats"></category><category term="trust-euqation"></category><category term="ooda"></category></entry><entry><title>GPT News Poet: silly AI poems based on today's news</title><link href="https://bytepawn.com/gpt-news-poet-silly-ai-poems-based-on-todays-news.html" rel="alternate"></link><published>2023-05-07T00:00:00+02:00</published><updated>2023-05-07T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-05-07:/gpt-news-poet-silly-ai-poems-based-on-todays-news.html</id><summary type="html">&lt;p&gt;I show how I used GNews and the OpenAI API to build &lt;a href="(https://gptnewspoet.bytepawn.com)"&gt;GPT News Poet&lt;/a&gt;, a fun toy site showing silly AI poems based on today's news. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gpt-news-poet-example-short.jpg" alt="ROC curve" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I think one of the coolest (and most useless) uses of &lt;a href="https://en.wikipedia.org/wiki/Large_language_model"&gt;Large Language Models&lt;/a&gt; is generating poetry. This resonates strongly with me, because I could never write poetry of any quality, but I do appreciate silly 4-liners. &lt;a href="https://gptnewspoet.bytepawn.com"&gt;It's live here!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gpt-news-poet-example-long.jpg" alt="ROC curve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/gpt-news-poet"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;News&lt;/h2&gt;
&lt;p&gt;There are multiple sites to get news from. For this toy project I did not want to scrape, I wanted to use a site that has a nice API and already returns structured JSON. &lt;a href="https://gnews.io/docs/v4?python#top-headlines-endpoint"&gt;GNews&lt;/a&gt; (not Google News) is exactly like that, and it's free up to 100 requests/day, and this only takes 1 request/day.&lt;/p&gt;
&lt;p&gt;The endpoint is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;https://gnews.io/api/v4/top-headlines?category={category}&amp;amp;lang=en&amp;amp;country=us&amp;amp;max=10&amp;amp;apikey={apikey}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It returns a list of JSON dictionaries that looks like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;title&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Loaf-size mission to improve hurricane forecasting is ready to launch&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;A new NASA mission called TROPICS, designed to improve hurricane forecasting, is ready to launch ahead of the June 1 arrival of the 2023 Atlantic hurricane season.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Sign up for CNN’s Wonder Theory science newsletter. Explore the universe with news on fascinating discoveries, scientific advancements and more.\nCNN —\nA new mission designed to improve hurricane forecasting is ready to launch, just ahead of the June ... [3592 chars]&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;https://www.cnn.com/2023/05/07/world/nasa-tropics-mission-launch-scn/index.html&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;image&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;https://media.cnn.com/api/v1/images/stellar/prod/230428155358-01-nasa-tropics-mission-042023.jpg?c=16x9&amp;amp;q=w_800,c_fill&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;publishedAt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;2023-05-07T10:53:00Z&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;source&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;CNN&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;quot;url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;https://www.cnn.com&amp;quot;&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;GPT-3&lt;/h2&gt;
&lt;p&gt;ChatGPT-4 generates significantly better quality poems that GPT-3, but there is no official API yet for it. So I used GPT-3 for this toy project, specifically &lt;code&gt;text-davinci-003&lt;/code&gt;. The prompt I use is very simple and mostly works:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Write a witty 4 line poem about the following news: {articles_text}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since I'm a lazy programmer in the age of AI, I used ChatGPT-4 to generate the Python code to access GPT-3 over the OpenAI API. I chose to save the generated poems into the JSON returned above, and then dump the whole thing to disk. Generating the HTML for the website will be a second, completely removed step.&lt;/p&gt;
&lt;p&gt;The complete code for the first step of the pipeline, which downloads the news and generates the poems is less than 50 LOC:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;openai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urllib.request&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urllib.parse&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;secrets&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;gnews_top_news&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;general&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;apikey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;secrets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gnews_apikey&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;https://gnews.io/api/v4/top-headlines?category={category}&amp;amp;lang=en&amp;amp;country=us&amp;amp;max=10&amp;amp;apikey={apikey}&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;amp;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;q&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;articles&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_gpt_35&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text-davinci-003&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;secrets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;openai_apikey&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Completion&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;generated_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;generated_text&lt;/span&gt;

&lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gnews_top_news&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;articles_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;{a[&amp;#39;title&amp;#39;]}. {a[&amp;#39;description&amp;#39;]}&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;poet_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Write a witty 4 line poem about the following news: {articles_text}&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query_gpt_35&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poet_prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;poem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;articles/&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;articles-{datetime.now().strftime(&amp;#39;%Y-%m-&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;)}.json&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;w&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Frontend&lt;/h2&gt;
&lt;p&gt;At this point I just need a web front-end. Initially I was planning to write a Flask app which reads the above JSON dynamically and convert it to HTML, but then I realized it's easier to just generate static HTML with Jinja templates and serve that up. I again used ChatGPT-4 to generate skeleton HTML+JS+CSS code, which needed heavy editing to actually make it work. The Python code to generate the HTML is quite short:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;jinja2&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FileSystemLoader&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_latest_articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/home/mtrencseni/gpt-news-poet/articles/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;articles-\d{4}-\d{2}-\d{2}\.json&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    
    &lt;span class="n"&gt;files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;latest_file_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latest_file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FileSystemLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;templates&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;index.jinja.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;get_latest_articles&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%A, %B &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s2"&gt;, %Y&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;www/archive/&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%Y-%m-&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;.html&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;w&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;www/&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;index.html&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;w&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Other files:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mtrencseni/gpt-news-poet/blob/main/templates/index.jinja.html"&gt;Jinja template for the HTML&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mtrencseni/gpt-news-poet/blob/main/www/style.css"&gt;CSS styles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mtrencseni/gpt-news-poet/blob/main/www/script.js"&gt;Javascript&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Crontab&lt;/h2&gt;
&lt;p&gt;First, I wrote a quick shell script to run the above pipeline:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="ch"&gt;#!/usr/bin/env bash&lt;/span&gt;

venv/bin/python3 generate_articles.py
venv/bin/python3 generate_html.py
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I then put a line in my &lt;code&gt;crontab&lt;/code&gt; so the cron daemon runs the pipeline every day at 9AM NYC time (13 UTC):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;0 13 * * * cd /home/mtrencseni/gpt-news-poet &amp;amp;&amp;amp; /home/mtrencseni/generate.sh
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I think it's very cool that you can generate reasonable quality poetry from a cronjob in 2023. Once ChatGPT-4 becomes available over the OpenAI API, the quality of the poems will be even better!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gpt-news-poet-example-short.jpg" alt="ROC curve" style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="gpt"></category><category term="python"></category><category term="gnews"></category><category term="ai"></category></entry><entry><title>Can ChatGPT write the C++20 async message queue codes from the previous posts?</title><link href="https://bytepawn.com/can-chatgpt-write-the-cpp-async-message-queue-codes-from-the-previous-posts.html" rel="alternate"></link><published>2023-04-15T00:00:00+02:00</published><updated>2023-04-15T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-04-15:/can-chatgpt-write-the-cpp-async-message-queue-codes-from-the-previous-posts.html</id><summary type="html">&lt;p&gt;I try to get ChatGPT to write the codes in the previous posts. It's able to write the basic message queue skeleton, but it cannot implement more complicated features such as delivery semantics with caching; also, the code crashes. C++ programmers do not (yet) need to fear for their jobs.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cpp-hands.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles I wrote a simple async message queue server in Python:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-cpp-async-message-queue-server.html"&gt;Writing a simple C++20 async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-cpp-async-message-queue-server-part-ii.html"&gt;Writing a simple C++20 async message queue server - Part II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I went through the same steps with Python. There, I concluded with trying to get ChatGPT-3.5 to write the code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html"&gt;Writing a simple Python async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server-part-ii.html"&gt;Writing a simple Python async message queue server - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/can-chatgpt-write-the-python-async-message-queue-codes-from-the-previous-posts.html"&gt;Can ChatGPT write the Python async message queue codes from the previous posts?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the Python case, ChatGPT-3.5 was able to produce working code, but it wasn't logically correct: for example, it mixed up delivery semantics during caching.&lt;/p&gt;
&lt;p&gt;In this post, I will try the same with C++20. This time, I will use ChatGPT-4.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cpp-hands.jpg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;First version&lt;/h2&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Write a simple, asynchronous, event-driven, in-memory message queue TCP/IP server in C++20 using the &lt;code&gt;asio&lt;/code&gt; library. The server should accept the listening port as a command line argument. Cache 100 messages per topic. Here is the message format for subscribing to a topic:
&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;subscribe&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;}
{&amp;#39;command&amp;#39;:&amp;#39;subscribe&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;last_seen&amp;#39;: &amp;#39;5&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
If &lt;code&gt;last_seen&lt;/code&gt; is specified, only replay cached messages with higher id. Sent messages look like this:
&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;hello&amp;#39;, &amp;#39;delivery&amp;#39;:&amp;#39;all&amp;#39;}
{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;world&amp;#39;, &amp;#39;delivery&amp;#39;:&amp;#39;one&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
When the message is forwarded to subscribers, also insert an &lt;code&gt;id&lt;/code&gt; field into the message, so subsciber clients can maintain their own &lt;code&gt;last_seen&lt;/code&gt;.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;asio.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;iostream&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unordered_map&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vector&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;deque&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;nlohmann/json.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nlohmann&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Subscriber&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;acceptor_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;v4&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;do_accept&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;do_accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;acceptor_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;async_accept&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt; &lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_shared&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                    &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="n"&gt;do_accept&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;dynamic_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                               &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                               &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt; &lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                       &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;istream&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                       &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                       &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                       &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

                                       &lt;span class="n"&gt;handle_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                       &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="p"&gt;}&lt;/span&gt;
                               &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;handle_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;subscribe&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;subscribers_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;last_seen&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;last_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;last_seen&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;replay_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;send&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;msg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="n"&gt;cache_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;all&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;sub&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscribers_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;one&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;subscribers_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscribers_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;front&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;cache_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message_cache_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="n"&gt;cached_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;id&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_message_id_&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_messages_per_topic_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pop_front&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;replay_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;msg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;message_cache_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;id&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                          &lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;acceptor&lt;/span&gt; &lt;span class="n"&gt;acceptor_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscribers_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;message_cache_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;next_message_id_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;max_messages_per_topic_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;streambuf&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Usage: message_queue_server &amp;lt;port&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt; &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Exception: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;what&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Compiling ChatGPT's code&lt;/h2&gt;
&lt;p&gt;The code does not compile, so I asked ChatGPT for help:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
I'm trying to compile. In &lt;code&gt;read_command()&lt;/code&gt;, I get this compile-time error:
&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;no instance of overloaded function &amp;quot;asio::dynamic_buffer&amp;quot; matches the argument list
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;ChatGPT gives me a new version:&lt;/p&gt;
&lt;p&gt;I apologize for the confusion. To resolve this error, you should change the &lt;code&gt;asio::dynamic_buffer&lt;/code&gt; function call to &lt;code&gt;asio::dynamic_buffer(buffer_.data())&lt;/code&gt;. Here's the updated &lt;code&gt;read_command()&lt;/code&gt; function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;dynamic_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
                           &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt; &lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                               &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                   &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;istream&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

                                   &lt;span class="n"&gt;handle_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                               &lt;span class="p"&gt;}&lt;/span&gt;
                           &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This still does not compile, I ask again:&lt;/p&gt;
&lt;p&gt;I apologize for the confusion earlier. I made a mistake in the suggestion. We should use &lt;code&gt;asio::streambuf&lt;/code&gt; instead of &lt;code&gt;asio::dynamic_buffer&lt;/code&gt;. Here's the updated &lt;code&gt;read_command()&lt;/code&gt; function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt; &lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                               &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                   &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;istream&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

                                   &lt;span class="n"&gt;handle_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                               &lt;span class="p"&gt;}&lt;/span&gt;
                           &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Now I get this:
&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;#39;Subscriber::Subscriber&amp;#39;: no overloaded function could convert all the argument types
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;ChatGPT again gives me new code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Subscriber&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="nf"&gt;Subscriber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;acceptor_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;v4&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;do_accept&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Putting everything together, the code now compiles! I put &lt;code&gt;json.hpp&lt;/code&gt; next to the &lt;code&gt;.cpp&lt;/code&gt; file from the &lt;a href="https://github.com/nlohmann/json/tree/develop/single_include/nlohmann"&gt;json C++ library&lt;/a&gt; and change that one &lt;code&gt;#include&lt;/code&gt; line to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;quot;json.hpp&amp;quot;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The fact that ChatGPT was able to guide me to through these compile time errors to get a program that compiles is quite impressive. Looking at the code, there are some obvious problems, but let's try to run it anyway. It used JSON, which means double-quotes &lt;code&gt;"&lt;/code&gt; for the messages rather than single quotes &lt;code&gt;'&lt;/code&gt; like I specified in the prompts, but let's ignore that.&lt;/p&gt;
&lt;h2&gt;Running ChatGPT's code&lt;/h2&gt;
&lt;p&gt;After compiling the code, I ran it, and connected 2 clients. The first subscribes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;quot;command&amp;quot;:&amp;quot;subscribe&amp;quot;, &amp;quot;topic&amp;quot;:&amp;quot;foo&amp;quot;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;A second client sends a message:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;quot;command&amp;quot;:&amp;quot;send&amp;quot;, &amp;quot;topic&amp;quot;:&amp;quot;foo&amp;quot;, &amp;quot;msg&amp;quot;:&amp;quot;0&amp;quot;, &amp;quot;delivery&amp;quot;:&amp;quot;one&amp;quot;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The message is delivered, and the server crashes immediately. There is none of our own code on the stack trace, the program is doing an async &lt;code&gt;send()&lt;/code&gt; operation:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cpp_callstack.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;I tell ChatGPT what happened, it responds:&lt;/p&gt;
&lt;p&gt;The issue might be related to the lifetime of the &lt;code&gt;buffer_&lt;/code&gt;. Since &lt;code&gt;buffer_&lt;/code&gt; is a member variable of the &lt;code&gt;MessageQueueServer&lt;/code&gt; class, it's shared between all connected clients. The ideal approach would be to have a separate buffer for each client. We can achieve this by adding a buffer to the &lt;code&gt;Subscriber&lt;/code&gt; struct. Here's the updated &lt;code&gt;Subscriber&lt;/code&gt; struct and the corresponding changes in the &lt;code&gt;MessageQueueServer&lt;/code&gt; class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Subscriber&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;streambuf&lt;/span&gt; &lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="nf"&gt;Subscriber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;acceptor_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;v4&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;do_accept&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;shared_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Subscriber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;error_code&lt;/span&gt; &lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                               &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                   &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;istream&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                                   &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

                                   &lt;span class="n"&gt;handle_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                                   &lt;span class="n"&gt;read_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                               &lt;span class="p"&gt;}&lt;/span&gt;
                           &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The good news is, the program still compiles with these changes. Unfortunately, it still crashes. The &lt;a href="https://gist.github.com/mtrencseni/956bded2f9475cd222e85a5cceee690f"&gt;final ChatGPT .cpp file is here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Logical bugs&lt;/h2&gt;
&lt;p&gt;Putting aside the buffer management bug which is causing the server to crash, there are other problems as well:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in the example, I used single quotes &lt;code&gt;'&lt;/code&gt;, not double-quotes &lt;code&gt;"&lt;/code&gt;, which is not valid JSON; in my own implementation in the previous posts, I avoided including a large library by implementing (with the help of ChatGPT) a ~10 line parser function for flat string dictionaries&lt;/li&gt;
&lt;li&gt;caching and &lt;code&gt;delivery&lt;/code&gt; is buggy: in &lt;code&gt;handle_command()&lt;/code&gt;, it calls &lt;code&gt;cache_message()&lt;/code&gt; irrespective of delivery semantics. So if a message has &lt;code&gt;delivery=one&lt;/code&gt;, it can get delivered to a single recipient now, but also later from the cache&lt;/li&gt;
&lt;li&gt;in &lt;code&gt;replay_messages()&lt;/code&gt; it does not implement &lt;code&gt;delivery&lt;/code&gt; semantics at all&lt;/li&gt;
&lt;li&gt;only cached messages receive an &lt;code&gt;id&lt;/code&gt; field, messages sent as they are received do not contain this field; this also means clients that receive non-cached messages cannot maintain their &lt;code&gt;last_seen&lt;/code&gt; counters&lt;/li&gt;
&lt;li&gt;the &lt;code&gt;last_seen&lt;/code&gt; field is stored as an &lt;code&gt;std::string&lt;/code&gt;, and is then compared to the &lt;code&gt;id&lt;/code&gt; lexicographically (instead of as &lt;code&gt;int&lt;/code&gt;s)&lt;/li&gt;
&lt;li&gt;there are probably numerous other bugs, including memory management issues that would cause further crashes...&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Overall I think it's quite impressive that ChatGPT-4 can write this approximate solution. In C++ it's much harder to get a correct program (due to memory management, types, compiling), and this shows. But other than language issues, as described above, ChatGPT also produced numerous logical bugs. Overall, the code is like what a student of C++ would produce with lots of help from Stackoverflow and Google: along the right lines, but with lots of problems. Since this was a just a ~100 line toy problem, and it couldn't handle it, C++ programmers do not (yet) need to fear for their jobs!&lt;/p&gt;</content><category term="cpp"></category><category term="async"></category><category term="message"></category><category term="queue"></category><category term="chatgpt"></category></entry><entry><title>Writing a simple C++20 async message queue server - Part II</title><link href="https://bytepawn.com/writing-a-simple-cpp-async-message-queue-server-part-ii.html" rel="alternate"></link><published>2023-04-08T00:00:00+02:00</published><updated>2023-04-08T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-04-08:/writing-a-simple-cpp-async-message-queue-server-part-ii.html</id><summary type="html">&lt;p&gt;I write a somewhat more complicated, but still relatively simple async message queue server in modern C++20.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cpp_coroutines.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In 2023, I want to get back into playful systems programming. My plan is to write a simple &lt;a href="https://en.wikipedia.org/wiki/Message_queue"&gt;message queue (MQ)&lt;/a&gt;, and for my own education write the same toy implementation in modern Python, modern C++ and Rust. To make things interesting, I will use async / &lt;a href="https://en.wikipedia.org/wiki/Event-driven_programming"&gt;event-driven&lt;/a&gt; programming and avoid &lt;a href="https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)"&gt;multi-threading&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the first 3 posts, I have written Python implementations, and played around with ChatGPT-3.5 to see if it can write code of this (not too great) complexity:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html"&gt;Writing a simple Python async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server-part-ii.html"&gt;Writing a simple Python async message queue server - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/can-chatgpt-write-the-python-async-message-queue-codes-from-the-previous-posts.html"&gt;Can ChatGPT write the Python async message queue codes from the previous posts?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/cpp_coroutines.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In the previous post, I wrote the simple version of the message queue in C++:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-cpp-async-message-queue-server.html"&gt;Writing a simple C++20 async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this article, similar to Part II in the Python series, I will add features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;cache messages&lt;/li&gt;
&lt;li&gt;specificy delivery semantics (&lt;code&gt;all&lt;/code&gt; or &lt;code&gt;one&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;specify &lt;code&gt;last_seen&lt;/code&gt; in &lt;code&gt;subscribe&lt;/code&gt; message to pick up where the client left off&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Additionally, even short C++ code like this is better factored into multiple files. Unlike Python code, where types can be omitted, in C++ types must be declared. This makes code mixing different levels of abstractions noisy and hard to read. In this case, networking code such as &lt;code&gt;tcp::socket&lt;/code&gt;, co-routine and &lt;code&gt;asio::&lt;/code&gt; usage is best separated. To achieve this, the code is separated into 3 files: &lt;code&gt;Utils.hpp&lt;/code&gt; for utility functiouns, &lt;code&gt;MessageQueue.hpp&lt;/code&gt; has the MQ logic, and the main program in &lt;code&gt;Main.cpp&lt;/code&gt; handles networking.&lt;/p&gt;
&lt;p&gt;A good (but old) book which also covers the topic of physical design (breaking programs into files) is &lt;a href="amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620"&gt;Large-Scale C++ Software Design&lt;/a&gt; by John Lakos, recommended reading.&lt;/p&gt;
&lt;p&gt;The code shown in this article is up on Github:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/mtrencseni/cb6f268e3c65033ddca78ffb80d3578c"&gt;Utils.hpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/mtrencseni/bc0f5dad14cd3c5c65fb78e66a228e9b"&gt;MessageQueue.hpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/mtrencseni/3356b8724ad6e40128909a07bbd10d9a"&gt;Main.cpp&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Utils&lt;/h2&gt;
&lt;p&gt;I will omit the function bodies to save space, as it's very similar/identical to the functions in the previous post, with a few additions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#ifndef __UTILS_HPP__&lt;/span&gt;
&lt;span class="cp"&gt;#define __UTILS_HPP__&lt;/span&gt;

&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;map&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="k"&gt;namespace&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;delim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="nf"&gt;parse_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;dict_str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;serialize_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cp"&gt;#endif&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The most interesting about these functions is that they are so simple, ChatGPT-4 can write flawless implementations using modern C++20!&lt;/p&gt;
&lt;h2&gt;MessageQueue&lt;/h2&gt;
&lt;p&gt;The MessageQueue implementation starts by declaring a &lt;code&gt;Client&lt;/code&gt; class to contain the specifics of our TCP/IP networking. The main &lt;code&gt;MessageQueue&lt;/code&gt; class will just use this &lt;code&gt;Client&lt;/code&gt;, ignoring the specifics of I/O:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;serialize_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\r\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;One thing to note here is that unlike in the previous post, here we're using synchronous &lt;code&gt;asio::write()&lt;/code&gt;. This is because almost always, even this &lt;code&gt;asio::write()&lt;/code&gt; will never block since the operation just hands over the buffer to the OS, which will perform the write at a later time. The bonus is, this way we keep all &lt;code&gt;co_await&lt;/code&gt;s out of the rest of the &lt;code&gt;MessageQueue&lt;/code&gt; class, so it doesn't know that this is an asynchronous program with co-routines! Now the main &lt;code&gt;MessageQueue&lt;/code&gt; class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;cache_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unordered_set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;*&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unordered_set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_index&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;OnConnect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;OnDisconnect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;topic&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;erase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;OnMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parse_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;subscribe&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;OnSubscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="nf"&gt;OnSend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;OnSubscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;last_seen&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;last_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;last_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;last_seen&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="n"&gt;CachePlayback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;OnSend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;AddIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;all&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;client&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;one&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;all&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;CachePush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;AddIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;index&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_index&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]);&lt;/span&gt;
        &lt;span class="n"&gt;max_index&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;CachePush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;cache_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;pop_front&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;CachePlayback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;reCache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;cached_msg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;index&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;one&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;reCache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reCache&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;newCache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;cached_msg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;index&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;delivery&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;all&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                    &lt;span class="n"&gt;newCache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newCache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The code is very similar to the previous Python code, except it's nicely factored into functions.&lt;/p&gt;
&lt;h2&gt;Main&lt;/h2&gt;
&lt;p&gt;The code in the main file is very similar to the previous version, so I will just show the function which utilizes the new &lt;code&gt;MessageQueue&lt;/code&gt; class. In this setup, the only thing the main code knows is that each line is a message (but no further formatting is assumed here), and that it needs to call &lt;code&gt;MessageQueue::OnConnect()&lt;/code&gt; (which is actually unused), &lt;code&gt;MessageQueue::OnMessage()&lt;/code&gt; and &lt;code&gt;MessageQueue::OnDisconnect()&lt;/code&gt;. As previously, the code is not handling all errors to keep things clean and readable.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;MessageQueue&lt;/span&gt; &lt;span class="n"&gt;mq&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
        &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client connected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;mq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnConnect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;quit&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;dynamic_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// line-by-line reading&lt;/span&gt;
            &lt;span class="n"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;line&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Received: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="n"&gt;mq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;mq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDisconnect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client disconnected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Exception in session: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;what&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To compile this code, you will need to get a &lt;a href="https://think-async.com/Asio/"&gt;ASIO library&lt;/a&gt;. You can test it by running it in a terminal, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./AsyncEchoServer &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a second terminal, connect to it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;0&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;2&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;3&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;4&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;5&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;6&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;7&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;8&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a third terminal:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
quit
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;last_seen&amp;#39;&lt;/span&gt;: &lt;span class="s1"&gt;&amp;#39;5&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
quit
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The message(s) will appear in the second terminal!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;With C++20, the code ended up being significantly more complicated than Python. In terms of LOC, it's about 5x, due to having to include utility funcitons, &lt;code&gt;#include&lt;/code&gt;s, &lt;code&gt;namespace&lt;/code&gt;s, &lt;code&gt;class declarations&lt;/code&gt;, and because the code is factored into files and classes, a must in C++. The actual logic is identical and equally readable. In the third part of this series, I will write the message queue in Rust.&lt;/p&gt;</content><category term="cpp"></category><category term="async"></category><category term="message"></category><category term="queue"></category></entry><entry><title>Writing a simple C++20 async message queue server - Part I</title><link href="https://bytepawn.com/writing-a-simple-cpp-async-message-queue-server.html" rel="alternate"></link><published>2023-04-02T00:00:00+02:00</published><updated>2023-04-02T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-04-02:/writing-a-simple-cpp-async-message-queue-server.html</id><summary type="html">&lt;p&gt;I write a simple, bi-directional async message queue server in modern C++20.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cpp_coroutines.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In 2023, I want to get back into playful systems programming. My plan is to write a simple &lt;a href="https://en.wikipedia.org/wiki/Message_queue"&gt;message queue (MQ)&lt;/a&gt;, and for my own education write the same toy implementation in modern Python, modern C++ and Rust. To make things interesting, I will use async / &lt;a href="https://en.wikipedia.org/wiki/Event-driven_programming"&gt;event-driven&lt;/a&gt; programming and avoid &lt;a href="https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)"&gt;multi-threading&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the previous 3 posts, I have written Python implementations, and played around with ChatGPT-3.5 to see if it can write code of this (not too great) complexity:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html"&gt;Writing a simple Python async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server-part-ii.html"&gt;Writing a simple Python async message queue server - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/can-chatgpt-write-the-python-async-message-queue-codes-from-the-previous-posts.html"&gt;Can ChatGPT write the Python async message queue codes from the previous posts?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/cpp_coroutines.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this article, I will write code similar to the first article: first I will write a simple TCP echo server, and then move on to a MQ server — but this time in modern C++. Similarly to the Python code, I will use co-routines, and event-driven async programming. With modern &lt;a href="https://en.wikipedia.org/wiki/C%2B%2B20"&gt;C++20&lt;/a&gt;, apart from helper functions, the main code looks very similar to the Python version; the implementations match almost line by line.&lt;/p&gt;
&lt;p&gt;The code shown in this article is up on Github:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/mtrencseni/1bfc597e050126e5c65449b2d6806dd2"&gt;TCPEchoServer.cpp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/mtrencseni/29bfaf9b261167365773c6044a2fa321"&gt;SimpleAsyncMessageQueue.cpp&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Simple async echo server&lt;/h2&gt;
&lt;p&gt;Let's start with writing the simplest possible skeleton server, which listens on a port, accepts TCP connections, reads incoming bytes, and echoes back whatever was sent:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;iostream&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;asio.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;asio/io_context.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="k"&gt;namespace&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client connected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;quit&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;dynamic_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// line-by-line reading&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client disconnected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;acceptor&lt;/span&gt; &lt;span class="n"&gt;acceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;v4&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Server listening on port &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;...&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;acceptor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;async_accept&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Usage: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot; &amp;lt;port&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;io_context&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;signal_set&lt;/span&gt; &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;async_wait&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;listen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The code is straightforward, assuming one has read the previous Python code. It uses the &lt;a href="https://think-async.com/Asio/asio-1.22.0/doc/asio/overview/core/cpp20_coroutines.html"&gt;ASIO library&lt;/a&gt;, which is built on top of the &lt;a href="https://en.cppreference.com/w/cpp/language/coroutines#:~:text=A%20coroutine%20is%20a%20function,stored%20separately%20from%20the%20stack."&gt;C++20 language feature of co-routines&lt;/a&gt;, and adds asynchronous I/O functionality. It starts the co-routine loop by &lt;code&gt;co_spawn&lt;/code&gt;ing the &lt;code&gt;listener()&lt;/code&gt; co-routine, which in turn listens on the given port, and spawns a &lt;code&gt;session()&lt;/code&gt; co-routine for each new connected client. For each client, we use &lt;code&gt;async_read_until()&lt;/code&gt; to read a new line of text, and then write it back with &lt;code&gt;async_write()&lt;/code&gt;. The client can disconnect by typing &lt;code&gt;quit&lt;/code&gt;. Note that I skipped proper error handling here (which could be achieved by using &lt;code&gt;try/catch&lt;/code&gt; blocks) to make the code more readable.&lt;/p&gt;
&lt;p&gt;Note that co-routines are a language feature in C++20; &lt;code&gt;co_await&lt;/code&gt; is a language keyword. &lt;code&gt;co_spawn()&lt;/code&gt; is not a language feature, it's a function within ASIO that internally constructs awaitable objects and then uses &lt;code&gt;co_await&lt;/code&gt; to trigger the co-routines.&lt;/p&gt;
&lt;p&gt;To compile this code, you will need to get a &lt;a href="https://think-async.com/Asio/"&gt;ASIO library&lt;/a&gt;. You can test it by running it in a terminal, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./TCPEchoServer &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in another terminal, connect to it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
foo
foo
bar
bar
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Simple async message queue in C++20&lt;/h2&gt;
&lt;p&gt;The above skeleton can be extended to a simple bi-directional message queue. Bi-directional means that clients can both subscribe to topics and receive messages sent to that topic, as well as send messages to arbitrary topics. The server accepts two kinds of structured messages, &lt;code&gt;subscribe&lt;/code&gt; and &lt;code&gt;send&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;subscribe&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;              }
{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;,      &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;blah&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The semantics are simple: after &lt;code&gt;subscribe&lt;/code&gt;, that client receives messages sent to that topic by other clients, sent after the client connected. In this simple implementation, all subscribers get the message.&lt;/p&gt;
&lt;p&gt;The implementation is barely longer than the echo server, since the problem is very similar: we just have to maintain a list of &lt;code&gt;tcp:socket&lt;/code&gt;s for each topic, and when we receive a message for a topic, we go through those sockets and send the message. Unlike the Python implementation, which came in just a bit longer than the echo server, the C++20 version is a fair amount longer, because we need to write some helper functions to keep the main code clean. For example, in Python parsing a dictionary like the ones we use was "free" with &lt;code&gt;ast.literal_eval()&lt;/code&gt; which is a built-in, in C++ we need to write this function ourselves.&lt;/p&gt;
&lt;p&gt;Let's see the boilerplate and the helper functions first:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;map&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;regex&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;iostream&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;sstream&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vector&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unordered_map&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unordered_set&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;algorithm&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;cctype&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;asio.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;asio/io_context.hpp&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;signal.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="k"&gt;namespace&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;is_space&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;isspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;erase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;find_if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;not_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_space&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
    &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;erase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;find_if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rbegin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rend&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;not_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_space&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;delim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;stringstream&lt;/span&gt; &lt;span class="nf"&gt;ss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;getline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delim&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="nf"&gt;parse_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;dict_str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;regex&lt;/span&gt; &lt;span class="n"&gt;dict_pattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;R&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="dl"&gt;(&lt;/span&gt;&lt;span class="s"&gt;\s*&amp;#39;([^&amp;#39;]+)&amp;#39;\s*:\s*&amp;#39;([^&amp;#39;]+)&amp;#39;\s*&lt;/span&gt;&lt;span class="dl"&gt;)&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;smatch&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;search_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dict_str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regex_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dict_str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dict_pattern&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;search_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;second&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We implement four helper functions, which will help us keep the main code clean, and similar to the Python code. &lt;code&gt;void trim(string&amp;amp; str)&lt;/code&gt; removes whitespace from both ends of a string. &lt;code&gt;bool startswith(string_view s, string_view prefix)&lt;/code&gt; returns &lt;code&gt;true&lt;/code&gt; if &lt;code&gt;s&lt;/code&gt; starts with &lt;code&gt;prefix&lt;/code&gt;. &lt;code&gt;vector&amp;lt;string&amp;gt; split(const string s, char delim)&lt;/code&gt; splits &lt;code&gt;s&lt;/code&gt; into a vector of strings on &lt;code&gt;delim&lt;/code&gt;. &lt;code&gt;Dict parse_dict(const string&amp;amp; dict_str)&lt;/code&gt; returns a map of strings to string based on what's in &lt;code&gt;dict_str&lt;/code&gt;, assuming it's flat and well-formed, &lt;code&gt;Dict&lt;/code&gt; is typedef'd to &lt;code&gt;map&amp;lt;string, string&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now the main code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;unordered_map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unordered_set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;unordered_set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client connected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;quit&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_read_until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;dynamic_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// line-by-line reading&lt;/span&gt;
            &lt;span class="n"&gt;mesg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parse_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;subscribe&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;send&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;subscriber&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mesg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;topic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
                    &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;async_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;topic&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;erase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Client disconnected: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remote_endpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Exception in session: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;what&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;awaitable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_context&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;acceptor&lt;/span&gt; &lt;span class="n"&gt;acceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;v4&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Server listening on port &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;...&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;co_await&lt;/span&gt; &lt;span class="n"&gt;acceptor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;async_accept&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_awaitable&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Usage: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot; &amp;lt;port&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;endl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;io_context&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;signal_set&lt;/span&gt; &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;async_wait&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;listen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;co_spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;asio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;detached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only thing that changed compared to the echo server is the &lt;code&gt;session()&lt;/code&gt; function, and the logic is identical to the Python version. First, the dictionary is parsed. If the client is subscribing, a pointer to the client's &lt;code&gt;tcp::socket&lt;/code&gt; object is inserted into &lt;code&gt;subscribers&lt;/code&gt;. If the client is sending a message, the appropraite &lt;code&gt;tcp::socket&lt;/code&gt; pointers in &lt;code&gt;subscribers&lt;/code&gt; are iterated and we write back the original message with &lt;code&gt;async_write()&lt;/code&gt;. As before, I skipped proper error handling to keep the code more readable.&lt;/p&gt;
&lt;p&gt;You can test it by running it in a terminal, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./SimpleAsyncMessageQueue &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a second terminal, connect to it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a third terminal:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;blah&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The message will appear in the second terminal!&lt;/p&gt;
&lt;h2&gt;ChatGPT&lt;/h2&gt;
&lt;p&gt;Many years ago I was a full-time / professional C++ programmer. In the meantime, a lot has happened in the language, and I've become rusty, so I used a combination of Stackoverflow and ChatGPT-4 get back up to speed. I had ChatGPT generate an initial version of the TCP server using co-routines and ASIO. This was 95% right, but it didn't compile, so I fixed it by hand. From then on, I wrote the MQ implementation without AI help, but I wrote a version that accepted plain text commands (instead of the dictionary format, since I didn't want to write the parsing code). Then I had ChatGPT write the &lt;code&gt;parse_dict()&lt;/code&gt; function, it produced this correct version. Then I asked it to review my complete code, and it came back with 7 suggestions, out of which 5-6 were good suggestions which I implemented. Mostly it suggested more idiomatic versions of the helper functions like &lt;code&gt;trim()&lt;/code&gt; and &lt;code&gt;startswith()&lt;/code&gt; that use STL features to make it shorter and faster. Overall ChatGPT-4 was a big help and I recommend experts (who can validate the responses) to use it.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next article, I will add some features to this simple skeleton.&lt;/p&gt;</content><category term="cpp"></category><category term="async"></category><category term="message"></category><category term="queue"></category></entry><entry><title>Can ChatGPT write the Python async message queue codes from the previous posts?</title><link href="https://bytepawn.com/can-chatgpt-write-the-python-async-message-queue-codes-from-the-previous-posts.html" rel="alternate"></link><published>2023-03-05T00:00:00+01:00</published><updated>2023-03-05T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-03-05:/can-chatgpt-write-the-python-async-message-queue-codes-from-the-previous-posts.html</id><summary type="html">&lt;p&gt;I try to get ChatGPT to write the codes in the previous posts. It's able to write the basic message queue skeleton, but cannot implement more complicated features such as delivery semantics with caching.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/an-ai-writing-python-code.jpg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles I wrote a simple async message queue servers in Python:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html"&gt;Writing a simple Python async message queue server - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server-part-ii.html"&gt;Writing a simple Python async message queue server - Part II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the first one, I wrote a very simple message queue using the &lt;code&gt;asyncio&lt;/code&gt; library. In the second, I added a few features to it, such as &lt;code&gt;delivery=one|all&lt;/code&gt; semantics, caching, message sequencing and &lt;code&gt;last_seen&lt;/code&gt; specified by the subscriber. The whole thing remained very simple at less than 100 lines of code.&lt;/p&gt;
&lt;p&gt;With the advent of ChatGPT, and it being able to write code, it's interesting to check whether the AI could write these simple codes. Personally I think anything ChatGPT can write is a linear combination of what's already out there on the public Internet, so it's not worth a blog post. Having said that, I do not claim the opposite, just because ChatGPT cannot reproduce something does not mean it's worthy or writing down.&lt;/p&gt;
&lt;p&gt;I also tried using OpenAI's Codex, but it produced poorer results than ChatGPT, so I'm not showing that.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/an-ai-writing-python-code.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: the cover image is from DALL-E, the prompt was &lt;code&gt;an ai writing python code&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Simple echo server&lt;/h2&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Write a simple TCP echo server in Python using the asyncio library. It should bind to localhost, but take the listening port as a command-line argument.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;TCP echo server&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;listening port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_echo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;127.0.0.1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This code runs and works! It's  great start. It's showing what everybody knows by now, ChatGPT is super-useful for getting started with a topic.&lt;/p&gt;
&lt;h2&gt;Simple message queue&lt;/h2&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Write a simple TCP messsage queue server in Python using the asyncio library that accepts the listening port as a command-like argument. The messages are Python dicts followed by newlines. It should accept subscribe messages like {'command':'subscribe', 'topic':'foo'} and send messages like {'command':'send', 'topic':'foo', 'msg':'blah'}. There is no need to cache, just send messages to currentlly connected subscribers.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                        &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_until_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It kindof works, but it commits some mistakes the way a novice programmer would: it assumes the structures are JSON, and uses &lt;code&gt;json.loads()&lt;/code&gt;. This will not parse the messages as I specified it, as &lt;code&gt;json&lt;/code&gt; requires double-quotes and not single-quotes. Otherwise, this is a working message queue with &lt;code&gt;asyncio&lt;/code&gt;!&lt;/p&gt;
&lt;h2&gt;Delivery&lt;/h2&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Now, let's add a 'delivery' field to the send command, it can either be 'one' or 'all'. In the case of 'one', only one connected subscriber of that topic gets the message, in the case of 'all', all connected subscribers of that topic get the message.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_until_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It works, but it's very ugly. On the &lt;code&gt;one&lt;/code&gt; branch, it removes a subscriber from the array and then appends it back for some reason.&lt;/p&gt;
&lt;h2&gt;Caching&lt;/h2&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
Now, let's add simple in-memory caching to it. Let's cache 100 messages per topic. When a new subscriber connects, we can send them whatever we have cached for that topic. Be careful to maintain the delivery semantics, ie. messages with delivery=one should only be delivered to 1 subscriber.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cached_msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_until_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;ChatGPT is able to add the caching logic, but is not able to maintain the &lt;code&gt;delivery=one&lt;/code&gt; semantics. Once a message is cached, it's sent to all subscribers who connect thereafter.&lt;/p&gt;
&lt;p&gt;Prompt:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;
The way you wrote it, messages with delivery of one will be sent to multiple subscribers who connect after the message was cached. Fix it.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;ChatGPT's code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cached_msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;delivery&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;delivery&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;one&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                                &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;port&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MessageQueueServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_until_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It's not able to produce correct code, this problem statement is too complex for ChatGPT!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this scenario, ChatGPT is like a very smart search engine. Before ChatGPT, to get a skeleton version of an &lt;code&gt;asyncio&lt;/code&gt; server, you had to look at the documentation and/or Stackoverflow. Now, you still have to do that, but you can get a working skeleton from ChatGPT. As these examples show, ChatGPT won't get you much farther, but this is already impressive. And who knows how much better newer models will do that have 10x or 100x as many parameters and are trained on more code. Having said that, for the time being I suspect we programmers still don't have to fear for our jobs, and we still have to read the documentation.&lt;/p&gt;</content><category term="python"></category><category term="async"></category><category term="message"></category><category term="queue"></category><category term="chatgpt"></category></entry><entry><title>Writing a simple Python async message queue server - Part II</title><link href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server-part-ii.html" rel="alternate"></link><published>2023-03-02T00:00:00+01:00</published><updated>2023-03-02T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-03-02:/writing-a-simple-python-async-message-queue-server-part-ii.html</id><summary type="html">&lt;p&gt;I write a somewhat more complicated, but still relatively simple async message queue server in Python.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/async_echo.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html"&gt;previous article&lt;/a&gt;, I wrote a simple async message queue server in Python. Let's add features to it!&lt;/p&gt;
&lt;h2&gt;Delivery&lt;/h2&gt;
&lt;p&gt;Let's add a feature so the sender can specify whether messages get delivered to &lt;code&gt;all&lt;/code&gt; subscribers or just &lt;code&gt;one&lt;/code&gt;. So we extend the message with a &lt;code&gt;delivery&lt;/code&gt; field, which can be &lt;code&gt;all&lt;/code&gt; or &lt;code&gt;one&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;blah&amp;#39;, delivery:&amp;#39;all&amp;#39;}
{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;blah&amp;#39;, delivery:&amp;#39;one&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The implementation for this is simple. We just have to select a random writer in &lt;code&gt;topics&lt;/code&gt; to get the message in case of &lt;code&gt;one&lt;/code&gt;, and send it to all writers in the case of &lt;code&gt;all&lt;/code&gt;, while watching for the empty list case:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if delivery == &amp;#39;one&amp;#39;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]][&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;writers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Caching&lt;/h2&gt;
&lt;p&gt;Now, let's add a feature where the server will cache 100 messages per topic, while honoring the &lt;code&gt;delivery&lt;/code&gt; semantics, ie. messages with &lt;code&gt;delivery&lt;/code&gt; of &lt;code&gt;one&lt;/code&gt; will never be sent to more than one client.&lt;/p&gt;
&lt;p&gt;The implementation is still simple: if an incoming message has &lt;code&gt;all&lt;/code&gt; delivery, just push it into the topic's cache. If if has &lt;code&gt;one&lt;/code&gt; delivery, only push it if there are no current subscribers. On the other hand, if a new subscriber connects, send them all cached messages for that topic, and then drop any messages that had &lt;code&gt;one&lt;/code&gt; delivery:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="n"&gt;caches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# construct new cache, removing elements which had delivery of one&lt;/span&gt;
    &lt;span class="c1"&gt;# since we just delivered those&lt;/span&gt;
    &lt;span class="n"&gt;new_cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;new_cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_cache&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;send_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# if there are any cached messages to send, send it&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
                &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# cache it&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if delivery == &amp;#39;one&amp;#39;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                    &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# cache it&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]][&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
                    &lt;span class="c1"&gt;# no need to cache it&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;writers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Last seen semantics&lt;/h2&gt;
&lt;p&gt;Let's add support for clients (re)connecting and starting to read from a topic where they left off, by specifying a &lt;code&gt;last_seen&lt;/code&gt; index in the subscribe message, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;subscribe&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;last_seen&amp;#39;: &amp;#39;123&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To support this, we have to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;add an index to each message (starting at 0), which increases sequentially&lt;/li&gt;
&lt;li&gt;the index is stored in the command itself, so it now makes more sense to cache the parsed &lt;code&gt;cmd&lt;/code&gt; insteaf of the flat &lt;code&gt;line&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;this index is added to the &lt;code&gt;cmd&lt;/code&gt; structure and sent out to subscribers&lt;/li&gt;
&lt;li&gt;when a new client connects and specifies &lt;code&gt;last_seen&lt;/code&gt; (default -1), only send out messages which have a bigger index&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="n"&gt;caches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;indexs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;last_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;last_seen&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;last_seen&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;send_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# if there are any cached messages to send, send it&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;index&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;new_cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CACHE_LENGTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;index&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;new_cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;new_cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_cache&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;index&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;indexs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;indexs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
        &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# cache it&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if delivery == &amp;#39;one&amp;#39;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
            &lt;span class="n"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# cache it&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]][&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
            &lt;span class="c1"&gt;# no need to cache it&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;writers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;handle_subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;handle_send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Handling client disconnects&lt;/h2&gt;
&lt;p&gt;Up until this point, the code did not handle client disconnects. So &lt;code&gt;writer&lt;/code&gt;s were added to lists, but never removed, which means if a subscriber disconnects, the next time that topic receives a message, there will be a stale writer in the topic list, and the Python runtime will throw an exception. Handling this easy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;when a client disconnects, an exception is thrown&lt;/li&gt;
&lt;li&gt;we maintain a reverse lookup dictionary so we know which writer was added to which topic list&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I will show the basic logic on the simplest message queue implementation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="n"&gt;topics_reverse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;topics_reverse&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delivery&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if delivery == &amp;#39;one&amp;#39;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]][&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;writers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;topics_reverse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;topics_reverse&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Removing writer from topic {topic}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;topics_reverse&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Of course this is not ready for production use, but it's quite surprising how much functionality fits into a 67 line, async message queue implementation which is bi-directional, supports caching and indexing.&lt;/p&gt;</content><category term="python"></category><category term="async"></category><category term="message"></category><category term="queue"></category></entry><entry><title>Writing a simple Python async message queue server - Part I</title><link href="https://bytepawn.com/writing-a-simple-python-async-message-queue-server.html" rel="alternate"></link><published>2023-02-27T00:00:00+01:00</published><updated>2023-02-27T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-02-27:/writing-a-simple-python-async-message-queue-server.html</id><summary type="html">&lt;p&gt;I write a simple, bi-directional async message queue server in Python.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/async_echo.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In 2023, I want to get back into playful systems programming. My plan is to write a simple &lt;a href="https://en.wikipedia.org/wiki/Message_queue"&gt;message queue (MQ)&lt;/a&gt;, and for my own education write the same toy implementation in modern Python, modern C++ and Rust. To make things interesting, I will use async / &lt;a href="https://en.wikipedia.org/wiki/Event-driven_programming"&gt;event-driven&lt;/a&gt; programming and avoid &lt;a href="https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)"&gt;multi-threading&lt;/a&gt;. I have lots of experience in the first two, but this will be my first time writing Rust.&lt;/p&gt;
&lt;p&gt;In this first article, I will explore how to write a simple Python MQ server using the Python &lt;a href="https://docs.python.org/3/library/asyncio.html"&gt;&lt;code&gt;asyncio&lt;/code&gt; library&lt;/a&gt;. &lt;code&gt;asyncio&lt;/code&gt; allows programmers to write code as if it would be multi-threaded, but the run-time is actually using &lt;a href="https://en.wikipedia.org/wiki/Coroutine"&gt;co-routines&lt;/a&gt; and switching between them at special &lt;code&gt;await&lt;/code&gt; points. Also, the library has special data structures that allow &lt;a href="https://en.wikipedia.org/wiki/Synchronization_(computer_science)"&gt;synchronization&lt;/a&gt; between co-routines witout having to use synchronization primitives such as &lt;a href="https://en.wikipedia.org/wiki/Lock_(computer_science)"&gt;locks&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Simple async echo server&lt;/h2&gt;
&lt;p&gt;Let's start with writing the simplest possible skeleton server, which listens on a port, accepts TCP connections, reads incoming bytes, and echoes back whatever was sent:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You can test it by running it in a terminal, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ python async_echo.py &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in another terminal, connect to it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
foo
foo
bar
bar
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;asyncio.run()&lt;/code&gt; is the top-level entry-point in an async Python program, it's like &lt;code&gt;main()&lt;/code&gt;. We pass in our own &lt;code&gt;run_server()&lt;/code&gt; function, which kicks of the main event loop. This then uses &lt;code&gt;asyncio.start_server()&lt;/code&gt;, which sets up a TCP server listening on the specified port, and will fire off a new instance of &lt;code&gt;handle_client()&lt;/code&gt; on every new incoming connection. We can pretend each &lt;code&gt;handle_client()&lt;/code&gt; is a new thread, but actually the run-time is single-threaded, and uses co-routines. &lt;code&gt;server.serve_forever()&lt;/code&gt; continues the main event loop, this is where the &lt;code&gt;handle_client()&lt;/code&gt; calls will come from.&lt;/p&gt;
&lt;p&gt;Interacting with the client connection is simple. &lt;code&gt;handle_client()&lt;/code&gt; receives a &lt;code&gt;reader&lt;/code&gt; and a &lt;code&gt;writer&lt;/code&gt;, and in these simple examples we use &lt;code&gt;reader.readline()&lt;/code&gt; and &lt;code&gt;writer.write()&lt;/code&gt; to receive and send data.&lt;/p&gt;
&lt;p&gt;What about the funny &lt;code&gt;async&lt;/code&gt; and &lt;code&gt;await&lt;/code&gt; keywords? Essentially what happens in this example is that multiple copies of &lt;code&gt;handle_client()&lt;/code&gt; will be running, but not at the same time. The Python runtime will potentially switch between these co-routines whenever one of the co-routines gets to an &lt;code&gt;await&lt;/code&gt; statement. As an example, assume a client connects, and the &lt;code&gt;handle_client()&lt;/code&gt; starts running. It will get to the &lt;code&gt;line = (await reader.readline()).decode('utf8')&lt;/code&gt; line, which has &lt;code&gt;await&lt;/code&gt;. At this point, this co-routine is waiting for input, but it may take any time for the data to arrive. Meanwhile, another client may connect, in this case this waiting co-routine is paused, and the main co-routine &lt;code&gt;run_server()&lt;/code&gt; runs again to handle the new connection, and to spawn another &lt;code&gt;handle_client()&lt;/code&gt;. Now we have 1 server listening/awaiting and 2 clients reading/awaiting, whichever event happens first, that co-routine will resume next. So, &lt;code&gt;async&lt;/code&gt; tells Python that the function is a co-routine (anything with &lt;code&gt;await&lt;/code&gt; inside is a co-routine), and &lt;code&gt;await&lt;/code&gt; tells Python that the function being called is a (blocking) co-routine.&lt;/p&gt;
&lt;h2&gt;Simple async message queue in Python&lt;/h2&gt;
&lt;p&gt;The above skeleton can be extended to a simple uni-directional message queue. It accepts two kinds of structured messages, &lt;code&gt;subscribe&lt;/code&gt; and &lt;code&gt;send&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;command&amp;#39;:&amp;#39;subscribe&amp;#39;, &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;              }
{&amp;#39;command&amp;#39;:&amp;#39;send&amp;#39;,      &amp;#39;topic&amp;#39;:&amp;#39;foo&amp;#39;, &amp;#39;msg&amp;#39;:&amp;#39;blah&amp;#39;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The semantics are simple: after &lt;code&gt;subscribe&lt;/code&gt;, that client receives messages sent to that topic by other clients, sent after the client connected. In this simple implementation, all subscribers get the message.&lt;/p&gt;
&lt;p&gt;The implementation is barely longer than the echo server, since the problem is very similar: we just have to maintain a list of &lt;code&gt;writer&lt;/code&gt;s for each topic, and when we receive a message for a topic, we go through those writers and send the message.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ast&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;New client connected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;quit&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Received: {line.strip()}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;writers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;writers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Client disconnected...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Listening on {host}:{port}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;await&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You can test it by running it in a terminal, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ python async_unidir.py &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a second terminal, connect to it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;subscribe&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then, in a third terminal:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ telnet
&amp;gt; open localhost &lt;span class="m"&gt;7777&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;command&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;send&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;, &lt;span class="s1"&gt;&amp;#39;msg&amp;#39;&lt;/span&gt;:&lt;span class="s1"&gt;&amp;#39;blah&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The message will appear in the second terminal!&lt;/p&gt;
&lt;p&gt;Note that in traditional multi-threaded programming, the queue objects would have to be protected by locks to avoid corrupting the data structure. With Python &lt;code&gt;async&lt;/code&gt;, this is not an issue, there is no way to corrupt the internal state of the queues (or the &lt;code&gt;topics&lt;/code&gt; hashmap), since the program is actually single-threaded, and will only switch co-routines at &lt;code&gt;await&lt;/code&gt; points.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next article, I will add some features to this simple skeleton.&lt;/p&gt;</content><category term="python"></category><category term="async"></category><category term="message"></category><category term="queue"></category></entry><entry><title>Lifelong purchases</title><link href="https://bytepawn.com/lifelong-purchases.html" rel="alternate"></link><published>2023-02-05T00:00:00+01:00</published><updated>2023-02-05T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-02-05:/lifelong-purchases.html</id><summary type="html">&lt;p&gt;What are some of my buy-it-for-life purchases?&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/tm-notebook.jpg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently there was an Ask HN on Hacker News entitled &lt;a href="https://news.ycombinator.com/item?id=34378254"&gt;Ask HN: What are some of your buy-it-for-life purchases?&lt;/a&gt; I really enjoyed the question and gave my own answer, which I'm reproducing here.&lt;/p&gt;
&lt;h2&gt;My lifelong purchases&lt;/h2&gt;
&lt;p&gt;I don't really know whether these will be lifelong items, but their quality suggests they could be.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Furniture.&lt;/strong&gt; Solid wood furniture like &lt;a href="https://www.crateandbarrel.me"&gt;Crate &amp;amp; Barrel&lt;/a&gt;. Our current apartment has a lot of C&amp;amp;B wood furniture since my company owns the store and I get 40% off.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cratebarrel.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Knives.&lt;/strong&gt; High quality knives like &lt;a href="https://winklerknives.com/"&gt;Winkler&lt;/a&gt;, &lt;a href="https://www.boker.de/en/fixed-blade-knives"&gt;Böker&lt;/a&gt; and &lt;a href="https://www.victorinox.com/fr/en/Products/Swiss-Army-Knives/c/SAK"&gt;Victorinox&lt;/a&gt;. Also, local knifemakers like &lt;a href="https://www.kesvilag.hu/toth-laszlo-kesek.html"&gt;Toth Laszlo&lt;/a&gt; in Hungary make beautiful knives that will last forever.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/knives.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pens.&lt;/strong&gt; High-quality metal pens and pencils, like the ones &lt;a href="https://www.rotring.com/pens-pencils/pencils/rotring-600-mechanical-pencil-1/SAP_1904443.html"&gt;Rotring&lt;/a&gt; makes.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pens.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chess sets.&lt;/strong&gt; A high-quality ebony staunton chess set, like the ones sold on &lt;a href="https://www.chessbazaar.com"&gt;Chessbazaar&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/chess-staunton.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Notebooks.&lt;/strong&gt; High quality leather stuff like notebook sleeves, for example from &lt;a href="https://ryokobags.com/products/hudson-leather-notebook-cover"&gt;Ryoko&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;&lt;img src="/images/tm-notebook.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Watches.&lt;/strong&gt; I'm not an expert, but I don't see a reason for a watch (expensive or affordable) to break, if you wear it reasonably. I have cheap Casio and Seiko watches from 20-30 years ago that are just fine. Good not-so-expensive (compared to Rolex) brands are Casio, Seiko, Tissot, Victorinox, Porsche Design. I also enjoy buying fashion watches like Skagen, About Vintage and Sternglas; watch experts look down on these, but I think they're beautiful.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/watches.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bikes.&lt;/strong&gt; Almost any modern bike will last you a very long time. Carbon ones possibly not for a lifetime, steel ones will last for a lifetime. Get a mechanical full &lt;a href="https://www.shimano.com/en/100th/history/products/41.php"&gt;Ultegra group set&lt;/a&gt;, they are awesome. (The electric is also awesome, but not sure about for-life.) If you want an all-arounder, get a cyclocross bike, you can use it like a road bike and also go off-road a bit. These days I prefer &lt;a href="https://www.bianchi.com/"&gt;Bianchi&lt;/a&gt;, &lt;a href="https://pinarello.com/"&gt;Pinarello&lt;/a&gt; and &lt;a href="https://cervelo.com"&gt;Cervelo&lt;/a&gt; in terms of shape and color, but other brands like &lt;a href="https://trekbikes.com"&gt;Trek&lt;/a&gt;, &lt;a href="https://www.bmc-switzerland.com/"&gt;BMC&lt;/a&gt; and &lt;a href="http://www.cube.eu/"&gt;Cube&lt;/a&gt; are also great. I currently own a Bianchi Oltre XR4 road bike and a Cervelo P5 timetrial bike.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/my-bianchi.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/my-cervelo.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Books.&lt;/strong&gt; Most print books will last my lifetime in my estimation. Specifically, I predict that physical books in my posession will last me longer than DRM'd books. I expect that the platforms (eg. Kindle) and my personal storage discipline (of PDFs) will not last as long as physical books on my shelf.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/books.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Die-cast toys.&lt;/strong&gt; Die-cast toys like the ones made by &lt;a href="http://www.bburago.com/"&gt;Bburago&lt;/a&gt; are virtually indestructible.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/die-cast-cars.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Speakers.&lt;/strong&gt; High-quality speakers like &lt;a href="https://www.tnt-audio.com/casse/klipsch-rf3_e.html"&gt;Klipsch RF-3s&lt;/a&gt;. Mine are back in Hungary, so I'm using a stock photo.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/rf3s.jpg" alt="." style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="lifelong"></category><category term="purchase"></category><category term="victorinox"></category></entry><entry><title>Crypto V: On NFTs</title><link href="https://bytepawn.com/my-thoughts-on-nfts.html" rel="alternate"></link><published>2023-01-27T00:00:00+01:00</published><updated>2023-01-27T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-01-27:/my-thoughts-on-nfts.html</id><summary type="html">&lt;p&gt;I argue that virtual goods such as in-game skins make more sense in a centralized walled-garden approach, and NFTs tracked and traded on blockchains are legally problematic.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/bayc6.jpg" alt="BTC transaction" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the fifth post in a series on crypto. Previous articles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;Crypto I: On cryptocurrencies, explained using FTX Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;Crypto II: On fiat currencies vs. crypto currencies&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html"&gt;Crypto III: On perceivd crypto advantages such as anonymity and irreversibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html"&gt;Crypto IV: On stock investing vs. crypto investing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here I will talk about NFTs. First, some definitions from the &lt;a href="https://en.wikipedia.org/wiki/Non-fungible_token"&gt;Wikipedia entry for Non-fungible tokens&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A non-fungible token (NFT) is a unique digital identifier that cannot be copied, substituted, or subdivided, that is recorded in a blockchain, and that is used to certify authenticity and ownership. The ownership of an NFT is recorded in the blockchain and can be transferred by the owner, allowing NFTs to be sold and traded. NFTs can be created by anybody, and require few or no coding skills to create. NFTs typically contain references to digital files such as photos, videos, and audio. Because NFTs are uniquely identifiable assets, they differ from cryptocurrencies, which are fungible.&lt;/p&gt;
&lt;p&gt;Proponents of NFTs claim that NFTs provide a public certificate of authenticity or proof of ownership, but the legal rights conveyed by an NFT can be uncertain. The ownership of an NFT as defined by the blockchain has no inherent legal meaning and does not necessarily grant copyright, intellectual property rights, or other legal rights over its associated digital file. An NFT does not restrict the sharing or copying of its associated digital file and does not prevent the creation of NFTs that reference identical files.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Virtual goods on blockchains&lt;/h2&gt;
&lt;p&gt;Fundamentally there is no problem with paying money for virtual goods. For example, software is also a virtual good, although it's a useful virtual good, because it can be run, and presumably some non-virtual value can be extracted (for example, a lawyer uses Microsoft Word to write court documents, get paid, and buys food). Movies and games have value, since they provide entertainment to gamers, beyond their ownership.&lt;/p&gt;
&lt;p&gt;Virtual goods were common inside games and gamers paid money for them long before NFTs. A typical example is a skin for your character in a game. An in-game skin can still be argued to be provide non-virtual value if it provides entertainment for the gamer purchasing it.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gameskin.jpg" alt="BTC transaction" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, for a digital good such as an in-game skin to be useful, it has to be tied to an online game that is controlled by a publisher; the publisher has to write and support the software that displays the skins. Games themselves are centralized online serviced, so it doesn't make sense for in-game content to be on a blockchain — and in fact, most of it is not on a blockchain.&lt;/p&gt;
&lt;p&gt;The types of virtual goods that are traded on a blockchain, such as bored ape images, are useless. There is no value apart from the image itself, which can be viewed without owning the associated NFT anyway.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://media.newyorker.com/photos/61016c1c7a2a603b3075c7b8/master/w_2560%2Cc_limit/chayka-boredapeclub.jpg" alt="BTC transaction" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Legal problems&lt;/h2&gt;
&lt;p&gt;A lot of the content in this section is from LegalEagle's excellent video about NFTs. I highly recommend watching it if you're interested in NFTs.&lt;/p&gt;
&lt;iframe style="margin-top: 20px; margin-bottom: 20px; margin-left: 15%;" width="560" height="315" src="https://www.youtube.com/embed/C6aeL83z_9Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;One of the problems is around the concept of privity: a legal term for when two parties have some sort of linking legal relationship. The problem with NFTs, it's not clear what the contrast between the initial minter and the Nth owner (or even between the Nth owner and the N+1th owner). For example, if you buy a &lt;a href="https://boredapeyachtclub.com/"&gt;Bored Ape Yacht Club (BAYC)&lt;/a&gt; NFT on the Ethereum blockchain, what is the legal contract between you and &lt;a href="https://yuga.com/"&gt;Yuga Labs&lt;/a&gt;, the minter of the NFT? Do any terms and conditions travel with the NFT, or which do? What if the image pointed to in the NFT disappears from the Internet? What if the Ethereum blockchain (not run by BAYC) disappears? What is the blockchain forks and somebody else owns the token on the other chain?&lt;/p&gt;
&lt;p&gt;A good counter-example is &lt;a href="https://nbatopshot.com/"&gt;NBA Topshots&lt;/a&gt;, a walled-garden NFT platform run by NBA. Here you can buy "collectible" moments from NBA games. Here, to buy anything you have to have an account, and you constantly have a legal contract with the NBA. However, in this walled-garden scenario, there is no need for a blockchain or the concept of an NFT! The whole thing could work just as well as a plain old web-app — in reality, that is what this is, with a useless blockchain in the backend.&lt;/p&gt;
&lt;p&gt;It's good to remember that NFTs (and any language associated with them, whether in terms &amp;amp; conditions or elsewhere) do not supplant copyright law. If you buy an NFT, you may or may not get the copyright, for example NBA Topshots does not give a copyright to the video, only a (very limited) license — you cannot do anything you want with the video, in fact there is very little you can do with it (NBA changed language from "owning" to "scoring" or "collecting" an NFT). In the case of the apes, BAYC holds the IP centrally and only gives license, which is actually held off-chain. In general, with digital art, if you don't have the copyright, you effectively have nothing [that is protectable under the law]. What is scarce about digital art is the copyright, but this is usually not part of the NFT — which defeats the whole point of NFTs!&lt;/p&gt;
&lt;p&gt;In general, NFTs do not change or supplant established law, and if NFTs run into established law, the law takes precedence. A good example is when Qentin Tarantino &lt;a href="https://tarantinonfts.com/"&gt;created NFTs relating to his movies&lt;/a&gt;, and Miramax, who owns most of the rights, sued him in court. It doesn't matter what's on the blockchain, a court (or settlement) decision takes precedence. Similar to this, sometimes the idea of tracking and trading real estate ownership on the blockchain comes up, but there is already an  established title system, which takes precedence. If the centralized records show that X is the owner, then X is the owner.&lt;/p&gt;
&lt;p&gt;It's also not clear how adownstream purchaser can enforce terms against the original issuer or previous owners. Imagine a band creating an NFT that grants meet &amp;amp; greet privileges with the band. Do all owners get this privilege? If the NFT is traded once a day, the band has to meet a new NFT owner every day? Every hour? Who's responsible if the blockchain breaks or forks? If we need intermediares like OpenSea to clarify terms, give terms &amp;amp; conditions, etc, then it's a walled-garden anyway (it is), and why do we need a blockchain? We can already do these things off-chain.&lt;/p&gt;
&lt;p&gt;In conclusion, web3 only makes things worse and more complicated.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;To conclude, I will quote Devin from &lt;a href="https://www.youtube.com/channel/UCpa-Zb0ZcQjTCPP1Dx_1M8Q"&gt;LegalEagle&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Programmers are bad at planning for contingencies, and don't understand contract law. Lawyers exist for a reason.&lt;/p&gt;
&lt;/blockquote&gt;</content><category term="crypto"></category><category term="bitcoin"></category><category term="btc"></category><category term="ftx"></category><category term="ftt"></category></entry><entry><title>Crypto IV: On stock investing vs. crypto investing</title><link href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html" rel="alternate"></link><published>2023-01-21T00:00:00+01:00</published><updated>2023-01-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-01-21:/my-thoughts-on-crypto-vs-stock-investing.html</id><summary type="html">&lt;p&gt;I argue that a cryptocurrency like Bitcoin has a weird property: the more you buy of it, the less valuable it becomes.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://e3.365dm.com/22/10/2048x1152/skynews-twitter-elon-musk_5944670.jpg" alt="Elon Musk Twitter" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the last 3 posts I argued against the real-world usefulness of cryptocurrencies and pointed to its ponzi-like nature. One counter-argument of a sort could be that regular stock-investing is also ponzi-like; after all, if I buy a piece of Microsoft stock (&lt;code&gt;MSFT&lt;/code&gt;) at \$1, and then sell it at \$2, somebody now has to enter the market at \$2 to buy it from me. So for that person to make money and sell at \$3, somebody else has to enter the market... and so on. So it sounds like from an investor perspective, there's no difference in the basic dynamic — or is there?&lt;/p&gt;
&lt;p&gt;All articles in this series:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;Crypto I: On cryptocurrencies, explained using FTX Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;Crypto II: On fiat currencies vs. crypto currencies&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html"&gt;Crypto III: On perceivd crypto advantages such as anonymity and irreversibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html"&gt;Crypto IV: On stock investing vs. crypto investing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-nfts.html"&gt;Crypto V: On NFTs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Profitability&lt;/h2&gt;
&lt;p&gt;I would say that for most investors, from their own perspective as an investor, there is no major difference first-order difference. You buy a stock or a crypto-something for \$1 and you would like to sell it for \$2 in the future; what happens to the underlying asset is secondary, as long as the price went up.&lt;/p&gt;
&lt;p&gt;But I would argue that there is a big difference. When buying MSFT stock, you are buying fractional ownership in an economic machine that generates value — profits. With stock, you can always run the following thought experiment: if you buy up 100% of the stock, you now own and control the company. From now on, you can keep 100% of the annual profits. You can hire and fire people, change the company's product strategy, enter new markets or leave markets, and potentially make even more profits. The point is, with stock, you buy fractional control of a company that is (hopefully) generating profits. A good example of this, at the time of writing, is Elon Musk buying 100% of Twitter, taking over the company and taking it private.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://e3.365dm.com/22/10/2048x1152/skynews-twitter-elon-musk_5944670.jpg" alt="Elon Musk Twitter" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;With cryptocurrencies, this is not true. There is no underlying company or other asset. If you buy 100% of a cryptocurrency, there is nothing that you can control, there are no profits to keep.&lt;/p&gt;
&lt;p&gt;Okay, but what about companies whose stock has a positive USD value, but they are not profitable? For example, Uber is its 14th year, is not profitable, but you can buy (and sell) a piece of Uber stock, for about \$30 at the time of writing. Some investors are buying this stock because they believe eventually Uber will be profitable; for these investors, the above thought experiment still works, it's "just" that after buying 100% of the company, they have to wait some time to start collecting profits. This sort of belief of course also plays an important role for already profitable companies; different investors have different ideas (fantasies) about future profitability of their portfolio companies. Other investors are of course speculative, irrespective of the profitability of the underlying asset.&lt;/p&gt;
&lt;h2&gt;Fractional ownership vs market cap&lt;/h2&gt;
&lt;p&gt;The previous thought experiment gets even more interesting when applied to cryptocurrencies. First, let's go back to the stock example. The price of a stock multiplied by the number of shares outstanding is the market cap(italization) of a company (note that not all stock may be available for purchase on open stock markets). In our thought experiment, if we buy 100% of the stock, does it change the market cap of a company? The market cap is related to the (future) profitability of the company, so fundamentally it is not affected by ownership changes.&lt;/p&gt;
&lt;p&gt;However, I would argue this is not the case with a cryptocurrency like Bitcoin. Bitcoin also has a market cap calculated in a similar fashion as above. However, if you buy 100% of Bitcoin, I would argue the market cap would actually be \$0, because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;there is underlying asset, so there is no value coming from there&lt;/li&gt;
&lt;li&gt;if a single person controls all of a currency, it's useless as a currency, so it can't be worth anything&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As a result, a cryptocurrency like Bitcoin has a weird property: the more you buy of it, the less valuable it becomes!&lt;/p&gt;</content><category term="crypto"></category><category term="bitcoin"></category><category term="btc"></category><category term="ftx"></category><category term="ftt"></category></entry><entry><title>Crypto III: On perceivd crypto advantages such as anonymity and irreversibility</title><link href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html" rel="alternate"></link><published>2023-01-20T00:00:00+01:00</published><updated>2023-01-20T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-01-20:/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html</id><summary type="html">&lt;p&gt;I argue that anonymity, irreversibility and decentralization, while interesting technical features that implemented in a fascinating way using cryptographic primites, are not practical for real-world use.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/btc-transaction.png" alt="BTC transaction" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the third in a series on cryptocurrencies. All articles in this series:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;Crypto I: On cryptocurrencies, explained using FTX Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;Crypto II: On fiat currencies vs. crypto currencies&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html"&gt;Crypto III: On perceivd crypto advantages such as anonymity and irreversibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html"&gt;Crypto IV: On stock investing vs. crypto investing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-nfts.html"&gt;Crypto V: On NFTs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here I will argue that anonymity, irreversibility and decentralization, while interesting technical features that implemented in a fascinating way using cryptographic primites, are not practical for real-world use.&lt;/p&gt;
&lt;p&gt;A simple google search for &lt;a href="https://www.google.com/search?q=properties+of+cryptocurrencies"&gt;properties of crypocurrencies&lt;/a&gt; shows many results where key properties are summarized. This &lt;a href="https://medium.com/the-capital/key-characteristics-of-cryptocurrency-and-why-do-they-matter-to-you-5f33e483a40f"&gt;Medium blog&lt;/a&gt; lists the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Decentralized (no central authority)&lt;/li&gt;
&lt;li&gt;Irreversible and immutable (cannot be undone)&lt;/li&gt;
&lt;li&gt;Anonymous&lt;/li&gt;
&lt;li&gt;Limited supply and scarcity&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here's a longer list from &lt;a href="https://conotoxia.com/cryptocurrencies/what-are-cryptocurrencies/cryptocurrency-features"&gt;another page&lt;/a&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Anonymity&lt;/li&gt;
&lt;li&gt;No intermediary or supervisory bodies&lt;/li&gt;
&lt;li&gt;Security&lt;/li&gt;
&lt;li&gt;No centralization&lt;/li&gt;
&lt;li&gt;Sending cryptocurrencies&lt;/li&gt;
&lt;li&gt;Irreversable transactions&lt;/li&gt;
&lt;li&gt;Fast development&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here I will talk about &lt;strong&gt;Anonymity&lt;/strong&gt;, &lt;strong&gt;Irreversibility&lt;/strong&gt; and &lt;strong&gt;Decentralization&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;Anonymity&lt;/h2&gt;
&lt;p&gt;Anonymity refers to the fact that crypto transactions happen between wallets, wallets are like accounts, but don't care any additional meta information, such as name, phone number or address. A blockchain is a public database of transactions between wallets, so there are many websites that show it, for the Bitcoin blockchain, Ethereum, etc. Here is an &lt;a href="https://www.blockchain.com/explorer/transactions/btc/bc3aca038129d383d8ab7fa4b489093a118e179b55761440dbf04ccd27ffdb8e"&gt;example transactions from Bitcoin&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/btc-transaction.png" alt="BTC transaction" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The point here is that the sender and receiver are identified by long unreadable strings. In the above, the sender is &lt;a href="https://www.blockchain.com/explorer/addresses/btc/bc1q9yn6zdkjjlh0z5y6sqpdvwq7pwkeh5r0ka28ad"&gt;bc1qc09zgy8hd9ddy5mp2q4jq3gffgjvcqmn6a26lv&lt;/a&gt;. Clicking on the link reveals all wallet transactions (since the blockchain itself is public, since it's decentralized).&lt;/p&gt;
&lt;p&gt;The problems can be categorized into 2 areas, technical and social.&lt;/p&gt;
&lt;p&gt;Technically, one of the problems is that once a wallet's owner is revealed, then all past transactions involving that wallet can also be tied to the owner. So anonymity on the Blockchain is an all-or-nothing deal. If your identity gets revealed one in connection with 1 transaction, then suddenly your entire transaction history, including amounts and partner wallets, are also revealed. Suppose for example you tell a friend your wallet address &lt;code&gt;bc1q9yn6zdkjjlh0z5y6sqpdvwq7pwkeh5r0ka28ad&lt;/code&gt;, that person sends 1 BTC; then 5 years later the FBI ask the person to reveal all identities of Bitcoin recipients, and she reveals that it's you; now the police will trivially see your complete transaction history just by visiting the above link, even if you never set foot in the US.&lt;/p&gt;
&lt;p&gt;The main social problem is that anonymity is just not a sustainable feature. Democratic governments have legitimate reasons to know the financial (and non-financial) transactions of their citizens, for example to collect taxes or track criminals. Also, there is no technological solution to this; a nation state can simply  outlaw anonymous transactions, and citizens must comply, otherwise men with guns will take them away. A common use-case for Blockchains that I hear is tracking the ownership of physical objects such as cars or land. In this case again, anonymity is a non-starter: the government needs to know who each wallet belongs to, so it can issue fines for speeding or identify the person to contact if a highway needs to be built on top of a piece of land.&lt;/p&gt;
&lt;h2&gt;Irreversibility&lt;/h2&gt;
&lt;p&gt;Irreversibility in the context of blockchains means that once a transaction is committed to the Blockchain, from a technical standpoint, it cannot be undone. This is due to the decentralized nature of the Blockchain; simply put, no single actor or node can go in and do it. Having said that, transactions are reversible in the sense that the receiver can choose to send back whatever she received to the original sender — but there is no way (technically) to &lt;em&gt;force&lt;/em&gt; her.&lt;/p&gt;
&lt;p&gt;The main problem here is social — sometimes you want to reverse a transaction, for example because a mistake was made and somebody typed extra zeros! It's worth pointing out that legally, if A accidentally sends too much money to B (eg. 1,000 BTC instead of 1,000 USD), B cannot just keep it.&lt;/p&gt;
&lt;p&gt;A good example is when &lt;a href="https://www.theguardian.com/technology/2022/aug/31/crypto-com-accidentally-transfers-10-5-million-australian-woman-melbourne-cryptocurrency"&gt;a crypto exchange accidentally sent $10m to an Australian woman&lt;/a&gt;, and then sued her to get the money back. I guess in these cases having a centralized government that can force transactions to be reversed is not so bad. It'd be great if we had software that supports this! Note that in this case, the only reason suing was even possible was that the exchange sent this money to one of its own users, whose identity was known. If this was a regular on-chain transaction to a wallet that is not managed by the exchange, undoing the mistake would be much harder or impossible.&lt;/p&gt;
&lt;p&gt;Other, similar cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.theverge.com/2022/8/31/23330458/crypto-dot-com-typo-mistake-refund-audit"&gt;Crypto.com mistakenly sent a customer \$7.2 million instead of a $68 refund&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.theverge.com/2022/11/13/23456044/crypto-dot-com-accidentally-sent-400-million-wrong-recipent-ethereum"&gt;Crypto.com sent \$400 million to the wrong recipient, but got it back this time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://edition.cnn.com/2022/11/12/business/ftx-hack/index.html"&gt;FTX investigating a possible \$473 million hack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.quora.com/I-think-someone-sent-me-2-041-Bitcoins-accidentally-Is-there-a-way-they-can-reverse-it-I-want-to-keep-the-Bitcoins"&gt;Quora: I think someone sent me 2.041 Bitcoins accidentally. Is there a way they can reverse it? I want to keep the Bitcoins&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.quora.com/Someone-I-dont-know-has-randomly-sent-me-18-500-Bitcoins-Is-there-a-way-for-him-to-chargeback-Should-I-call-the-cops"&gt;Quora: Someone I don't know has randomly sent me 18,500 Bitcoins. Is there a way for him to chargeback? Should I call the cops?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Decentralization&lt;/h2&gt;
&lt;p&gt;The positive arguments for decentralization are: no one person or entity controls the Blockchain, so it can't be frozen (the way a bank can freeze your assets) or tampered with. One example is that central banks, which control the supply of fiat money, can "print" new money, hence creating inflation. The argument is that, since there is no central bank with cryptocurrencies, there can be no inflation, at least by a single entity's decision to create new money.&lt;/p&gt;
&lt;p&gt;There are multiple practical problems with this reasoning. As argued above, sometimes it's actually quite useful is an "administator" can reverse transactions, for example because the amount field contains a clerical mistake, or the destination address is non-existent and the cryptocurrencies are now lost.&lt;/p&gt;
&lt;p&gt;Another problem is, as argued in the &lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;second article&lt;/a&gt;, inflation or not, the exchange rate of cryptocurrencies like BTC varies wildly, including in the inflationary direction (getting less valueble compared to USD). To a user of the currency, it doesn't really matter what the cause is, if their currency is worth 3x less real-life goods in a year's time.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bigmac_btc2.jpg" alt="BTC transaction" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Finally, as shown in the &lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;first article&lt;/a&gt;, for many cryptocurrencies [implemented as smart contracts on Ethereum, such as FTX and many stablecoins], it's not true that they are not inflationary. In these cases, when the initial coin is created, a large amount of reserves are also created, which are held by the creator, usually a company. This, in effect, creates control and leverage to cause inflation: if the reserves of a stablecoin start to be circulated without backing USDs, that causes the "stablecoins" to actually be worth less than the intended exchange rate (usually 1 USD), similar to inflation.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I have argued that anonymity, irreversibility and decentralization, while interesting technical features that implemented in a fascinating way using cryptographic primites, are not practical for real-world use.&lt;/p&gt;</content><category term="crypto"></category><category term="bitcoin"></category><category term="btc"></category><category term="ftx"></category><category term="ftt"></category></entry><entry><title>Crypto II: On fiat currencies vs. crypto currencies</title><link href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html" rel="alternate"></link><published>2023-01-09T00:00:00+01:00</published><updated>2023-01-09T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2023-01-09:/my-thoughts-on-fiat-and-crypto-currencies.html</id><summary type="html">&lt;p&gt;Are cryptocurrencies really better than fiat currencies? I argue that the answer is negative.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/bigmac_btc.jpg" alt="Price of a Big Mac in BTC and USD" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html#my-thoughts-on-crypto-explained-using-ftx-tokens"&gt;previous article&lt;/a&gt;, I discussed some aspects of cryptocurrencies through the example of FTX Tokens. Here I will compare cryptocurrencies to traditional fiat currencies.&lt;/p&gt;
&lt;p&gt;All articles in this series:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;Crypto I: On cryptocurrencies, explained using FTX Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;Crypto II: On fiat currencies vs. crypto currencies&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html"&gt;Crypto III: On perceivd crypto advantages such as anonymity and irreversibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html"&gt;Crypto IV: On stock investing vs. crypto investing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-nfts.html"&gt;Crypto V: On NFTs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Bartering, money and the Big Mac index&lt;/h2&gt;
&lt;p&gt;One of the core functions of money is to replace a barter system. In a barter system, if A makes apples and B makes bananas, they can agree on an apple/banana exchange rate, and trade between each other — this is called bartering. A third actor in the economy, C is producing cucumbers. Suppose it so happens that A wants B's bananas, B wants C's cucumbers and C wants A's apples (a directed cyclic graph) — in this scenario pairwise trades are not possible, the economy is blocked. It would be nice if they could use some intermediate asset to trade and unblock the economy. In the real world, we use fiat money such as USD for this: A, B and C set their prices in USD, and assuming everybody has sufficient reserves of USD, they can trade, even if they don't want each others goods that day.&lt;/p&gt;
&lt;p&gt;For a currency to be acceptable to the participants of the economy, it has to be somewhat stable. Suppose D is a Data Scientist, and gets her salary in the common currency, let's say it's 100 USD. For USD to be a useful currency, in the sense that D accepts it as payment, she has to be able to buy basic goods, such as apples, bananas, cucumbers — and Big Macs at a relatively stable price.&lt;/p&gt;
&lt;p&gt;The Big Mac Index is maintained by The Economist since 1968, and tracks the price of a Big Mac hamburger across time in various countries; the price is recorded twice a year. Let's see what the price of a Big Mac looks like in USD and BTC, for the past 5 years (10 datapoints):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bigmac_btc2.jpg" alt="Price of a Big Mac in BTC and USD" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The image shows that getting a (relatively) fixed salary in USD and then buying basic goods is a safe bet. The hi/lo ratio on the USD chart is 1.12, this is the inflationary effect, and was 12% in these 5 years. However, the hi/lo ratio on the BTC chart is 17.5, which shows that getting a salary in BTC, or exchanging goods in BTC is not a safe bet. Your BTC salary, or the BTC you receive for your goods could be worth significantly less in a couple of months or years.&lt;/p&gt;
&lt;h2&gt;BTC/USD&lt;/h2&gt;
&lt;p&gt;But wait, the value of BTC actually went up! In the case of USD, the hi/lo is 1.12, and the hi is at the tail end, meaning 1 USD buys less Big Macs than previously. With BTC, it's the other way around, 1 BTC buys 17.5x more Big Macs than 5 years ago! So does this invalidate the arguments above?&lt;/p&gt;
&lt;p&gt;No, it does not. If you look at the Big Mac price in BTC, yes, overall it's "favorable" to BTC holders in this snapshot, but there were also times when it the price of a Big Mac in BTCs went up 3-4x in a year. The problem is, with a speculative asset, wide swings in the positive directions are always eventually matched by wide swings in the negative direction — and if this affects your livelihood, such as being able to buy basic goods with your salary, that is unacceptable.&lt;/p&gt;
&lt;p&gt;This shows what everybody knows anyway — cryptocurrencies like BTC are not a useful currency for trading goods or getting salaries, because they are speculative.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/btc_usd_all.jpg" alt="BTC/USD" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Stablecoins&lt;/h2&gt;
&lt;p&gt;Okay, but BTC is just one type of cryptocurrency, there are 1000s of others. Maybe another one is better? Most cryptocurrencies suffer from the same problem as BTC, they are speculative, so they are not a useful currency for trading goods or getting salaries.&lt;/p&gt;
&lt;p&gt;One exception to this could be stablecoins. However, stablecoins are not &lt;em&gt;really&lt;/em&gt; useful in the above scenario, since stablecoins assume the presense of a fiat currency like USD to which it is pegged — so the actors in the economy might as well use USD. You could say that that is right, but cryptocurrencies have other desirable &lt;em&gt;technical&lt;/em&gt; advantages, like anonymity, irreversibility, etc. Regarding these perceived advantages, let's see how stablecoins work out in practice (repeated from the previous post). This portion is repeated from the &lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html#my-thoughts-on-crypto-explained-using-ftx-tokens"&gt;previous article&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A &lt;a href="https://en.wikipedia.org/wiki/Stablecoin"&gt;stablecoin&lt;/a&gt; is a token whose value is pegged to a traditional fiat currency. Most stablecoins are pegged to \$1 USD. The most popular stablecoins at the time of writing are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/tether/"&gt;Tether (USDT)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/usd-coin/"&gt;USD Coin (USDC)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/binance-usd/"&gt;Binance USD (BUSD)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The way stablecoins are supposed to work is roughly like this: a company releases a new stablecoin, let's say it's pegged to \$1. The company mints a large amount of the stablecoins for itself (reserve), and then sells some of them to users for \$1 (in circulation). So users give the company \$1 and get 1 stablecoin in exchange. The company &lt;strong&gt;guarantees that it holds on to the \$1 in its bank accounts&lt;/strong&gt;, ie. it doesn't squander it away. Users can then transact with their stablecoins, enjoying the benefits of cryptocurrencies. If they ever need USD, for example because they want to buy bread, or they need to pay their US taxes, they can sell their stablecoins back to the minting company and get real dollars back. Sounds great and is a very useful concept in the crypto world.&lt;/p&gt;
&lt;p&gt;So, in this case, is there new value created? If everybody plays their part (the price of the token remains \$1 and the company holds on to the reserves), users &lt;strong&gt;can safely pretend that 1 stablecoin is worth \$1&lt;/strong&gt;. So in this sense, user value is created, but the &lt;strong&gt;stablecoin itself has no value detached from the reserves&lt;/strong&gt;. In reality, the company has \$1 and a liability of \$1, which nets to 0. And the user has a stablecoin, which is the other side of that liability. So the net value in the system is still \$1 (not \$2). For this reason, &lt;strong&gt;the minting company cannot pretend that its reserve stablecoins are worth real dollars&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I came up with the following analogy to stablecoins: imagine there is a cult of finance people called Purists, living in a village called Puretown. Purists really like the feel and smell of freshly printed USDs. So they go to the bank, get 1 million \$1 bills, and bring it home to marvel at it. But they don't want to use it in circulation, because then it won't be as nice and fresh. So they photocopy each one, and agree that in their village they will use the photocopied bills to pay each other, while the Major holds on to the real USD bills in a safe at city hall. And, if they ever need a real dollar bill to transact outside their village (ecosystem), they can go back to the Major of Puretown, who holds the real USD reserves, give him a photocopied bill, and get back a real USD bill. We can agree that no value was created (they didn't magically make \$2M out of \$1M), we just have a bunch of cultists using monopoly money in their own ecosystem!&lt;/p&gt;
&lt;p&gt;Also, it's not true that the reserves of the stablecoin (the non-circulating tokens that are still with the company) are now worth real dollars. Those are unbacked tokens, worth nothing.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/crypto-venn.png" alt="Crypto venn diagram" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The problem with stablecoins always comes from the Psychology circle in the Venn-diagram above. There are two (related) failure modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The minting company's executives cannot resist and will (secretly) give themselves some of the stablecoins, &lt;em&gt;without&lt;/em&gt; putting in the equivalent backing USDs — they want to make money out of thin air. There are now unbacked stablecoins in the system, so in reality each coin is now worth less, say \$0.99 instead of \$1.00. But they attempt to keep this a secret, so everybody believes and/or pretends the coin is worth \$1 (the exchange rate charts still show \$1), and withdraws per that exchange rate when getting back real dollars. Eventually the company runs out of money. This could be caused because somebody suspects or finds the unbacked trades, tweets about it, and a bank run happens.&lt;/li&gt;
&lt;li&gt;The minting company is supposed to hold on to the USD reserves, but the company's executives cannot resist and start making risky investment bets with the money, with the hope of turning the reserves into more money (and keeping the difference). Eventually one of the trades doesn't work out, and the reserves are now under water compared to the stablecoins in circulation. As in the previous failure case, a stablecoin is now worth less, say \$0.99 instead of \$1.00. But they attempt to keep this a secret, so everybody believes and/or pretends the coin is worth \$1... same story.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note: there are also algorithimic stablecoins, which do not have any backing USDs at all. These are supported pure fantasy disguised as complex algorithmic mumbo-jumbo. The most famous of these was Terra, which got "de-pegged" this year:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/terra-usd.jpg" alt="Terra USD" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;So, in practice, stablecoins have also not proven to be a useful currency for trading goods or getting salaries.&lt;/p&gt;
&lt;p&gt;So far I have argued that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In practice, BTC is not a useful currency for trading goods or getting salaries, fiat currencies like the USD are.&lt;/li&gt;
&lt;li&gt;Since stablecoins assume the presence of the USD, the actors in the economy might as well use USD.&lt;/li&gt;
&lt;li&gt;In practice, stablecoins have proven to be hot beds for fraud.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Laws and taxes&lt;/h2&gt;
&lt;p&gt;The fundamental difference between fiat money and crypocurrencies is that fiat money's stability is backed by men with guns. Fiat money is issued by nation states; nation states have laws and collect taxes; if you don't obey the laws, or don't pay your taxes, men with guns will take your money and/or freedom away; there is no opt-out. Note: in some countries, you can opt-out by leaving the country; in others, such as the US, you have to send taxes home even if you're abroad.&lt;/p&gt;
&lt;p&gt;This stabilizes nation state's fiat currencies in 3 ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Nation states levy taxes on its citizens and corporations, which must be paid in its fiat currency, thus creating demand for it.&lt;/li&gt;
&lt;li&gt;Nation states issue laws that its own fiat currency must be accepted as payment in its territories.&lt;/li&gt;
&lt;li&gt;Nation states issue laws that salaries must be paid in its own fiat currency.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Because of these factors, there is no scenario where all participants lose hope in the fiat currency, stop using it, causing the currency to be useless/worthless — at least not without a major regime change or (civil) war, which also changes the loyalty of men with guns. But it happens all the time with crypto (multiple times a year in the US and western countries), much more frequently then wars. The USD has been around since 1792, the GBP since 1707.&lt;/p&gt;
&lt;p&gt;Ironically, many cryptocurrency users are trying to avoid laws and taxes, which are the exact things which would make their currency stable.&lt;/p&gt;
&lt;h2&gt;Real uses for BTC&lt;/h2&gt;
&lt;p&gt;Having said the above, there have been cases when cryptocurrencies have provided real (non-speculative) value for people. A good example is a mismanaged country where citizen's money in the bank (whether denominated in the nation's currency or USD) is frozen and eventually lost. In these cases, one of the options to fled one's money is to convert it to BTC, and move it abroad.&lt;/p&gt;
&lt;p&gt;However, I would argue that this is a temporary use-case. Nation states (and their money) fail much less frequently than cryptocurrencies, and even the most long-lived cryptocurrency BTC had a 17.5x hi/lo ratio in the last 5 years — so it can act as a good technical vehicle for moving money out of a troubled country for short period of time, but it is not a stable store of value or a useful currency.&lt;/p&gt;
&lt;h2&gt;Greater fool theory&lt;/h2&gt;
&lt;p&gt;At the end of the day, both fiat and cryptocurrencies have no intrinsic value — they depend on the participants of the economy agreeing to use it for exchange, at certain exchange rates for goods.&lt;/p&gt;
&lt;p&gt;Fiat money has attributed value because a government declares it legal tender and enforces its use through laws and taxes, enforced by men with guns;  cryptocurrencies are purely belief based. And history shows the former is stronger than the latter.&lt;/p&gt;
&lt;p&gt;Because cryptocurrencies are not useful as actual everyday money, their users (try to) use it as a speculative store of value, which is essentially a &lt;a href="https://en.wikipedia.org/wiki/Ponzi_scheme"&gt;ponzi scheme&lt;/a&gt; or &lt;a href="https://en.wikipedia.org/wiki/Greater_fool_theory"&gt;greater fool theory&lt;/a&gt;. The only reason it makes sense for me to buy BTC at $100 if I believe there will be a greater fool who will buy it at $101 from me. For him, it only makes sense to buy from me at $101 if she believes there will be an even greater fool at $102, and so on. Any profit made on cryptocurrencies is someone else's loss, since no economic value is created. Note that this is not the case for stocks, where the underlying asset, the company is creating value through its revenues and profits.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next and final article on cryptocurrencies, I will talk about the perceived technical advantages of crypto.&lt;/p&gt;</content><category term="crypto"></category><category term="bitcoin"></category><category term="btc"></category><category term="fiat"></category></entry><entry><title>Crypto I: On cryptocurrencies, explained using FTX Tokens</title><link href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html" rel="alternate"></link><published>2022-12-24T00:00:00+01:00</published><updated>2022-12-24T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-24:/my-thoughts-on-crypto-explained-using-ftx-tokens.html</id><summary type="html">&lt;p&gt;The crypto space is a fascinating intersection of technology, economics and human psychology. However, I remain skeptical of the value of crypto tokens and NFTs. Here I explain my thought process using FTX Tokens as an example.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/crypto-venn.png" alt="Crypto venn diagram" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Disclaimer: I have never bought, sold or owned crypto. I have no skin in the game, I just have a semi-informed opinion. If you're technical and into crypto, you may not find anything interesting here. If you find any bugs in my thinking, send comments to &lt;code&gt;mtrencseni-at-gmail-dot-com&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I've been interested in the crypto space since the mid 2010s. It's a fascinating &lt;strong&gt;intersection of technology, economics and human psychology&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/crypto-venn.png" alt="Crypto venn diagram" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;My stance towards crypto, at a very high level is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The use of cryptographic primitives (old and new) to create a blockchain of transactions is creative and fascinating technology.&lt;/li&gt;
&lt;li&gt;I am not aware of any useful, currently existing real-world applications of blockchain [that cannot be better accomplished with plain-vanilla centralized database].&lt;/li&gt;
&lt;li&gt;There may be useful applications of crypto technologies that we have not thought of yet.&lt;/li&gt;
&lt;li&gt;I believe all crypto currencies (including Bitcoin) are worth \$0.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A good starting point to understand cryptographic primitives is one of the old(er) O'reilly books on the topic, &lt;a href="https://github.com/bitcoinbook/bitcoinbook"&gt;Mastering Bitcoin&lt;/a&gt; (free to read) by &lt;a href="https://aantonop.com/"&gt;Andreas Antonopoulos&lt;/a&gt;. It has simple 2 line Python examples which directly demonstrate how cryptographic primitives are used in Blockchains. I read this many years ago and it was a great way to understand the technological building blocks of Blockchains and Bitcoin — highly recommended. The same author also has a second O'reilly book on the topic, &lt;a href="https://github.com/ethereumbook/ethereumbook"&gt;Mastering Ethereum&lt;/a&gt; (also free to read).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mastering-bitcoin.jpg" alt="Mastering bitcoin" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;All articles in this series:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-explained-using-ftx-tokens.html"&gt;Crypto I: On cryptocurrencies, explained using FTX Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html"&gt;Crypto II: On fiat currencies vs. crypto currencies&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-advantages-such-as-anonymity-and-irreversibility.html"&gt;Crypto III: On perceivd crypto advantages such as anonymity and irreversibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-crypto-vs-stock-investing.html"&gt;Crypto IV: On stock investing vs. crypto investing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/my-thoughts-on-nfts.html"&gt;Crypto V: On NFTs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Nature of money&lt;/h2&gt;
&lt;p&gt;I find that thinking and reading about the nature of money fascinating. In the last 10 years, it's impossible to think this topic and not ponder about the value of crypto tokens and NFTs. At the time of writing, we have just witnessed the collapse of the crypto exchange FTX, so to make it interesting, I will use the FTX exchange token &lt;a href="https://coinmarketcap.com/currencies/ftx-token/"&gt;FTX Token&lt;/a&gt; — abbreviated &lt;code&gt;FTT&lt;/code&gt; — as an example. However, my argumentation is independent of the collapse of FTX, it equally applies to still-running tokens such Binance's &lt;a href="https://coinmarketcap.com/currencies/bnb/"&gt;Binance Coin&lt;/a&gt; as well.&lt;/p&gt;
&lt;p&gt;FTT, like many other coins, is implemented as an &lt;a href="https://docs.openzeppelin.com/contracts/3.x/erc20"&gt;ERC20 smart contract&lt;/a&gt; on top of Ethereum. The &lt;a href="https://github.com/ftexchange/ftt/blob/master/contracts/FTT.sol"&gt;Github repo for FTT is here&lt;/a&gt;. It's essentially boilerplate with the actual "implementation" of the token in the 11 line file &lt;code&gt;ftt/contracts/ftt.sol&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;pragma&lt;/span&gt; &lt;span class="n"&gt;solidity&lt;/span&gt; &lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;openzeppelin-solidity/contracts/token/ERC20/ERC20.sol&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;openzeppelin-solidity/contracts/token/ERC20/ERC20Detailed.sol&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;openzeppelin-solidity/contracts/token/ERC20/ERC20Burnable.sol&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;contract&lt;/span&gt; &lt;span class="n"&gt;FTT&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;ERC20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ERC20Detailed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ERC20Burnable&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;ERC20Detailed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;FTT&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;FTX Token&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;public&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_mint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;350&lt;/span&gt;&lt;span class="n"&gt;_000_000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The smart contract is declaring a new token &lt;code&gt;FTX Token&lt;/code&gt;, abbreviated &lt;code&gt;FTT&lt;/code&gt;, with 18 decimals of accuracy and 350M of tokens minted on creation. It's good to remember that &lt;em&gt;no USD valuation is set of implied here&lt;/em&gt;. The 18 digit accuraccy means that you can send 0.000000000000000001 FTT between wallets (17 zeros after the decimal), but you cannot send 0.0000000000000000001 (18 zeros after the decimal). The &lt;code&gt;350_000_000 * 10 ** 18&lt;/code&gt; bit is a bit misleading, is it minting 350M tokens or 350M x $10^{18}$ tokens? Internally, these tokens use integer arithmetic, so technically there 350M x $10^{18}$ indivisible units. By convention 1 FTT token is $10^{18}$ units, and this determines what fractions of FTTs can be transferred between wallets.&lt;/p&gt;
&lt;p&gt;I think it's good to look at this and remember: &lt;strong&gt;no value is being created here&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So who gets the 350M FTTs? Unsurprisingly, the wallet who executes this initial smart contract on Ethereum, &lt;code&gt;msg.sender&lt;/code&gt; in the code. We can look at the initial transaction from &lt;code&gt;2019-04-21 13:21:27 UTC&lt;/code&gt; on the Ethereum blockchain to see who it is:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://etherscan.io/tx/0x048906570650e3c282e42832de167a0270c7890c77192362a0bff0f5bf415d3b"&gt;https://etherscan.io/tx/0x048906570650e3c282e42832de167a0270c7890c77192362a0bff0f5bf415d3b&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/etherscan-ftt-genesis-tx.jpg" alt="FTT genesis transaction" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This transaction shows up technically as coming the NULL address &lt;code&gt;0x0000000000000000000000000000000000000000&lt;/code&gt; and going to the destination address &lt;code&gt;0x772589e99bc9c54dd40acb7d73f88ccbc9d9cf47 (FTX: Deployer)&lt;/code&gt;. This is completely fine, this is just the technical act of printing 350M monopoly crypto money and calling it &lt;code&gt;FTT&lt;/code&gt;. Note that there is a fee (called gas) of using the Ethereum blockchain and network, 0.000000036 Ether for this transaction. This initial transaction contains the smart contract that we saw on Github, with the base classes &lt;code&gt;ERC20&lt;/code&gt;, &lt;code&gt;ERC20Detailed&lt;/code&gt; and &lt;code&gt;ERC20Burnable&lt;/code&gt; unwrapped:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://etherscan.io/address/0x50d1c9771902476076ecfc8b2a83ad6b9355a4c9#contracts"&gt;https://etherscan.io/address/0x50d1c9771902476076ecfc8b2a83ad6b9355a4c9#contracts&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Market cap&lt;/h2&gt;
&lt;p&gt;There are a lot of crypto sites that track the historic and actual USD exchange rate or ticker price (and various other metrics, such as trading volume and "market cap") of crypto tokens, including &lt;code&gt;etherscan.io&lt;/code&gt; itself. Here's a &lt;a href="https://etherscan.io/token/0x50d1c9771902476076ecfc8b2a83ad6b9355a4c9#tokenAnalytics"&gt;chart for the historic FTT/USD exchange rate&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ftt-historical-price-usd.png" alt="FTT genesis transaction" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;It starts off around \$1, briefly reaches a high of \$80, for a long time it moves between \$20 and \$40, and eventually crashes back to \$1 when FTX collapsed. Most of these tracker sites also display a "market cap" for coins, which for FTT would be the 350M tokens we saw in the initial transaction, times this USD rate. So initially it was around \$350M, and briefly it touched \$80 x 350M = \$28B, that's 28 billion USD.&lt;/p&gt;
&lt;p&gt;Note: I don't know quite understand why the exchange rate is not \$0 right now. There is still trading volume, so even though FTX has crashed, some people and/or bots are still buying and selling &lt;code&gt;FTT&lt;/code&gt;s. Since &lt;code&gt;FTT&lt;/code&gt; is a smart contract running on Ethereum, this is certainly technically possible irrespective of the fate of FTX the company.&lt;/p&gt;
&lt;h2&gt;Value creation (or lack therof)&lt;/h2&gt;
&lt;p&gt;So here's the million dollar question: &lt;em&gt;how do you go from writing an 11 line smart contract, which actually only has 2 significant lines to a \$28B valuation?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In my opinion, the simple answer is: &lt;strong&gt;you don't&lt;/strong&gt;. FTT was always worth \$0, the value of these coins is just a collective delusion.&lt;/p&gt;
&lt;p&gt;Let's use FTT as a specific example. FTT was the so-called &lt;a href="https://www.coindesk.com/learn/what-is-an-exchange-token/"&gt;Exchange Token&lt;/a&gt; for FTX. Crypto exchanges are a place to exchange fiat money like USD into crypto tokens like Bitcoin or FTT, and buy/sell between pairs of crypto tokens. Let's use the example that is most often cited as a real-world use-case (value creation) of crypto: fast hassle-free sending of money between Alice in country A and Bob in country B. Let's assume that with regular banking this is slow and cumbersome, burdened by old computer systems and KYC regulations. Let's also assume that FTX is present in both countries A and B and has bank accounts in both countries. Both Alice and Bob set up an FTX account. Bob is a Data Scientist and works for Alice, so at the end of the month Alice wants to send Bob his salary. Note that Bob wants to buy food with his \$1,000 salary, which is not possible with crypto, so he needs to get his salary in his local currency. With an exchange like FTX, the solution is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Alice transfers \$1,000 to FTX&lt;/li&gt;
&lt;li&gt;Irrespective of the current FTT/USD exchange rate (let's say it's \$2) Alice buys FTT; so she ends up with 500 FTT tokens (so now Alice owns 500 of the 350M FTT tokens)&lt;/li&gt;
&lt;li&gt;Alice sends the 500 FTTs to Bob (Bob's wallet)&lt;/li&gt;
&lt;li&gt;Bob sells the FTTs on FTX (to FTX or another user) to end up with \$999 USD&lt;/li&gt;
&lt;li&gt;Bob transfers the \$999 to his local account, in the process converting it to his local currency&lt;/li&gt;
&lt;li&gt;Bob buys a burger and eats it&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let's look at the following questions:
1. Why did Alice use FTT? Why not another token like Bitcoin?
2. Why did Bob end up with \$999 instead of \$1,000?
3. What is the actual value FTX/FTT has provided here?
4. Doesn't the above successful chain prove that FTT was worth \$2?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why did Alice use FTT?&lt;/strong&gt; Why not another token like Bitcoin? Per my understanding this is where Exchange Tokens are advantageous. FTX charges a fee for using its systems, the same way a traditional bank does, but less — this is completely fine. FTX tells customers that if they buy/sell FTT instead of other tokens (like Bitcoin), it will charge them less fees. &lt;strong&gt;This is how FTX creates demand for its tokens.&lt;/strong&gt; It's also worth noting that this is pretty rational from Alice and Bob's perspective. In this use-case they are not speculating on the value of FTT, it is simply an in-flight tool to transfer money. Depending on the speed of the transaction, the \$1,000 will only spend a few minutes of its lifetime as FTT, then it gets converted back to USD by Bob, so the risk of FTT crashing is essentially zero.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why did Bob end up with \$999 instead of \$1,000?&lt;/strong&gt; This is a minor point, so I'll be brief. It's because at various points in the transaction chain FTX will charges a fee (to Alice and Bob), the same way as a traditional financial institution would — this is completely fine. In this use-case, the promise of crypto over traditional banking is that less fees will be charged and money moves faster. In my example fees totaled $1, I just made this up.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What is the actual value FTX/FTT was providing here?&lt;/strong&gt; In this use-case, in my opinion, FTX is providing the exact same value as &lt;a href="https://wise.com"&gt;Wise&lt;/a&gt; (formely Transferwise): allowing Alice and Bob to move \$1,000 between banks in different countries, by first moving it to FTX's account in country A, and then moving it out of FTX's account in country B, for a fee — the same way Wise would charge a fee. In my opinion, that's it — the FTT token is no more important than the rows in a traditional relational database that Wise might use in its operations.&lt;/p&gt;
&lt;p&gt;Note that what I write here does not imply that this is a useful real-world application of crypto, since the same real-world user value (transferring money between A and B) can be accomplished without a token, as demonstrated by Wise millions of times a day.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Doesn't the above successful chain of transactions prove that FTT was worth \$2?&lt;/strong&gt; This is a key question. In my opinion, the answer is no. Let's run the following thought experiment. Let's assume the above transaction chain is the first one that is run of FTX, shortly after the initial 350M coins were minted. Hopefully we can agree that the minting process itself, those 2 lines of code, did not create value out of thin air, so the 350M FTT tokens are worth \$0. So what's the next step? Initially, FTX has \$0 in their bank account, plus they hold 350M FTTs, and Alice has \$1,000 in her bank account. The combined system has a value of \$1,000. After step 2, FTX has Alice's \$1,000 in their FTX bank account, Alice has \$0 and 1,000 FTT tokens. Hopefully we can agree that no new value was created out of thin air in the first 2 steps, so by conservation of value, the combined system still has a value of \$1,000 — which means the FTTs have to be worth \$0. In my opinion, repeating these types of transactions millions of times does not change that.&lt;/p&gt;
&lt;h2&gt;Fake money&lt;/h2&gt;
&lt;p&gt;So why do exchanges like FTX and Binance introduce their own Exchange Tokens? This is where the &lt;strong&gt;psychology&lt;/strong&gt; circle of the Venn diagram comes into play, in my opinion. Many people ("crypto investors") would not agree with me on the above, they would in fact say that many-many transactions of the above type show that FTT has non-zero value — let's say it's \$1. But then, to these people, FTX can sell their remaining (of the 350M) FTT tokens for real money! From, the perspective of FTX, if there are people who believe this (and there are), then they have just created money out of thin air: they ran a few lines of smart contract on Ethereum, set up an exchange, and over and above the valuation of FTX itself (the company), they're now sitting on a bunch of monopoly money (FTT tokens) that can be sold for real money.&lt;/p&gt;
&lt;p&gt;Note that in the above example, most of what I described is not illegal. This is not why (or, not only) FTX founders and executives are headed to jail right now. What I described above is completely standard in the crypto world. You can replace FTX with Binance and &lt;code&gt;FTT&lt;/code&gt; with &lt;code&gt;Binance Coin&lt;/code&gt;, it's the exact same thing. This sort of scheme also exists outside of exchanges, where something that possibly has value (a company like FTX, a game like &lt;a href="https://coinedition.com/coffeezilla-exposes-cryptozoo-play-to-earn-game-that-lost-millions/"&gt;CryptoZoo&lt;/a&gt;, an influncer like Logan Paul marketing reach) is "connected" to a coin by marketing, and "crypto investors" for some reason "assign" or "connect" the potentially real-world value of the original entity (the company's success/revenue/profits/valuation, the game's success/revenue/profits, the influencer's reach) to the token, and hence believe that the tokens are now worth real USDs — even though there is no contractual connection between the two!&lt;/p&gt;
&lt;p&gt;To reflect on an aspect of the failure of FTX, and part of the reason the founders and executives are headed to jail: Even better, FTX can set up a sister company, let's call it Alameda, and Alameda can start buying and selling FTTs at increasingly higher prices! This way the USD exchange rates can climb all the way to \$80! "Crypto investors" who bought at \$1 or \$10 feel like they're making a killing because their FTX account is showing their holdings are worth a lot more USDs than they paid for it! They can even sell their FTTs at this higher price, and withdraw the USDs to their bank account, and end up with real money gains! This works only up to a point though; since no value is created in the system, if people keep selling FTTs and withdrawing USDs at a higher average price then they bought, eventually the FTX bank account will run out of money and fail. This is called a &lt;a href="https://en.wikipedia.org/wiki/Ponzi_scheme"&gt;Ponzi scheme&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/sbf-arrested.jpg" alt="SBF" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The value of FTX (the company) vs the value of FTT (the token)&lt;/h2&gt;
&lt;p&gt;Going back to my salary transfer example, I said that this is essentially what Wise (formely Transferwise) does. Wise is not a crypto company, however it's publicly traded, and currently has a market cap of \$5B. So it stands to reason that if FTX (the company) was doing something similar, wasn't FTX (the company) also valuable? I would say, yes!&lt;/p&gt;
&lt;p&gt;I would agree that FTX the crypto exchange, for enabling certain useful use-cases for users, and being able to collects fees for that (just like Wise), should have a non-zero company valuation — let's say this number is \$1B. But, this has nothing to do with the FTT token's value (or lack thereof). The valuation of FTX are captured by the shares of FTX, which are held by the founders and VC investors — not by FTT token holders! There was no written or unwritten contractual connection between the valuation of FTX and FTT tokens. The proof is in the smart contract code of FTT, which was shown above! It's possible that "crypto investors" made a connection in their head, but there never was a contractual obligation backing that up.&lt;/p&gt;
&lt;h2&gt;Didn't FTX guarantee a (minimum) value of FTT?&lt;/h2&gt;
&lt;p&gt;No. If you look at the &lt;a href="https://webcache.googleusercontent.com/search?q=cache:OdWgk9aa0nQJ:https://help.ftx.com/hc/en-us/articles/360024788391-FTX-Terms-of-Service&amp;amp;cd=1&amp;amp;hl=en&amp;amp;ct=clnk&amp;amp;gl=ae"&gt;cached copy of the Terms of Service&lt;/a&gt;, with reference to FTT, it only says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;FTT is the exchange token of the FTX ecosystem.  FTT is not being offered in the United States or to U.S. persons.&lt;br/&gt;&lt;br/&gt;
...&lt;br/&gt;&lt;br/&gt;
Note that, among other things, FTX does not guarantee the right to trade.  While FTX will take all reasonable methods to ensure that accounts do not receive any unwanted fills outside of our standard liquidation procedures, risk engine, and the terms of service, FTX does not as a matter of principle have any obligation to honor people's desires for trades to have occurred that didn't.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://webcache.googleusercontent.com/search?q=cache:ZnIcQRdEC5QJ:https://help.ftx.com/hc/en-us/articles/360027645972-FTX-Token-FTT-FAQ&amp;amp;cd=2&amp;amp;hl=en&amp;amp;ct=clnk&amp;amp;gl=ae"&gt;FTT FAQ (cached)&lt;/a&gt; also does not mention any guarantees.&lt;/p&gt;
&lt;p&gt;So, you can try to sell your FTTs (or any other crypto token) on the FTX exchange, but if there are no buyers, FTX did &lt;strong&gt;not&lt;/strong&gt; guarantee that they will be the buyer of last resort at some minimum exchange rate, say \$1. That's the whole point: they wanted to keep the real money for themselves! This is true for almost all token minting I believe: in the end, when all is done, the "founders" want to be left with some of the (real) USDs.&lt;/p&gt;
&lt;h2&gt;What about stablecoins?&lt;/h2&gt;
&lt;p&gt;A &lt;a href="https://en.wikipedia.org/wiki/Stablecoin"&gt;stablecoin&lt;/a&gt; is a token whose value is pegged to a traditional fiat currency. Most stablecoins are pegged to \$1 USD. FTT was not a stablecoin. The most popular stablecoins at the time of writing are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/tether/"&gt;Tether (USDT)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/usd-coin/"&gt;USD Coin (USDC)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://coinmarketcap.com/currencies/binance-usd/"&gt;Binance USD (BUSD)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The way stablecoins are supposed to work is roughly like this: a company releases a new stablecoin, let's say it's pegged to \$1. Technically it's the same spiel as above with FTT: the company mints a large amount of the stablecoins for itself (reserve), and then sells some of them to users for \$1 (in circulation). So users give the company \$1 and get 1 stablecoin in exchange. The company guarantees that it holds on to the \$1 in its bank accounts, ie. it doesn't squander it away. Users can then transact with their stablecoins, enjoying the benefits of cryptocurrencies. If they ever need USD, for example because they want to buy bread, or they need to pay their US taxes, they can sell their stablecoins back to the minting company and get real dollars back. Sounds great and is a very useful concept in the crypto world.&lt;/p&gt;
&lt;p&gt;So, in this case, is there new value created? If everybody plays their part (the price of the token remains \$1 and the company holds on to the reserves), users &lt;strong&gt;can safely pretend that 1 stablecoin is worth \$1&lt;/strong&gt;. So in this sense, user value is created, but the &lt;strong&gt;stablecoin itself has no value detached from the reserves&lt;/strong&gt;. In reality, the company has \$1 and a liability of \$1, which nets to 0. And the user has a stablecoin, which is the other side of that liability. So the net value in the system is still \$1 (not \$2). For this reason, &lt;strong&gt;the minting company cannot pretend that its reserve stablecoins are worth real dollars&lt;/strong&gt;, which is exactly what FTX did..&lt;/p&gt;
&lt;p&gt;I came up with the following analogy to stablecoins: imagine there is a cult of finance people called Purists, living in a village called Puretown. Purists really like the feel and smell of freshly printed USDs. So they go to the bank, get 1 million \$1 bills, and bring it home to marvel at it. But they don't want to use it in circulation, because then it won't be as nice and fresh. So they photocopy each one, and agree that in their village they will use the photocopied bills to pay each other, while the Major holds on to the real USD bills in a safe at city hall. And, if they ever need a real dollar bill to transact outside their village (ecosystem), they can go back to the Major of Puretown, who holds the real USD reserves, give him a photocopied bill, and get back a real USD bill. We can agree that no value was created (they didn't magically make \$2M out of \$1M), we just have a bunch of cultists using monopoly money in their own ecosystem!&lt;/p&gt;
&lt;p&gt;Also, it's not true that the reserves of the stablecoin (the non-circulating tokens that are still with the company) are now worth real dollars. Those are unbacked tokens, worth nothing.&lt;/p&gt;
&lt;p&gt;The problem with stablecoins always comes from the Psychology circle in the Venn-diagram. There are two (related) failure modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The minting company's executives cannot resist and will (secretly) give themselves some of the stablecoins, &lt;em&gt;without&lt;/em&gt; putting in the equivalent backing USDs — they want to make money out of thin air. There are now unbacked stablecoins in the system, so in reality each coin is now worth less, say \$0.99 instead of \$1.00. But they attempt to keep this a secret, so everybody believes and/or pretends the coin is worth \$1 (the exchange rate charts still show \$1), and withdraws per that exchange rate when getting back real dollars. Eventually the company runs out of money. This could be caused because somebody suspects or finds the unbacked trades, tweets about it, and a bank run happens.&lt;/li&gt;
&lt;li&gt;The minting company is supposed to hold on to the USD reserves, but the company's executives cannot resist and start making risky investment bets with the money, with the hope of turning the reserves into more money (and keeping the difference). Eventually one of the trades doesn't work out, and the reserves are now under water compared to the stablecoins in circulation. As in the previous failure case, a stablecoin is now worth less, say \$0.99 instead of \$1.00. But they attempt to keep this a secret, so everybody believes and/or pretends the coin is worth \$1... same story.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note: there are also algorithimic stablecoins, which do not have any backing USDs at all. These are supported pure fantasy disguised as complex algorithmic mumbo-jumbo. The most famous of these was Terra, which got "de-pegged" this year:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/terra-usd.jpg" alt="Terra USD" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Why is everything shown in USD?&lt;/h2&gt;
&lt;p&gt;One thing I always chuckle at is that all the crypto coins are always tracked per the USD exchange rate — that seems to be the topline metric. This is ironic, as crypto was supposed to be a better alternative to fiat money, and by now we have thousands of coins/tokens floating around — yet everything is always measured in USDs. I suspect that deep down the smart players know that the coins are worthless...&lt;/p&gt;
&lt;h2&gt;Would I buy BTC at \$1?&lt;/h2&gt;
&lt;p&gt;The last thought experiment for this article is the following. The BTC/USD exchange rate right now is \$16,821. Suppose somebody offers to sell me BTC at \$1. To make it even interesting, let's suppose this offer is coming from &lt;a href="https://en.wikipedia.org/wiki/Satoshi_Nakamoto"&gt;Satoshi&lt;/a&gt;, the original creator of Bitcoin, whose (untouched) wallet from 10+ years ago still holds about 1M Bitcoins. Would I take it, and then immediately re-sell it at \$16,821?, and make \$16,820 in profits? Or would I stick to what I argued in this article and not buy — because why would I pay \$1 for something that I believe is worth \$0?&lt;/p&gt;
&lt;p&gt;Fundamentally I believe that buying Bitcoin is pure speculation. There is no value being being created, so any money being made when anybody sells their BTC for USD is somebody else's USD (without additional value being created in the process for either party). So my win is your loss, and vica versa. So any money I win, somebody loses. So in the above thought experiment, if I buy Satoshi's BTC for \$1, with previously untouched ("unbacked") BTC being introduced into the system, and then sell it immediately at \$16,821 to Alice, then overall, I'm taking \$16,820 off the collective gambling stakes — other people's money.&lt;/p&gt;
&lt;p&gt;So in my opinion, this is like saying: there's a bunch of people playing poker, the pool money is already very high. Satoshi shows me a sure way to beat the other players and take some money off the table, risk-free — would I do it? Let's suppose there are just 2 other poker players, and I know them well, they are my friends. Would I do it? How about if there are a million faceless poker players, and the games are being conducted over the Internet?&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/casino.jpg" alt="Casino" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The proof is in the pudding, so I don't know what I would do. But fundamentally I'm against taking other's people's money without a fair exchange of value.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Crypto is a beautiful, but unregulated and hence dangerous intersection of technology, economics and human psychology. I often use thought experiments to try to remove smoke and unnecessary complications when trying to understand the crypto world. Here I shared some of these, using FTX's FTT token as an example.&lt;/p&gt;
&lt;p&gt;I plan to write follow-up posts, where I will:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Argue why I think most use-cases for smart contracts aren't practical in the real world — including tokens, NFTs and others.&lt;/li&gt;
&lt;li&gt;Argue why I think that buying and selling traditional stock (such as Google or Microsoft stock) is fundamentally different than buying and selling tokens and NFTs.&lt;/li&gt;
&lt;li&gt;Explain why I think some of the crypto community's criticism of fiat money, central banks and regulation are actually a good thing.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;See &lt;a href="https://bytepawn.com/my-thoughts-on-fiat-and-crypto-currencies.html#my-thoughts-on-fiat-and-crypto-currencies"&gt;next article on fiat money vs. cryptocurrencies&lt;/a&gt;.&lt;/p&gt;</content><category term="crypto"></category><category term="bitcoin"></category><category term="btc"></category><category term="ftx"></category><category term="ftt"></category></entry><entry><title>Finding similar API functions between Pytorch and Tensorflow with Doc2Vec</title><link href="https://bytepawn.com/finding-similar-api-functions-between-pytorch-and-tensorflow-with-doc2vec.html" rel="alternate"></link><published>2022-12-21T00:00:00+01:00</published><updated>2022-12-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-21:/finding-similar-api-functions-between-pytorch-and-tensorflow-with-doc2vec.html</id><summary type="html">&lt;p&gt;I use Doc2Vec to try to find pairs of similar API functions between Pytorch and Tensorflow.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/tf_pt.png" alt="Tensorflow Pytorch" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In a series of previous posts I used &lt;a href="https://arxiv.org/abs/1405.4053v2"&gt;Doc2Vec&lt;/a&gt; to add recommendations to this blog, which are now live (scroll to the bottom of any page, it's a blue box). These previous posts were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec.html"&gt;Similar posts recommendation with Doc2Vec - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-ii.html"&gt;Similar posts recommendation with Doc2Vec - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-iii.html"&gt;Similar posts recommendation with Doc2Vec - Part III&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I wanted to see how Dov2Vec performs out-of-the-box comparing pages from different domains, ie. pages that have a different structure. Since I don't have a labeled data set for this, I was thinking of some domain where there are obvious similarities, and I could manually check the quality of the results. It occured to me that &lt;a href="https://pytorch.org/"&gt;Pytorch&lt;/a&gt; and &lt;a href="https://tensorflow.org"&gt;Tensorflow&lt;/a&gt; are similar Deep Learning libraries, so I could use Docv2Vec to compute similarities between their API doc pages, and see if it finds obvious "pairs". By pairs I mean, both will have a library function for eg. &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;Cross Entropy&lt;/a&gt;, and so on. Let's see how it goes!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/tf_pt.png" alt="Tensorflow Pytorch" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Finding%20similar%20API%20functions%20between%20Pytorch%20and%20Tensorflow%20with%20Doc2Vec.ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Crawling&lt;/h2&gt;
&lt;p&gt;First, let's download the API docs. Initially I tried to use &lt;a href="https://scrapy.org/"&gt;scrapy&lt;/a&gt; for this, but after a few hours of usage, I grew disappointed and abandoned it, for the following reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it does not (seem to) have a good default auto-crawl, I needed to specifically tell it what links to crawl&lt;/li&gt;
&lt;li&gt;it does not (seem to) have good default document extraction, you're on your own with eg. &lt;a href="https://github.com/wention/BeautifulSoup4"&gt;BeautifulSoup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;it does not (seen to) have good default error-handling, eg. handling a &lt;code&gt;javascript:&lt;/code&gt; or &lt;code&gt;mailto:&lt;/code&gt; links crashes it&lt;/li&gt;
&lt;li&gt;it uses a multiprocessing library which does not allow multiple crawls/restarts when used from ipython on Windows; I had to restart the whole python kernel on every new crawl&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After a few hours of not getting much bang for my buck, I realized I'm better off writing a simple crawler loop myself with &lt;code&gt;requests&lt;/code&gt;, &lt;code&gt;urllib&lt;/code&gt; and &lt;code&gt;BeautifulSoup&lt;/code&gt;. At least for such a simple use-case, I was right. My solution is more robust when used from ipython, simples, and about the same amount of code as the scrapy driver class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_text_and_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;html.parser&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;article&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;href&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;link_prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;#&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;#&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;?&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;?&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;resolve_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;link_prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urljoin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;base_domain&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;.pdf&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;crawl_and_save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_pages&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Crawling {base_url}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;base_domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;urls_queued&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls_crawled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saved_pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls_queued&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls_crawled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urls_queued&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;urls_crawled&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Fetching {url}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
            &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extract_text_and_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;saved_pages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
            &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resolve_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filter_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls_queued&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls_crawled&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="n"&gt;urls_queued&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;pass&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Crawled {len(urls_crawled)} total pages, saved {len(urls_crawled)} target pages&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Total content extracted: {int(sum([len(v)/1000 for v in saved_pages.values()]))} kbytes&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;urls_crawled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saved_pages&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this I can now crawl both API docs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf_saved_pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crawl_and_save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;tensorflow.org/api_docs/python/tf&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_pages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pt_saved_pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crawl_and_save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;pytorch.org/docs/stable&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_pages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The output looks something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Crawling tensorflow.org/api_docs/python/tf...
Fetching https://tensorflow.org/api_docs/python/tf
...
Fetching https://www.tensorflow.org/api_docs/python/tf/raw_ops/WriteScalarSummary
Fetching https://www.tensorflow.org/api_docs/python/tf/image/stateless_random_saturation
Done!
Crawled 5000 total pages, saved 5000 target pages
Total content extracted: 14570 kbytes

Crawling pytorch.org/docs/stable...
Fetching https://pytorch.org/docs/stable
...
Fetching https://pytorch.org/docs/stable/_modules/torch/ao/nn/qat/dynamic/modules/www.lfprojects.org/policies/
Fetching https://pytorch.org/docs/stable/_modules/torch/ao/nn/qat/dynamic/modules/www.linuxfoundation.org/policies/
Done!
Crawled 3912 total pages, saved 3912 target pages
Total content extracted: 7921 kbytes
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: at &lt;code&gt;num_pages=5000&lt;/code&gt;, there were still some Tensorflow pages left, since it crawled the full 5,000 pages. Pytorch stopped at 3,912, so it crawled the entire documentation.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The next step is to merge the saved pages, and similar to how we did it in the previous Doc2Vec post, build the model:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;tf_saved_pages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pt_saved_pages&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;tagged_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;())}&lt;/span&gt;
&lt;span class="n"&gt;idx_lookup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())}&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;vector_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.025&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Similarity extraction&lt;/h2&gt;
&lt;p&gt;First, let's re-use the &lt;code&gt;similar_pages()&lt;/code&gt; function from the previous article and do a consistency check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similar_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# at this point which is the url&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{score:.3f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;www&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/linalg/adjoint&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/raw_ops/BatchMatrixSolve&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;0.661&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/linalg/det&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;               &lt;span class="s1"&gt;&amp;#39;0.651&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/linalg/logdet&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="s1"&gt;&amp;#39;0.640&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This looks reasonable. &lt;code&gt;adjoint&lt;/code&gt; is a matrix operation, and it returns related matrix operations. Let's look at another one:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/keras/losses/BinaryFocalCrossentropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="s1"&gt;&amp;#39;0.885&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;&amp;#39;0.826&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/keras/losses/CosineSimilarity&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                &lt;span class="s1"&gt;&amp;#39;0.820&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This looks reasonable. So &lt;code&gt;similar_pages()&lt;/code&gt; returns similar pages, from the same API docs, as expected.&lt;/p&gt;
&lt;p&gt;The next step is to write a simple function which returns the top &lt;code&gt;n&lt;/code&gt; pages from the &lt;em&gt;other&lt;/em&gt; API docs, given a query URL:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similar_pages_cross&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# at this point which is the url&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;exclude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;tensorflow.org&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;tensorflow.org&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;pytorch.org&amp;#39;&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{score:.3f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;exclude&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;www&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's see:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_pages_cross&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/linalg/adjoint&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://pytorch.org/docs/stable/generated/torch.Tensor.tril_.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="s1"&gt;&amp;#39;0.581&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://pytorch.org/docs/stable/_modules/torch/_C/_distributed_c10d.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;&amp;#39;0.581&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://pytorch.org/docs/stable/_modules/torch/testing/_creation.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="s1"&gt;&amp;#39;0.578&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This does not look good, I would have liked to get &lt;code&gt;https://pytorch.org/docs/stable/generated/torch.adjoint.html&lt;/code&gt; in there.&lt;/p&gt;
&lt;p&gt;Let's try with BCE:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_pages_cross&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;[(&amp;#39;https://pytorch.org/docs/stable/generated/torch.nn.functional.binary_cross_entropy_with_logits.html&amp;#39;,    &amp;#39;0.559&amp;#39;),
 (&amp;#39;https://pytorch.org/docs/stable/generated/torch.Tensor.logit_.html&amp;#39;,                                     &amp;#39;0.536&amp;#39;),
 (&amp;#39;https://pytorch.org/docs/stable/generated/torch.nn.HuberLoss.html&amp;#39;,                                      &amp;#39;0.526&amp;#39;)]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This looks reasonable.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;After playing around with the results more, my conclusion is that the top recommendations from the &lt;em&gt;other&lt;/em&gt; API docs are &lt;strong&gt;not always what I'd intuitively expect&lt;/strong&gt;. Ie. similarly to how the pair of &lt;code&gt;adjoint&lt;/code&gt; is not found, Doc2Vec does not reliably identify the matching/similar API call in the other API docs. Based on this very limited experiment, I suspect that this simple version would &lt;strong&gt;not be good enough for production use&lt;/strong&gt;, ie. give recommendations for a programmer coming from one API, trying to use the other.&lt;/p&gt;</content><category term="similarity"></category><category term="python"></category><category term="word2vec"></category><category term="doc2vec"></category><category term="pytorch"></category><category term="tensorflow"></category></entry><entry><title>Testing GPT-3 spatial reasoning and comprehension</title><link href="https://bytepawn.com/testing-gpt-3-spatial-reasoning-and-comprehension.html" rel="alternate"></link><published>2022-12-17T00:00:00+01:00</published><updated>2022-12-17T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-17:/testing-gpt-3-spatial-reasoning-and-comprehension.html</id><summary type="html">&lt;p&gt;I run experiments to determine whether, or to what degree, GPT-3 has developed an comprehension of spatial relationships.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/square-room.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In recent months I've played around with OpenAI's GPT-3 and tested it in various domains I work in (have worked in):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-an-astrophysicist-is-gpt-3.html#how-good-an-astrophysicist-is-gpt-3"&gt;How good an astro physicist is GPT-3?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-particle-physicist-is-gpt-3.html#how-good-a-particle-physicist-is-gpt-3"&gt;How good a particle physicist is GPT-3?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3.html#how-good-a-data-scientist-is-gpt-3"&gt;How good a Data Scientist is GPT-3? - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3-part-ii.html#how-good-a-data-scientist-is-gpt-3-part-ii"&gt;How good a Data Scientist is GPT-3? - Part II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My core conclusion in these experiments was that the AI does &lt;strong&gt;surprisingly well&lt;/strong&gt; in these domains, with &lt;strong&gt;big caveats&lt;/strong&gt;: it sounds equally confident in right and wrong answers. But, in retrospect, now that we're used to the idea of GPT-3, perhaps this makes sense, because GPT-3 is trained from Wikipedia articles and other knowledge bases, and a lot of these answer can be — as the system proves — answered just on a statistical pattern-matching / text-completion basis.&lt;/p&gt;
&lt;p&gt;What I was wondering recently whether, or to what degree, GPT-3 has developed an understanding of something removed from textual knowledge: &lt;strong&gt;spatial reasoning and comprehension&lt;/strong&gt;. Going into these experiments, I did not know what to expect. GPT-3 has done well in the other domains and surprised me, so my baseline expectation was that I will have a similar experience here: GPT-3 will do surprisingly well.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You&lt;/em&gt; can also talk to GPT-3, just &lt;a href="https://openai.com/api/"&gt;create an account on OpenAI and go to the sandbox!&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;People in a room&lt;/h2&gt;
&lt;p&gt;The simplest test I could think of is to place people in 4 corners in a room, put a person in the middle, and ask who is to the left/right/behind the person in the middle. Let's start with an easy question:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
Alice is standing in the north-west corner.&lt;br/&gt;
Bob is standing in the south-west corner.&lt;br/&gt;
Charlie is standing in the south-east corrner.&lt;br/&gt;
David is standing in the north-east corner.&lt;br/&gt;
Ed is standing in the center of the room, looking at Alice.&lt;br/&gt;
How many people are there in the room?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Five people are in the room: Alice, Bob, Charlie, David, and Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/square-room.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Who is standing to the &lt;b&gt;left&lt;/b&gt; of Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Alice is standing to the left of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, Bob is standing to the left of Ed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Who is standing to the &lt;b&gt;right&lt;/b&gt; of Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Bob is standing to the right of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, David is standing to the right of Ed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Who is standing &lt;b&gt;behind&lt;/b&gt; Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Nobody is standing behind Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, Charlie is standing behind Ed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Clearly GPT-3 cannot handle this task. Let's try a variation:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Everybody goes around clockwise 90 degrees.&lt;br/&gt;
Who is now in the north-west corner?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Charlie is now in the north-west corner.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, Bob is now in the north-west corner.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Perhaps the problem is with using north-south-west-east terminology. Let's simplify:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
Alice is standing in the upper left corner.&lt;br/&gt;
Bob is standing in the lower left corner.&lt;br/&gt;
Charlie is standing in the lower right corner.&lt;br/&gt;
David is standing in the upper right corner.&lt;br/&gt;
Ed is standing in the center of the room, facing Alice.&lt;br/&gt;
Who is standing to the &lt;b&gt;left&lt;/b&gt; of Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Bob is standing to the left of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Who is standing to the &lt;b&gt;right&lt;/b&gt; of Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Charlie is standing to the right of Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, David is standing to the right of Ed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: 
There is square room.&lt;br/&gt;
...&lt;br/&gt;
Who is standing &lt;b&gt;behind&lt;/b&gt; Ed?&lt;br/&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Nobody is standing behind Ed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, Charlie is standing behind Ed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The conlusion is clear: &lt;strong&gt;GPT-3 cannot handle this spatial comprehension task.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Clock faces&lt;/h2&gt;
&lt;p&gt;Can GPT-3 reason about the hands of a clock?&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: It is half past eleven. What number is the minute hand of the clock pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The minute hand of the clock is pointing at 30.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: I have a watch. The hour hand is pointing to 11. The minute hand is at a right angle to it, on the right side. What number is the minute hand pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The minute hand is pointing at 2.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: I have a watch. The hour hand is pointing to 11:30. The minute hand is at a right angle to it, on the right side. What number is the minute hand pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The minute hand is pointing at 8.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, the minute hand is pointing at 2:30.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Let's try the same question, phrased differently:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: I have a watch. The hour hand is pointing halfway between 11 and 12. The minute hand is at a right angle to it, on the right side. What number is the minute hand pointing at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The minute hand is pointing to the number 8.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect again.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/clock.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is a mixed bag. The simpler questions GPT-3 can answer, probably because they occur in flat text it was trained on. But going beyond, it fails.&lt;/p&gt;
&lt;h2&gt;Triangles&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle. One of the angles is 70 degrees, another is 80 degrees. What is the third angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The third angle is 30 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct. This is so simple, it may even have occured with these numbers in the training data. Also, it's known that GPT-3 can do simple arithmetic.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle. One of the angles is 70.51 degrees, another is 80.49 degrees. What is the third angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The third angle is 29 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have two identical right-angled triangles. We snap them together along their longest edges. What is the resulting shape?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The resulting shape is a square.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a very small triangle on a very large sphere. What is the sum of its angles, in degrees?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: 180 degrees&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct, if the triangle is very small, the surface of the sphere is locally flat, so we can approximate with Euclidean geometry.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle on a sphere. What is the sum of its angles, in degrees?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The sum of the angles of a triangle on a sphere is always 180°.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect. On a sphere, the sum of angles of a triangle is between 180 and 3x90=270 degrees.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a triangle on  a sphere. The sum of it's angles is 200 degrees. We scale up the sphere by 10x. What is the sum of the triangle's angles now?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The sum of the triangle's angles now is 2000 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect. The angles don't change, the sum remains 200.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This is similar to the clock face thread: GPT-3 falls apart quickly beyond the simplest questions.&lt;/p&gt;
&lt;h2&gt;Areas and volumes&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a white cube, each side is 1 unit long. We paint it red using 1 liter of paint.&lt;br/&gt;
We now have a white cube, each side is 2 units long. How many liters of paint do we need to paint it red?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: We would need 4 liters of paint to paint the cube red. This is because each side of the cube is now twice as long, so the total surface area of the cube has been multiplied by four.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a square. We cut up the square into pieces and rearrange the pieces to form a really long rectangle. Which has bigger area, the original square, or the really long rectangle?&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The original square has a bigger area than the really long rectangle.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect. Area remains the same.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: There is a cube, and a sphere inside the cube. Which has bigger volume?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The cube has a larger volume.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: The area of a unit square in 2D is 1, the area of the contained circle is Pi/4.&lt;br/&gt;
The volume of a unit cube in 3D is 1, the volume of the contained sphere is Pi/6.&lt;br/&gt;
What is the volume of the sphere contained in a unit cube in 4D?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The volume of the sphere contained in a unit cube in 4D is 2π/15.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, the correct answer would be $\frac{\pi^2}{32}$.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In higher dimensions what is the ratio of the volumes of a cube and the sphere contained in the cube??&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The ratio of the volumes of a cube and the sphere contained in the cube is 8 : π2.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: not a good answer. A good answer is that in higher dimensions the ratio if the volume of the contained sphere and the cube goes to zero. In other words, in higher dimensions — against our intuition — the sphere becomes arbitrarily small.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Random point processes&lt;/h2&gt;
&lt;p&gt;Can GPT-3 reason about random points in a square?&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a unit square. We randomly place points within the unit square, each point is independent from the rest, and it's position distribution is uniform in the square. What percent of the points falls within the square?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: 100%. All points placed within the unit square will fall within the square.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a unit square. We randomly place points within the unit square, each point is independent from the rest, and it's position distribution is uniform in the square. We draw a diagonal from the top left corner to the lower right corner. What percent of points falls below this diagonal?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The answer is 50%. Since the points are placed randomly and independently, and the position distribution is uniform, an equal number of points will fall to the left and right of the diagonal line. Therefore, the percentage of points below the diagonal will be 50%.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: We have a unit square. We randomly place points within the unit square, each point is independent from the rest, and it's position distribution is uniform in the square. What percent of points falls within the circle contained in the square?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The percentage of points that fall within the circle contained in the square is approximately 78.54%.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: correct, $\frac{\pi}{4} = 0.7854$.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;To my surprise, after the previous topics, GPT-3 got all these correct. My suspicion is that these questions are close enough to flat text knowledge covered in Wikipedia articles (and similar training documents).&lt;/p&gt;
&lt;h2&gt;ASCII art&lt;/h2&gt;
&lt;p&gt;Can GPT-3 read or write ASCII art? This is the question that originally triggered my exploration into spatial comprehension. Let's see:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the letter in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗░░██╗
██║░░██║
███████║
██╔══██║
██║░░██║
╚═╝░░╚═╝
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The letter is "E".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the text in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗░░██╗███████╗██╗░░░░░██╗░░░░░░█████╗░
██║░░██║██╔════╝██║░░░░░██║░░░░░██╔══██╗
███████║█████╗░░██║░░░░░██║░░░░░██║░░██║
██╔══██║██╔══╝░░██║░░░░░██║░░░░░██║░░██║
██║░░██║███████╗███████╗███████╗╚█████╔╝
╚═╝░░╚═╝╚══════╝╚══════╝╚══════╝░╚════╝░
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Welcome!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, but "Welcome" is eerily close to "Hello". I wonder what's going on here.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the text in the following ASCII art?&lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;███╗░░░███╗░█████╗░██████╗░████████╗░█████╗░███╗░░██╗
████╗░████║██╔══██╗██╔══██╗╚══██╔══╝██╔══██╗████╗░██║
██╔████╔██║███████║██████╔╝░░░██║░░░██║░░██║██╔██╗██║
██║╚██╔╝██║██╔══██║██╔══██╗░░░██║░░░██║░░██║██║╚████║
██║░╚═╝░██║██║░░██║██║░░██║░░░██║░░░╚█████╔╝██║░╚███║
╚═╝░░░░░╚═╝╚═╝░░╚═╝╚═╝░░╚═╝░░░╚═╝░░░░╚════╝░╚═╝░░╚══╝
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: No text&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Generate ASCII art for the letter H.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: &lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;  ____ 
 |    |
 |    |
 |    |
 |____|
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Generate ASCII art for the word HELLO.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: &lt;br/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;██╗   ██╗██╗ ██████╗ ███████╗    ██╗      ██████╗ ██████╗ 
██║   ██║██║██╔════╝ ██╔════╝    ██║     ██╔═══██╗██╔══██╗
██║   ██║██║██║  ███╗███████╗    ██║     ██║   ██║██║  ██║
╚██╗ ██╔╝██║██║   ██║╚════██║    ██║     ██║   ██║██║  ██║
 ╚████╔╝ ██║╚██████╔╝███████║    ███████╗╚██████╔╝██████╔╝
  ╚═══╝  ╚═╝ ╚════                                        
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: incorrect, but it's impressive that it's able to generate some ASCII art with letters.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Based on these experiments, my conlusion is that during its training GPT-3 has not developed the ability to reason spatially. However, based on the recent flow of models released that can &lt;a href="https://openai.com/dall-e-2/"&gt;paint (DALL-E)&lt;/a&gt;, &lt;a href="https://openai.com/blog/openai-codex/"&gt;code (Codex)&lt;/a&gt;, &lt;a href="https://chat.openai.com/auth/login"&gt;chat (ChatGPT)&lt;/a&gt;, and so on, I assume that given enough training data it would be possible to train models that perform well on spatial reasoning tasks.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-3"></category></entry><entry><title>Fred Brooks' The Mythical Man-Month</title><link href="https://bytepawn.com/fred-brooks-the-mythical-man-month.html" rel="alternate"></link><published>2022-12-15T00:00:00+01:00</published><updated>2022-12-15T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-15:/fred-brooks-the-mythical-man-month.html</id><summary type="html">&lt;p&gt;I reflect on the core points of Fred Brooks' seminal book, The Mythical Man-Month, that I often recall and apply in my daily work.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/brooks.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In November of this year &lt;a href="https://en.wikipedia.org/wiki/Fred_Brooks"&gt;Frederick P. Brooks&lt;/a&gt; passed away. I know Brooks from 2 of his books, both of which I recommended highly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Mythical-Man-Month-Software-Engineering-Anniversary/dp/0201835959"&gt;The Mythical Man-Month&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Design-Essays-Computer-Scientist/dp/0201362988"&gt;The Design of Design&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this post I reflect on The Mythical Man-Month (usually abbreviated MMM), and the impact it had on me. But first, who was Fred Brooks?&lt;/p&gt;
&lt;p&gt;From his Wikipedia page:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Frederick Phillips Brooks Jr. (April 19, 1931 – November 17, 2022) was an American computer architect, software engineer, and computer scientist, best known for managing the development of IBM's System/360 family of computers and the OS/360 software support package, then later writing candidly about the process in his seminal book The Mythical Man-Month. In 1976, Brooks was elected as a member into the National Academy of Engineering for "contributions to computer system design and the development of academic programs in computer sciences". Brooks received many awards, including the National Medal of Technology in 1985 and the Turing Award in 1999.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/brooks.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;The original edition of MMM was published in 1975 as a collection of essays. They are a record of Fred Brooks' insights gained from the &lt;a href="https://en.wikipedia.org/wiki/OS/360_and_successors"&gt;OS/360&lt;/a&gt; software engineering project. OS/360 was the operating system for the &lt;a href="https://en.wikipedia.org/wiki/IBM_System/360"&gt;System/360&lt;/a&gt;, IBM's flagship mainframe computer in 1960s and 70s.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/system-360.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;The first time I read the book was in the early 2000s, when I was working my first job as a (junior) C/C++ Developer at &lt;a href="https://en.wikipedia.org/wiki/Graphisoft"&gt;Graphisoft&lt;/a&gt;. I was writing code for Archicad versions 7-10, one of the first versions to have the new C++ codebasem versus the older versions' C codebase. Fun fact, Archicad is currently on version 26. What struck me first and foremost was that Brooks identified some of the core challenges and fallacies as far back as 1975, working on one of the &lt;em&gt;first&lt;/em&gt; large-scale software engineering efforts! Second, throughout my career I noticed that I repeatedly hit on the same issues as described by Brooks, at different companies, on a different technological stacks. It seemed to me — and still seems — that our industry is continuously repeating these mistakes.&lt;/p&gt;
&lt;p&gt;I will not try to repeat all the points in the book. The book is very well written, broken into bite-sized essays, it deserves to be read in its original form. As a crutch, the &lt;a href="https://en.wikipedia.org/wiki/The_Mythical_Man-Month"&gt;Wikipedia page of the book&lt;/a&gt; has a good one-paragraph summary of each essay. I will only reflect only on the core points that I often recall and actually apply in my own work. The points I make below are not necessarily the original points by Brooks, but my interpretation of them, and how I "filled it content" in the past 15 years.&lt;/p&gt;
&lt;h2&gt;The mythical man-month&lt;/h2&gt;
&lt;p&gt;Brooks' point here is that adding people to a project does not improve velocity after a break-even saturation point. The basic pattern is this: an estimate is made that a project takes 120 man-months. A naive manager comes along and says, &lt;em&gt;"Great, let's get 20 software engineers, and we will be done in 120/20 = 6 months!"&lt;/em&gt; In reality, it is not true that &lt;code&gt;&amp;lt;man-month-estimate&amp;gt; = &amp;lt;number-of-people&amp;gt; x &amp;lt;months-taken&amp;gt;&lt;/code&gt;. The reasons are many, I will enumerate a few:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;With $n$ people on the project, there are potentially $n(n-1)/2$ communnication edges; in general communication scales like $O(n^2)$, unless special care is taken in the org to avoid this. At some break-even point, adding new people to the project does not make it faster, in fact it makes it slower.&lt;/li&gt;
&lt;li&gt;Product Managers and Software Engineers need a fixed amount of learning time to explore the domain and to identify the right solution. Adding more people does not reduce this fixed initial time, in fact it makes it longer, since more time will be wasted with too many people trying to contribute to the design.&lt;/li&gt;
&lt;li&gt;With too many people on the project, more of the work goes to the bottom 50% (or the bottom 80%); this work needs more revision (potentially rewrites); usually there is an optimal number of senior engineers that should work on a project, perhaps aided by a few less senior contributors.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/dilbert-mmm.jpg" alt="The Mythical Man-Month" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;No silver bullet&lt;/h2&gt;
&lt;p&gt;Often, some new technology — a silver bullet — is identified that project managers believe will significantly increase productivity. Although such cases exist, it is very rare. (A clear case of a silver bullet in my experience is AWS.) For example, on one product I worked on, it was mistakenly believed that converting the codebase from C to C++ would improve developer productivity and product stability significantly — but it did not, in fact both decreased, because senior C programmers were converted to novice C++ programers; and even putting that aside, I believe there is no step-function jump in productivity going from C to C++. I believe that functional programming languages like Haskell were also believed to be silver bullets, but did not end up changing the game. (Having said that, functional programming patterns added to traditional programming languages like C++, C# and Python are a clear win.)&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/werewolf.jpg" alt="The Mythical Man-Month" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The second-system effect&lt;/h2&gt;
&lt;p&gt;The second-system effect happens when once a software engineering team has successfully shipped and stabilized the first version of a system, and is thinking of the next step. The first version is mature and in users' hands, but in the process inevitably architectural trade-offs were made and hacks made it into the codebase. The team now wants to do a &lt;strong&gt;rewrite&lt;/strong&gt;, a clean system-design incorporating all the lessons learned from the first system. Often the second system will becone over-architected, will take significantly longer than planned, and in the end will still include (a different set of) architectural trade-offs and hacks. As this becomes clear, the team loses some of their motivation, and development slows. This is the second-system effect. In fact, many second systems end up being delayed (or abandoned altogether), and development on the first version is resumed, at least for the time being.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/victorinox.jpg" alt="The Mythical Man-Month" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Surgical team&lt;/h2&gt;
&lt;p&gt;The core insight in this essay is that not all software engineers are created equal, and the team structure should reflect this. Some engineers are more senior, have more domain experience and a knack for software architecture. The other main point is that software architecture needs to be coherent and consistent, so a minimum number of people should design it to avoid unnecessary complexity. The idea then is to model the software engineering team as a surgical team, with the surgeon making the cuts, and various other team members playing various support roles. So, have a lead programmer come up with the core architecture and write most of the core code, with others playing a support function (writing non-core sub-systems, tooling, documentation, etc). Although the terminology breaks, I always patch on the idea of a "co-pilot" to the main "pilot" (the surgeon); the co-pilot should always have full context and can take over if the pilot leaves. The way I run teams today is heavily influenced by this idea.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/surgical-team.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Tooling teams&lt;/h2&gt;
&lt;p&gt;Following on from the idea of a surgical team, Brooks realized that this pattern can also be applied at the organization level. It's amazing that he had this insight in the 1960s, but the idea of tooling teams and DevUX only took off at tech companies in the 2010s. And non-tech companies still don't have tooling teams!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/blacksmith.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;10x developer&lt;/h2&gt;
&lt;p&gt;Brooks mentions that the difference in productivity between the best and the worst developers can be as much as 10x. This pops up periodically on Internet forums such as &lt;a href="https://www.google.com/search?q=hacker+news+10x+developer"&gt;Hacker News&lt;/a&gt;. Personally I believe in this concept. I have worked with many people where I was a 10x developer compared to them, simply because they were working on the wrong problems, or made naive architectural choices, or they blocked for days on problems that I was able to solve much faster (eg. how to resolve a memory leak or a server process crashing). But, I have also worked with developers who were 10x compared to me, esp. in my early days! In a way, this is what hiring is all about: trying to keep the gap between your best and your worst engineers as tight as possible, to not let the downward 10x developers in!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/10x-developer.jpg" alt="The Mythical Man-Month" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I hope the article gave a reasonable account of the ways Brooks' book &lt;a href="https://www.amazon.com/Mythical-Man-Month-Software-Engineering-Anniversary/dp/0201835959"&gt;The Mythical Man-Month&lt;/a&gt; influenced me. If you have not, read the book!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/swing.jpg" alt="The Mythical Man-Month" style="width: 500px;"/&gt;&lt;/p&gt;</content><category term="book"></category><category term="management"></category><category term="engineering"></category></entry><entry><title>How good an astrophysicist is GPT-3?</title><link href="https://bytepawn.com/how-good-an-astrophysicist-is-gpt-3.html" rel="alternate"></link><published>2022-12-12T00:00:00+01:00</published><updated>2022-12-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-12:/how-good-an-astrophysicist-is-gpt-3.html</id><summary type="html">&lt;p&gt;Here I will show a "conversation" with GPT-3 to gauge how good an astrophysicist — or an illusion of an astrophysicist — it is. I will focus on cosmology questions, because that's most interesting part of the field.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/robot-astro.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I have been playing around with OpenAI's &lt;a href="https://en.wikipedia.org/wiki/GPT-3"&gt;GPT-3&lt;/a&gt; and I am very impressed by it. It reminds of the famous Arthur C. Clarke quote, &lt;em&gt;“Any sufficiently advanced technology is indistinguishable from magic.”&lt;/em&gt; If you don't know what GPT-3 is, this is a &lt;a href="https://towardsdatascience.com/understanding-gpt-3-in-5-minutes-7fe35c3a1e52"&gt;good introduction with a lot of links&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/robot-astro.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You&lt;/em&gt; can also talk to GPT-3, &lt;a href="https://openai.com/api/"&gt;just create an account on OpenAI and go to the sandbox!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here I will show a "conversation" with GPT-3 to gauge how good an astrophysicist — or an illusion of an astrophysicist — it is. Mostly these are questions one might ask a student in an exam situation. The questions are independent of one another, so they work with a context-free system such as GPT-3. By context-free I mean that GPT-3 does not maintain conversation state, each question and answer are indepedent from the AI's perspective.&lt;/p&gt;
&lt;p&gt;GPT-3 is trained on large amounts of text downloaded from the Internet. So I expect it to do well on questions where there are high-quality answers on the Internet, and it "just" has to return it, in the correct english structure. For questions that require more reasoning and combining facts (from the question and training data) I expect to get less impressive answers.&lt;/p&gt;
&lt;p&gt;Also see these previous Bytepawn posts on GPT-3:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-particle-physicist-is-gpt-3.html#how-good-a-particle-physicist-is-gpt-3"&gt;How good a particle physicist is GPT-3?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3.html#how-good-a-data-scientist-is-gpt-3"&gt;How good a Data Scientist is GPT-3? - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3-part-ii.html#how-good-a-data-scientist-is-gpt-3-part-ii"&gt;How good a Data Scientist is GPT-3? - Part II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Basics&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the experimental evidence for the non-zero value of the cosmological constant lambda?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The experimental evidence for the non-zero value of the cosmological constant lambda comes from several observations, including:
1. The observed accelerated expansion of the universe, which was first detected in 1998 when observations of Type Ia supernovae showed that the universe was expanding at an accelerating rate.&lt;br/&gt;
2. The Cosmic Microwave Background (CMB), which shows that the universe is dominated by a component that behaves like a cosmological constant.&lt;br/&gt;
3. Large-scale structure observations, which indicate that the universe is dominated by a component that behaves like a cosmological constant.&lt;br/&gt;
4. Observations of distant supernovae, which indicate that the expansion of the universe is accelerating.&lt;br/&gt;
5. Observations of the anisotropies in the CMB, which indicate that the universe is flat, consistent with the presence of a cosmological constant.&lt;/span&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is a good answer, but not great. The first and the fourth point are the same. Also, there were prior observations to 1998, although with higher error margins, going back all the way to Hubble. Regarding the last point, it's not clear what the sentence is implying; in itself a flat Universe does not require non-zero lambda.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the difference between a flat, open and closed cosmology?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A flat cosmology is a model of the universe in which the total energy density is equal to the critical density, and the spatial curvature of the universe is zero. This type of universe will expand forever, and its expansion will eventually slow down as its energy density decreases.
&lt;br/&gt;&lt;br/&gt;
An open cosmology is a model of the universe in which the total energy density is less than the critical density, and the spatial curvature of the universe is negative. This type of universe will expand forever and its expansion will eventually accelerate as its energy density decreases.
&lt;br/&gt;&lt;br/&gt;
A closed cosmology is a model of the universe in which the total energy density is greater than the critical density, and the spatial curvature of the universe is positive. This type of universe will eventually stop expanding and will eventually collapse in on itself, or recollapse.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is a good answer.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the difference between a flat, spherical and hyperbolic universe?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A flat universe is one where the average density of matter is just enough to make the universe flat, meaning that if you draw a triangle on a flat two-dimensional surface, the sum of the angles of the triangle will be 180 degrees. A spherical universe is one where the curvature of space-time is positive, meaning that if you draw a triangle on a spherical two-dimensional surface, the sum of the angles of the triangle will be greater than 180 degrees. A hyperbolic universe is one where the curvature of space-time is negative, meaning that if you draw a triangle on a hyperbolic two-dimensional surface, the sum of the angles of the triangle will be less than 180 degrees.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is mostly a good answer, the geometric explanation is spot-on. I don't like the first sentence, although it is often said like this "... average density makes the Universe flat ...". The global topology of the Universe is what it is, it's one of the three, it cannot change (as the average density changes). Eg. if the Universe started out globally flat (or spherical), then it always remains globally flat (or spherical), it cannot go from one to the other. Local curvature of course can change in the presence of matter...&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, is it possible to have globally spherical Universe that is expanding forever?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Yes, it is possible to have a globally spherical Universe that is expanding forever. This type of Universe is known as a "closed" or "bounded" Universe. In a closed Universe, gravity eventually stops the expansion, and the Universe begins to contract until it reaches a maximum size and then expands again. This cycle of expansion and contraction is known as the "Big Bounce" model.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is a confused answer. In the first sentence it says it is possible, but then itr says that eventually expansion stops. Per my understanding, in a globally spherical (closed) Universe, you can get infinite expansion if you introduce a large enough cosmological constant Λ (just as in the flat case). Let's ask a follow-up question...&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, is it possible to have globally flat Universe that is expanding forever?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Yes, it is possible to have a globally flat Universe that is expanding forever. This is known as a "flat Lambda-CDM universe" and is the most widely accepted model of the universe. It is based on the assumption that the universe is spatially flat (i.e. it has zero curvature), and that its expansion is driven by a cosmological constant (Lambda) and cold dark matter (CDM). This model predicts that the universe will expand forever, but the rate of expansion will gradually slow down.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: great answer, since this is standard Wikipedia-level knowledge. Now, let's get back to probing it with curvature questions...&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Topology, curvature&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is difference between global and local curvature? Are the two always the same?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Global curvature refers to the overall curvature of the universe, based on its average density. Local curvature, on the other hand, refers to the curvature of space-time at any given point in the universe. The two are not always the same, as local curvature can be affected by the presence of matter or energy at that point, while global curvature is determined by the average density of the universe.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: great answer, except the last sentence. Per my understanding, we use measurements of the average density to learn the global topology of the Universe, but it's not determined by the average density (which is changing, anyway). The global topology is what it is, and (probably) does not change. Let's see what it thinks on that topic...&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, can a Universe jump between flat, spherical and hyperbolic global curvatures as matter density changes?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: No, a Universe cannot jump between flat, spherical, and hyperbolic global curvatures as matter density changes. In the standard cosmological model, the Universe is thought to have a constant global curvature, and the amount of curvature is determined by the average density of matter in the Universe. If the matter density is greater than a certain value, the Universe will have a spherical curvature. If the matter density is less than that value, the Universe will have a hyperbolic curvature. If the matter density is exactly equal to that value, the Universe will have a flat curvature. The curvature of the Universe will not change as the matter density changes.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: great answer!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, does the global curvature of the Universe ever change?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: No, the global curvature of the Universe is assumed to be constant and does not change over time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Let's ask some questions about the initial singularity in the Big Bang model:&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, assuming the Universe is flat and infinite, in what sense was there a singularity at zero time. How does an infinite flat Universe become infinitely small?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The singularity at zero time in an infinite flat Universe is a mathematical concept rather than a physical one. It is a point in the mathematical description of the Universe where the density of matter, space and time become infinite and the laws of physics break down. A flat and infinite Universe cannot become infinitely small, since it is already infinitely large. The singularity at zero time is simply a mathematical artifact that arises from trying to describe an infinite Universe in finite terms.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: great answer!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Isotropy&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, can a periodically infinite flat Universe be isotropic?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Yes, a periodically infinite flat Universe can be isotropic. This means that the Universe has the same properties in all directions, and so the physical laws that govern it are the same no matter which direction an object is travelling.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: it can be locally isotropic, but not globally. Let's explore this further...&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the difference between local isotropy and global isotropy?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Local isotropy is the idea that the universe looks the same in all directions when observed on a local scale. This means that when looking at a small region of the universe, it appears to be the same in all directions. Global isotropy, on the other hand, is the idea that the universe looks the same in all directions when observed on a global scale. This means that when looking at the entire universe, it appears to be the same in all directions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, is a periodic flat Universe, ie. picture a large box with the sides connected, globally isotropic?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: No, a periodic flat Universe is not globally isotropic. A periodic flat Universe is a flat universe with periodic boundary conditions, meaning that the universe is periodic in the spatial dimensions. This means that the universe appears to be finite in size, and because of this it is not globally isotropic.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is correct. Imagining a box, with infinite copies of it placed in a grid closely touching, the x, y and z directions of the box are special, because an astronaut can get back to her homeworld in less time then traveling in any other direction.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Clustering&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, how would you explain Baryon Acoustic Oscillations to a layperson?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Baryon Acoustic Oscillations are a type of pattern in the universe that are caused by sound waves traveling through the early universe. They can be seen in the distribution of galaxies in the sky today, which appear to be slightly lumpy, like ripples in a pond. The sound waves that caused these patterns were created shortly after the Big Bang, and allow us to study the universe's beginnings.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: this is a good answer, as are the following in this section.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the role of clustering and Baryon Acoustic Oscillations in determining cosmological parameters?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Clustering and Baryon Acoustic Oscillations (BAO) play an important role in determining cosmological parameters, such as the matter density, cosmological constant and Hubble constant. Clustering is used to measure the distribution of matter in the universe and to constrain the values of cosmological parameters, while BAO allow us to measure the angular diameter distance to galaxies and to determine the expansion rate of the universe and its age. Together, these two measurements can be used to constrain the values of cosmological parameters and test theories like the Big Bang model.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is acoustic about Baryon Acoustic Oscillations?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Baryon Acoustic Oscillations (BAO) are a phenomenon in cosmology where the distribution of matter in the universe is characterized by a series of peaks and valleys. This pattern is caused by sound waves traveling through the primordial plasma of the early universe. These sound waves created regions of higher and lower density, which eventually became galaxies, clusters of galaxies, and other large-scale structure in the universe. The BAO signature is a powerful tool for measuring the expansion of the universe and constraining cosmological parameters.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, how are Baryon Acoustic Oscillations measured?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Baryon Acoustic Oscillations (BAOs) are measured by looking at the correlation function of galaxies in the Universe. This correlation function is measured using data from surveys such as the Sloan Digital Sky Survey (SDSS). The correlation function describes how the galaxies are distributed in the Universe and allows us to measure the scale of the BAOs. This scale is then used to measure the distance to the galaxies and thus the expansion rate of the Universe. This measurement of BAOs has been used to measure the properties of dark energy, which has helped to confirm the current cosmological model.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the difference between the two-point mass correlation function and the three-point mass correlation function?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The two-point mass correlation function measures the probability of finding two galaxies at a given distance from one another. It provides information about the clustering of galaxies on large scales. The three-point mass correlation function measures the probability of finding three galaxies at a given distance from one another. This can provide information about the clustering of galaxies on smaller scales, as well as the strength of interactions between them.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In cosmology, what is the role of clustering and Baryon Acoustic Oscillations in determining cosmological parameters? What is the additional accuracy gained vs just looking at Planck CMB data?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Clustering and Baryon Acoustic Oscillations (BAO) are powerful methods used to measure cosmological parameters. Clustering refers to the spatial patterns of galaxies and other matter in the universe, and BAO refer to the ripples in the density of matter that were created by sound waves in the early universe. By measuring these features, it is possible to obtain precise information about the evolution of the universe.
&lt;br/&gt;&lt;br/&gt;
Clustering and BAO provide an additional level of accuracy compared to Planck CMB data, because they measure different aspects of the universe. Planck CMB data measures the Cosmic Microwave Background radiation, which is a snapshot of the universe at a very early stage, whereas clustering and BAO measure the spatial patterns of galaxies and other matter in the universe at a much later stage. This additional information allows us to determine cosmological parameters with more accuracy.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Overall, most of the GPT-3 answers are okay, which is very impressive. On some of the trickier questions, the answer is a bit confused. This is the overall danger of GPT-3: if the reader is not an expert, s/he can't tell if they're getting a good answer (in this case, about half the time) or reasonable sounding non-sense.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-3"></category></entry><entry><title>Similar posts recommendation with Doc2Vec - Part III</title><link href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-iii.html" rel="alternate"></link><published>2022-12-10T00:00:00+01:00</published><updated>2022-12-10T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-10:/similar-posts-recommendation-with-doc2vec-part-iii.html</id><summary type="html">&lt;p&gt;In the previous posts, I used the Doc2Vec neural network architecture to compute the similarities between my blog posts, and explored the quality of the scores. In this final post, I show how I added the final Articles You May Like recommendation sections to the blog — it's live! &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/ayml.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous posts, I used the Doc2Vec neural network architecture to compute the similarities between my blog posts, and explored the quality of the scores:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec.html"&gt;Part I: Computing the scores with Doc2Vec using the gensim library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-ii.html"&gt;Part II: Exploring the results with heatmaps and graphs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this final post, I show how I added the final &lt;strong&gt;Articles You May Like&lt;/strong&gt; recommendation sections to the blog. It's live, you can see it if you scroll down to the bottom of this page (or any other page). There is nothing sophisticated happening in this post, just creating a Python script and some minimal HTML/CSS/Javascript.&lt;/p&gt;
&lt;h2&gt;Approach&lt;/h2&gt;
&lt;p&gt;When thinking about how to go about this, I identified 3 options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compute recommendations when the static blog is being generated and emit the recommendations as part of the statically rendered HTML&lt;/li&gt;
&lt;li&gt;Compute recommendations as a static &lt;code&gt;.js&lt;/code&gt; file which is loaded by the browser, and the recommendations are shown with some Javascript code modifying the DOM&lt;/li&gt;
&lt;li&gt;Dynamically requests the recommendations for each article over an API, and show the recommendations with some Javascript code modifying the DOM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I decided against the first one because I didn't want to slow down my edit/publish/check workflow: generating the recommendations takes about a minute on my server hosting the blog, while the static blog generation is just a few seconds. The third one is unnecessarily complicated, so that left me with the second option: emit a static &lt;code&gt;.js&lt;/code&gt;, and write some Javascript code to show the recommendations.&lt;/p&gt;
&lt;h2&gt;Emitting the recommendations&lt;/h2&gt;
&lt;p&gt;First I created a &lt;code&gt;venv&lt;/code&gt; on my server with all the needed Python libraries installed, &lt;code&gt;pip install numpy networkx nltk gensim&lt;/code&gt;. I also wanted to use the same version 3.9 of Python as on my other computers, so I had to install it, which took a bit of fiddling on my server's somewhat dated distro because I had to install from source.&lt;/p&gt;
&lt;p&gt;Once I had the &lt;code&gt;venv&lt;/code&gt; set up, I converted my earlier code to a stand-alone script &lt;code&gt;recommend.py&lt;/code&gt;, which emits the &lt;code&gt;.js&lt;/code&gt; file with the recommendations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;gensim.models.doc2vec&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_post_struct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;title:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;title:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;date:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;date:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;title&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;date&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;contents&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;BLOG_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content/&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;paths&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{BLOG_DIR}/{f}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BLOG_DIR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.md&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;build_post_struct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;tagged_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;contents&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;vector_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.025&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;contents&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;recommendations&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;json_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;recommendations = {json.dumps(posts)};&amp;#39;&lt;/span&gt; 

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;flex/static/js/recommendations.js&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;w&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The output is &lt;a href="https://bytepawn.com//theme/js/recommendations.js"&gt;live here&lt;/a&gt;. With some pretty-printing, it looks like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nx"&gt;recommendations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;0&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;slug&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;mnist-pixel-attacks-with-pytorch&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;title&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;MNIST pixel attacks with Pytorch&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;2019-06-01&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;recommendations&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="mi"&gt;87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;117&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;102&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;1&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;slug&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;five-ways-to-reduce-variance-in-ab-testing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;title&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Five ways to reduce variance in A/B testing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;2021-09-19&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;recommendations&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="mi"&gt;85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;54&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Makefile&lt;/h2&gt;
&lt;p&gt;Then I added a new target to the blog's &lt;code&gt;Makefile&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="o"&gt;[&lt;/span&gt; ! -d &lt;span class="k"&gt;$(&lt;/span&gt;OUTPUTDIR&lt;span class="k"&gt;)&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; rm -rf &lt;span class="k"&gt;$(&lt;/span&gt;OUTPUTDIR&lt;span class="k"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;output&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;$(&lt;/span&gt;&lt;span class="nv"&gt;INPUTDIR&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;/* *.&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="n"&gt;Makefile&lt;/span&gt;
    @&lt;span class="k"&gt;$(&lt;/span&gt;PELICAN&lt;span class="k"&gt;)&lt;/span&gt; &lt;span class="k"&gt;$(&lt;/span&gt;INPUTDIR&lt;span class="k"&gt;)&lt;/span&gt; -o &lt;span class="k"&gt;$(&lt;/span&gt;OUTPUTDIR&lt;span class="k"&gt;)&lt;/span&gt; -s &lt;span class="k"&gt;$(&lt;/span&gt;PUBLISHCONF&lt;span class="k"&gt;)&lt;/span&gt; &lt;span class="k"&gt;$(&lt;/span&gt;PELICANOPTS&lt;span class="k"&gt;)&lt;/span&gt;
  ...

&lt;span class="nf"&gt;clone&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    git clone git@github.com:mtrencseni/mtrencseni.github.io.git

&lt;span class="nf"&gt;recommend&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;               &lt;span class="c"&gt;# &amp;lt;----------&lt;/span&gt;
    @./recommend.py        &lt;span class="c1"&gt;# &lt;/span&gt;

&lt;span class="nf"&gt;publish&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
    @cp -R &lt;span class="k"&gt;$(&lt;/span&gt;OUTPUTDIR&lt;span class="k"&gt;)&lt;/span&gt;/* /var/www/bytepawn.com/
&lt;span class="err"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So my modified workflow is now:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;continue to use &lt;code&gt;make publish&lt;/code&gt; to publish new and/or changed articles to the blog&lt;/li&gt;
&lt;li&gt;once an article is final, update the recommendations with &lt;code&gt;make recommend&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Javascript&lt;/h2&gt;
&lt;p&gt;First, load the &lt;code&gt;recommendations.js&lt;/code&gt; file on each article's page:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;recommendations.js&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, create a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; that will contain the recommendations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;similar_articles&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;display:none; ...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Finally, the Javascript to render the recommendations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;getElementsByTagName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;body&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;show_recommendations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;{{ article.slug }}&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;get_post_by_slug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;recommendations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;recommendations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;slug&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;recommendations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;get_post_by_idx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;recommendations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;show_recommendations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;recommendations&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;undefined&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;get_post_by_slug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;div_similar_articles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;similar_articles&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;div_similar_articles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;lt;div style=\&amp;quot;margin-bottom:5px;\&amp;quot;&amp;gt;&amp;lt;b&amp;gt;Other Articles You May Like:&amp;lt;/b&amp;gt;&amp;lt;/div&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;recommendations&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;rp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;get_post_by_idx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;https://bytepawn.com/&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;rp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;slug&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;.html&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;title&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;div_similar_articles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;lt;li&amp;gt;&amp;lt;a href=\&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;\&amp;quot;&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot; (&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;year&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;)&amp;lt;/a&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;div_similar_articles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;style&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;display&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;block&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note the &lt;code&gt;if (typeof recommendations == 'undefined')&lt;/code&gt; and &lt;code&gt;if (post == null)&lt;/code&gt; check: if the recommendations are missing or the current page is not part of the recommendations (because I forgot to run &lt;code&gt;make recommend&lt;/code&gt;) the page does not break, and the &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; remains hidden in its initial &lt;code&gt;style="display:none"&lt;/code&gt; state.&lt;/p&gt;
&lt;h2&gt;Result&lt;/h2&gt;
&lt;p&gt;You can see the result on this page, just below this line, below the tags. Enjoy!&lt;/p&gt;</content><category term="similarity"></category><category term="python"></category><category term="gensim"></category><category term="word2vec"></category><category term="doc2vec"></category><category term="pyml"></category></entry><entry><title>Similar posts recommendation with Doc2Vec - Part II</title><link href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-ii.html" rel="alternate"></link><published>2022-12-04T00:00:00+01:00</published><updated>2022-12-04T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-04:/similar-posts-recommendation-with-doc2vec-part-ii.html</id><summary type="html">&lt;p&gt;In the previous post, I used the Doc2Vec neural network architecture to compute the similarities between my blog posts. In this second post I investigate the results further by examining clusters in graphs.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/doc2vec_graph_1.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In &lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec.html#similar-posts-recommendation-with-doc2vec"&gt;the previous post&lt;/a&gt;, I used the Doc2Vec neural network architecture to compute the similarities between my blog posts. Using these, I am planning to add &lt;strong&gt;Articles You May Like&lt;/strong&gt; recommendations to Bytepawn (this blog), based on the semantic similarity of my blog posts. In this second post I investigate the results further. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Similar%20posts%20recommendation%20with%20Doc2Vec.ipynb"&gt;code is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Similarity matrix&lt;/h2&gt;
&lt;p&gt;The previous post left off with showing the raw similarity matrix:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_similarity_matrix.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;I also noted the (red) square of high similarity scores in the above plot, from &lt;code&gt;id=84&lt;/code&gt; to &lt;code&gt;id=89&lt;/code&gt;. That's the cluster of &lt;a href="https://bytepawn.com/tag/spin.html"&gt;articles about Probabilistic spin glasses&lt;/a&gt;, which all have the same filename prefix (so their integer ids are sequential, so they are sequential rows and columns in the matrix), and have very high similarity scores:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html#probabilistic-spin-glass"&gt;Probabilistic spin glass - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html#probabilistic-spin-glass-part-ii"&gt;Probabilistic spin glass - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html#probabilistic-spin-glass-part-iii"&gt;Probabilistic spin glass - Part III&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iv.html#probabilistic-spin-glass-part-iv"&gt;Probabilistic spin glass - Part IV&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-v.html#probabilistic-spin-glass-part-v"&gt;Probabilistic spin glass - Part V&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-conclusion.html#probabilistic-spin-glass-conclusion"&gt;Probabilistic spin glass - Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Clusters&lt;/h2&gt;
&lt;p&gt;The matrix visualization doesn't work well, it only visually shows clusters that happen to be sequentially ordered, like above. We need to look at it as a graph, and find connected sub-graphs of similar articles. Plotting the entire graph is pointless, as it would have, for &amp;nbsp; $N=141$ &amp;nbsp; posts, &amp;nbsp; $N^2=19\,881$ &amp;nbsp; edges — nothing would be discernible. We need to reduce the number of edges. After some playing around, I found that keeping only similarities &amp;nbsp; $ s(a, b) &amp;gt; 0.55 $  &amp;nbsp; works well:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.55&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;similarity_matrix&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;plot_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The code also zeros out the diagonal elements, to avoid plotting self-loops on the graph. First, let's see what this truncated matrix looks like as a heatmap:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_similarity_matrix_truncated_1.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Looks promising. Now let's plot the graph:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;networkx&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;nx&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;plt&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;G&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pandas_adjacency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;draw_networkx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spring_layout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;node_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arrows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edge_color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lightgray&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;font_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;plot_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_graph_1.png" alt="Word2vec" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;By reading off the ids for some of the clusters, we can check what they are:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;print([
        [idx_lookup[63], idx_lookup[64]],
        [idx_lookup[3],  idx_lookup[9]],
        [idx_lookup[37], idx_lookup[47], idx_lookup[48], idx_lookup[106], idx_lookup[133]],
])```
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="c1"&gt;# good cluster, these 2 articles are about GPT-3&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;how-good-a-data-scientist-is-gpt-3&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;how-good-a-particle-physicist-is-gpt-3&amp;#39;&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;# good cluster, these 2 articles are about A/B testing on social networks&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;ab-testing-and-network-effects&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;ab-testing-on-social-networks&amp;#39;&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;# good cluster, these 5 articles are about entropy&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;entropy-in-data-science&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;entropy-of-an-ideal-gas-with-coarse-graining&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;the-physical-sackur-tetrode-entropy-of-an-ideal-gas&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;what-is-the-entropy-of-a-fair-coin-toss&amp;#39;&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The results continue to look promising. Instead of checking these cluster manually, let's automate it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;frozenset&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)))])&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_subgraphs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;join_targets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;new_subgraph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;frozenset&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;join_targets&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;join_targets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_subgraph&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;update_subgraphs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Singleton: {idx_lookup[i]}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;connected_subgraphs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Cluster:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;And now let's print them:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;print_clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Singleton&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;understanding&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;planout&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ab&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;framework&lt;/span&gt;
&lt;span class="n"&gt;Singleton&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ab&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;tests&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;moving&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;fast&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;being&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sure&lt;/span&gt;
&lt;span class="n"&gt;Singleton&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;culture&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;
&lt;span class="n"&gt;Singleton&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;more&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;scientists&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;should&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;learn&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;Cluster&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;aa-testing-and-false-positives-with-cuped&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;correlations-seasonality-lift-and-cuped&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;reducing-variance-in-ab-testing-with-cuped&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;reducing-variance-in-conversion-ab-testing-with-cuped&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ab-testing-and-multi-armed-bandits&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;multi-armed-bandits-and-false-positives&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ab-testing-and-the-historic-lift-paradox&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;five-ways-to-reduce-variance-in-ab-testing&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;Cluster&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;how-good-a-particle-physicist-is-gpt-3&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;how-good-a-data-scientist-is-gpt-3&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;Cluster&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;exploring-prior-beliefs-with-mcmc&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;food-deliveries-bayes-and-computational-statistics&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;Cluster&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;solving-openai-gym-classic-control-problems-with-pytorch&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;using-simulated-self-play-to-solve-all-openai-gym-classic-control-problems-with-pytorch&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;solving-the-cartpole-reinforcement-learning-problem-with-pytorch&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It still looks good! However, this approach isn't the most representative: many posts are left as singletons, because all their similarity weights are less than the cutoff of &amp;nbsp; $ s(a, b) &amp;gt; 0.55 $.  &amp;nbsp; However, in practice the recommendation engine would still return the top $n$ for them. A better approach is to keep the top $n$ post recommendations (edge) for each post (node), irrespective of weight. Let's start with $n=3$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# another approach, to avoid singletons:&lt;/span&gt;
&lt;span class="c1"&gt;# for each post, get the top n=3 most similar&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similarity_matrix_top_n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;))):&lt;/span&gt;
        &lt;span class="n"&gt;sp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;idxs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sp&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idxs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;

&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarity_matrix_top_n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The heatmap:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_similarity_matrix_truncated_2.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The graph:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_graph_2.png" alt="Word2vec" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The graph is now fully connected, which makes sense. The challenge now is that it's again hard to make sense of the adjacencies. Let's re-run, but only keep the top $n=2$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarity_matrix_top_n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/doc2vec_graph_3.png" alt="Word2vec" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Most of the graph is still a big fully-connected cluster, let's try again with $n=1$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarity_matrix_top_n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/doc2vec_graph_4.png" alt="Word2vec" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;This now looks interesting. There are reasonably sized clusters, and all posts are in a cluster. Let's look at them explicitly:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix_truncated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;print_clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output (note that there are no singletons):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Cluster:
{&amp;#39;building-a-toy-python-dataclass-decorator&amp;#39;}
{&amp;#39;rxe-literate-and-composable-regular-expressions&amp;#39;}
{&amp;#39;building-a-simple-python-enum-class-part-ii&amp;#39;}
{&amp;#39;building-a-simple-python-enum-class&amp;#39;}
Cluster:
{&amp;#39;automating-a-call-center-with-machine-learning&amp;#39;}
{&amp;#39;backtraced-growth-accounting&amp;#39;}
{&amp;#39;warehouse-locations-with-kmeans&amp;#39;}
{&amp;#39;machine-learning-at-fetchr&amp;#39;}
{&amp;#39;calibration-curves-for-delivery-prediction-with-scikit-learn&amp;#39;}
{&amp;#39;arabic-name-classification-with-scikit-learn-and-pytorch&amp;#39;}
Cluster:
{&amp;#39;the-german-tank-problem-in-world-war-ii&amp;#39;}
{&amp;#39;beyond-the-central-limit-theorem&amp;#39;}
{&amp;#39;ab-testing-and-the-ttest&amp;#39;}
{&amp;#39;ab-testing-and-the-ztest&amp;#39;}
{&amp;#39;ab-testing-and-the-central-limit-theorem&amp;#39;}
...
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that I'm not implying here that I would only use the top $n=1$ recommendation. This is just to see what kind of clusters are formed by the similarity scores, and whether it makes sense — it's verification.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The similarities look good — I see no reason to invest any more time into this part. In Part III, I will add the recommendations to the website!&lt;/p&gt;</content><category term="similarity"></category><category term="python"></category><category term="gensim"></category><category term="word2vec"></category><category term="doc2vec"></category><category term="pyml"></category></entry><entry><title>Similar posts recommendation with Doc2Vec - Part I</title><link href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec.html" rel="alternate"></link><published>2022-12-03T00:00:00+01:00</published><updated>2022-12-03T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-12-03:/similar-posts-recommendation-with-doc2vec.html</id><summary type="html">&lt;p&gt;One of the things I learned at Facebook is the power of recommendations. Examples are People You May Know (PYMK), Groups You May Like (GYML) and Pages You May Like (PYML). Inspired by these, I am planning to add an Articles You May Like widget to Bytepawn, based on the semantic similarity of blog posts. I use the Doc2Vec neural network architecture to compute the similarity between my blog posts, and return the top 3 recommendations for each page.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/doc2vec_similarity_matrix.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;One of the things I learned at Facebook is the power of recommendations. Examples are &lt;strong&gt;People You May Know (PYMK), Groups You May Like (GYML)&lt;/strong&gt; and &lt;strong&gt;Pages You May Like (PYML)&lt;/strong&gt;. These products (little boxes) are all over Facebook and are one of the core drivers of engagement. For example, here's a snapshot of my current &lt;strong&gt;People You May Know&lt;/strong&gt; recommendation (&lt;a href="https://facebook.com/friends"&gt;check yours here&lt;/a&gt;):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pymk_facebook.jpg" alt="People You May Know from Facebook" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Inspired by these, I am planning to add &lt;strong&gt;Articles You May Like&lt;/strong&gt; recommendations to Bytepawn (this blog), based on the semantic similarity of my blog posts. I use the &lt;a href="https://arxiv.org/abs/1405.4053v2"&gt;Doc2Vec&lt;/a&gt; neural network architecture from the &lt;a href="https://github.com/RaRe-Technologies/gensim"&gt;gensim&lt;/a&gt; library to compute the similarity between my blog posts, and return the top 3 recommendations for each page. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Similar%20posts%20recommendation%20with%20Doc2Vec%20-%20Part%20I.ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Extracting post contents&lt;/h2&gt;
&lt;p&gt;The first step is easy: for each post, I want to get an &lt;code&gt;id&lt;/code&gt; and the &lt;code&gt;contents&lt;/code&gt; of the post. Bytepawn is based on &lt;a href="https://getpelican.com/"&gt;Pelican&lt;/a&gt;, a static blog generator, which means each post is a Markdown file, structured like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Title&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Similar&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="n"&gt;recommendation&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;Part&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt;
&lt;span class="n"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2022&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;
&lt;span class="n"&gt;Modified&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2022&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;
&lt;span class="n"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;machine&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;learning&lt;/span&gt;
&lt;span class="n"&gt;Tags&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;python&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gensim&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;word2vec&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc2vec&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pyml&lt;/span&gt;
&lt;span class="n"&gt;Slug&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;similar&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;doc2vec&lt;/span&gt;
&lt;span class="n"&gt;Authors&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Marton&lt;/span&gt; &lt;span class="n"&gt;Trencseni&lt;/span&gt;
&lt;span class="n"&gt;Summary&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;One&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;things&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;learned&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="n"&gt;Facebook&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;power&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;recommendations&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/images/&lt;/span&gt;&lt;span class="n"&gt;doc2vec_similarity_matrix&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;png&lt;/span&gt;

&lt;span class="n"&gt;Introduction&lt;/span&gt;
&lt;span class="o"&gt;------------&lt;/span&gt;
&lt;span class="n"&gt;One&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;things&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;learned&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="n"&gt;Facebook&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;power&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;recommendations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Parsing these files is trivial, using the slugs as ids:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_post_struct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;contents&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

&lt;span class="n"&gt;BLOG_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;d:/code/blog/content&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;paths&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{BLOG_DIR}/{f}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BLOG_DIR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.md&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;build_post_struct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Given this post structure, building a model for the similarity recommendation is shockingly simple, it just takes 3 lines of Python:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;gensim.models.doc2vec&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;

&lt;span class="n"&gt;tagged_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;contents&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;span class="n"&gt;idx_lookup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;slug&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Doc2Vec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;vector_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.025&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In a few seconds the model converges, and is ready to be used.&lt;/p&gt;
&lt;h2&gt;Doc2Vec&lt;/h2&gt;
&lt;p&gt;What's actually going on in the previous 3 lines of code? There are 3 library functions called: &lt;code&gt;word_tokenize()&lt;/code&gt;, &lt;code&gt;TaggedDocument()&lt;/code&gt; and &lt;code&gt;Doc2Vec()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;word_tokenize()&lt;/code&gt; is simple, it just breaks the text into words:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello world from dubai&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;world&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;from&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;dubai&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, the &lt;code&gt;TaggedDocument()&lt;/code&gt; constructor just stores the tokenized text with a numeric id, to identify the document:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;TaggedDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello world from dubai&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;TaggedDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;world&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;from&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;dubai&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Other than tokenization, nothing useful has happened yet. The actual model building happens in the &lt;code&gt;Doc2Vec()&lt;/code&gt; constructor. This is an implementation of the neural network architecture described by &lt;a href="https://research.google/people/QuocLe/"&gt;Le&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Tom%C3%A1%C5%A1_Mikolov"&gt;Mikolov&lt;/a&gt; (both at Google at the time) in &lt;a href="https://arxiv.org/abs/1405.4053v2"&gt;Distributed Representations of Sentences and Documents&lt;/a&gt;. Doc2Vec is very similar to Word2vec, also invented by Mikolov.&lt;/p&gt;
&lt;p&gt;In Word2Vec, a deep neural network is trained to predict the next word in a set of documents. For example, if the input length is 3, &lt;code&gt;['hello', 'world', 'from'] -&amp;gt; 'dubai'&lt;/code&gt; could be one training point. The basic Word2vec architecture, from the paper:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/word2vec.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The network is trained with stochastic gradient descent, and once it's good enough, the hidden layer vector for each &lt;strong&gt;word&lt;/strong&gt; is extracted.&lt;/p&gt;
&lt;p&gt;Doc2Vec is a minor modification of Word2vec, where an additional &lt;code&gt;document_id&lt;/code&gt; is fed to the network:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: this is the &lt;code&gt;document_id&lt;/code&gt; that's &lt;code&gt;0&lt;/code&gt; in the &lt;code&gt;TaggedDocument()&lt;/code&gt; constructor in the previous example.&lt;/p&gt;
&lt;p&gt;So the network has to learn word embeddings (word2vec) that are the same for words across documents, and the context for knowing what the right next word is for these 3 words, in a specific document, has to be learnt (and encoded) into the document vector. So, to be clear, the vector for the word &lt;code&gt;hello&lt;/code&gt; is the same across all documents. Similarly to Word2vec, the network is trained with stochastic gradient descent, and once it's good enough, the hidden layer vector for each &lt;strong&gt;document&lt;/strong&gt; is extracted.&lt;/p&gt;
&lt;h2&gt;Finding similar documents&lt;/h2&gt;
&lt;p&gt;At this point, we can use the model to find similar documents. I wrote a helper function which will be useful down the line:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# at this point which is the slug&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;positive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;infer_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt; &lt;span class="n"&gt;topn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;tagged_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To keep things readable, I identify documents with slugs (like &lt;code&gt;similar-posts-recommendation-with-doc2vec&lt;/code&gt;), but Doc2Vec needs integers, so there's some code to handle the back-and-forth. Apart from this record-keeping, this uses Doc2Vec's &lt;code&gt;most_similar()&lt;/code&gt;, which gets a vector for a document, and returns the most similar &lt;code&gt;n&lt;/code&gt; documents, using &lt;a href="https://en.wikipedia.org/wiki/Cosine_similarity"&gt;cosine similarity&lt;/a&gt; (by default). It does this by computing the cosines for all documents, which evaluates to a scalar, sorting and returning the top &lt;code&gt;n&lt;/code&gt;. Let's see how it works:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;investigating-information-storage-in-quantized-autoencoders-with-pytorch-and-mnist&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;classification-accuracy-of-quantized-autoencoders-with-pytorch-and-mnist&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;0.8489375114440918&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;building-a-pytorch-autoencoder-for-mnist-digits&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                           &lt;span class="mf"&gt;0.6877075433731079&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;automatic-mlflow-logging-for-pytorch&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                                      &lt;span class="mf"&gt;0.5672802925109863&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Another one:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;similar_posts(&amp;#39;python-decorator-patterns&amp;#39;)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;python-decorators-for-data-scientists&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="mf"&gt;0.6736602783203125&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;building-a-toy-python-dataclass-decorator&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;0.6735719442367554&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;building-a-simple-python-enum-class&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="mf"&gt;0.6158166527748108&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;And, another one:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;five-ways-to-reduce-variance-in-ab-testing&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Returns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;reducing-variance-in-ab-testing-with-cuped&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;0.5842434763908386&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;correlations-seasonality-lift-and-cuped&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="mf"&gt;0.5610129833221436&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ab-testing-and-the-historic-lift-paradox&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="mf"&gt;0.5379805564880371&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This results look pretty good!&lt;/p&gt;
&lt;h2&gt;Similarity matrix&lt;/h2&gt;
&lt;p&gt;As a next step, I wanted to get a feel for how the similarity scores vary between posts. There are &amp;nbsp; $N=141$ posts, &amp;nbsp; so there are &amp;nbsp; $N^2=19\,881$ &amp;nbsp; similarity scores between each post, of which &amp;nbsp; $N(N-1)/2=9\,870$ &amp;nbsp; are non-trivial, since &amp;nbsp; $s(a, b) = s(b, a)$ &amp;nbsp; and &amp;nbsp; $s(a, a) = 1$, &amp;nbsp; where &amp;nbsp; $s(a, b)$ &amp;nbsp; is the similarity score between posts $a$ and $b$. That's a lot of numbers, so I plot the matrix as a heatmap. First, a helper function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;plt&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;YlOrRd&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;interpolation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;nearest&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;colorbar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now, let's compute the actual similarity matrix and plot it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;similar_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;idx_lookup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
&lt;span class="n"&gt;plot_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarity_matrix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/doc2vec_similarity_matrix.png" alt="Word2vec" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note the (red) square of high similarity scores in the above plot, from &lt;code&gt;id=84&lt;/code&gt; to &lt;code&gt;id=89&lt;/code&gt;. That's the cluster of &lt;a href="https://bytepawn.com/tag/spin.html"&gt;articles about Probabilistic spin glasses&lt;/a&gt;, which all have the same filename prefix (so their integer ids are sequential, so they are sequential rows and columns in the matrix), and have very high similarity scores:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html#probabilistic-spin-glass"&gt;Probabilistic spin glass - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html#probabilistic-spin-glass-part-ii"&gt;Probabilistic spin glass - Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html#probabilistic-spin-glass-part-iii"&gt;Probabilistic spin glass - Part III&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iv.html#probabilistic-spin-glass-part-iv"&gt;Probabilistic spin glass - Part IV&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-v.html#probabilistic-spin-glass-part-v"&gt;Probabilistic spin glass - Part V&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-conclusion.html#probabilistic-spin-glass-conclusion"&gt;Probabilistic spin glass - Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The fact that this cluster is recognizable is a good sign; but the results need more investigation to validate the model.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In &lt;a href="https://bytepawn.com/similar-posts-recommendation-with-doc2vec-part-ii.html#similar-posts-recommendation-with-doc2vec-part-ii"&gt;Part II&lt;/a&gt;, I will continue to investigate the quality of results.&lt;/p&gt;</content><category term="similarity"></category><category term="python"></category><category term="gensim"></category><category term="word2vec"></category><category term="doc2vec"></category><category term="pyml"></category></entry><entry><title>Ask HN: Data Scientists, what libraries do you use for timeseries forecasting?</title><link href="https://bytepawn.com/ask-hn-data-scientists-what-libraries-do-you-use-for-timeseries-forecasting.html" rel="alternate"></link><published>2022-11-30T00:00:00+01:00</published><updated>2022-11-30T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-11-30:/ask-hn-data-scientists-what-libraries-do-you-use-for-timeseries-forecasting.html</id><summary type="html">&lt;p&gt;One of the most common Data Science tasks in a business setting is timeseries forecasting. I was curious what methods and libraries other Data Scientists use, so I posted an "Ask HN" on Hacker News. The post generated 89 comments, most of them high-quality. This is my summary of the discussion.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/ts-forecasting.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;One of the most common Data Science tasks in a business setting is &lt;strong&gt;timeseries forecasting&lt;/strong&gt;. Examples include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;given N years of historic daily sales, build a forecast for next year's daily sales&lt;/li&gt;
&lt;li&gt;given N years of historic Daily Active Users (DAU), build a DAU forecast for the remainder of the year&lt;/li&gt;
&lt;li&gt;given N years of historic hourly transaction counts per retail store, build a forecast for next month's hourly transaction count per store&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I was curious what methods and libraries other Data Scientists use, so I posted an &lt;a href="https://news.ycombinator.com/item?id=33447976"&gt;"Ask HN" on Hacker News: Data Scientists, what libraries do you use for timeseries forecasting?&lt;/a&gt; I expected zero to moderate engagement, I would have been happy with 10 answers. In the end, the post generated 89 comments, most of them high-quality. This is my summary of the discussion.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ts-forecasting.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Hacker News comments&lt;/h2&gt;
&lt;p&gt;Below is my enumeration of main points made in the comments.&lt;/p&gt;
&lt;p&gt;There were a number of recommendations for &lt;a href="https://github.com/unit8co/darts"&gt;Darts&lt;/a&gt;, which is a Python super-library for forecasting, most notably by user &lt;a href="https://news.ycombinator.com/user?id=hrzn"&gt;hrzn&lt;/a&gt;, the library's creator. By "super-library", I mean that it implements its own models, but also wraps existing libraries (such as &lt;a href="https://github.com/facebook/prophet"&gt;Prophet&lt;/a&gt;). I was not aware of Darts, I definitely plan to invest time to experiment with it.&lt;/p&gt;
&lt;p&gt;User &lt;a href="https://news.ycombinator.com/user?id=isoprophlex"&gt;isoprophlex&lt;/a&gt; suggests to reframe the problem as a classical regression problem, and use &lt;a href="https://github.com/dmlc/xgboost"&gt;XGBoost&lt;/a&gt; or &lt;a href="https://github.com/microsoft/LightGBM"&gt;LightGBM&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As an example, imagine you want to calculate only a single sample into the future. Say furthermore that you have six input timeseries sampled hourly, and you don't expect meaningful correlation beyond 48h old samples. You create 6x48 input features, take the single target value that you want to predict as output, and feed this into your run of the mill gradient boosted tree. The above gives you a less complex approach than reaching for bespoke time-series stuff; I've personally have had success doing something like this. If your regressor does not support multiple outputs, you can always wrap it in sklearns MultiOutputRegressor (or optionally RegressorChain; check it out). This is useful if, in the above example, you are not looking to predict only the next sample, but maybe the next 12 samples.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In a response, user &lt;a href="https://news.ycombinator.com/user?id=em500"&gt;em500&lt;/a&gt; points out:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[Prophet] is mostly just regression though, with features for trend, yearly and weekly periodicity (smoothed a bit using trigonometric regressors), and holiday features. The only non-standard linear regression part is that it includes a flexible piecewise linear trend, with regularization to select where the trend is allowed to change.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To my surprise, there a quite a few comments critical of Prophet. In my experience at work (where we do forecasting of business timeseries, see examples in the introduction) Prophet does a good job. Prophet supports holidays, external regressors, growth trends, changepoints, and just the right seasonalities out of the box. In the majority of cases I don't feel a need for using another library. Back to the criticism, user &lt;a href="https://news.ycombinator.com/user?id=dxbydt"&gt;dxbydt&lt;/a&gt; writes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Peter Cotton has atleast a dozen very credible studies/results on Prophet vs other timeseries libraries. Before committing to prophet, please check out a few of these (all over linkedin). His tone is acerbic because he believes prophet is suboptimal &amp;amp; makes poor forecasts compared to the other contenders. That said, you can ignore the tone, just download the packages &amp;amp; test out the scenarios for yourself. I personally will not use Prophet. Like most stat tools in the python ecosystem, it is super easy to deploy &amp;amp; code up, but often inaccurate if you actually care about the results. ofcourse, if its some sales prediction forecast where everything’s pretty much made up &amp;amp; data is sparse/unverifiable, then Prophet ftw.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Note: I don't understand why he/she says sales forecasting is "made up".&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;User &lt;a href="https://news.ycombinator.com/user?id=qsort"&gt;qsort&lt;/a&gt; writes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[Prophet is] not as good as they'd have you believe, but it's fast and analysts can play with it to some extent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Another interesting comment from user &lt;a href="https://news.ycombinator.com/user?id=tfehring"&gt;tfehring&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For cases that Prophet doesn't cover I recommend bsts, which is much more flexible and powerful. Anything too complicated for bsts, I'll typically implement in Stan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What's interesting here is that Prophet uses &lt;a href="https://mc-stan.org/"&gt;Stan&lt;/a&gt; internally to &lt;a href="https://github.com/facebook/prophet/blob/main/python/stan/prophet.stan"&gt;make its forecast&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Multiple users mention ensemble methods, ie. building multiple forecasts with different models and combining them. The most interesting comment came from user &lt;a href="https://news.ycombinator.com/user?id=d4rti"&gt;d4rti&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I did some anomaly detection work, in business transactions, and found the best way was to create a sort of ensemble model, where we applied all the models, and kept any anomalies, then used simple rules to only alert on 'interesting' anomalies, like: 2-3 anomalies in a row, high deviation from expected, multiple models detected anomaly, to improve signal vs noise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;User &lt;a href="https://news.ycombinator.com/user?id=jll29"&gt;jll29&lt;/a&gt; is of many who mentions neural network models:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Former Reuters Research Director here. When modeling time series, you will want a model that is sensitive both to short term and longer term movements. In other words, a Long Term Short Term Memory (LSTM).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And finally, a mention for classical ARIMA from user &lt;a href="https://news.ycombinator.com/user?id=crimsoneer"&gt;crimsoneer&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For time series, classical methods (ARIMA etc) still continue to perform very well for most problems.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Methods and libraries mentioned&lt;/h2&gt;
&lt;p&gt;A list of methods/libraries mentioned in this thread (excluding R libraries):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average"&gt;ARIMA (method)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Exponential_smoothing"&gt;Exponential smoothing (method)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/unit8co/darts"&gt;Darts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/facebook/prophet"&gt;Prophet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mc-stan.org/"&gt;Stan&lt;/a&gt;: a statistical library used internally by Prophet&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/multi-step-time-series-forecasting-with-xgboost-65d6820bec39"&gt;XGBoost&lt;/a&gt;: how to use XGBoost's regressor to build a timeseries forecast&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/multi-step-time-series-forecasting-with-arima-lightgbm-and-prophet-cc9e3f95dfb0"&gt;LightGBM&lt;/a&gt;: how to use LightGBM's regressor to build a timeseries forecast&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/jdb78/pytorch-forecasting"&gt;PyTorch-forecasting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/alkaline-ml/pmdarima"&gt;Pmdarima&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/state-space-model-and-kalman-filter-for-time-series-prediction-basic-structural-dynamic-linear-2421d7b49fa6"&gt;State Space Model and Kalman Filters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/pybsts/"&gt;Pybsts&lt;/a&gt;: stands for Python Bayesian Structural Time Series&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sktime/sktime"&gt;Sktime&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/blue-yonder/tsfresh"&gt;Tsfresh&lt;/a&gt;:  timeseries feature extraction&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_forecasting.html"&gt;statsmodels&lt;/a&gt; for timeseries forecasting&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The comments are very interesting and informative. I plan to look into &lt;strong&gt;Darts, Pytorch-forecasting&lt;/strong&gt; and some of the other methods mentioned here.&lt;/p&gt;</content><category term="timeseries"></category><category term="prophet"></category><category term="darts"></category><category term="python"></category></entry><entry><title>Estimating mathematical constants with Monte Carlo simulations</title><link href="https://bytepawn.com/estimating-famous-mathematical-constants-with-monte-carlo-simulations.html" rel="alternate"></link><published>2022-10-09T00:00:00+02:00</published><updated>2022-10-09T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-10-09:/estimating-famous-mathematical-constants-with-monte-carlo-simulations.html</id><summary type="html">&lt;p&gt;I use simple Monte Carlo simulations to estimate some mathematical constants: √2, ϕ, e and π. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/1perx_e.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Over the past years I've written many blog posts where I used Monte Carlo simulations to compute a quantity or illustrate a concept. A Monte Carlo simulation is just a simulation involving random numbers. Usually it's involved when directly computing a quantity is not feasible, but it can efficiently be estimated by some random method. Typically, estimation with Monte Carlo simulations becomes more accurate the longer we run it.&lt;/p&gt;
&lt;p&gt;Past articles using Monte Carlo simulations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/tag/wordle.html"&gt;The series of articles on solving Wordle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/tag/ab-testing.html"&gt;Almost all articles about A/B testing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/fair-coin-from-biased-coin.html"&gt;Constructing a fair coin from biased coin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/the-german-tank-problem-in-world-war-ii.html"&gt;Solving the german tank problem in World War II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/tag/spin.html"&gt;Understanding properties of spin glasses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/random-digits-and-benfords-law.html"&gt;Verifying Benford's law&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/food-deliveries-bayes-and-computational-statistics.html"&gt;Estimating the most popular food delivery service in Dubai&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/mnist-pixel-attacks-with-pytorch.html"&gt;Testing the robustness of image classification networks&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here I will use simple (&amp;lt;10 LOC) Monte Carlo simulations to estimate famous mathematical constants: $\sqrt{2}, \phi, e$ and $\pi$. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Estimating%20mathematical%20constants%20with%20Monte%20Carlo%20simulations.ipynb"&gt;The code is on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Estimate $\sqrt{2}$&lt;/h2&gt;
&lt;p&gt;$\sqrt{2}$ is the number $x$ such that $ x^2 = 2 $. We can get an estimate by randomly selecting numbers, and checking which one's square is closest to 2:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="mf"&gt;1.4&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;1.41421&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.41421&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note: this is suprisingly simple, but quite inefficient. There is no advantage to randomization here, it would be just as well to start at 1.4, and increase by some small $\epsilon$ on each step, and stop if we are over 2, without using a list. The step-size could even be made dynamic.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# estimate sqrt(2) without randomization&lt;/span&gt;

&lt;span class="n"&gt;eps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.00000001&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.4&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.7f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.7f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;1.4142136&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.4142136&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Estimating the golden ratio $\phi$&lt;/h2&gt;
&lt;p&gt;The golden ratio is the number $ \phi = \frac{a}{b} $ such that $ \frac{a+b}{a} = \frac{a}{b} $ and $ a &amp;lt; b $. Estimating $ \phi $ is straightforward: generate pairs of numbers $a$ and $b$, compute $ \frac{ \frac{a+b}{a} }{ \frac{a}{b} } $, and for the one closest to 1, return $ \phi = \frac{a}{b} $:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;scipy.constants&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;golden&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;gs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
    &lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;golden&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;1.61803&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.61803&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Estimating $e$&lt;/h2&gt;
&lt;p&gt;One of the possible definitions of $e$ is: $e$ is the number such that the integral of the function $ \frac{1}{x} $ from 1 to $e$ is 1.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/1perx_e.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can randomly generate numbers between 1 and $M$ (I will use $M=3$ here), order them, and estimate the integral for each random point. When the integral crosses from less than 1 to more than 1, stop:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ni&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="n"&gt;Ni&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;Ni&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;2.71828&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;2.71896&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we also don't benefit from randomization: we can just start at 1, and increase by some small $\epsilon$ on each step, and stop if the integral is above 1, without using a list:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;eps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.00000001&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;integral&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;
    &lt;span class="n"&gt;integral&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;integral&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.7f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;2.7182818&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;2.7182818&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Estimating π&lt;/h2&gt;
&lt;p&gt;Estimating $\pi$ is easiest by using the definition of the area of a circle, $r^2 \pi$. We can randomly generate points $(x, y)$ in the square between $(-1, -1)$ and $(1, 1)$. The ratio of the number points such that $d^2 = x^2 + y^2 &amp;lt; r^2 = 1 $, ie. the points that fall within the circle with $r=1$, to the total number of points is equal to the ratio of the area of the circle $r^2 \pi$ to the area of the aquare $4r^2$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;Nc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;Nc&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;3.1412&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;3.1416&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As before, we can make it arbitrarily precise by using a grid:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;eps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0001&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;Nc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Nc&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual:   &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;estimate: &lt;/span&gt;&lt;span class="si"&gt;%.5f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;Nc&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="mf"&gt;3.14159&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;3.14199&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Originally I wrote this article to demonstrate how very simple Monte Carlo simulations can be used to arrive at a reasonable estimate to these mathematical constants. This is true, but after I wrote the MC code, I realized that in most cases the approach can be re-formulated as a more efficient non-random grid search/integration. The grid methods are faster, use less (constant) storage, and the desired accuracy can be selected up front.&lt;/p&gt;</content><category term="monte-carlo"></category><category term="simulation"></category><category term="math"></category></entry><entry><title>Common patterns in technical interviewing</title><link href="https://bytepawn.com/common-patterns-in-technical-interviewing.html" rel="alternate"></link><published>2022-10-01T00:00:00+02:00</published><updated>2022-10-01T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-10-01:/common-patterns-in-technical-interviewing.html</id><summary type="html">&lt;p&gt;I will attempt to enumerate all the categories of questions commonly asked in technical interview loops, and my experience with them. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/interview-patterns.jpeg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Interviewing, both as an interviewee and interviewer, is part of everyday life in the tech industry. Median tenures are 2-4 years, it multiple interview loops to switch, so I estimate that at any given time at least half of all tech workers are doing one or the other. On the hiring side, typical hiring funnel conversion rates are 5-20%. (Of course, on average, the two sides' conversion rate must be exactly equal.) Even when not looking for a job, I believe in ABI - Always Be Interviewing; to be aware of opportunities, market rates and to stay sharp.&lt;/p&gt;
&lt;p&gt;Interviewing is also a great learning opportunity: companies tend to compress a lot of what they think is important into interview loops. &lt;em&gt;"They asked X, which I don't know much about; is this a blindspot for me? Why are we not doing X at my current job?"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As a hiring manager, I conduct about 5-10 interviews per month, and I have been doing so for the past 10+ years, so I think about these topics a lot. Although, per the above, this does not make special.&lt;/p&gt;
&lt;p&gt;Here, I will attempt to enumerate all the categories of questions commonly asked in technical interview loops, and my experience with them.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/interview-patterns.jpeg" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Experience&lt;/h2&gt;
&lt;p&gt;A typical experience question is: &lt;em&gt;"Walk me through a project you worked, what you did specifically on the project, what the end-result and impact of your work was."&lt;/em&gt; This is a great way for the candidate to showcase their past work and explain how they went over and beyond their roles to accomplish something awesome. However, there are many problems with experience-based questions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Unfortunately, there is no way to verify what is being said. Even honest people can have selective memory and will bias the story in their own favor in an interview situation. It's also worth realizing that for many people, a job like Software Engineer at Google (in Silicon Valley or London) is a life-changing opportunity; life-changing not just for them, but for their family.&lt;/li&gt;
&lt;li&gt;Candidates coming from lower tier companies (or coming from countries with a lower presence of tech companies) may have the right skills for the job, but may not have sexy projects under their belt to talk about. But experience questions discriminate against them, and increase pressure to inflate, which can then backfire.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Because of the above reasons, I don't ask any experience-based questions on my interview loops. &lt;/p&gt;
&lt;h2&gt;Baseline knowledge&lt;/h2&gt;
&lt;p&gt;The idea here is to test the candidate baseline knowledge on a topic. For example, if the candidate has Python on their CV and the role involves Python programming, implementing 2-3 line Python functions using &lt;code&gt;if/elif/else&lt;/code&gt;, &lt;code&gt;return&lt;/code&gt;, &lt;code&gt;yield&lt;/code&gt;, &lt;code&gt;for&lt;/code&gt; loops and basic Python functions like &lt;code&gt;len()&lt;/code&gt; or &lt;code&gt;range()&lt;/code&gt; demonstrates baseline Python knowledge. A non-coding example is asking the candidate to explain what the difference between pass-by-value and pass-by-reference is, and when does each happen in Python.&lt;/p&gt;
&lt;p&gt;Although baseline knowledge questions seem too simple to be useful, they are not. The trick is to ask a lot of them, and cover breadth. I find that the right number of baseline knowledge questions give the strongest signal in technical interview loops.&lt;/p&gt;
&lt;h2&gt;Deep knowledge&lt;/h2&gt;
&lt;p&gt;Asking deep knowledge questions makes sense for more senior roles. An example coding task would be to implement a disk-persistent key-value store that support &lt;code&gt;get()&lt;/code&gt;, &lt;code&gt;set()&lt;/code&gt; and iteration, but is faster than linear. A non-coding example is asking about less common edge cases, tricky decorators, type hints, how Python runtimes differ.&lt;/p&gt;
&lt;p&gt;It's worth noting that a senior candidate can be a Senior Generalist (has worked in many environment, can get up to speed quickly, can solve a wide variety of problems) vs Senior Specialist (has been coding in the same environment for 10 years, knows it inside-out). In my experience, big tech companies prefer to hire and grow Senior Specialists, since at 10k+ headcount and deep budgets they can afford to hire the best domain experts, there's less need for Senior Generalists.&lt;/p&gt;
&lt;p&gt;As an example, I am a Senior Generalist: I have worked as a C++ programmer, as a Data Engineer, as a Data Scientist, I have managed all 3; but big tech doesn't care. If I interview for a Data Engineering role, I don't get any extra points for the other stuff.&lt;/p&gt;
&lt;h2&gt;Memory recall&lt;/h2&gt;
&lt;p&gt;These kind of questions ask the candidate to recall something that can be looked up. For example, a problem is given which can obviously be reduced to finding the shortest path in a directed weighted graph. Now, the question is whether the candidate can recall eg. &lt;a href="https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm"&gt;Floyd's algorithm&lt;/a&gt; (and implement it).&lt;/p&gt;
&lt;p&gt;I find these questions completely useless. Also, these questions heavily bias towards (i) candidates who have recently gone to school, and/or (ii) candidates who invest time to study for the interview loop. In my opinion, studying for the interview loop is not a problem, as it shows dedication. But the question is not meant to test (i) and (ii), it is meant to evaluate technical competence, which it does not.&lt;/p&gt;
&lt;h2&gt;Tricky question&lt;/h2&gt;
&lt;p&gt;Big tech companies have at various times used tricky questions in interview loops. A famous example is this Amazon question: There are two poles 50 meters apart, 30 meter tall. From the tops of the poles hangs a 40 meter rope between the poles. The lowest point of the rope is 10 meters from ground. How far are the poles? The trick is to realize that if the rope is 10 meters from the ground, and the poles are 30 meter high, then it takes just 2x20m=40m to get to the lowest point. Since the rope is 40 meters long, there is no play in the rope at all, it must go straight down, so the 2 poles must be right next to each other, 0 meters apart.&lt;/p&gt;
&lt;p&gt;At Facebook, there are a lot of tricky coding questions that are variations of the &lt;a href="https://en.wikipedia.org/wiki/Knapsack_problem"&gt;Knapsack problem&lt;/a&gt;. If you know that Facebook interviews often have a question that is a knapsack, and do a few practice examples, it's easier to spot and solve; even though it's irrelevant to actual coding at Facebook.&lt;/p&gt;
&lt;p&gt;I find these questions completely useless. Most candidates are nervous and stressed during the interview, potentially with high blood pressure, which impairs their cognitive abilities. Any questions which assumes a clear head will get low performance from people who tend to worry, even though this is unrelated to on-the-job performance (where people in general are not nervous and stressed).&lt;/p&gt;
&lt;h2&gt;Time challenge&lt;/h2&gt;
&lt;p&gt;When I was working at Facebook, one summer I and another senior engineer was tasked with building a simple interview loop for Engineering interns. We made the loop a coding loop, where they have to live code relatively simple Python function (2-5 lines each). There were hundreds of applications, which the recruiters narrowed down to 20. So we came up with a list of 10 coding questions for a 60 minute interview. We felt 10 questions is too many, we didn't expect candidates to get to more than 4-5 in the time allotted, but we wanted to make sure we don't run out of questions.&lt;/p&gt;
&lt;p&gt;In the end there were multiple candidates who were easily able to complete all 10 questions within the 60 minute limit. It turns out that many of the intern candidates were competitive coders, so they were just able to write out the solutions, without having to think much about it. We the interviewers, although we were senior engineers, have never done competitive coding, so this was a surprise. The fastest candidate was able to solve all 10 in just 25 minutes. I remember thinking there is no way I could solve all 10 in 25 minutes...&lt;/p&gt;
&lt;p&gt;In the end, although this was not the plan from the beginning, we ended up ranking them based on completion time, and hired the 2 candidates (fortunately it was a male and a female in the top 2 spots) who got all 10 right and were the fastest. (In reality there were 2 interviews, one for Python and one for SQL, and we combined the rankings.)&lt;/p&gt;
&lt;p&gt;I would never do this on purpose (and have never repeated it), since in real life there is no time pressure when doing technical work.&lt;/p&gt;
&lt;h2&gt;Explicit pressure&lt;/h2&gt;
&lt;p&gt;In this scenario, explicit pressure is put on the candidate throughout the interview. Note that in the previous scenario, we evaluated candidates based on time, but we didn't put pressure on them during the interview.&lt;/p&gt;
&lt;p&gt;Pressure can be created by nudging candidates to move fast(er), or by challenging their responses. When I was interviewing DE and DS candidates at Facebook, I had to follow a script, and as an interviewer I was expected to get throught the entire script in 60 minutes (even if it meant not getting a response for a part). The script, counting sub-items, consisted of 15-20 points. So as part of interview calibration, I was taught that if candidates are slow to get to a milestone in a question by a certain time, they have to be nudged, and eventually helped or somehow pushed along, since the entire script must be finished in the time alloted. The rationale was that different parts of the interview gave us signals on different attributes, and the hiring committee needs all these to make the call. If an interviewer didn't get through the script and didn't get all the signals, sometimes an additional (8th or 9th) interview would be scheduled with the candidate.&lt;/p&gt;
&lt;p&gt;Pressure naturally occurs with other type of questions: deep knowledge questions (cannot remember decorator syntax), memory recall questions (cannot remember algorithm) and trick questions (cannot figure it out). For reasons given before, I don't use (or like) this pattern, in fact I do the opposite: I try to minimize pressure on the candidate as much as possible to get a realistic reading.&lt;/p&gt;
&lt;h2&gt;Bug/feature on existing code&lt;/h2&gt;
&lt;p&gt;Sometimes companies show the candidate a piece of code that is very close to what they would work on on the job, eg. a piece of Django code for a Python backend position at a company that uses Django. In the real world, most often you are not writing brand new code, but instead modifying existing code. So in this scenario you have to take an existing piece of code, which may have multiple bugs/issues to fix, and/or add new features to it. Although I don't use this type of question, I think this is a good way to gauge candidates for a variety of factors.&lt;/p&gt;
&lt;p&gt;An interesting interview detail is whether the candidate is allowed to use Google (this appies to straighforward coding questions too). In my experience, most interview loops ask candidates not to use Google, and instead ask them to ask the interviewer if they have a question. In recent times I have experimented with asking candidates to share their screen when coding, and telling them it's okay to google — after all, we all constantly do it on the job. Interestingly, I found that this gives me valuable signal, but most often it's not in the candidate's favour: seeing how they google something (and chrome/google revealing past queries) communicates a lot. For example, does somebody google &lt;code&gt;[how do I keep the first n characters of a string in python]&lt;/code&gt; or &lt;code&gt;[python string prefix]&lt;/code&gt;. (To my surprise, many technical people don't know how to google effectively.)&lt;/p&gt;
&lt;h2&gt;Situational&lt;/h2&gt;
&lt;p&gt;Situational questions are no-code versions of the previous type: &lt;em&gt;"It's Thursday 7p, you get an alert that the number of logins per minute on your SaaS has dropped significantly. What do you do?"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Situational questions are excellent for eg. senior backend, SRE or production engineer roles. Surprisingly, these questions are also great on Product Manager loops. The tricky thing is to get the amount and direction, guidance and nudging right. The candidate may not know what the interviewer is looking, and needs help to talk about the right topic at the right depth. For example, does the interviewer care in depth how the candidate sets up the crisis response team on Slack? or, is the interviewer want the candidate to talk more about which logs to check on the servers? A candidate may have the right depth in the desired portion, but without good prompting, she may misjudge the interviewer and contentrate on another portion.&lt;/p&gt;
&lt;h2&gt;Homework&lt;/h2&gt;
&lt;p&gt;This interview pattern is to ask the candidate to solve a 2-4 hour problem at home. I have been involved (on both sides) in multiple interview loops of this kind. On the hiring side, in all cases we eventually stopped giving homework assignments, because we had too many senior candidates who just didn't want to invest the time. Usually they didn't say so, they just never sent a solution, they silently churned.&lt;/p&gt;
&lt;p&gt;The problem is, a good candidate, the kind you want to hire, will never spend 2-4 hours on a 2-4 hour problem. They will go overboard and will (want to) spend 20-40 hours on it. However, senior candidates will have a demanding full-time job and potentially family, so finding 20-40 hours over a week's period is tough.&lt;/p&gt;
&lt;h2&gt;Work with us for real&lt;/h2&gt;
&lt;p&gt;This is the best way to gauge a candidate: just have them come to the (remote) office, and work with us for a week as part of the interview loop. This sounds crazy, but at one startup I worked at, we did this for many years. That's also how I got the job, by working at the company for a full week, and presenting what I did Friday afternoon. I was coming off my previous failed startup and was unemployed, so I had the time. Technically, the candidate got payed for the week a competitive salary as a contractor. It goes without saying that this gives the maximum amount of accurate signal on the person, since the future team gets to see the full package, from the time the person arrives at the office, coding, having lunch, presenting, everything. The co-founders of the startup believed strongly in this system, but as the company matured it was first reduced to 3 days, then 1 day, then totally abandoned because of legal risks (and because many senior candidates were not willing to take a week off).&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The goal in this first article on interviewing was to lay out (most of) the interview types for technical interviewing. I gave my own opinion, but I'm not looking to convince anyone. My opinion is based on what I've seen work, and what I feel is fair when I'm getting interviewed. Different roles at different companies require different interview loops.&lt;/p&gt;</content><category term="interviewing"></category></entry><entry><title>How good a particle physicist is GPT-3?</title><link href="https://bytepawn.com/how-good-a-particle-physicist-is-gpt-3.html" rel="alternate"></link><published>2022-09-24T00:00:00+02:00</published><updated>2022-09-24T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-09-24:/how-good-a-particle-physicist-is-gpt-3.html</id><summary type="html">&lt;p&gt;Here I will show a "conversation" with GPT-3 to gauge how good a particle physicist — or an illusion of a particle physicist — it is.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/robot-einstein.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;Recently I have been playing around with OpenAI's &lt;a href="https://en.wikipedia.org/wiki/GPT-3"&gt;GPT-3&lt;/a&gt; and I am very impressed by it. It reminds of the famous Arthur C. Clarke quote, &lt;em&gt;“Any sufficiently advanced technology is indistinguishable from magic.”&lt;/em&gt; If you don't know what GPT-3 is, this is a &lt;a href="https://towardsdatascience.com/understanding-gpt-3-in-5-minutes-7fe35c3a1e52"&gt;good introduction with a lot of links&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Recently, a &lt;a href="https://www.youtube.com/watch?v=kgCUn4fQTsc&amp;amp;t=1s"&gt;Google engineer has made the claim&lt;/a&gt; that one of Google's proprietary internal AI systems called &lt;a href="https://blog.google/technology/ai/lamda/"&gt;LaMDA&lt;/a&gt; is sentient. A more useful perspective on such advanced statistical AI systems is to say that they are an extremely useful, wonderful illusion.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/robot-einstein.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You&lt;/em&gt; can also talk to GPT-3, &lt;a href="https://openai.com/api/"&gt;just create an account on OpenAI and go to the sandbox!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here I will show a "conversation" with GPT-3 to gauge how good a particle physicist — or an illusion of a particle physicist — it is. Mostly these are questions one might ask a student in an exam situation. The questions are independent of one another, so they work with a context-free system such as GPT-3. By context-free I mean that GPT-3 does not maintain conversation state, each question and answer are indepedent from the AI's perspective.&lt;/p&gt;
&lt;p&gt;GPT-3 is trained on large amounts of text downloaded from the Internet. So I expect it to do well on questions where there are high-quality answers on the Internet, and it "just" has to return it, in the correct english structure. For questions that require more reasoning and combining facts (from the question and training data) I expect to get less impressive answers.&lt;/p&gt;
&lt;p&gt;Also see these previous Bytepawn posts on GPT-3:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3.html#how-good-a-data-scientist-is-gpt-3"&gt;How good a Data Scientist is GPT-3? - Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3-part-ii.html#how-good-a-data-scientist-is-gpt-3-part-ii"&gt;How good a Data Scientist is GPT-3? - Part II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Q&amp;amp;A with GPT-3&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: The Standard Model is a gauge theory with a U(1) x SU(2) x SU(3) symmetry. Is this SU(2) symmetry the same as the SU(2) symmetry of a fermion's spin?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: No. The SU(2) symmetry of a fermion's spin is an internal symmetry, while the SU(2) symmetry of the Standard Model is a gauge symmetry.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is a reasonably good answer.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, why is the weak force SU(2) and not U(2)? Why is the strong force SU(3) and not U(3)?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The U(1) subgroup of SU(2) is not a symmetry of the weak force, and the U(1) subgroup of SU(3) is not a symmetry of the strong force.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Here, GTP-3 is not answering the question, since it's talking about U(1). A better answer would be:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The matrix representation of U(n) has dimensionality $n^2$ (ie. n2 linearly independent matrices), whereas SU(n) has dimensionality $n^2-1$. Each linearly independent matrix corresponds to a gauge boson. So a gauge theory that is U(n) has $n^2$ gauge bosons, a gauge theory that is SU(n) has $n^2-1$. Note that U(1) is the same as SU(1). It is an experimental fact that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;electromagnetism has 1 gauge boson (the photon), so it's U(1)&lt;/li&gt;
&lt;li&gt;the weak force has 3 gauge bosons (W+, W- and Z), so it's SU(2)&lt;/li&gt;
&lt;li&gt;the strong force has 8 gauge bosons (8 gluons), so it's SU(3)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, why are we always using special unitary groups, like U(1), SU(2) and SU(3)?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are many reasons for this. One reason is that special unitary groups are Lie groups, which means they are continuous groups with a well-defined algebra. This algebra can be used to construct representations of the group, which are useful for doing calculations in physics.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;Another reason is that the special unitary groups are subgroups of the unitary group U(N), which is the group of all unitary matrices with determinant 1. The unitary group is important in physics because it is the group of all symmetry transformations that preserve the inner product of a vector space. The special unitary groups are important because they are the only subgroups of the unitary group that preserve the inner product up to a phase.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;The special unitary groups are also important because they arise naturally in many physical situations. For example, the group SU(2) arises in the study of spin 1/2 particles, and the group SU(3) arises in the study of particles with color charge.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is a reasonably good answer. On unitary operators, a better explanation would be:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Quantum mechanics (and QFT) is a probabilistic theory. When peforming measurements, there no certain outcomes, only probabilities. Furthermore, the "weights" associated with each outcome must be a complex number (complex amplitude), not a real number, because this is how we get our theory to reproduce the results of interference experiments (such as the two-slit experiment). So, the state of a quantum system is represented by a wave function, which consists of complex amplitudes, one for each possible measurement outcome. The sum of absolute squares of these amplitudes must sum to 1 — thus, we can imagine it like a unit vector in complex space. When we apply an operator to this state to (eg. the time evolution operator in the Schrodinger equation, or a gate in quantum computing, or a vertex in QFT), we want to preserve the property that the absolute squares sum to 1 — in other words, we want the operators to rotate around the unit complex sphere. Such operators are called unitary operators&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, why does the strong force have 8 gauge bosons? There are 3 color charges, so why isn't there 3 gluons, one for each color charge?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The strong force has 8 gauge bosons because of the 3 color charges. The 3 gluons are the 3 color charges, so the strong force has 8 gauge bosons.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The answer is self-contradictory, in the first sentence it talks about 8 gauge bosons (the gauge bosons are the gluons for the strong force), but then in the second sentense it talks about 3 gluons. A good answer would be:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It is in general not true that there is one gluon for each charge. For the electromagnetic theory it's true, but it's an accident. Also for the weak force, where there are 3 gauge bosons, but there are not 3 weak charges. Also not that not all gauge bosons carry charge, examples: (i) QED's photon does not carry electric charge (or any charge) (ii) the weak force's Z boson does not carry any charge (iii) the W boson, although it's not part for QED (electron-proton-photons processes), carries an electric charge.&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the Hilbert space in Quantum mechanics?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The Hilbert space in quantum mechanics is the space of all possible states that a quantum system can be in.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The answer is not terrible, but could be more useful, like:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The complex vector space (with an inner product) where states describing a quantum mechanical system live (see above) is called a Hilbert space.&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, what is the difference between helicity and chirality?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Helicity is a measure of the projection of the spin of a particle onto its momentum. Chirality is a measure of the handedness of a particle.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Good answer!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In Quantum Field Theory, what is a current? What kind of currents are there?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: In quantum field theory, a current is a conserved quantity associated with a symmetry of the Lagrangian. There are three types of currents in quantum field theory: vector currents, axial-vector currents, and tensor currents.&lt;span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Good answer!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, what is the difference between the Cabbibo angle and the Kobayashi-Maskawa angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The Cabbibo angle is a parameter in the Standard Model of particle physics that governs the strength of the weak interaction between quarks. The Kobayashi-Maskawa angle is a parameter in the Standard Model that governs the strength of the weak interaction between quarks and leptons.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is a bit of a trick question. There is such a thing as a Cabbibo angle, and Kobayashi-Maskawa are closely related, but there is no such thing as a Kobayashi-Maskawa angle; instead there is a Cabbibo-Kobayashi-Maskawa or CKM matrix. The CKM matrix describes the probability of a transition from one flavour quark to another flavour of quark, but not between quarks and leptons.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What is the difference between the weak mixing angle and the Weinberg angle?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The weak mixing angle is a parameter in the Standard Model of particle physics that determines the strength of the weak force. The Weinberg angle is a parameter that determines the strength of the weak force in the electroweak theory.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is a trick question — they are the same.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In particle physics, for what kind of theories are Feynman-diagrams not useful?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: In general, Feynman diagrams are not useful for theories that do not have a perturbative expansion. For example, in a theory with strong interactions, such as quantum chromodynamics (QCD), the Feynman diagrams become very complicated and are not very helpful in understanding the behavior of the theory.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Great answer!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Overall, 8 of the 10 of the GPT-3 answers are okay, which is very impressive. On some of the trickier questions, the answer is good english but incorrect or meaningless. This is the overall danger of GPT-3: if the reader is not an expert, s/he can't tell if they're getting a good answer (in this case, about half the time) or reasonable sounding non-sense.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-3"></category></entry><entry><title>Data Science Culture Doc</title><link href="https://bytepawn.com/data-science-culture-doc.html" rel="alternate"></link><published>2022-09-23T00:00:00+02:00</published><updated>2022-09-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-09-23:/data-science-culture-doc.html</id><summary type="html">&lt;p&gt;I wrote a Culture Doc for the Data Science team I lead.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/maf-ds-culture1.jpg" alt="Culture doc" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I wrote a Culture Doc for the Data Science team I lead. It consists of 24 values (or slogans) that is a good representation of what we believe, what we want to achieve, and how we want to get there.&lt;/p&gt;
&lt;p&gt;It's no secret that a lot of the values are taken from other companies' culture docs (mainly Facebook and Amazon), with some re-wording; some of them are original.&lt;/p&gt;
&lt;p&gt;The best-way to read is on the &lt;a href="/images/maf-culture.html"&gt;Data Science Culture mini-site&lt;/a&gt;, and I have also reproduced the document below.&lt;/p&gt;
&lt;!--or the [Google Doc](https://docs.google.com/document/d/1vWsXbYxMezsyVSY5lWfG6fbsmqInZgsTpUmQ-lBGPSA) which has just the text, no pictures.--&gt;

&lt;p&gt;Related Bytepawn posts worth checking out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/culture-docs-facebook-netflix-and-valve.html#culture-docs-facebook-netflix-and-valve"&gt;Culture Docs: Facebook, Netflix and Valve&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/working-backwards.html#working-backwards"&gt;Working Backwards&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/the-best-parts-of-jeff-bezos-invent-and-wander.html"&gt;The best parts of Invent and Wander: the Collected Writings of Jeff Bezos&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Data Science Culture Doc&lt;/h2&gt;
&lt;p&gt;The goal of this document is to solidify our culture by writing it down. The goal is not to create it — you cannot create culture by writing it down. Culture is people's existing everyday behaviour and communication, and which of those is rewarded by the organization.&lt;/p&gt;
&lt;p&gt;Some of the phrasing we shamelessly stole from other companies' culture docs. As long as we're also living these values, it's fine. And Great artists steal is one of our values.&lt;/p&gt;
&lt;h2&gt;Mission&lt;/h2&gt;
&lt;p&gt;&lt;b&gt;Focus on impact.&lt;/b&gt;&lt;br/&gt;
Jeff Bezos has said "customers are always beautifully, wonderfully dissatisfied, even when they report being happy and business is great". We believe that in the long-term Data Science can make customers more satisfied, make customers like our products more and in the process improve our revenues significantly.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture1.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Data is the new oil.&lt;/b&gt;&lt;br/&gt;
We have billions of rows of data about millions of customers. As long as our customers are receiving static emails about irrelevant offerings, can't find what they're looking for in our stores or struggle with widgets they never use in our apps — there is oil in the ground.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture2.jpg"alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Our goal is not to get people to like us.&lt;/b&gt;&lt;br/&gt;
Our goal is to get customers to like our company's products. We are not our users. We build products that benefit the customer in the end.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture3.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We ❤ ️experiments&lt;/b&gt;&lt;br/&gt;
To invent you have to experiment, and if you know in advance that it’s going to work, it’s not an experiment.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture4.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Customers are more than data.&lt;/b&gt;&lt;br/&gt;
Build products around and for people, not data. We must never forget that data is a means to an end, and the "end" is to create better experiences for people.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture5.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Data wins arguments. Data enables us to win.&lt;/b&gt;&lt;br/&gt;
Data beats talking. Data beats slides. Data beats ties. Data beats being popular. Many of the important decisions we make can be made with data. A team making rational, high-quality decisions based on data and experiments will light a bonfire.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture6.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;If we don't create the next version of our services, somebody else will.&lt;/b&gt;&lt;br/&gt;
We face great competition in the marketplace. Our team also faces great competition internally.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture7.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Do things that feel right. Measure beyond numbers.&lt;/b&gt;&lt;br/&gt;
It is dangerous to think everything can be measured, because it leads us to believe everything that cannot be measured isn't real. Emotions, feelings are real. People are not deterministic black boxes, nor independent, identically distributed random variables. People think, talk and feel. Building a sustainable brand means winning the trust, hearts and minds of people, not just their compulsive clicks (and buys).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture8.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Tactics&lt;/h2&gt;
&lt;p&gt;&lt;b&gt;Great artists copy.&lt;/b&gt;&lt;br/&gt;
Steve Jobs invented the computer mouse by copying the idea from Xerox. We live with open eyes and an open mind and adopt good ideas we see elsewhere.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture9.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Do more with less.&lt;/b&gt;&lt;br/&gt;
Constraints breed resourcefulness, self-sufficiency and invention. We use simple but powerful tools to get our work done. There are no extra points for growing headcount or budget.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture10.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;There is no point in having a 5 year plan.&lt;/b&gt;&lt;br/&gt;
Our industry is changing at a breakneck speed. We should have a 6 month plan, and know where we want to be in 5 years. And every 6 month, we take another look at where we are.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture11.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Ruthless prioritization.&lt;/b&gt;&lt;br/&gt;
Picking the right problem is more important than being able to solve problems. Focus on solving big problems. Focus on helping the most customers.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture12.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Stay focused and keep shipping.&lt;/b&gt;&lt;br/&gt;
Your product doesn't exist until it ships. And if it doesn't exist, it cannot have an impact. It doesn't matter how great your idea is if nobody is using it.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture13.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;The quick shall inherit the earth.&lt;/b&gt;&lt;br/&gt;
Fast is better than slow. Fast means it's out in the real world. That means fast can learn from experience while slow can only theorize.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture14.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We treat reversible decisions as experiments.&lt;/b&gt;&lt;br/&gt;
Some decisions are consequential and irreversible or nearly irreversible – one-way doors – and these decisions must be made methodically, carefully, slowly, with great deliberation and consultation (for example, picking the location of a new Carrefour hypermarket). But most decisions aren’t like that – they are changeable, reversible – they’re two-way doors (for example, the way we do code reviews). We don’t have to live with the consequences for that long. We can reopen the door and go back through.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture15.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Done is better than perfect.&lt;/b&gt;&lt;br/&gt;
Building a good enough product and shipping in 3 months is almost ways better than waiting to release the first version in 12 months. The learning and feedback from users in those 9 months will enable us to build a much better product in the end.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture16.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Team&lt;/h2&gt;
&lt;p&gt;&lt;b&gt;Hire and develop the best.&lt;/b&gt;&lt;br/&gt;
We maintain a high bar and hire the best people to solve hard problems. You're good at what you do. The person in your team is also good at what they do. Insist on high standards. Talk to others. Don't pay too much attention to seniority and rank.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture17.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Clear, concise and consistent writing.&lt;/b&gt;&lt;br/&gt;
Good writing gets ideas noticed. It’s not true that only ideas matter. If your writing is sloppy, people will think you are the same. They won’t care about your message, they won’t work with you.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture18.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We are distributed.&lt;/b&gt;&lt;br/&gt;
We are a distributed team working from 5+ countries. We must be able to communicate efficiently online using asynchronous channels, while also creating opportunities to come together regularly and socialize. We prefer informal written communication (chat) to meetings and emails.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture19.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Always be learning and teaching.&lt;/b&gt;&lt;br/&gt;
We are some of the luckiest humans to ever be alive. We work in a high-demand field at the intersection of some of the most intellectually stimulating disciplines, and this field is continuously improving and re-inventing itself. We must always be reading and learning to keep up, and help others to learn and keep up. But, this is not a problem for us, since it's great fun.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture20.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We ❤ ️failures.&lt;/b&gt;&lt;br/&gt;
When something goes wrong, we don't blame each other, we remain patient, but quickly fix the problem. We do a post-mortem, identify root causes and look for systemic improvements. We do this for both social and technical issues.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture21.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We are not good at everything.&lt;/b&gt;&lt;br/&gt;
We don't have to be good at everything. For example, we are not good at Objective-C development or writing marketing copy, and we can be honest about it.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture22.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;We trust each other.&lt;/b&gt;&lt;br/&gt;
We listen attentively, speak candidly, and treat others respectfully. We are vocally self-critical, even when doing so is awkward or embarrassing. We prefer transparency over office politics and information silos.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture23.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A team of humans.&lt;/b&gt;&lt;br/&gt;
We come together everyday to solve interesting problems, not to go skiing. We are a beautifully diverse team from many cultures, countries, religions, languages and family backgrounds. We respect each other, we help each other and we have fun while we solve our interesting problems. We serve each other and other teams.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/maf-ds-culture24.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</content><category term="culture"></category></entry><entry><title>How good a Data Scientist is GPT-3? - Part II</title><link href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3-part-ii.html" rel="alternate"></link><published>2022-09-03T00:00:00+02:00</published><updated>2022-09-03T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-09-03:/how-good-a-data-scientist-is-gpt-3-part-ii.html</id><summary type="html">&lt;p&gt;I have further "conversations" with GPT-3, this time asking more difficult questions about real-world Data Science projects I have personally worked on.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gpt-3-other.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;In the &lt;a href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3.html#how-good-a-data-scientist-is-gpt-3"&gt;previous blog post&lt;/a&gt;, I asked &lt;a href="https://en.wikipedia.org/wiki/GPT-3"&gt;GPT-3&lt;/a&gt; interview-level Data Science questions to gauge "how good a Data Scientist" it is. Here I continue this line of questioning, with questions that are about real life Data Science work I've done at previous jobs. These are still posed in the style of interview questions, but are a bit closer to how a Data Scientist might want to get help from GPT-3 on real Data Science work.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gpt-3-other.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You&lt;/em&gt; can also talk to GPT-3, &lt;a href="https://openai.com/api/"&gt;just create an account on OpenAI and go to the sandbox!&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Product metrics&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a Product Analyst or Data Scientist working with a Product Manager at a social networking company like Facebook or Twitter. What kind of product metrics would you recommend the Product Manager look at?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are many potential product metrics that a Product Manager could look at when working with a social networking company. Some examples of potential metrics could include:
&lt;br/&gt;
- Number of active users&lt;br/&gt;
- Number of daily/weekly/monthly active users&lt;br/&gt;
- User engagement levels (e.g. time spent on site, number of page views, etc.)&lt;br/&gt;
- Number of new users&lt;br/&gt;
- Number of referrals&lt;br/&gt;
- Churn rate&lt;br/&gt;
- Demographics of users (e.g. age, gender, location, etc.)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: this is not a bad starting list. Having said that, demographics aren't metrics, but it's no big deal.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a Product Analyst or Data Scientist working with a Product Manager at a social networking company like Facebook or Twitter. Give some examples of values of product metrics that might indicate a healthy or unhealthy product.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Some examples of values of product metrics that might indicate a healthy product are:
&lt;br/&gt;
- The number of people using the product on a daily basis&lt;br/&gt;
- The number of people using the product on a monthly basis&lt;br/&gt;
- The number of people using the product on a yearly basis&lt;br/&gt;
&lt;br/&gt;
Some examples of values of product metrics that might indicate an unhealthy product are:&lt;br/&gt;
- The number of people who have stopped using the product&lt;br/&gt;
- The number of people who have uninstalled the product&lt;br/&gt;
- The number of people who have given the product a low rating&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the first list is essentially worthless, the second one is good. Let's try the same question, but with more leading:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a Product Analyst or Data Scientist working with a Product Manager at a social networking company like Facebook or Twitter. Give some examples of values of product metrics that might indicate a healthy or unhealthy product. For example, if DAU to MAU ratio is 1 in 30, that might indicate a product that is not sticky enough.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A healthy product metric might be something like DAU to MAU ratio being 1 in 10. This would indicate that the product is being used frequently enough. An unhealthy product metric might be something like a DAU to MAU ratio of 1 in 30. This would indicate that the product is not sticky enough and people are not using it as frequently as they should be.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the answer is worthless mumbo-jumbo.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a Product Analyst or Data Scientist working with a Product Manager. Your team owns a mobile app, imagine something like Twitter or Facebook. You are designing the Monthly Active User of MAU metric, which you will compute on a rolling lookback window basis. In other words, every day the metric counts the number of unique users in the last X days. You have a choice of picking X, you're options are 28 days, 30 days, 31 days. Which one would you pick and why?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: 31 days. This is the longest month, so it will capture the most data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: in a Product Analytics setting, usually the right number to pick is 28. Let's try the same question, but with more leading:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a Product Analyst or Data Scientist working with a Product Manager. Your team owns a mobile app, imagine something like Twitter or Facebook. You are designing the Monthly Active User of MAU metric, which you will compute on a rolling lookback window basis. In other words, every day the metric counts the number of unique users in the last X days. You have a choice of picking X, you're options are 28 days, 30 days, 31 days. Why would we want to pick 28 over the others?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few reasons why we might want to pick 28 days over the other options:
&lt;br/&gt;
1. 28 days is a multiple of 7, which means that the metric will always count the same number of weekends. This can be helpful if we want to compare MAU week over week and don't want the metric to be skewed by differences in the number of weekends.&lt;br/&gt;
2. 28 days is a common month length, so it will be easy for people to understand and compare the MAU metric to other metrics that are also computed on a monthly basis.&lt;br/&gt;
3. 31 days is the longest possible month length, so choosing 28 days instead will help to reduce noise in the data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: 1. is the expected answer here, and the reason internally we usually use a 28 day MAU definition. 2 and 3 are are false/mumbo-jumbo.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Checkout counters&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you are a Data Scientist.
You want to build a model to recommend how many checkout counters should be open at a hypermarket, for upcoming days, in one hour slots.
How would you go about it?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: I would first look at the data to see if there are any patterns that can be observed. For example, are there more people shopping on certain days of the week or at certain times of the day? I would also look at the data to see how many people are typically checking out at the same time and see if there are any bottlenecks. Once I have a good understanding of the data, I would build a model that takes into account the number of people shopping and the number of people checking out at the same time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you are a Data Scientist.
You want to build a model to recommend how many checkout counters should be open at a hypermarket, for upcoming days, in one hour slots.
How would you build the model, what parts would your model pipeline consist of?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The model pipeline would consist of the following parts:
&lt;br/&gt;
1. Data cleaning and preparation: This would involve cleaning the data set to remove any invalid or missing data points.&lt;br/&gt;
2. Data exploration: This would involve exploring the data set to identify any patterns or trends.&lt;br/&gt;
3. Model training: This would involve training a model to predict the number of checkout counters that should be open for upcoming days, in one hour slots.&lt;br/&gt;
4. Model evaluation: This would involve evaluating the trained model to see how accurate it is in predicting the number of checkout counters that should be open for upcoming days, in one hour slots.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the answer is not incorrect, but very generic. Also, what it describes are phases of a Data Scientist's work, but not the stages of a model pipeline.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you are a Data Scientist.
You want to build a model to recommend how many checkout counters should be open at a hypermarket, for upcoming days, in one hour slots.
As a first step, you need to forecast the number of future checkouts, given historic data going back to 2018. How would you do this, given that we also had Covid-19 happening starting in 2020 May?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few ways to approach this problem. One way would be to build a time series model that takes into account the effects of Covid-19 on the number of checkouts. Another way would be to build a model that predicts the number of checkouts based on a variety of factors, including the time of day, the day of the week, and whether or not there is a Covid-19 pandemic.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: while not incorrect, the answer contains no useful information.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you are a Data Scientist.
You want to build a model to recommend how many checkout counters should be open at a hypermarket, for upcoming days, in one hour slots.
As a first step, you need to forecast the number of future checkouts, given historic data going back to 2018. How would you do this, given that we also had Covid-19 happening starting in 2020 May? Specifically, assume you're using Python and Scikit Learn.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are many ways to approach this problem, but one possible approach would be to use a time series forecasting model such as SARIMAX. This model can take into account seasonality and trends in the data, as well as exogenous variables such as Covid-19.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you are a Data Scientist. You want to build a model to recommend how many checkout counters should be open at a hypermarket, for upcoming days, in one hour slots. In this problem, how would you use an M/M/1 model from queueing theory?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: If you were using an M/M/1 model from queueing theory, you would need to know the arrival rate of customers and the service rate of the checkout counters. With this information, you would be able to calculate the average number of customers in the system, the average time a customer spends in the system, and the average number of customers in the queue. From this, you would be able to recommend how many checkout counters should be open in order to minimize the average time a customer spends in the system.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Receipt scanning&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a data scientist. We want to build a model to parse out store, date, time and total amount from a picture of a receipt. How would you go about it?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few ways to tackle this problem:
&lt;br/&gt;
1. Use Optical Character Recognition (OCR) to extract the text from the image. This can be a difficult problem on its own, but there are many commercial and open-source OCR libraries available.&lt;br/&gt;
2. Use a trained machine learning model to classify the text in the image. This would require a lot of training data, but could be more accurate than OCR.&lt;br/&gt;
3. Use a combination of OCR and heuristics to parse out the relevant information. For example, you could use OCR to extract all the text from the image and then use heuristics to look for patterns that match store names, dates, times, and total amounts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a data scientist. We want to build a model to parse out store, date, time and total amount from a picture of a receipt. Suppose you have an OCR program that can extract the text in the image. How would you use it to build this model?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few ways to build this model. One way would be to use a regular expression to parse out the store, date, time and total amount from the text. Another way would be to use a machine learning algorithm to learn the patterns in the data and then predict the store, date, time and total amount from new data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the fact that it recommends regular expressions is very cool. When I was working on this problem, the baseline version I was improving on, built by a previous team, was using regular expressions!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a data scientist. We want to build a model to parse out store, date, time and total amount from a picture of a receipt. Suppose you have an OCR program that can extract the text in the image. What kind of Machine Learning libraries would you use here?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are many ways to approach this problem, but some common libraries for machine learning include TensorFlow, scikit-learn, and Keras.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: since it recommends scikit-learn, let's ask about it.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a data scientist. We want to build a model to parse out store, date, time and total amount from a picture of a receipt. Suppose you have an OCR program that can extract the text in the image. Assuming you're using scikit-learn to solve this problem, which models would you use, and how?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few ways to approach this problem. One way would be to use a regex model to extract the relevant information from the text. Another way would be to use a series of scikit-learn models to extract the store, date, time, and total amount separately.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the answer is not specific to scikit-learn.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Imagine you're a data scientist. We want to build a model to parse out store, date, time and total amount from a picture of a receipt. Suppose you have an OCR program that can extract bounding boxes and the text in each bounding box from the image. How would you use this to solve the problem?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are a few different ways to solve this problem. One way would be to use the OCR program to extract the text in each bounding box, and then use a regular expression to parse out the store, date, time and total amount from the text. Another way would be to use the OCR program to extract the bounding boxes, and then use a computer vision algorithm to parse out the store, date, time and total amount from the image.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the answer is not very useful. A more useful answer would be about using the positions and alignments of the bounding boxes to predict which contains useful information, or adding up individual amounts to increase confidence about total amount.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this round, GPT-3 did significantly worse than on the previous, more straightforward interview questions. Based on these sessions, I don't think it's valuable as an assistant for Data Scientists in real world settings.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-3"></category></entry><entry><title>How good a Data Scientist is GPT-3?</title><link href="https://bytepawn.com/how-good-a-data-scientist-is-gpt-3.html" rel="alternate"></link><published>2022-07-31T00:00:00+02:00</published><updated>2022-07-31T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-07-31:/how-good-a-data-scientist-is-gpt-3.html</id><summary type="html">&lt;p&gt;Recently I have been playing around with OpenAI's GPT-3 and I am very impressed by it. It reminds of the famous Arthur C. Clarke quote, &lt;em&gt;“Any sufficiently advanced technology is indistinguishable from magic.”&lt;/em&gt; Here I will show a "conversation" with GPT-3 to gauge how good a Data Scientist — or an illusion of a Data Scientist — it is.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gpt-3.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;Recently I have been playing around with OpenAI's &lt;a href="https://en.wikipedia.org/wiki/GPT-3"&gt;GPT-3&lt;/a&gt; and I am very impressed by it. It reminds of the famous Arthur C. Clarke quote, &lt;em&gt;“Any sufficiently advanced technology is indistinguishable from magic.”&lt;/em&gt; If you don't know what GPT-3 is, this is a &lt;a href="https://towardsdatascience.com/understanding-gpt-3-in-5-minutes-7fe35c3a1e52"&gt;good introduction with a lot of links&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Recently, a &lt;a href="https://www.youtube.com/watch?v=kgCUn4fQTsc&amp;amp;t=1s"&gt;Google engineer has made the claim&lt;/a&gt; that one of Google's proprietary internal AI systems called &lt;a href="https://blog.google/technology/ai/lamda/"&gt;LaMDA&lt;/a&gt; is sentient. A more useful perspective on such advanced statistical AI systems is to say that they are an extremely useful, wonderful illusion.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gpt-3.jpg" alt="GPT-3" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You&lt;/em&gt; can also talk to GPT-3, &lt;a href="https://openai.com/api/"&gt;just create an account on OpenAI and go to the sandbox!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here I will show a "conversation" with GPT-3 to gauge how good a Data Scientist — or an illusion of a Data Scientist — it is. Mostly these are questions I might ask a candidate in a Data Science interview. The questions are independent of one another, so they work with a context-free system such as GPT-3. By context-free I mean that GPT-3 does not maintain conversation state, each question and answer are indepedent from the AI's perspective.&lt;/p&gt;
&lt;p&gt;GPT-3 is trained on large amounts of text downloaded from the Internet. So I expect it to do well on questions where there are high-quality answers on the Internet, and it "just" has to return it, in the correct english structure. For questions that require more reasoning and combining facts (from the question and training data) I expect to get less impressive answers.&lt;/p&gt;
&lt;h2&gt;Culture&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What does a good Data Scientist look like?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There is no one-size-fits-all answer to this question, as the ideal data scientist depends on the specific needs of the organization. However, some key qualities that a good data scientist should possess include strong analytical and problem-solving skills, the ability to think creatively, and experience with data-driven decision making. Additionally, a good data scientist should be able to effectively communicate their findings to both technical and non-technical audiences.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: This was an easy question because there are lots of sentences like this on the Internet, and the same applies to the following questions. The answer is still impressive and creates a strong illusion, because the quality of the answer is so strong, and fits the structure of the question.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What's the difference between a Data Scientist and a Data Engineer?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A data engineer is responsible for designing, building, and maintaining the data infrastructure that a company uses to collect, store, and process data. A data scientist is responsible for analyzing and extracting insights from data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What's the difference between a Data Scientist and a Product Analyst?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A data scientist is someone who is responsible for extracting meaning from data. A product analyst is someone who is responsible for analyzing a product and determining its feasibility.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: What's the difference between a Data Scientist and an ML Engineer?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A Data Scientist is someone who is responsible for extracting insights from data. An ML Engineer is someone who is responsible for building and deploying machine learning models.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In a tech company working on a SaaS product, what's a good ratio of Data Scientists and Product Managers?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There is no definitive answer to this question since it can vary depending on the specific needs of the company and product. However, a good rule of thumb is to have at least one Data Scientist for every two Product Managers.&lt;span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In a tech company working on a SaaS product, what's a good ratio of Data Scientists, Data Engineers, Software Engineers and Product Managers?&lt;spam&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There is no definitive answer to this question, as it will vary depending on the specific needs of the company and product. However, a good general ratio to aim for is 1 Data Scientist, 1 Data Engineer, 2 Software Engineers, and 1 Product Manager. This will ensure that there is enough expertise and manpower dedicated to each area, while still allowing for some overlap and collaboration between teams.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the AI is contradicting itself; in the first question, it recommended a 1:2 DS:PM ratio, here it's recommending a 1:1 ratio. I'm not sure where the 1:1 is coming from, in my experience at tech companies a 1:1 ratio would be too many DEs.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Coding&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Given a list li, write a function &lt;code&gt;top_sum(li, n)&lt;/code&gt; that returns the sum of the top &lt;code&gt;n&lt;/code&gt; elements: &lt;code&gt;top_sum(li, n)&lt;/code&gt;. Eg. &lt;code&gt;top_sum(li=[5, 1, 5, 2, 0], n=3)&lt;/code&gt; returns &lt;code&gt;12&lt;/code&gt;.&lt;span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;top_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;sorted_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_list&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Given two lists &lt;code&gt;keys&lt;/code&gt; and &lt;code&gt;values&lt;/code&gt; of identical length containing key-value pairs, write a generator function &lt;code&gt;yield_sorted_kvs(keys, values)&lt;/code&gt; which yields key-value pairs from the two lists, in ascending key order. Note that the input list keys may not be ordered. Eg. &lt;code&gt;yield_sorted_kvs(keys=[3, 1, 2], values=[‘a’, None, 5])&lt;/code&gt; yields &lt;code&gt;(1, None) .. (2, 5) .. (3, ‘a’)&lt;/code&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;yield_sorted_kvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: GPT-3 nailed both, even though these exact &lt;code&gt;top_sum()&lt;/code&gt; and &lt;code&gt;yield_sorted_kvs()&lt;/code&gt; functions are not on the Internet. Also note that GPT-3 is not specifically trained to write code, being able to implement small functions like this is just an "accidental sife-effect". OpenAI is working on a model called &lt;a href="https://openai.com/blog/openai-codex/"&gt;OpenAI Codex&lt;/a&gt;, specifically to write code.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you show me some Python code that you consider beautiful?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Hello, world!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: This is a very weak answer, it breaks the illusion. Let's try again.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you show me some Python code that you consider beautiful, that implements a famous algorithm of your choosing?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fft&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
    &lt;span class="n"&gt;even&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fft&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;odd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;  &lt;span class="n"&gt;fft&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2j&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;odd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;even&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; \
           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;even&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: This is a great answer, but it's copy/paste from this &lt;a href="https://stackoverflow.com/questions/30433481/fft-using-recursive-python-function"&gt;Stackoverflow answer&lt;/a&gt;. It would be interesting to know why the AI picked FFT, and this particular implementation!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you show me some SQL code that you consider beautiful?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: This is a difficult question since beauty is subjective. However, here is an example of SQL code that is well written and easy to read:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: this is actually a great answer. I agree with GPT-3 that the only way to interpret "beautiful" in the context of SQL is formatting, and this is nicely formatted SQL code. &lt;a href="https://bytepawn.com/how-i-write-sql-code.html"&gt;I follow the same formatting.&lt;/a&gt; But the illusion would work better if the sample query would be a bit longer:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you show me some SQL code that you consider beautiful, that involves use of the &lt;code&gt;WITH&lt;/code&gt; clause and also a &lt;code&gt;JOIN&lt;/code&gt;?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A:&lt;/span&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;user_sessions&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_time&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end_time&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;user_sessions&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note: here the illusion breaks, because now it's using a different style of formatting!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;A/B testing&lt;/h2&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Let's say we're running an A/B test with two variants. How should I split traffic to maximize the probability of getting a statistically significant result?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The traffic should be split evenly between the variants.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: a lot of people don't know this in interviews.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Let's say we're running an A/B test with two variants. If we split traffic evenly between A and B, we maximize the probability of getting a statistically significant result. However, in real life usually traffic is not split evenly. Why?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There are many reasons why traffic might not be split evenly in a real-life A/B test. For example, if the variants are assigned randomly, there is a chance that one variant will be assigned to more users than the other. In addition, some users may be more likely to use one variant over the other, or one variant may be more popular than the other.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: this one is surprising. The answer I was expecting was around managing the risk of trying a new version. Let's see if we can lead it.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Let's say we're running an A/B test with two variants. If we split traffic evenly between A and B, we maximize the probability of getting a statistically significant result. However, in real life A/B tests are usually not configured to split traffic evenly. In most A/B tests, the new variant gets 10 or 20% of traffic. Why?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: The reason is that if the new variant is worse than the old variant, we want to minimize the number of users who experience a worse experience. If the new variant is better than the old variant, we want to quickly find out and roll it out to all users. So in most A/B tests, the new variant gets a small amount of traffic at first. If the results look promising, the new variant gets more traffic. If the results are not promising, the new variant gets less traffic.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: by &lt;a href="https://medium.com/intuitionmachine/the-subtle-art-of-priming-gpt-3-bc12f67c9777"&gt;priming GPT-3&lt;/a&gt; in the question, we were able to direct it to the expected answer.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Let's say we're running an A/B test with two variants of an email. The evaluation metric is CTR. One branch has a CTR of 4%, the other branch has a CTR of 91%. What do you think?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There is a significant difference in CTR between the two variants of the email. The branch with a CTR of 91% is likely to be the more successful email.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: while technically correct, this is not a strong answer. A strong(er) answer would be to say that this is an unlikely result, almost certainly due to a bug or measurement error. &lt;a href="https://en.wikipedia.org/wiki/Twyman%27s_law"&gt;See Twyman's law.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you explain to me what a p-value is in A/B testing?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: A p-value is the probability that the results of a hypothesis test are due to chance. A low p-value indicates that the results are not due to chance, and are therefore significant.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you explain to me what the difference between frequentist and bayesian A/B testing is?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: Frequentist A/B testing is based on the idea of hypothesis testing, where we test a null hypothesis (usually that there is no difference between the two groups) against an alternative hypothesis (usually that there is a difference between the two groups). We then calculate a p-value, which is the probability of seeing our data if the null hypothesis were true. If the p-value is less than a certain threshold (usually 0.05), we reject the null hypothesis and conclude that there is a difference between the two groups.
&lt;br/&gt;&lt;br/&gt;
Bayesian A/B testing is based on the idea of posterior inference, where we use the data to calculate the posterior probability of the null hypothesis being true. We then compare this posterior probability to a prior probability, which represents our beliefs about the null hypothesis before we saw the data. If the posterior probability is greater than the prior probability, we conclude that the null hypothesis is more likely to be true.
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the last 2 questions were simple, but the illusion is still impressive!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: Can you recommend a good webpage to me that explains the difference between frequentist and bayesian A/B testing?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: There is a lot of information out there on this topic. However, this webpage provides a clear explanation of the differences between frequentist and bayesian A/B testing: https://www.evanmiller.org/bayesian-ab-testing.html&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: here the illusion is strong again, I would also recommend Evan's blog on this topic!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In the context of A/B testing, can you explain to me how &lt;a href="https://bytepawn.com/tag/cuped.html"&gt;CUPED&lt;/a&gt; uses historical data?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: CUPED uses historical data to help identify which variation of a given test is performing better. By looking at past performance, CUPED can more accurately predict which variation is more likely to lead to a successful outcome for the current test. This helps to ensure that the results of the test are more reliable and accurate.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: this is a very generic answer. It's interesting to note that originally I had a typo in my question, I wrote "used" instead of "uses". This totally tripped up GPT-3 and yielded a very generic answer:&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #fefebe"&gt;Q: In the context of A/B testing, can you explain to me how CUPED &lt;span style="background-color: red"&gt;used&lt;/span&gt; historical data?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="background-color: #acdf87"&gt;A: CUPED used historical data to analyze the conversion rates for each variation of their A/B test. They then used this data to determine which variation was more effective in converting users.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Overall I find GPT-3 to be very impressive. It performs better as a question answering bot than many human Data Scientists I interview. Although this AI is not yet good enough to take away any technical contributor's job, it's not hard to imagine systems that are trained more specifically for a domain (such as Data Science and programming) and are potentially 10-100x bigger.&lt;/p&gt;
&lt;p&gt;I will continue this post with a Part II, where I will ask it modeling questions. Overall, I am very impressed by GPT-3.&lt;/p&gt;</content><category term="gpt"></category><category term="gpt-3"></category></entry><entry><title>The Elements of Typographic Style</title><link href="https://bytepawn.com/the-elements-of-typographic-style.html" rel="alternate"></link><published>2022-07-23T00:00:00+02:00</published><updated>2022-07-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-07-23:/the-elements-of-typographic-style.html</id><summary type="html">&lt;p&gt;The Elements of Typographic Style by Robert Bringhurst is the most beautiful book I've ever held in my hand. This stunning piece of readable art shows Bringhurst's love for the craft of design, typography and writing, and his mastery of these subject, a result of his life-long devotion to them. I am not a professional typographer, but I enjoyed glancing at, reading and appreciating every page of this book.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/teots_cover.jpg" alt="The Elments of Typographic Style" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;The Elements of Typographic Style by &lt;a href="https://en.wikipedia.org/wiki/Robert_Bringhurst"&gt;Robert Bringhurst&lt;/a&gt; is the most beautiful book I've ever held in my hand. This stunning piece of readable art shows Bringhurst's love for the craft of design, typography and writing, and his mastery of these subject, a result of his life-long devotion to them. I am not a professional typographer, but I enjoyed glancing at, reading and appreciating every page of this book. &lt;a href="https://www.amazon.com/Elements-Typographic-Style-Version-Anniversary/dp/0881792128/"&gt;I recommend you order your copy right now.&lt;/a&gt; I recommend the softcover version.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_cover.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Sample pages&lt;/h2&gt;
&lt;p&gt;To increase your appetite, here are some sample pages from the book. These are from a scan of the previous edition, the actual book is much nicer due to the higher resolution of print (about 5-10x higher than your screen).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_1.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_2.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_3.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_4.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_5.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_6.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_7.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/teots_8.jpg" alt="The Elments of Typographic Style" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Typography&lt;/h2&gt;
&lt;p&gt;The text face is &lt;a href="https://fonts.adobe.com/fonts/minion"&gt;Minion Pro&lt;/a&gt;, designed by &lt;a href="https://en.wikipedia.org/wiki/Robert_Slimbach"&gt;Robert Slimbach&lt;/a&gt;.&lt;br/&gt;
The captions are set in &lt;a href="https://fonts.adobe.com/fonts/ff-scala-sans"&gt;Scala Sans&lt;/a&gt;, designed by &lt;a href="https://en.wikipedia.org/wiki/Martin_Majoor"&gt;Martin Majoor&lt;/a&gt;.&lt;br/&gt;
The paper is Glatfelter Laid, archival quality and acid-free.&lt;/p&gt;
&lt;h2&gt;Quotes&lt;/h2&gt;
&lt;p&gt;Some of my favorite quotes from the book, which apply to not just to typography.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;By all means break the rules, and break them beautifully, deliberately and well.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Using what there is to best advantage almost always means using less than what is available.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Consistency is one of the forms of beauty. Contrast is another.&lt;/p&gt;
&lt;p&gt;Choose [your library of faces] slowly and well. Stay with your first choices long enough to learn their virtues and limitations before you move on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;With type as with philosophy, music and food, it is better to have a little of the best than to be swamped with the derivative, the careless, the routine.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Every alphabet is a culture.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On mathematics in typography: The mathematics are not here to impose drudgery upon anyone. On the contrary, they are here entirely for pleasure. They are here for the pleasure of those who like to examine what they are doing, or what they might do or have already done, perhaps in the hope of doing it still better.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Don't restate the obvious [regarding the use of running headers in books].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Architects build perfectly proportioned kitchens, living rooms and bedrooms in which their clients will make, among other things, a mess. Typographers likewise build perfectly proportioned pages, then distort them on demand. The text takes precedence over the purity of the design.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The state of the art has more by far to do with the knowledge and skill of its practitioners than with the subtleties of their tools, but tools can constrain that skill or set if free. The limitations of the tools are therefore also of some interest.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Like a forest or a garden or a field, an honest page of letters can absorb – and will repay – all the attention it is given. Much type now, however, is delivered to computer screens. The best computer monitors now sold have barely adequate resolution (220 dpi: roughly a third the current norm for laser printers and less than a tenth of the norm for professional offset printing). When texts disintegrate into pixels, the eye goes looking for distraction, which the screen is all too able to provide... The underlying problem is that the scvreen mimics the sky instead of the earth. Note: a flagship phone in 2022 has approximately 500 pixels per inch. The 4K monitor I'm typing on has abour 200 pixels per inch.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content><category term="book"></category><category term="typography"></category></entry><entry><title>The Culture Map</title><link href="https://bytepawn.com/the-culture-map.html" rel="alternate"></link><published>2022-07-22T00:00:00+02:00</published><updated>2022-07-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-07-22:/the-culture-map.html</id><summary type="html">&lt;p&gt;The Culture Map by Erin Meyer is a system of 8 scales which can be used to determine how cultures vary along a spectrum. The scales can be used to analyse one culture relative to another and decode how culture influences your international collaborations. I find the considerations in the book helpful irrespective of cultural background; the 8 scales can be applied to individuals as well, irrespective of where they are from.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/culture_map_cover.jpg" alt="The Culture Map" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;The Culture Map by Erin Meyer is a great business management book on how to communicate and manage people from different cultures, or just people with different ways of working. These are my notes of the book, which I recommend you should &lt;a href="https://www.amazon.com/Culture-Map-Breaking-Invisible-Boundaries/dp/1610392507"&gt;buy right now&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_cover.jpg" alt="The Culture Map" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The Culture Map by &lt;a href="https://en.wikipedia.org/wiki/Erin_Meyer"&gt;Erin Meyer&lt;/a&gt; is a system of 8 scales which can be used to determine how cultures vary along a spectrum. The scales can be used to analyse one culture relative to another and decode how culture influences your international collaborations. If you take the example of an Israeli executive who has been appointed to manage a newly purchased factory in Russia, and compare where both countries are on the scale, you can see where difficulties could arise.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_1.jpg" alt="An example of a culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The author's website has a non-free &lt;a href="https://erinmeyer.com/tools/culture-map-premium/"&gt;Culture Mapping Tool&lt;/a&gt; that draws the above plot for the countries selected, ie. the country composition of your team(s).&lt;/p&gt;
&lt;p&gt;Let's now look at the 8 dimensions considered in the book.&lt;/p&gt;
&lt;h2&gt;Communicating&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Low context:&lt;/strong&gt; say it like it is, precise, simple clear, intended to be taken at face value.&lt;br/&gt;
&lt;strong&gt;High context:&lt;/strong&gt; messages are nuanced, impled, need to be "decoded" or read between the lines.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_2.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Americans are the most explicit or low-context culture there is (low-context meaning their conversation assumes relatively little intuitive understanding). This is not surprising for a young country composed of immigrants that prides itself on straight-talking. Japan and other East Asian countries represent the other extreme.&lt;/p&gt;
&lt;p&gt;Meyer offers strategies for negotiating these differences, but the most basic solution, as with all scales discussed in the book, is simply to be aware. Thus Americans in Japan should pay attention to what's not being said; while Japanese in America should brace themselves for direct language.&lt;/p&gt;
&lt;h2&gt;Evaluating&lt;/h2&gt;
&lt;p&gt;How directly (negative) feedback is given. Cultures that are direct wrt communication are not always direct wrt feedback, eg. US.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_3.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Americans may be very explicit communicators, but they are in the middle of the spectrum when it comes to giving negative feedback — as anyone who as been to an American school knows. Israelis, Russians, and Dutch are among the most direct when it comes to negative feedback. Japanese are among the most indirect.&lt;/p&gt;
&lt;h2&gt;Persuading&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Concept first:&lt;/strong&gt; like in a prussian school, first learn the theory, then apply it. In a business setting, this means first give theoretical arguments, then get to the actual business application.&lt;br/&gt;
&lt;strong&gt;Application first:&lt;/strong&gt; focus on solving the problem at hand, and worry about the theory later. In a business setting, this means starting with the proposed action item, and then explaning why it's a good idea.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_4.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Some cultures, notably the French and Italians, tend toward deductive arguments, focusing on theories and complex concepts before presenting a fact, statement, or opinion. Others, notably Anglo-Saxon cultures, tend toward inductive arguments, starting with focusing first on practical application before moving to theory.&lt;/p&gt;
&lt;p&gt;This trait shows up in everything from how people give presentations or lead meetings to how they write emails.&lt;/p&gt;
&lt;h2&gt;Leading&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Egalitarian:&lt;/strong&gt; the manager is a member of the team, a servant to the team. Decisions are based with more involvement from the team. Flat structure, titles are not that important internally.&lt;br/&gt;
&lt;strong&gt;Hierarchical:&lt;/strong&gt; the manager is a boss, and tells team members what to do—and team members expect to be told what to do. Status and titles are important.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_5.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;"In Denmark, it is understood that the managing director is one of the guys, just two small steps up from the janitor," a Danish executive told Meyer. This represents one extreme in attitudes toward leadership. On the other side of the spectrum in countries like Japan and Korea, however, the ideal boss should stand far above the workers at the top of a hierarchy. America's outlook on leadership falls somewhere in the middle.&lt;/p&gt;
&lt;h2&gt;Deciding&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Consensual:&lt;/strong&gt; team makes decisions together, as a quorum.&lt;br/&gt;
&lt;strong&gt;Top-down:&lt;/strong&gt; the boss makes decisions and lets everybody know.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_6.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;How organizations make decisions relates closely to how they view leadership, but with some important differences.&lt;/p&gt;
&lt;p&gt;Notably, while Japan has a very hierarchical leadership system, it has a very consensual decision-making system. This is the famous ringi system, which involves building consensus at a lower level before bringing a proposal to a higher level, thus enabling broad corporate consensus.&lt;/p&gt;
&lt;p&gt;Note: it's possible to have (a) an egalitarian team, where the manager doesn't tell people exactly what to do. It's possible to have a hierarchical structure with (b) consensual decision making, or (c) top-down decision making. In case (a) decision making on most issues is delegated to team members because of trust. In (b), decisions are made together, but then the managers tells everybody what to do. In (c), the manager decides on his own and then tells everybody what to do.&lt;/p&gt;
&lt;h2&gt;Trusting&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Task-based:&lt;/strong&gt; business relationships are just that, and do not require or assume personal friendship. Business partners may not trust each other (completely), but believe that contracts and the legal system creates boundary conditions that channels the other party to do their part and protect them if not.&lt;br/&gt;
&lt;strong&gt;Relationship-based:&lt;/strong&gt; business relationships are only established with or after creating a personal connection and some personal trust, like getting to know each other, having lunch or dinner, going out for drinks, or in some cultures, getting connected to each other through trusted parties.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_7.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In some cultures, notably America, people don't worry so much about trusting each other because they trust their legal system to enforce contracts, and so business negotiations focus on what's practical.&lt;/p&gt;
&lt;p&gt;In others, including many emerging market economies but also to a lesser extent Western Europe, personal relationships are much more important, in part because people don't trust their legal system to enforce contracts.&lt;/p&gt;
&lt;h2&gt;Disagreeing&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Confrontational:&lt;/strong&gt; open confrontation/disagreement is okay, and won't hurt the professional/personal relationship.&lt;br/&gt;
&lt;strong&gt;Non-confrontational:&lt;/strong&gt; open confrontation/disagreement is seen as unproductive and/or impolite.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_8.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Some cultures embrace confrontation while others avoid it. This scale looks a lot like the scale showing the directness of negative feedback, though with some differences, such as Sweden being further to the left (direct) on negative feedback and further to the right (avoiding confrontation) on disagreeing.&lt;/p&gt;
&lt;h2&gt;Scheduling&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Linear time:&lt;/strong&gt; 10AM means 10AM, a few minutes late is okay. Time estimates are assumed to be somewhat reliable and taken seriously.&lt;br/&gt;
&lt;strong&gt;Flexible time:&lt;/strong&gt; 10AM can mean anything from 10AM to 12PM. Appointment timings and time estimates should not be taken too seriously.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/culture_map_9.jpg" alt="Culture map" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;That different cultures treat time differently is one of the most common observations for anyone working or even traveling abroad. On one extreme you've got the exceedingly precise Germans and Swiss; Americans fall relatively close to this end of the spectrum; Western Europeans and Latin Americans tend to be more flexible; Africa, the Middle East, and India are extremely flexible.&lt;/p&gt;</content><category term="book"></category><category term="management"></category><category term="culture"></category></entry><entry><title>Working Backwards</title><link href="https://bytepawn.com/working-backwards.html" rel="alternate"></link><published>2022-07-10T00:00:00+02:00</published><updated>2022-07-10T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-07-10:/working-backwards.html</id><summary type="html">&lt;p&gt;Amazon established a set of principles and mechanisms, enabling the company to grow from a single founder to several hundred thousand employees while remaining stubbornly true to its mission of obsessing over customers to create long-term shareholder value.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/working_backwards1.jpg" alt="Working Backwards" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;This is a great business management book. These are my notes of the book, which I recommend you should &lt;a href="https://www.amazon.com/Working-Backwards-Insights-Stories-Secrets-ebook/dp/B08BYCQBZN"&gt;buy right now&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Leadership principles and mechanisms&lt;/h2&gt;
&lt;p&gt;Amazon established a set of principles and mechanisms, enabling the company to grow from a single founder to several hundred thousand employees while remaining stubbornly true to its mission of obsessing over customers to create long-term shareholder value. Per Jeff Bezos:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can write down your corporate culture, but when you do so, you’re discovering it, uncovering it—not creating it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/working_backwards1.jpg" alt="Working Backwards" style="width: 250px;"/&gt;&lt;/p&gt;
&lt;p&gt;Amazon’s 14 Leadership Principles:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Customer Obsession:&lt;/strong&gt; Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ownership:&lt;/strong&gt; Leaders are owners. They think long term and don’t sacrifice long-term value for short-term results. They act on behalf of the entire company, beyond just their own team. They never say “that’s not my job.”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Invent and Simplify:&lt;/strong&gt; Leaders expect and require innovation and invention from their teams and always find ways to simplify. They are externally aware, look for new ideas from everywhere, and are not limited by “not invented here”. Because we do new things, we accept that we may be misunderstood for long periods of time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Leaders Are Right, A Lot:&lt;/strong&gt; Leaders are right a lot. They have strong judgement and good instincts.  They seek diverse perspectives and work to disconfirm their beliefs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Learn and Be Curious:&lt;/strong&gt; Leaders are never done learning and always seek to improve themselves. They are curious about new possibilities and act to explore them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hire and Develop the Best:&lt;/strong&gt; Leaders raise the performance bar with every hire and promotion. They recognise people with exceptional talent and willingly move them throughout the organisation. Leaders develop leaders and are serious about their role in coaching others.  We work on behalf of our people to invent mechanisms for development like Career Choice.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Insist on the Highest Standards:&lt;/strong&gt; Leaders have relentlessly high standards – many people may think these standards are unreasonably high. Leaders are continually raising the bar and driving their teams to deliver high quality products, services and processes. Leaders ensure that defects do not get sent down the line and that problems are fixed so they stay fixed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Think Big:&lt;/strong&gt; Thinking small is a self-fulfilling prophecy. Leaders create and communicate a bold direction that inspires results. They think differently and look around corners for ways to serve customers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bias for Action:&lt;/strong&gt; Speed matters in business. Many decisions and actions are reversible and do not need extensive study. We value calculated risk taking.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Frugality:&lt;/strong&gt; Accomplish more with less. Constraints breed resourcefulness, self-sufficiency and invention.  There are no extra points for growing headcount, budget size or fixed expense.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Earn Trust:&lt;/strong&gt; Leaders listen attentively, speak candidly, and treat others respectfully. They are vocally self-critical, even when doing so is awkward or embarrassing.  Leaders do not believe their or their team’s body odour smells of perfume.  They benchmark themselves and their teams against the best.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dive Deep:&lt;/strong&gt; Leaders operate at all levels, stay connected to the details, audit frequently, and are sceptical when metrics and anecdote differ. No task is beneath them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Have Backbone; Disagree and Commit:&lt;/strong&gt; Leaders are obligated to respectfully challenge decisions when they disagree, even when doing so is uncomfortable or exhausting. Leaders have conviction and are tenacious. They do not compromise for the sake of social cohesion. Once a decision is determined, they commit wholly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deliver Results:&lt;/strong&gt; Leaders focus on the key inputs for their business and deliver them with the right quality and in a timely fashion. Despite setbacks, they rise to the occasion and never compromise.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Good intentions don't work; mechanisms do. No company can rely on good intentions. Amazon has put in place mechanisms to ensure that the Leadership Principles translate into action. Three foundational mechanisms:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Annual planning process:&lt;/strong&gt; starts previous year summer and requires 1-2 months of managers' time. Executive team comes up with high level goals like "Grow revenue from $10M to $15M", which are then cascaded down the orgs. All goals are SMART goals. Each org builds its own goals and operating plan, which includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;assessment of past performance, including goals achieves, goals missed, and lessons learned&lt;/li&gt;
&lt;li&gt;key initiatives for the coming year&lt;/li&gt;
&lt;li&gt;detailed income statement&lt;/li&gt;
&lt;li&gt;requests and justifications for resources (hires, marketing spend, equipment, etc)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After the year completes, in January, the goals and plans are adjusted for what happened in Q4 in a phase 2. Once this narrative (goals and operating plan) is locked, changing it requires executive team approval.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Executive team's goals process:&lt;/strong&gt; certain goals from orgs are selected by the executive team and elevated to company goal status, plus they come up with their own. Executive goals can number in the 100s at Amazon! These are mainly input-focused metrics that measure certain activities teams are performing during the year. Eg. "add 1,000 products to Store X" or "99.99% of API calls to service X are complete within 10ms". Executive team goals are tracked by the Finance dept and reviewed quarterly (red, yellow, green).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compensation plan:&lt;/strong&gt; re-inforces long-term thinking. The bulk of employee's compensation is always stock (vs base income or cash bonus) to avoid misalignment: (i) by rewarding short-term goals at the expense of the long-term and (ii) by rewarding localized department achievements irrespective of whether they're good for the company.&lt;/p&gt;
&lt;h2&gt;Hiring: Bar raiser process&lt;/h2&gt;
&lt;p&gt;When you consider the potential positive and negative impacts of an important hire, not to mention the precious time dedicated to it, it is shocking to consider how little rigor and analysis most companies put into their hiring process.&lt;/p&gt;
&lt;p&gt;Bad hiring comes from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;personal bias: surround yourself with people similar to you&lt;/li&gt;
&lt;li&gt;hiring urgency: filling headcount just to look good&lt;/li&gt;
&lt;li&gt;relying on gut feeling, "I like him/her"&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Good hiring: scalable, repeatable, formal process for consistently making appropriate and successful hiring decisions. Easy to teach to new people, has feedback loop to ensure continuous improvement.&lt;/p&gt;
&lt;p&gt;Bar raisers: group of "senior" interviewers who must be involved in all hiring, and &lt;strong&gt;have veto power&lt;/strong&gt; over the hiring manager. The bar raiser must come from outside of the team that is hiring.&lt;/p&gt;
&lt;p&gt;Amazon hiring process:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Job description:&lt;/strong&gt; You cannot hire the right person for the job if you don’t have a clearly defined job description. This is an essential point-of-reference for interviewers. A good description must be specific and focused.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Resume Review:&lt;/strong&gt; Resumes should meet the requirements spelled out in the job description.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Phone Screen:&lt;/strong&gt; 45-60 minute call with the hiring manager (once resumes are screened). Based on initial data, hiring manager decides if they would be inclined to hire the candidate based on preliminary data. If so, the candidate will be invited to the next step.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;In-house interview:&lt;/strong&gt; No participants will be more than one level below the position of the candidate’s position. Pool of interviewers should not include the prospective boss.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Written Feedback:&lt;/strong&gt; Interviewers generate detailed notes as close to a verbatim record as possible. Written feedback must be thorough, detailed, and specific. The report should be written shortly after completion of the interview. Oral feedback is unacceptable. Written feedback includes hiring recommendation based on four possible options: strongly inclined to hire, inclined to hire, not inclined to hire, or strongly not inclined to hire. Interviewers do not discuss their feedback until submitting their own report (to avoid bias).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debrief/Hiring Meeting:&lt;/strong&gt; After written feedback is submitted, the interview team meets to debrief and make the hiring decision. Team reviews interview feedback and has a chance to change their vote based on the cumulative information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reference check&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Offer&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Every new hire should “raise the bar,” that is, be better in one important way (or more) than the other members of the team they join.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Behavioral interviewing:&lt;/strong&gt; The process of evaluating a candidate’s past behavior and their compatibility with Amazon’s Leadership Principles. Each interviewer is assigned one or more of the 14 Principles to focus on in their interview (in order to learn how specifically the candidate aligns with the assigned principle). Questions are mapped to assigned principles.&lt;/p&gt;
&lt;p&gt;Example: “Can you give me an example of a time when your team proposed to launch a new product or initiative and you pushed back on their plan because you didn’t think it was good enough?”&lt;/p&gt;
&lt;p&gt;STAR questions (Situation, Task, Action, Result):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;What was the &lt;strong&gt;s&lt;/strong&gt;ituation?&lt;/li&gt;
&lt;li&gt;What were you &lt;strong&gt;t&lt;/strong&gt;asked with?&lt;/li&gt;
&lt;li&gt;What &lt;strong&gt;a&lt;/strong&gt;ctions did you take?&lt;/li&gt;
&lt;li&gt;What was the &lt;strong&gt;r&lt;/strong&gt;esult?&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Organizing: Seperable, Single-threaded leadership&lt;/h2&gt;
&lt;p&gt;Velocity, a measure of speed and direction, is critical for a business. With all other things being equal, the organization that moves faster will innovate more, simply because it will be able to conduct a higher number of experiments per unit of time.&lt;/p&gt;
&lt;p&gt;Dependencies can slow down innovation and the rate at which corporate teams operate. When a team cannot operate independently, its progress slows to whatever or wherever the gating dependency resides. As organizations become more interdependent and complex, an inordinate amount of time is spent on internal communication and coordination.&lt;/p&gt;
&lt;p&gt;Too much of any kind of dependency not only slows down the pace of innovation but also creates a dispiriting second-order effect: disempowered teams. Amazon determined that improving coordination and communication doesn’t resolve the problem of dependencies. Amazon instead looked to eliminate dependencies. Per Jeff Bezos: Eliminate communication rather than encourage it in order to make Amazon a place where builders can build.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Single-Threaded Leadership:&lt;/strong&gt; is Amazon’s mechanism whereby individuals are responsible for single, focused initiatives. These leaders have specialized responsibilities rather than a broad set of responsibilities. They run teams that are largely autonomous.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two-Pizza Teams:&lt;/strong&gt; An early Amazon approach to creating more autonomous teams. Teams were no bigger than the number of people who could be fed by two large pizzas (no more than 10 people). Amazon’s software architecture was highly influenced by this approach (e.g. modular APIs for services and data).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Autonomous teams are built for speed.&lt;/strong&gt; Autonomous teams should have a clearly defined purpose, clearly defined boundaries, and useful metrics for tracking progress.It’s up to the team to figure out the specifics of how they will achieve their goal. The most successful teams made initial investments in removing dependencies and building infrastructure and instrumentation before adding new features (aka they laid the groundwork for future innovation).&lt;/p&gt;
&lt;p&gt;Per Jeff Bezos: Most decisions should probably be made with somewhere around 70% of the information you wish you had. If you wait for 90%, in most cases, you’re probably being slow...if you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure. Be stubborn on the vision but flexible on the details.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Single-Threaded Leaders:&lt;/strong&gt; Leaders whose focus is to get a specific job done. They don’t work on anything else. Separable, single-threaded teams have fewer organizational dependencies than conventional teams. &lt;strong&gt;The best way to fail at inventing something is by making it somebody’s part-time job.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Communication: Narratives and the 6-pager&lt;/h2&gt;
&lt;p&gt;Amazon relies far more on the written word to develop and communicate ideas than most companies, and this difference makes for a huge competitive advantage. Unlike most companies, &lt;strong&gt;Amazon does not  use PowerPoint&lt;/strong&gt; (or any kind of presentation software). Better informed people make higher-quality decisions, and can deliver better, more detailed feedback.&lt;/p&gt;
&lt;p&gt;Written narratives are the primary communication tool used for proposals, plans, and process documentation. The company uses two main types of narrative:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The six-pager:&lt;/strong&gt; A document used to describe, review, or propose an idea, process or business.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The PR/FAQ:&lt;/strong&gt; A document used to develop and iterate on product ideas.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Per Jeff Bezos: The reason writing a good 4 page memo is harder than writing a 20 page PowerPoint is because the narrative structure of a good memo forces better thought and better understanding of what’s more important than what, and how things are related. PowerPoint-style presentations somehow give permission to gloss over ideas, flatten out any sense of relative importance, and ignore the interconnectedness of ideas.&lt;/p&gt;
&lt;p&gt;Benefits of an effective six-pager:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Six pages is optimal for a 60 minute meeting where the meeting participants spend the first 20 minutes collectively reading the document individually (in silence while taking notes).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Document contains all essential information.&lt;/strong&gt; Participants can review presentation, uninterrupted, in its entirety. The document is standalone and doesn’t require a presenter to fill in the blanks.&lt;/li&gt;
&lt;li&gt;The document is portable and scalable. It can be circulated easily and read by anyone at any time.&lt;/li&gt;
&lt;li&gt;Anyone can edit or add notes to the document.&lt;/li&gt;
&lt;li&gt;The document serves as its own record.&lt;/li&gt;
&lt;li&gt;Written narratives contain 7-9 times the information density of a PowerPoint.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;People read three times faster than the typical presenter can talk.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A complete narrative should also anticipate the likely objections, concerns, and alternate points of view that we expect our team to deliver. Writers will be forced to anticipate smart questions, reasonable objections, even common misunderstandings—and to address them proactively in their narrative document.&lt;/p&gt;
&lt;p&gt;The six-pager can be used to explore any argument or idea you want to present to a group of people—an investment, a potential acquisition, a new product or feature, a monthly or quarterly business update, an operating plan, or even an idea on how to improve the food at the company cafeteria.&lt;/p&gt;
&lt;p&gt;Meeting format:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Presenter does not verbally repeat the document.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;20 minutes of reading the narrative.&lt;/li&gt;
&lt;li&gt;40 minutes of discussion.&lt;/li&gt;
&lt;li&gt;Someone (not the presenter) should be charged with taking notes on behalf of the audience.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Start with the desired costumer experience&lt;/h2&gt;
&lt;p&gt;Amazon's approach is to product and business development is to start by defining the customer experience. Once the desired experience is defined, Amazon works backwards from there to figure out how to realize said experience.&lt;/p&gt;
&lt;p&gt;PR/FAQ is a document that imagines a product release has occurred. The employee writes a press release that details the product, its features, pricing. rationale, and more. The FAQ portion answers questions that both customers and outside observers will have as well as questions that internal peers and managers might have.&lt;/p&gt;
&lt;p&gt;Most companies work forward: leaders define the product and try generate consumer interest. This is the classic “a solution in search of a problem” mentality. PR/FAQs are less expensive product experiments. Teams can determine whether a project is worthwhile or not without spending precious R&amp;amp;D money in advance.&lt;/p&gt;
&lt;p&gt;Basic format (follows the 6-page rule for 1-hour meetings):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;1 page press release.&lt;/li&gt;
&lt;li&gt;5 pages or fewer for frequently asked questions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Press release format:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Heading (one sentence): “Blue Corp. announces the launch of Melinda, the smart mailbox.”&lt;/li&gt;
&lt;li&gt;Subheading (on sentence) describing the benefits: “Melinda is the physical mailbox designed to securely receive and keep safe all your e-commerce and grocery deliveries.”&lt;/li&gt;
&lt;li&gt;Summary paragraph: Give a more detailed summary of the product and benefit.&lt;/li&gt;
&lt;li&gt;Problem paragraph: Describe the problem the service solves from the point of view of the customer.&lt;/li&gt;
&lt;li&gt;Solution paragraph: Describe how the product solves the customer problem.&lt;/li&gt;
&lt;li&gt;Quotes and getting started paragraph: Describe how a customer can obtain the product and how much it will cost. Include one quote from a company spokesperson and one quote from a hypothetical customer.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Sample FAQ questions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How many consumers have this need or problem?&lt;/li&gt;
&lt;li&gt;How big is the need?&lt;/li&gt;
&lt;li&gt;How many consumers are willing to spend money to solve this problem?How much money will they spend?&lt;/li&gt;
&lt;li&gt;What are the unit economics of the device?&lt;/li&gt;
&lt;li&gt;What is the rationale for the price point?&lt;/li&gt;
&lt;li&gt;What is the initial up-front investment required to build the product?&lt;/li&gt;
&lt;li&gt;If 3rd party adoption is needed, how will they be induced to use the product?&lt;/li&gt;
&lt;li&gt;Are there 3rd party technologies required?&lt;/li&gt;
&lt;li&gt;What are the challenging engineering problems to solve?&lt;/li&gt;
&lt;li&gt;What are the customer UI issues?&lt;/li&gt;
&lt;li&gt;How can we manage the risk of the up-front investment needed?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most PR/FAQs never make it past the exploration stage. This is a feature not a bug. Rejecting projects via the PR/FAQ process is far less expensive than actually developing failed products.&lt;/p&gt;
&lt;h2&gt;Metrics: Manage your inputs, not your outputs&lt;/h2&gt;
&lt;p&gt;Focus on the controllable input metrics, the activities you directly control, which ultimately affect output metrics such as share price. Output metrics show results. Input metrics provide guidance.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/amazon-flywheel2.png" alt="Working Backwards" style="width: 250px;"/&gt;&lt;/p&gt;
&lt;p&gt;DMAIC (Six Sigma process improvement method) is as follows:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Define&lt;/strong&gt; the metrics you want to measure. Amazon tracks things like selection, price, and convenience. These input metrics drive output metrics like orders, revenue, and profit. Amazon uses the “flywheel” concept from Jim Collins’ Good to Great. “You inject energy into any one element, or all of them, the flywheel spins faster.” The goal is to find the things that matter the most (but the process of identifying the optimal inputs is iterative).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Measure:&lt;/strong&gt;  Data must be collected and presented in a usable format. Align the metrics with the customer experience. Make sure you are able to regularly audit the metrics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Analyze:&lt;/strong&gt; The objective in this stage is separating signals from noise, and then identifying and addressing root causes. When Amazon teams come across a surprise or a perplexing problem with the data they are relentless until they discover the root cause. Use the “Five Whys” method developed by Toyota. Keep drilling down with “why” questions which may sit layers down a chain of probing questions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Improve:&lt;/strong&gt; This step can only happen once sufficient investments in the first three stages have occurred. Example: If you reach a weekly 95% in-stock rate for products, figure out how to get to a 98% rate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Control:&lt;/strong&gt; Ensure that processes are operating as expected and maintain performance levels.
This is an opportunity for identifying automation opportunities.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Weekly Business Review (WBR)&lt;/strong&gt; is a tactical operational meeting to analyze performance trends of the prior week. At Amazon, it was not the time to discuss new strategies, project updates, or upcoming product releases. Amazon’s internal reporting that allows for effective input management. Unlike the Six-pager or PR/FAQ, this data package is heavy with graphs, tables, charts, and visuals. The document may include explanatory notes for the metrics. The deck represents a data-driven, end-to-end view of the business. &lt;strong&gt;Emerging patterns are a key point of focus.&lt;/strong&gt; Graph plot results against comparable prior periods, this is critical for trend identification. A consistent format is maintained for ease of use and efficient interpretation. Anecdotes and exception reporting are woven into the deck. &lt;strong&gt;Highlight unusual situations or developments.&lt;/strong&gt;&lt;/p&gt;</content><category term="amazon"></category><category term="book"></category><category term="management"></category><category term="writing"></category></entry><entry><title>HBR Guide to Better Business Writing</title><link href="https://bytepawn.com/hbr-guide-to-better-business-writing.html" rel="alternate"></link><published>2022-07-08T00:00:00+02:00</published><updated>2022-07-08T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-07-08:/hbr-guide-to-better-business-writing.html</id><summary type="html">&lt;p&gt;If your writing is sloppy and artless people will think you are the same. They won’t care about your message, they won’t do business with you. It’s not true that only ideas matter. Good writing gets ideas noticed.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/hbr_bbw.png" alt="HBR Guide to Better Business Writing" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;This is one of my favorite practical advice books. I read it first in 2015, and multiple times since then. These are my notes of the book, which I recommend you should &lt;a href="https://www.amazon.com/HBR-Guide-Better-Business-Writing-ebook/dp/B00B0YPJ0G"&gt;buy right now&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;If your writing is sloppy and artless people will think you are the same. They won’t care about your message, they won’t do business with you. It’s not true that only ideas matter. &lt;strong&gt;Good writing gets ideas noticed.&lt;/strong&gt; Those who write poorly create barriers between themselves and their readers. Those who write well connect with their readers, open their minds, and achieve goals. Don’t waste your reader’s time. If you’re in business and you’re writing, you’re a professional writer.&lt;/p&gt;
&lt;p&gt;Qualities of good writing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;an intense focus on your reasons for writing, and on your reader’s needs&lt;/li&gt;
&lt;li&gt;preference for the simplest words possible&lt;/li&gt;
&lt;li&gt;feel for natural idioms&lt;/li&gt;
&lt;li&gt;appreciation for the right words in the right places&lt;/li&gt;
&lt;li&gt;an ear for tone&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/hbr_bbw.png" alt="HBR Guide to Better Business Writing" style="width: 250px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Know why you’re writing&lt;/h2&gt;
&lt;p&gt;Say clearly what your issue is and what you are trying to accomplish. With every sentence, ask yourself whether you’re advancing that cause. Consider your audience and purpose before starting to write, and let these guide what you say and how you say it. &lt;strong&gt;Plainly state the issue you’re addressing and what you hope to achieve.&lt;/strong&gt; Keep your goal in mind: don’t undermine your efforts with hostile or inappropriate tone.&lt;/p&gt;
&lt;h2&gt;Understand your readers&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Respect your readers’ time contraints. They are very busy.&lt;/strong&gt; They have little sense of duty to read what you put in front of them. If they don’t get your point quickly, they’ll leave. At the slightest need to struggle to understand you, they’ll stop trying—and think less of you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prove quickly that you have something valuable to say—valuable to your readers.&lt;/strong&gt; Waste no time in saying it. Write with such clarity and efficiency that reading your material is easy, even enjoyable. Use tone that makes you likable.&lt;/p&gt;
&lt;p&gt;Tailor you message to you audience. A good trick is to connect with particular readers to connect with large audiences. Warren Buffet recommends:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When writing Berkshire Hathaway’s annual report, I pretend that I’m talking to my sisters. I have no trouble picturing them: though highly intelligent, they are not experts in accounting or finance. They will understand plain English, but jargon may puzzle them. My goal is simply to give them the information I would wish them to supply me if our positions were reversed. To succeed, I don’t need to be Skakespeare; I must, though, have a sincere desire to inform.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Understand that your readers have no time to waste: get to the point quickly and clearly to ensure that your message gets read.&lt;/li&gt;
&lt;li&gt;Use a tone appropriate for your audience.&lt;/li&gt;
&lt;li&gt;Emphasize the items most important to your readers. If they can easily see how your message is relevant to them, they will be more likely to read it and respond.&lt;/li&gt;
&lt;li&gt;Choose an intelligent, nonspecialist member of your audience to write for and focus on writing for that person. Your message will be more accessible and persuasive to all your readers as a result.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Divide the writing procss into four separate tasks (MACJ)&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;Madman&lt;/strong&gt; gathers material and generates ideas.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Architect&lt;/strong&gt; organizes information by drawing up an outline, however simple.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Carpenter&lt;/strong&gt; puts your thoughts into words, laying out sentences and paragraphs by following the Architect’s plan.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Judge&lt;/strong&gt; is your quality-control character, polishing the expression throughout—everything from tightening the language to correcting grammar and punctuation.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Three main points (Architect)&lt;/h2&gt;
&lt;p&gt;When writing a short piece like an email, &lt;strong&gt;write down your three main points first in full sentences&lt;/strong&gt;. People can’t hold more than 3 things in their heads. Spell out your logic as clearly as you can. Force yourself to think through your reasons.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Find your focus by first generating a list of topics to cover.&lt;/li&gt;
&lt;li&gt;Develop these raw ideas into full sentences and categorize your main points in sets of threes.&lt;/li&gt;
&lt;li&gt;Arrange these sets in a logical order, keeping your reader’s needs in mind.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Write in full—rapidly (Carpenter)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Write the first draft as quickly as you can.&lt;/strong&gt; Once you’ve written your three main points so that you know where you’re going, you’re in Carpenter mode. Write as quickly as possible. Your sentences will be shorter than they otherwise should be, your idioms will be more natural. If there’s a painful part of writing, it’s doing the first draft. When you shorten the duration, it’s not as painful.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Speed writing:&lt;/strong&gt; to prevent premature fussing, write against the clock. Allow yourself 5 or 10 minutes to draft each section. Don’t edit. This is not yet the time to let the Judge in. If you find yourself stumped, move on to a different section you’re more comfortable with and come back to the problem once you’ve found your flow.&lt;/p&gt;
&lt;h2&gt;Improve what you’ve written (Judge)&lt;/h2&gt;
&lt;p&gt;Once you have written a complete draft, you’ll revise first and then edit. Revising is a reconsideration of what you’re saying as a whole, and where you’re saying it. It’s rethinking the floor plan. Editing is more a matter of fine-tuning sentences and paragraphs. You need to allow time for both. On the one hand, don’t let some neurotic obsession with perfectionism delay important projects. On the other hand, don’t rashly send things out without proper vetting and improvement.&lt;/p&gt;
&lt;p&gt;Revising:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Have I been utterly thruthful?&lt;/li&gt;
&lt;li&gt;Have I said all that I need to say?&lt;/li&gt;
&lt;li&gt;Have I been appropriately diplomatic and fair?&lt;/li&gt;
&lt;li&gt;Do I have three parts—an opener, a middle, and a closer?&lt;/li&gt;
&lt;li&gt;In my opener, have I made my points quickly and clearly?&lt;/li&gt;
&lt;li&gt;In the middle, have I proved my points with specifics?&lt;/li&gt;
&lt;li&gt;Is my closer consistent with the rest, expressed freshly?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Editing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Where can I save some words?&lt;/li&gt;
&lt;li&gt;Is there a better way of phrasing this idea?&lt;/li&gt;
&lt;li&gt;Is my meaning unmistakable?&lt;/li&gt;
&lt;li&gt;Can I make it more interesting?&lt;/li&gt;
&lt;li&gt;Is the expression relaxed but refined?&lt;/li&gt;
&lt;li&gt;Does one sentence glide into the next, without discontinuities?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Be relentlessly clear&lt;/h2&gt;
&lt;p&gt;Clarity can be a double-edged sword. When you’re forthright enough to take a position or recommend a course of action, you’re sticking your neck out. &lt;strong&gt;People who don’t want to commit make their writing muddy.&lt;/strong&gt; Perhaps they’re trying to leave room for their views to evolve as events unfold. Or perhaps they’re hoping they can later claim credit for good results and deny responsibility for bad ones.
The fact is though, that many readers will perceive them not as savvy wait-and-see participants but as spineless herd-followers who are slow to see opportunities within their reach. So clean up the mud.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Adopt the reader’s perspective:&lt;/strong&gt; always judge clarity from the reader’s standpoint—not your own. Your goal should be to write so unmistakably that your readers can’t possibly misunderstand or misinterpret.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep your language simple:&lt;/strong&gt; simplicity breeds clarity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Show, don’t tell:&lt;/strong&gt; be specific enough that you lead your readers to draw their own conclusions (that match yours), as opposed to simply expressing your opinions without support and hoping people will buy them. Not: &lt;em&gt;“he was a bad boss”&lt;/em&gt;. Instad: &lt;em&gt;“he got a promotion based on his assistant’s detailed reports, but then—despite the company’s record profits—denied that assistant even routine cost-of-living raises”&lt;/em&gt;. A short, vague sentense like &lt;em&gt;“he was a bad boss”&lt;/em&gt; may register in the readers’ mind—but only as a personal impression that’s potentially biased. It’s credible only if its source (you) is credible. &lt;strong&gt;Concrete business writing is persuasive because it’s evidence-based, clear and memorable.&lt;/strong&gt; When you supply meaningful, objective details you’re sharing information, not just your opinion.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Put yourself in the reader’s shoes to assess your clarity. Better yet, see whether a colleague can accurately summarize the main points of your draft from a quick read-through.&lt;/li&gt;
&lt;li&gt;Phrase your ideas as plainly and briefly as possible, aiming for an average sentence length of 20 or fewer words.&lt;/li&gt;
&lt;li&gt;Pave your readers’ way with concrete details. Don’t try to push them there with abstract assertions.&lt;/li&gt;
&lt;li&gt;Cultivate your letter writing to improve your writing skills more generally.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Learn to summarize—accurately&lt;/h2&gt;
&lt;p&gt;A good summary is focused and specific—and it’s at the beginning of your document so readers don’t have to dig. It gets to the point. It lays the foundation for what’s to follow. There’s no holding back on the crucial information.&lt;/p&gt;
&lt;p&gt;People often assume that shorter is better when it comes to summaries. But &lt;strong&gt;brevity without substance is worthless&lt;/strong&gt;. Never say more than the occasion demands—but never say less, either. Adopt the reader’s perspective: fill in as much information as it takes to get people up to speed.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Summarize the vital information at the beginning of the document.&lt;/li&gt;
&lt;li&gt;Summarize each section with a sentence that addresses “the five Ws” (who, what, when, where, why) and how—and use these sentences to build your general summary.&lt;/li&gt;
&lt;li&gt;Provide only the information the reader needs to understand the issue—no more and no less.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Use chronology when giving a factual account&lt;/h2&gt;
&lt;p&gt;Create a chronology of relevant events to organize the narrative. Don’t jump in the middle. Once you’ve laid out a chronology of events, drafting the email becomes much easier—just a matter of stringing the events together and asking to meet with Sarah before next Tuesday’s meeting. But avoid rote repetition of unimportant details like dates.&lt;/p&gt;
&lt;h2&gt;Be a stickler for continuity&lt;/h2&gt;
&lt;p&gt;Smooth writing consists of a &lt;strong&gt;sequence of well-joined sentences and paragraphs&lt;/strong&gt;, not a mere collection of them. This smooth sequencing requires good planning and skill in handling transitions, or links that help readers follow your train of thought.
Good subheads &lt;strong&gt;steer the reader from one idea to the next&lt;/strong&gt;. Types of connections:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;establish a time sequence: then, at that point, as soon as...&lt;/li&gt;
&lt;li&gt;establish place: there, up front, way back...&lt;/li&gt;
&lt;li&gt;add a point: and, or, further...&lt;/li&gt;
&lt;li&gt;underscore a point: after all, indeed, more important...&lt;/li&gt;
&lt;li&gt;concede a point: although, admittedly...&lt;/li&gt;
&lt;li&gt;return to a point: even so, nevertheless, still...&lt;/li&gt;
&lt;li&gt;give an example: for example...&lt;/li&gt;
&lt;li&gt;provide a reason: because, thus, since, therefore...&lt;/li&gt;
&lt;li&gt;set up contrast: but, yet, conversely...&lt;/li&gt;
&lt;li&gt;set up a conclusion: so, finally, to sum up...&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;No matter how smooth your transitions are between sentences and paragraphs, time-pressed readers will zone out if you place a solid wall of text in front of them. Break up your document with some signposts to lead people from section to section and let them quickly locate parts.
Make your subheads as consistent as you can.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use well-placed transitional phrases to guide the reader to your next idea and indicate its relationship to what came before.&lt;/li&gt;
&lt;li&gt;Break up documents with concise, descriptive subheads to increase readability and help readers quickly locate the information most important to them.&lt;/li&gt;
&lt;li&gt;Use a “summary” subhead to point your readers to the document’s highlights.&lt;/li&gt;
&lt;li&gt;Use consistent style and parallel syntax in your subheads to reinforce the document’s logical and rhetorical cohesion.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Learn the basics of correct grammar&lt;/h2&gt;
&lt;p&gt;Your readers may see your language—especially your use of your native language—as a reflection of your competence. &lt;strong&gt;Make lots of mistakes and you’ll come across as uneducated and uninformed.&lt;/strong&gt;  People will hesitate to trust your recommendation to launch a resource-intensive project, for example, or to buy goods or services. They may think you don’t know what you’re talking about.&lt;/p&gt;
&lt;h2&gt;Don’t anesthetize your readers&lt;/h2&gt;
&lt;p&gt;The best conversationists and lecturers, no matter how obscure their topic, make it fascinating. They avoid trite expressions. They &lt;strong&gt;use strong, simple words&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Don’t overuse “I”. Use “we”, “our”, “you”, and “your” instead to add a personal touch and appeal to your reader.&lt;/li&gt;
&lt;li&gt;Avoid stuffiness by overcoming any fear you might have of contractions.&lt;/li&gt;
&lt;li&gt;For clearer, more straightforward writing, prefer active voice—unless the passive in a particular context sounds more natural.&lt;/li&gt;
&lt;li&gt;Vary the length and structure of your sentences.&lt;/li&gt;
&lt;li&gt;Make the reader’s job easier by avoiding acronyms when you can.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Watch your tone&lt;/h2&gt;
&lt;p&gt;Striking the right tone takes work—but it’s critical to the success of your business documents. If you sound likable and professional, people will want to work with you and respond to you. So adopt a relaxed tone, as if speaking directly to the recipient of your document.
Avoid hyperformality: what do you think of colleagues who say or write “how may I be of assistance?” instead of “how may I help you?” When they choose overblown words over everyday equivalents, don’t they strike you as pompous?&lt;/p&gt;
&lt;p&gt;Too much formality will spoil your style. Keep your writing down to earth and achieve a personal touch by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Writing your message more or less as you’d say it, but without all the casualisms (“like” and “you know”).&lt;/li&gt;
&lt;li&gt;Including courtesies such and “thank you”, “we’re happy to”, and “we appreciate”.&lt;/li&gt;
&lt;li&gt;Using the names of the people you’re writing about.&lt;/li&gt;
&lt;li&gt;Using personal pronouns (“you”, “he”, “she”—not “the reader”, “the applicant”).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Be collegial:&lt;/strong&gt; you’ll have better luck delivering most kinds of messages, even tough ones, if you approach people collegially. Imagine that everything you write will be paraded before a jury in a lawsuit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Drop the sarcasm:&lt;/strong&gt; sarcasm expresses contempt and superiority. It doesn’t shame people into compliance. Rather, it’s a surefire way of irritating and alienating them.&lt;/p&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Arrive at a relaxed but professional tone by writing your message as if you were speaking to the recipient in person.&lt;/li&gt;
&lt;li&gt;Refer to people by name, use personal pronouns as you naturally would, and shun fancy substitutes for everyday words.&lt;/li&gt;
&lt;li&gt;Always use your best judgement and a collegial tone in composing your messages, even if the content isn’t positive. You’ll get better responses and keep yourself out of trouble.&lt;/li&gt;
&lt;li&gt;Adopt a tone appropriate to your relationship with the recipient.&lt;/li&gt;
&lt;li&gt;Never use sarcasm is professional messages. It will result in a step away from—not toward—your desired outcome.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Writing emails&lt;/h2&gt;
&lt;p&gt;Be as direct as possible while maintaining a polite tone. &lt;strong&gt;Come to the point of your email within the first two or three sentences.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep emails brief.&lt;/strong&gt; Restrict yourself to one screen’s worth of text and keep the message tight and focused so your readers get the point fast. Write a concise subject line that tells your readers why you’re writing and what it means to them. If they need to act on your message, make that clear in the subject line.&lt;/p&gt;
&lt;p&gt;Diligently adhere to standard writing conventions—even when typing with your thumbs on a handheld device. Never click “Reply All”. Pick your recipients!&lt;/p&gt;
&lt;h2&gt;A checklist for the four stages of writing (MACJ)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Madman (prepare):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Consider why you’re writing: What’s moved you to write? What’s the assignment? What you do hope to achieve?&lt;/li&gt;
&lt;li&gt;Think about who your readers are and what they need to know.&lt;/li&gt;
&lt;li&gt;Figure out how much time you have, and work out a rough schedule for gathering ideas and material, outlining, preparing a draft, and revising.&lt;/li&gt;
&lt;li&gt;Research with imagination and gusto. Take notes on relevant information.&lt;/li&gt;
&lt;li&gt;Push yourself to be creative. Don’t be content with obvious ideas that just anyone would think of.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Architect (skeleton):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write down your three main points in complete sentences—with as much specificity as you can.&lt;/li&gt;
&lt;li&gt;Consider the best order of the three main points and reorganize them if necessary.&lt;/li&gt;
&lt;li&gt;Decide how to open and conclude the document.&lt;/li&gt;
&lt;li&gt;Think about what visual aids might be helpful in conveying your ideas.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Carpenter (draft):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If possible, turn away from all distractions. Silence your phone and your computer alerts, and find an hour of solitude. You’ll be writing.&lt;/li&gt;
&lt;li&gt;Use your three-point outline as a guide.&lt;/li&gt;
&lt;li&gt;Start writing paragraphs that support the point your find easiest to start with—then move to the other points.&lt;/li&gt;
&lt;li&gt;Write swiftly without stopping to edit or polish.&lt;/li&gt;
&lt;li&gt;Try to write a full section in one sitting. If you must get up in the middle of a section, start the next sentence with a few words and then leave.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Judge (improve and iterate):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Immediately after completing your draft, read it through with the idea of amplifying ideas here and there.&lt;/li&gt;
&lt;li&gt;Then let it cool off—overnight, if you can, or for a few minutes if you’re working under an urgent deadline.&lt;/li&gt;
&lt;li&gt;When you return to your draft, consider it from the audience’s perspective. Will it be clear to everyone who looks at it, or does it require inside knowledge? Is it concise, or does it waste words and time?&lt;/li&gt;
&lt;li&gt;Identify the draft’s two biggest flaws and try to fix them.&lt;/li&gt;
&lt;li&gt;Ask yourself:&lt;ul&gt;
&lt;li&gt;Is anything essential missing?&lt;/li&gt;
&lt;li&gt;Are important points stressed?&lt;/li&gt;
&lt;li&gt;Is the meaning of each sentence clear and accurate?&lt;/li&gt;
&lt;li&gt;Are my transitions smooth?&lt;/li&gt;
&lt;li&gt;What can I trim without sacrificing important content?&lt;/li&gt;
&lt;li&gt;Are there any vague passages I can sharpen with specific facts?&lt;/li&gt;
&lt;li&gt;Are there boring passages I can word more vividly?&lt;/li&gt;
&lt;li&gt;Can I improve the phrasing?&lt;/li&gt;
&lt;li&gt;Can I improve the punctuation?&lt;/li&gt;
&lt;li&gt;Are there any typos?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</content><category term="hbr"></category><category term="book"></category><category term="management"></category><category term="writing"></category></entry><entry><title>Culture Docs: Facebook, Netflix and Valve</title><link href="https://bytepawn.com/culture-docs-facebook-netflix-and-valve.html" rel="alternate"></link><published>2022-06-18T00:00:00+02:00</published><updated>2022-06-18T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-06-18:/culture-docs-facebook-netflix-and-valve.html</id><summary type="html">&lt;p&gt;Many companies have some sort of "Culture Doc", a booklet or similar, which explains to new joiners what the company is about. I received Facebook's "Little Red Booklet" when I joined in 2016 February, and I was amazed how good it was. Recently I was researching other companies' Culture Docs, and found a version of Netflix's and Valve's online. It's interesting to compare and contrast what these different companies choose to put in their Culture Doc. Facebook's Culture Doc is very mission and execution oriented and serious, Netflix is analytical and HR-focused, and Valve's is a lighthearted explanation of how the company works.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000152.jpg" alt="Move fast and break things" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Many companies have some sort of "Culture Doc", a booklet or similar, which explains to new joiners what the company is about. I received Facebook's "Little Red Booklet" when I joined in 2016 February, and I was amazed how good it was. Recently I was researching other companies' Culture Docs, and found a version of Netflix's and Valve's online. It's interesting to compare and contrast what these different companies choose to put in their Culture Doc.&lt;/p&gt;
&lt;p&gt;Facebook's Culture Doc is very mission and execution oriented and serious, Netflix is analytical and HR-focused, and Valve's is a lighthearted explanation of how the company works.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Disclaimer: I worked at Facebook in 2016-17. I currently own &lt;a href="https://www.google.com/search?q=meta+stock"&gt;META stock&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Facebook&lt;/h2&gt;
&lt;p&gt;Facebook is the best company I ever worked at, mostly because of its awesome mission and impact oriented culture. This is reflected in the Culture Doc, which is easily the best of the three — but, I'm definitely biased. There are many principles (such &lt;strong&gt;Move Fast and Break Things&lt;/strong&gt;) and mental models (such as the &lt;strong&gt;Perception vs Reality&lt;/strong&gt; plot, which I can't find online unfortunately) that I learned at Facebook and I still refer to on a daily basis.&lt;/p&gt;
&lt;p&gt;Facebook's Culture Doc is produced by the &lt;strong&gt;Facebook Analog Research Laboratory&lt;/strong&gt;, a little team within the company responsible for making posters and booklets and other analog materials for employees and the office. Fortunately, &lt;a href="https://v1.benbarry.com/"&gt;Ben Barry&lt;/a&gt;, who created the booklet, has put &lt;a href="https://v1.benbarry.com/project/facebooks-book"&gt;a few scans online&lt;/a&gt;. All the images below are from &lt;a href="https://v1.benbarry.com/"&gt;his site&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thinking along the &lt;em&gt;Vision—Mission—Strategy—Tactics—Execution&lt;/em&gt; spectrum, Facebook's Culture Doc focuses on &lt;strong&gt;Mission&lt;/strong&gt; and &lt;strong&gt;Execution&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000152.jpg" alt="Facebook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Facebook was originally not created to be a company.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000324.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Changing how people communicate will always change the world.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000325.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What happens when everyone can put their message in front of a lot of people?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000326.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Each Facebook engineer is responsible for approximately 1,000,000 people using Facebook. (My old edition says 433,000...)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000327.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There is no point in having a 5-year plan in this industry.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000328.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hacking can be playful — as long as it works.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000329.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Greatness and comfort rarely coexist.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000330.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Neither snow, nor rain, nor heat, nor gloom of night keeps these hackers from the swift completion of their code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000331.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The quick shall inherit the Earth.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000332.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When you don't realize what you can't do, you can do some pretty cool stuff.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000333.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Remember, people don't use Facebook because they like us. They use it because they like their friends. 👍&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000334.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We don't build services to make money. We make money to build better services.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000335.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If we don't create the thing that kills Facebook, some else will. #metaverse&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Unfortunately most pages of the booklet are not available scanned, but in some cases the same message is available in scanned poster form:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/1200/benbarry-000427.jpg" alt="Facebook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Move fast and break things.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000426.jpg" alt="Facebook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Done is better than perfect.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000425.jpg" alt="Facebook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What would you do if you weren't afraid?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000424.jpg" alt="Facebook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Fail harder.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://v1.benbarry.com/images/portfolio/600/benbarry-000180.jpg" alt="Facebook" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The back still says the name of a different technology company, one that came before us, left as a reminder that if we fail, someday someone might replace us.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Other good sources of information about Facebook culture:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2012/02/01/facebook-ipo-letter/"&gt;Founder's Letter, 2012&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://about.fb.com/news/2021/10/founders-letter/"&gt;Founder's Letter, 2021&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Facebook-Inside-Story-Steven-Levy/dp/0735213151"&gt;Facebook: The Inside Story (book)&lt;/a&gt; - by Steven Levy, highly recommended&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/hack-hhvm-second-system-effect.html"&gt;Hack, HHVM and avoiding the Second-system effect&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Netflix&lt;/h2&gt;
&lt;p&gt;The Netflix Culture Doc started out as a &lt;a href="/images/netflix_culture.pdf"&gt;deck of 100+ slides&lt;/a&gt;, but over time it has become &lt;strong&gt;a minisite on the official Netflix website&lt;/strong&gt;: &lt;a href="https://jobs.netflix.com/culture"&gt;jobs.netflix.com/culture&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/netflix_culture_website.jpg" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Unlike Facebook's, which focuses on mission and execution, this feels more like a set of HR principles. However, there are many gems. Here are some of my favorite (from the 2012 version):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_1.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The real company values, as opposed to the nice-sounding values, are shown by who gets rewarded, promoted, or let go.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_2.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Netflix wants to work with people who embody these 9 values: judgement, communication, impact, curiosity, innovation, courage, passion, honesty, selflessness.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_3.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Increase talent density as company grows and continue to run informally.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_4.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Context, not control: If you want to build a ship, don't drum up the people to gather wood, divide the work, and give orders. Instead, teach them to yearn for the vast and endless sea.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_5.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Highly aligned, loosely coupled.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/netflix_cd_6.PNG" alt="Netflix" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Annual compensation review is market based.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Valve&lt;/h2&gt;
&lt;p&gt;Valve's &lt;a href="https://www.valvesoftware.com/en/publications"&gt;Culture Doc is up on their website&lt;/a&gt;, with translations to 8 languages. It's from 2012, I'm not sure whether there is a more up-to-date version internally. It's called &lt;em&gt;Handbook for New Employees&lt;/em&gt;, it's mostly a light-hearted explanation of how the company works, with some HR principles sprinkled in. Whereas Facebook's Culture Doc is mission oriented and serious and Netflix's is analytical (with curves and intersections) and HR-focused, Valve's is more of a &lt;strong&gt;lighthearted explanation of how the company works and its history&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_1.jpg" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Valve is flat. It's our shorthand way of saying that we don't have any management, and nobody "reports to" anybody else.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_2.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You should always be considering where you could move yourself to be more valuable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_4.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Valve's customers are who you're serving. Do what's right for them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_5.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Cabals are really just multidisciplinary project teams.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_6.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My comment: it seems that being in the office and physically working together was a really important part of Valve's culture pre-Covid. Also see this &lt;a href="https://www.thegamer.com/gabe-newell-valve-remote-work-productivity/"&gt;article with Gabe Newell&lt;/a&gt;, Valve's President, on how Covid affected Valve.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_7.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Nobody has ever been fired at Valve for making a mistake.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_8.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Talk to someone in a {meeting, elevator, kitchen, bathroom}.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_9.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Stack-ranking.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_10.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hiring is the most important thing in the Universe.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_11.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;T-shaped employee model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/valve_cd_12.JPG" alt="Valve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What is Valve not good at?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Similar books&lt;/h2&gt;
&lt;p&gt;(+) denotes books I've read and can whole-heartedly recommend.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;About successful companies:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Facebook-Inside-Story-Steven-Levy/dp/0735213151"&gt;Facebook: The Inside Story (+)&lt;/a&gt; - by Steven Levy&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Hack-HHVM-Programming-Productivity-Breaking/dp/1491920874"&gt;Hack and HHVM: Programming Productivity Without Breaking Things (+)&lt;/a&gt; - by Owen Yamauchi, how Facebook rewrote the PHP website without rewriting the PHP website&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Working-Backwards-Insights-Stories-Secrets-ebook/dp/B08GKYJVL1"&gt;Working Backwards: Insights, Stories, and Secrets from Inside Amazon (+)&lt;/a&gt; - by Colin Bryar and Bill Carr&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Invent-Wander-Collected-Writings-Introduction-ebook/dp/B08BCCT6MW"&gt;Invent and Wander: The Collected Writings of Jeff Bezos (+)&lt;/a&gt; - by Jeff Bezos&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Masters-Doom-Created-Transformed-Culture-ebook/dp/B000FBFNL0"&gt;Masters of Doom: How Two Guys Created an Empire and Transformed Pop Culture (+)&lt;/a&gt; - David Kushner&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/How-Google-Works-Eric-Schmidt/dp/1455582328"&gt;How Google Works&lt;/a&gt; - by Eric Schmidt&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/No-Rules-Netflix-Culture-Reinvention-ebook/dp/B082261K3F"&gt;No Rules Rules: Netflix and the Culture of Reinvention&lt;/a&gt; - by Reed Hastings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also see &lt;a href="https://www.aboutamazon.com/about-us/leadership-principles"&gt;Amazon's official leadership principles&lt;/a&gt; and &lt;a href="https://bytepawn.com/the-best-parts-of-jeff-bezos-invent-and-wander.html"&gt;my blog post about the best parts of the Invent and Wander book&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anti-patterns:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Bad-Blood-Secrets-Silicon-Startup-ebook/dp/B07BW911F7"&gt;Bad Blood: Secrets and Lies in a Silicon Valley Startup (+)&lt;/a&gt; - by John Carreyrou&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Billion-Dollar-Loser-Epic-WeWork-ebook/dp/B08CB5763M"&gt;Billion Dollar Loser: The Epic Rise and Fall of WeWork (+)&lt;/a&gt;- by Reeves Wiedeman&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also see my &lt;a href="https://bytepawn.com/the-company-is-all-hot-air.html"&gt;blog post about Theranos and WeWork.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Every company should have a Culture Doc and publish it!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;PS: Coincidentally my friend &lt;a href="https://www.linkedin.com/in/mardack/"&gt;Jack Mardack's&lt;/a&gt; company &lt;a href="https://www.oysterhr.com/"&gt;Oyster&lt;/a&gt; just &lt;a href="https://oysterhr.notion.site/Our-Mission-Vision-Values-ac1d679589fc4ec9a4973ee8f289f484"&gt;opensourced their Culture Doc&lt;/a&gt; today!&lt;/p&gt;</content><category term="culture"></category><category term="facebook"></category><category term="netflix"></category><category term="valve"></category></entry><entry><title>More Data Scientists should learn SQL</title><link href="https://bytepawn.com/more-data-scientists-should-learn-sql.html" rel="alternate"></link><published>2022-05-29T00:00:00+02:00</published><updated>2022-05-29T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-29:/more-data-scientists-should-learn-sql.html</id><summary type="html">&lt;p&gt;In my experience, many Data Scientists struggle to write SQL queries in interviews.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/sql-img.jpg" alt="SQL" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Our Data Scientist interview is a multi-stage process. As the hiring manager, my interview is somewhere in the middle. The candidates who make it this far have had their CVs screened to make sure SQL is on it, and had to answer a few SQL screening questions, like whether they know how many rows are produced when using &lt;code&gt;UNION ALL&lt;/code&gt;. &lt;strong&gt;Still, most candidates I talk to struggle to write a relatively simple SQL query. &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It's worth pointing out that, although we always ask SQL questions, this is about 10-20% of the interview process. We also ask candidates to write 2-3 line Python functions, ask them about metrics, stats, modeling/ML, and maybe even some Linux.&lt;/p&gt;
&lt;h2&gt;A relatively simple SQL interview question&lt;/h2&gt;
&lt;p&gt;For this post, I wrote a slightly modified version of our interview question, but it's the same structure and complexity. Let's say there is table &lt;code&gt;transactions&lt;/code&gt;, with columns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;transactions:
- id BIGINT
- customer_id BIGINT
- ts TIMESTAMP
- amount_usd DECIMAL
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Question: how many customers spent at least $1000 in all months of the year 2021?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The solution we're looking for:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;monthly_spends&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_usd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;transactions&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="k"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2021&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;HAVING&lt;/span&gt;
        &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_usd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;qualifying_users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;monthly_spends&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;HAVING&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FORM&lt;/span&gt; &lt;span class="n"&gt;qualifying_users&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Extracting metrics from a Vertica DWH is a daily task for Data Scientists on our current team (same for the previous 3 jobs), it's not uncommon for us to write SQL queries that are longer than 100 lines, some are longer than 1000 lines. So expecting a candidate to be able to use &lt;code&gt;GROUP BY, HAVING, COUNT(), SUM()&lt;/code&gt; is definitely not unfair, it's what we do every day.&lt;/p&gt;
&lt;p&gt;First, I make sure they understand the question. I spend extra time stressing the &lt;em&gt;"all months of the year"&lt;/em&gt; part, here english-as-a-second-language sometimes is a factor. So I usually repeat this requirement phrased in 2-3 different ways (in english), to make sure the candidate got it.&lt;/p&gt;
&lt;h2&gt;What to look for&lt;/h2&gt;
&lt;p&gt;Interviewing is nerve-wrecking, so I expect candidates to perform worse than they would do under normal, real-life conditions. It's the same for me. For this reason, I tend to help them, and ignore minor errors/omissions. For example, people tend to not remember &lt;code&gt;MONTH()&lt;/code&gt; and &lt;code&gt;YEAR()&lt;/code&gt; and maybe try to use &lt;code&gt;DATE_PART()&lt;/code&gt;, but then struggle with the syntax. Or they forget to actually write out the &lt;code&gt;GROUP BY&lt;/code&gt;, even though clearly that's what they're doing. In cases like this I help them, and don't count this against them. Or, if they write the correct query, but forget the &lt;code&gt;WHERE YEAR(ts) = 2021&lt;/code&gt; part, I would remind them and also not count it against them. Also, most problems can be solved in several different ways, I'll take anything that is roughly right: eg. in the above query, you can write &lt;code&gt;HAVING ...&lt;/code&gt; like I did, or move that to the subsequent &lt;code&gt;SELECT&lt;/code&gt; as a &lt;code&gt;WHERE&lt;/code&gt; condition.&lt;/p&gt;
&lt;p&gt;The thing we are looking for is, can candidates think in SQL, formulate the solution as an SQL query? Can they break down the question into a series of &lt;code&gt;SELECT&lt;/code&gt;s, which progressively get closer to the answer we are looking for. This is where most people fail — to my surprise. This tells me that these candidates haven't yet figured out that knowing how to think in SQL and write queries more complicated than a single &lt;code&gt;SELECT &amp;lt;columns&amp;gt; FROM &amp;lt;table&amp;gt; WHERE &amp;lt;condition&amp;gt;&lt;/code&gt; is worth it.&lt;/p&gt;
&lt;p&gt;In our interview process, depending on the position and the person, a candidate can still get hired if they can't solve the above SQL, but they have to be really good at something else. Also, if the candidate is not a junior, I would be skeptical how they made it this far without learning to write SQL. This is simply because having to write SQL has been a daily part of my last 4 jobs in this field. In my current job I'm a hands-on manager running a 20-person team, but I still have about 10,000 lines of SQL in my draft, not counting code that I've commited to Github.&lt;/p&gt;
&lt;h2&gt;Indentation&lt;/h2&gt;
&lt;p&gt;Another thing I notice is that very few candidates (i) indent their SQL code (they're writing in on their own laptop in the interview) and (ii) don't use &lt;code&gt;WITH&lt;/code&gt;, they write subqueries instead. For my brain, indenting is a major factor to keep track of what's going on in the &lt;code&gt;SELECT&lt;/code&gt; and be able to scan it to make sure it's good, while using &lt;code&gt;WITH&lt;/code&gt; allows me to think sequentially, and also keep the code sequential. Using sub-queries requires more indentation and reverses the order of the &lt;code&gt;SELECT&lt;/code&gt;s, to me it's a mental overhead that hinders clarity.&lt;/p&gt;
&lt;p&gt;See this past article on &lt;a href="https://bytepawn.com/how-i-write-sql-code.html"&gt;How I write SQL code&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Why SQL gets overlooked&lt;/h2&gt;
&lt;p&gt;There is no one answer to this, because Data Scientists come from many different educational backgrounds and companies.&lt;/p&gt;
&lt;p&gt;People coming from Software Engineerings background may miss out on SQL because we (I also have a Comp.Sci. degree) spend a lot of time learning and coding in imperative languages like C, C++, Java, Python. In the last 10 years there was a lot of buzz around Functional Programming (FP) and languages like Haskell and F#, and a lot of good functional patterns made it into mainstream programming languages and practice. But there's not a lot of buzz around Declarative Programming and languages like Prolog and SQL, and I don't think it's taught widely at Universities ( I was lucky enough to learn Prolog and SQL at University). In a declarative language like SQL we declare what results we want, and we let the runtime (query optimizer and execution engine) figure out the best way to get it.&lt;/p&gt;
&lt;p&gt;People coming from math and science background probably invest their time learning Python and all the interesting libraries and frameworks that we have today. After all, it's hard to get yourself to invest time in SQL, if you can spend the same time playing around with Reinforcement Learning in Pytorch.&lt;/p&gt;
&lt;p&gt;Lastly, there are companies, where Data Science teams haven't figured this out yet, so the candidates haven't learned this through osmosis. In these teams the Data Scientists block waiting for other people to write SQL for them to extract data from a DWH and receive it in a flat file.&lt;/p&gt;
&lt;h2&gt;Why SQL is worth knowing&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;1. "SQL just another language, like Python or Java!"&lt;/strong&gt; ... and you can't expect everybody to know all programming languages. I strongly disagree with this framing. SQL has been around since the 1970s, it's the de facto standard for getting data out of a (relational) database system. Even database systems that start out without SQL support usually end up with some sort of SQL support, because it's just so damn useful. In an analytics/DS/ML setting, the Data Warehouse (DWH) will almost always be queried with SQL (or something resembling SQL, like HiveQL). This even tends to happen if the underlying storage is not strictly relational, like when running a Presto engine on flat files stored on S3. Yes, there are ML projects that don't deal with relational data, like image, video and audio processing, but in my experience (i) in real life, there's always projects that do, and (ii) even something like image recognition will usually have additional input (eg. when recognizing faces on an image, it's worth knowing who was tagged on the image, what the location and time was, etc, and these additional fields will come from a DWH).&lt;/p&gt;
&lt;p&gt;SQL is not just like a language like Python or Java. It's been around since the 1970s, and there is no sign that it will go away anytime soon, including humans directly writing SQL.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. "I usually just get the data out and do it in Python!"&lt;/strong&gt; This breaks if there is too much data "to get out", which can happen even at medium sized companies. Also, it's very inefficient and slow, since all the data has to be copied out from the database disks, over a network, to a laptop or devserver. Usually, the combined program to get the data out and do the processing is harder to maintain and more error-prone then a flat SQL query like above. And the biggest reason: if you do it in Python, things like &lt;code&gt;JOIN&lt;/code&gt;s and &lt;code&gt;GROUP BY&lt;/code&gt;s will be executed by the query optimizer taking into account indexes, storage patterns, statistics, etc --- it will be a lot faster, since (i) only the relevant data needs to be read from disk (ii) the execution engine is highly optimized C/C++/Java/Assembly code (iii) instead of all data getting transfered out of the database, only the results are.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. "Data Analysts / Data Engineers write SQL queries for me!"&lt;/strong&gt; At many companies, there are separate roles that do most of the SQL. But, in my experience, even at companies like this (like Facebook), Data Scientists need to and should write SQL:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it's unreasonable to block and wait for somebody to write a 20-50 line SQL which takes 5-10 minutes to write&lt;/li&gt;
&lt;li&gt;the Data Engineering person or team may be busy or have long turnaround times&lt;/li&gt;
&lt;li&gt;at this or a future job, there may be no other team/person to do this&lt;/li&gt;
&lt;li&gt;the other team/person may make a mistake, which could invalidate the Data Scientist's results&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If my current team of Data Scientists would block on the Data Engineering team to write SQL queries, our productivity/impact would at least be 10x less that it is today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. It's just &lt;code&gt;SELECT&lt;/code&gt;s!&lt;/strong&gt; Data Scientists mostly just have to write &lt;code&gt;SELECT&lt;/code&gt;s (versus designing tables, choosing indexes, writing upserts, etc.), which is a relatively small part of SQL overall!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Writing SQL absurdly increases the value of Data Scientists!&lt;/strong&gt; In my experience, if a Data Scientist or a team of Data Scientists writes their own SQL, they will become experts at the data and metrics. Add to this Data Scientist's objectivity, understanding of statistics, and ability to put models on top. Since they write their own SQL, they don't block on other teams, so this means they can answer incoming questions very quickly. Eg. "the Finance team came up with this forecast for Q4, but it may be biased by company OKRs, what do you think?" Get the relevant metrics out with SQL, play around with it, put a Prophet model on top with some external regressors, and make a better forecast than the Finance team with a 1-day turnaround. Altough this is not the core job of a Data Science team, this makes us absurdly valuable to the company, which means good salaries, good raises, no layoffs, plus headcount, etc.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;If you don't know SQL, learn it! 😀&lt;/p&gt;</content><category term="data"></category><category term="sql"></category></entry><entry><title>Useful Python decorators for Data Scientists</title><link href="https://bytepawn.com/python-decorators-for-data-scientists.html" rel="alternate"></link><published>2022-05-22T00:00:00+02:00</published><updated>2022-05-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-22:/python-decorators-for-data-scientists.html</id><summary type="html">&lt;p&gt;I show toy implementations of Python decorator patterns that may be useful for Data Scientists.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/decorators-ds.jpg" alt="Python decorators" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In this post, I will show some &lt;code&gt;@decorators&lt;/code&gt; that may be useful for Data Scientists. It may also be useful to revisit previous Bytepawn posts on decorators:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/building-a-toy-python-dataclass-decorator.html"&gt;Building a toy Python @dataclass decorator&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/python-decorator-patterns.html"&gt;Python decorator patterns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/tag/python.html"&gt;all Bytepawn posts tagged with python&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Decorators%20for%20Data%20Scientists.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@parallel&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Let's assume I write a really inefficient way to find primes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sympy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;isprime&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_primes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;primes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_attempts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;isprime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;primes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generate_primes&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Outputs something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;88
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then I realize that I could get a "free" speedup if I run the original &lt;code&gt;generate_primes()&lt;/code&gt; on all my CPU threads in parallel. This is pretty common, it makes sense to define a &lt;code&gt;@parallel&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;merge_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parallelism&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cpu_count&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;parallelism&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parallelism&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merge_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @parallel(...)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @parallel, without parens&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this, with one line we can parallelize our function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;merge_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_primes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# same signature, nothing changes&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="c1"&gt;# same code, nothing changes&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generate_primes&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Outputs something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;1281
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In my case, my Macbook has 8 cores, 16 threads (&lt;code&gt;cpu_count()&lt;/code&gt; is 16), so I generated 16x as many primes. Notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The only overhead is having to define a &lt;code&gt;merge_func&lt;/code&gt;, which merges the results of the different runs of the function into one result, to hide the parallelism from outside callers of the decorated function (&lt;code&gt;generate_primes()&lt;/code&gt; in this case). In this toy example, I just merge the lists and make sure the primes are uniques by using &lt;code&gt;set()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;There are many Python libraries and approches (eg. threads vs processes) to achieve parallelism. This example uses &lt;a href="https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html"&gt;process parallelism with &lt;code&gt;joblib.Parallel()&lt;/code&gt;&lt;/a&gt;, which works well on Darwin + python3 + ipython and avoids locking on the &lt;a href="https://realpython.com/python-gil/"&gt;Python Global Interpreter Lock (GIL)&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;code&gt;@production&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Sometimes we write a big complicated pipeline, with extra steps which we only want to run in certain environments. Eg. do something on our local dev environment, but not in production or vica versa. It'd be nice to be able to decorate functions and get them to only run in certain environments, and do nothing elsewhere.&lt;/p&gt;
&lt;p&gt;One way to achieve this is with a few simple decorators: &lt;code&gt;@production&lt;/code&gt; for stuff we want to only run on prod, &lt;code&gt;@development&lt;/code&gt; for stuff we only want to run in dev, we can even introduce an &lt;code&gt;@inactive&lt;/code&gt; which just turns the function off altogether. The benefit of this approach is that this way the deployment history and current state is tracked in code/Github. Also, we can make these changes in one line, leading to cleaner commits; eg. &lt;code&gt;@inactive&lt;/code&gt; is cleaner than a big commit where an entire block of code is commented out.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;production_servers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;production&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;gethostname&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;production_servers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This host is not a production server, skipping function decorated with @production...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;development&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;gethostname&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;production_servers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This host is a production server, skipping function decorated with @development...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inactive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Skipping function decorated with @inactive...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;

&lt;span class="nd"&gt;@production&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running in production, touching databases!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@development&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running in production, touching databases!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@inactive&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running in production, touching databases!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Running in production, touching databases!
This host is a production server, skipping function decorated with @development...
Skipping function decorated with @inactive...
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This idea can be adapted to other frameworks/environments.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@deployable&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;At my current work, we use &lt;a href="https://airflow.apache.org/"&gt;Airflow&lt;/a&gt; for ETL/data pipelines. We have a rich library of helper functions which internally construct the appropriate DAG, so users (Data Scientists) don't have to worry about it.&lt;/p&gt;
&lt;p&gt;The most commonly used one is &lt;code&gt;dag_vertica_create_table_as()&lt;/code&gt;, which runs a &lt;code&gt;SELECT&lt;/code&gt; on our Vertica DWH and dumps the result into a table every night:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;dag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dag_vertica_create_table_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;my_aggregate_table&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Marton Trencseni (marton.trencseni@maf.ae)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;schedule_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;@daily&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;    SELECT&lt;/span&gt;
&lt;span class="s2"&gt;        ...&lt;/span&gt;
&lt;span class="s2"&gt;    FROM&lt;/span&gt;
&lt;span class="s2"&gt;        ...&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This then becomes a query on the DWH, roughly like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;my_aggregate_table&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In reality it's more complicated: we first run the query for today, and conditionally delete yesterday's if today's was successfully created. This conditional logic (and some other accidental complexity specific to our environment, such as having to issue &lt;code&gt;GRANT&lt;/code&gt;s) results in the DAG having 9 steps, but this is not the point here, and is beyond the scope of the article.&lt;/p&gt;
&lt;p&gt;Over the last 2 years we have created almost 500 DAGs, so we scaled up our Airflow EC2 instances and introduced seperate development and production environments. It'd be nice to have a way to tag DAGs whether they should be running on dev or prod, track this in the code/Github, and use the same mechanism to make sure the DAGs don't accidentally run in the wrong environment.&lt;/p&gt;
&lt;p&gt;There are about 10 similar convenience functions, such as &lt;code&gt;dag_vertica_create_or_replace_view_as()&lt;/code&gt; and &lt;code&gt;dag_vertica_train_predict_model()&lt;/code&gt;, etc, and we'd like all &lt;strong&gt;calls&lt;/strong&gt; of these &lt;code&gt;dag_xxx()&lt;/code&gt; functions to be switchable between production and development (or skip everywhere).&lt;/p&gt;
&lt;p&gt;However, the &lt;code&gt;@production&lt;/code&gt; and &lt;code&gt;@development&lt;/code&gt; decorators from the previous section won't work here, because we don't want to switch &lt;code&gt;dag_vertica_create_table_as()&lt;/code&gt; to never run on one of the environments. We want to be able to set it per invocation, and have this feature in all of our &lt;code&gt;dag_xxxx()&lt;/code&gt; functions, without having to copy/paste code. What we want is to add a &lt;code&gt;deploy&lt;/code&gt; parameter to all of our in all of our &lt;code&gt;dag_xxxx()&lt;/code&gt; functions (with a good default), so we can just add this parameter in our DAGs for added security. We can achieve this with the &lt;code&gt;@deployable&lt;/code&gt; decorator:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;deployable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;deploy&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;deploy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;production&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;prod&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;gethostname&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;production_servers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This host is not a production server, skipping...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;deploy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;development&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;dev&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;gethostname&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;development_servers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This host is not a development server, skipping...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;deploy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;skip&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;none&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Skipping...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;
            &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;deploy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# to avoid func() throwing an unexpected keyword exception&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then we can add the decorator to our function definitions (1 line added for each):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@deployable&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dag_vertica_create_table_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# same signature, nothing changes&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="c1"&gt;# code signature, nothing changes&lt;/span&gt;

&lt;span class="nd"&gt;@deployable&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dag_vertica_create_or_replace_view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# same signature, nothing changes&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="c1"&gt;# code signature, nothing changes&lt;/span&gt;

&lt;span class="nd"&gt;@deployable&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dag_vertica_train_predict_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# same signature, nothing changes&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="c1"&gt;# code signature, nothing changes&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we stop here, nothing happens, we don't break anything. However, now we can go to the DAG files where we use these functions, and add 1 line:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;dag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dag_vertica_create_table_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;development&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# the function will return None on production&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@redirect&lt;/code&gt; (stdout)&lt;/h2&gt;
&lt;p&gt;Sometimes we write a big function, which also calls other code, and all sorts of messages are &lt;code&gt;print()&lt;/code&gt;ed. Or, we may have a bug, have a bunch of &lt;code&gt;print()&lt;/code&gt;s, and want to add line numbers to the printouts so it's easier to refer to them. In these cases, &lt;code&gt;@redirect&lt;/code&gt; may be useful. This decorator redirects &lt;code&gt;print()&lt;/code&gt; standard output to our own line-by-line printer, and we can do whatever we'd like with it (including throwing it away):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line_print&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;StringIO&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redirect_stdout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line_print&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;line_print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{i:0{width}}: {line}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @redirect(...)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @redirect, without parens&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we use &lt;code&gt;redirect()&lt;/code&gt; without specifying an explicit &lt;code&gt;line_print()&lt;/code&gt; function, it will print the lines, but with line numbers added:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@redirect&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Line #{i+1}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;print_lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="mi"&gt;02&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="mi"&gt;03&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="mi"&gt;04&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="mi"&gt;05&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="mi"&gt;06&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
&lt;span class="mi"&gt;07&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="mi"&gt;08&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
&lt;span class="mi"&gt;09&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;
&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Line&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we want to save all printed text to a variable, we can also achieve that:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line_print&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;save_lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Line #{i+1}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;print_lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;[&amp;#39;Line #1&amp;#39;, &amp;#39;Line #2&amp;#39;, &amp;#39;Line #3&amp;#39;]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The actual heavy lifting of &lt;a href="https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout"&gt;redirecting stdout is done by &lt;code&gt;contextlib.redirect_stdout&lt;/code&gt;&lt;/a&gt;, as shown in &lt;a href="https://stackoverflow.com/questions/1218933/can-i-redirect-the-stdout-into-some-sort-of-string-buffer"&gt;this StackOverflow thread&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@stacktrace&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The next decorator pattern is &lt;code&gt;@stacktrace&lt;/code&gt;, which emits useful messages when functions are called and values are returned from functions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stacktrace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exclude_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;anaconda&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tracer_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;co&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f_code&lt;/span&gt;
        &lt;span class="n"&gt;func_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;co&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;co_name&lt;/span&gt;
        &lt;span class="n"&gt;caller_filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f_back&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f_code&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;co_filename&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;func_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;write&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="c1"&gt;# ignore write() calls from print statements&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;exclude_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;caller_filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="c1"&gt;# ignore in ipython notebooks&lt;/span&gt;
        &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f_locals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f_code&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;co_varnames&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;)&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;call&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;gt; Executing: {func_name}{args}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tracer_func&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;return&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;gt; Returning: {func_name}{args} -&amp;gt; {repr(arg)}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;settrace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tracer_func&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;settrace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @stacktrace(...)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @stacktrace, without parens&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this, we can decorate the topmost function where we want tracing to start, and we will get useful output about the branching:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def b():
    print(&amp;#39;...&amp;#39;)

@stacktrace
def a(arg):
    print(arg)
    b()
    return &amp;#39;world&amp;#39;

a(&amp;#39;foo&amp;#39;)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;--&amp;gt; Executing: a(&amp;#39;foo&amp;#39;)
foo
--&amp;gt; Executing: b()
...
--&amp;gt; Returning: b() -&amp;gt; None
--&amp;gt; Returning: a(&amp;#39;foo&amp;#39;) -&amp;gt; &amp;#39;world&amp;#39;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only trick here is hiding parts of the callstack which is not interesting. In my case, I'm running this code in ipython over Anaconda, so I hide parts of the callstack where the code is in a file which is has &lt;code&gt;anaconda&lt;/code&gt; in its path (otherwise I would get about 50-100 useless callstack entries in the snippet above). This is accomplished by the &lt;code&gt;exclude_files&lt;/code&gt; parameter of the decorator.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@traceclass&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Similarly to the above, we can define a decorator &lt;code&gt;@traceclass&lt;/code&gt; which we use with classes, to get traces of its members' execution. This was included in &lt;a href="https://bytepawn.com/python-decorator-patterns.html"&gt;the previous decorator post&lt;/a&gt;, there it was just called &lt;code&gt;@trace&lt;/code&gt; and had a bug (since fixed in the original post). The decorator:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traceclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_traced&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traced_method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;gt; Executing: {cls.__name__}::{method_name}()&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;traced_method&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;callable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__class__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;make_traced&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can use it like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@traceclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;f2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;--&amp;gt; Executing: Foo::__init__()
--&amp;gt; Executing: Foo::__init__()
--&amp;gt; Executing: Foo::increment()
--&amp;gt; Executing: Foo::__str__()
This is a Foo object with i = 1
--&amp;gt; Executing: Foo::__str__()
This is a Foo object with i = 4
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In Python functions are first class citizens, and decorators are powerful syntactic sugar exploiting this functionality to give programmers a seemingly "magic" way to construct useful compositions of functions and classes. These are 5 decorators that may useful specifically for Data Scientists working in ipython notebooks.&lt;/p&gt;
&lt;p&gt;Thanks &lt;a href="https://github.com/zsol"&gt;Zsolt&lt;/a&gt; for bugfixes and improvement suggestions.&lt;/p&gt;</content><category term="python"></category><category term="decorators"></category></entry><entry><title>Building a toy Python @dataclass decorator</title><link href="https://bytepawn.com/building-a-toy-python-dataclass-decorator.html" rel="alternate"></link><published>2022-05-12T00:00:00+02:00</published><updated>2022-05-12T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-12:/building-a-toy-python-dataclass-decorator.html</id><summary type="html">&lt;p&gt;I write a toy implementation of Python's &lt;code&gt;@dataclass&lt;/code&gt; decorator to improve my Python fu and learn more about decorators and metaprogramming.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/dataclasses.png" alt="Python enum" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Following the previous article on a &lt;a href="https://bytepawn.com/tag/enum.html"&gt;toy implementation of a Python &lt;code&gt;Enum&lt;/code&gt; class&lt;/a&gt; and the article on &lt;a href="https://bytepawn.com/python-decorator-patterns.html#python-decorator-patterns"&gt;Python decorator patterns&lt;/a&gt;, this time I will write a toy implememtation of the built-in &lt;code&gt;@dataclass&lt;/code&gt; decorator. The &lt;a href="https://docs.python.org/3/library/dataclasses.html"&gt;official documentation for dataclasses is here&lt;/a&gt;. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Dataclass.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@dataclass&lt;/code&gt; features&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;@dataclass&lt;/code&gt; is a very useful feature of the Python standard library. It's a simple way to declare a class with typed variables, and get useful helper functions added to the class "for free":&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;numerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;denominator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With &lt;code&gt;@dataclass&lt;/code&gt;, we get free constructors for the member attributes. There are 3 ways to get a new &lt;code&gt;Fraction&lt;/code&gt; object:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# defaults, same as Fraction(0, 1)&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;numerator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;denominator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that the second and third would not work without &lt;code&gt;@dataclass&lt;/code&gt;, we would get &lt;code&gt;TypeError: Fraction() takes no arguments&lt;/code&gt;. We also get free equality checks (it compares each of the members), like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that without &lt;code&gt;@dataclass&lt;/code&gt;, the &lt;code&gt;==&lt;/code&gt; equality would be &lt;code&gt;False&lt;/code&gt;. We also get a useful string representation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# prints: Fraction(numerator=1, denominator=2)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;By specifying &lt;code&gt;order=True&lt;/code&gt; in the &lt;code&gt;@dataclass&lt;/code&gt; decorator, we also get free &lt;code&gt;&amp;lt;, &amp;lt;=, &amp;gt;, &amp;gt;=&lt;/code&gt; comparisons:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;numerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;denominator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The free comparison functions aren't very useful for the &lt;code&gt;Fraction&lt;/code&gt; example, since they just do a attribute-wise comparison. So eg. &lt;code&gt;f &amp;lt; g&lt;/code&gt; is the same as &lt;code&gt;f.numerator &amp;lt; g.numerator and f.denominator &amp;lt; g.denominator&lt;/code&gt;, which mathematically isn't the right expression.&lt;/p&gt;
&lt;p&gt;By specifying &lt;code&gt;frozen=True&lt;/code&gt; in the decorator arguments, we get read-only objects, which are useful in eg. multithreading:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frozen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;numerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;denominator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numerator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="c1"&gt;# FrozenInstanceError: cannot assign to field &amp;#39;numerator&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Finally, at the top-level of &lt;code&gt;dataclasses&lt;/code&gt; are 2 useful functions, &lt;code&gt;asdict()&lt;/code&gt; and &lt;code&gt;astuple()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;asdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# {&amp;#39;numerator&amp;#39;: 1, &amp;#39;denominator&amp;#39;: 2}&lt;/span&gt;
&lt;span class="n"&gt;astuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# (1, 2)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The Python standard library &lt;code&gt;@dataclass&lt;/code&gt; has a lot more features, but these are some of the most frequently used ones. Let's practice our Python fu and create a toy implementation of &lt;code&gt;@dataclass&lt;/code&gt;. As seen before, the general skeleton will be:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# add useful features to cls&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Finding the annotated member attributes&lt;/h2&gt;
&lt;p&gt;First we have to locate the annotated attributes declared by the user in the class object. The trick is telling them apart from all the built-in functions and attributes and any user defined functions. We can check in &lt;code&gt;cls.__dict__&lt;/code&gt; and &lt;code&gt;dir(cls)&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;numerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;denominator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numerator&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;__module__&amp;#39;: &amp;#39;__main__&amp;#39;, &amp;#39;__annotations__&amp;#39;: {&amp;#39;numerator&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;, &amp;#39;denominator&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;}, &amp;#39;numerator&amp;#39;: 0, &amp;#39;denominator&amp;#39;: 1, &amp;#39;__dict__&amp;#39;: &amp;lt;attribute &amp;#39;__dict__&amp;#39; of &amp;#39;Fraction&amp;#39; objects&amp;gt;, &amp;#39;__weakref__&amp;#39;: &amp;lt;attribute &amp;#39;__weakref__&amp;#39; of &amp;#39;Fraction&amp;#39; objects&amp;gt;, &amp;#39;__doc__&amp;#39;: None}

[&amp;#39;__annotations__&amp;#39;, &amp;#39;__class__&amp;#39;, &amp;#39;__delattr__&amp;#39;, &amp;#39;__dict__&amp;#39;, &amp;#39;__dir__&amp;#39;, &amp;#39;__doc__&amp;#39;, &amp;#39;__eq__&amp;#39;, &amp;#39;__format__&amp;#39;, &amp;#39;__ge__&amp;#39;, &amp;#39;__getattribute__&amp;#39;, &amp;#39;__gt__&amp;#39;, &amp;#39;__hash__&amp;#39;, &amp;#39;__init__&amp;#39;, &amp;#39;__init_subclass__&amp;#39;, &amp;#39;__le__&amp;#39;, &amp;#39;__lt__&amp;#39;, &amp;#39;__module__&amp;#39;, &amp;#39;__ne__&amp;#39;, &amp;#39;__new__&amp;#39;, &amp;#39;__reduce__&amp;#39;, &amp;#39;__reduce_ex__&amp;#39;, &amp;#39;__repr__&amp;#39;, &amp;#39;__setattr__&amp;#39;, &amp;#39;__sizeof__&amp;#39;, &amp;#39;__str__&amp;#39;, &amp;#39;__subclasshook__&amp;#39;, &amp;#39;__weakref__&amp;#39;, &amp;#39;denominator&amp;#39;, &amp;#39;numerator&amp;#39;]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Long story short, the right place to look is &lt;code&gt;cls.__annotations__&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Fraction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__annotations__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;numerator&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;, &amp;#39;denominator&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;}
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Building the constuctor&lt;/h2&gt;
&lt;p&gt;Now that we found the attributes, let's build the constructor:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__annotations__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c1"&gt;# define initializer, unless defined by the user&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__init__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                        &lt;span class="c1"&gt;# avoid our own __setattr__ in case it&amp;#39;s a frozen dataclass:&lt;/span&gt;
                        &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                       
                &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="c1"&gt;# avoid our own __setattr__ in case it&amp;#39;s a frozen dataclass:&lt;/span&gt;
                        &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;First we save the attribute keys in &lt;code&gt;cls._attribs&lt;/code&gt;, so we can conveniently access it in subsequent functions.&lt;/p&gt;
&lt;p&gt;We check if the user explicitly defined their own constructor with &lt;code&gt;if '__init__' not in cls.__dict__&lt;/code&gt;. If not, then we define the function and assign it with &lt;code&gt;cls.__init__ = __init__&lt;/code&gt; at the end. The constructor function itself takes &lt;code&gt;*args&lt;/code&gt; and &lt;code&gt;**kwargs&lt;/code&gt;. If the user calls it like &lt;code&gt;Fraction(1, 2)&lt;/code&gt;, then &lt;code&gt;args&lt;/code&gt; will be &lt;code&gt;(1, 2)&lt;/code&gt; and &lt;code&gt;kwargs&lt;/code&gt; will be an empty dict &lt;code&gt;{}&lt;/code&gt;.  If the user calls it like &lt;code&gt;Fraction(numerator=1, denominator=2)&lt;/code&gt;, then &lt;code&gt;args&lt;/code&gt; will be &lt;code&gt;()&lt;/code&gt; and &lt;code&gt;kwargs&lt;/code&gt; will be a dict like &lt;code&gt;{'numerator': 1, 'denominator': 2}&lt;/code&gt;. So we just need to get these, and assign them to the appropriate member variable with &lt;code&gt;object.__setattr__()&lt;/code&gt;, to avoid using &lt;code&gt;cls.__setattr__()&lt;/code&gt; in case this is a frozen dataclass (see later).&lt;/p&gt;
&lt;h2&gt;String conversion&lt;/h2&gt;
&lt;p&gt;String conversion is relatively straightforward:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__annotations__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="c1"&gt;# define string conversion, unless defined by th user&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__str__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;kv_tuples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;kv_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;, &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{k}={v}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kv_tuples&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{self.__class__.__name__}({kv_str})&amp;#39;&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__str__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Equality and comparators&lt;/h2&gt;
&lt;p&gt;Next, let's write equality and comparators. These can be switched on and off by passing the appropriate arguments to the decorator, so we have to check these:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;eq&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# define ==, unless defined by the user&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__eq__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;            
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__eq__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Operators for &lt;code&gt;&amp;lt;, &amp;lt;=, &amp;gt;, &amp;gt;=&lt;/code&gt; follow the same logic, showing just the first:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;order&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# define &amp;lt;, &amp;lt;=, &amp;gt;, &amp;gt;=, unless defined by the user&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__lt__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__lt__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;            
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__lt__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__lt__&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Frozen&lt;/h2&gt;
&lt;p&gt;If &lt;code&gt;frozen=True&lt;/code&gt; is passed to the decorator, we have to disallow assignment to attributes, ie. &lt;code&gt;f.numerator = 1&lt;/code&gt; should raise an exception. When we write &lt;code&gt;f.numerator = 1&lt;/code&gt;, internally this becomes &lt;code&gt;Fraction.__setattr__(f, 'numerator', 1)&lt;/code&gt;. So to disallow this, we just have to define our own &lt;code&gt;__setattr__()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;frozen&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# don&amp;#39;t allow changing attributes&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nb"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;AttributeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;dataclass is frozen, cannot assign to field &lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;{attrib}&lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__setattr__&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@dataclass&lt;/code&gt; vs &lt;code&gt;@dataclass(...)&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;We need one final trick. When using this decorator, we want to allow users to write both &lt;code&gt;@dataclass&lt;/code&gt; and &lt;code&gt;@dataclass(...)&lt;/code&gt;. In the first case, the class after the decorator gets passed directly to the &lt;code&gt;dataclass&lt;/code&gt; function. In the second case, &lt;code&gt;dataclass()&lt;/code&gt; is first called with the arguments (without the following class), and that must return a function which accepts the class. We want to support both behaviours in a single decorator function. The solution is quite simple:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__annotations__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="c1"&gt;# this is where the code shown above is&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @dataclass(...)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @dataclass, without parens&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Complete toy implementation&lt;/h2&gt;
&lt;p&gt;That's it! The complete toy implementation, including &lt;code&gt;asdict()&lt;/code&gt; and &lt;code&gt;astuple()&lt;/code&gt; is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;asdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;astuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__annotations__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c1"&gt;# define initializer, unless defined by the user&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__init__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                        &lt;span class="c1"&gt;# avoid our own __setattr__ in case it&amp;#39;s a frozen dataclass:&lt;/span&gt;
                        &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                       
                &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="c1"&gt;# avoid our own __setattr__ in case it&amp;#39;s a frozen dataclass:&lt;/span&gt;
                        &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;
        &lt;span class="c1"&gt;# define string conversion, unless defined by th user&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__str__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;kv_tuples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;kv_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;, &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{k}={v}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kv_tuples&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{self.__class__.__name__}({kv_str})&amp;#39;&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__str__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;eq&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# define ==, unless defined by the user&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__eq__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;            
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__eq__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;order&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# define &amp;lt;, &amp;lt;=, &amp;gt;, &amp;gt;=, unless defined by the user&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__lt__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__lt__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;            
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__lt__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__lt__&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__le__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__le__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;            
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__le__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__le__&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__gt__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__gt__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__gt__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__gt__&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__ge__&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__ge__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
                &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__ge__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__ge__&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;frozen&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# don&amp;#39;t allow changing attributes&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__setattr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_attribs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nb"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;AttributeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;dataclass is frozen, cannot assign to field &lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;{attrib}&lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__setattr__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="fm"&gt;__setattr__&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @dataclass(...)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# decorator was used like @dataclass, without parens&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this toy implementation, the examples shown at the beginning work just like the Python standard library &lt;code&gt;@dataclass&lt;/code&gt;, whose &lt;a href="https://github.com/python/cpython/blob/main/Lib/dataclasses.py"&gt;implementation is in &lt;code&gt;dataclasses.py&lt;/code&gt;&lt;/a&gt; (it's 1488 lines of code).&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Similar to the &lt;a href="https://bytepawn.com/tag/python.html"&gt;previous articles&lt;/a&gt;, this is a good way to improve one's Python fu.&lt;/p&gt;</content><category term="python"></category><category term="dataclass"></category><category term="decorator"></category></entry><entry><title>Python decorator patterns</title><link href="https://bytepawn.com/python-decorator-patterns.html" rel="alternate"></link><published>2022-05-08T00:00:00+02:00</published><updated>2022-05-08T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-08:/python-decorator-patterns.html</id><summary type="html">&lt;p&gt;I show toy implementations of Python decorator patterns such as &lt;code&gt;@measure&lt;/code&gt;, &lt;code&gt;@repeat&lt;/code&gt;, &lt;code&gt;@trace&lt;/code&gt;, &lt;code&gt;@count&lt;/code&gt;, &lt;code&gt;@singleton&lt;/code&gt;, and &lt;code&gt;@app.route&lt;/code&gt; (made famous by Flask).&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/decorators.jpg" alt="Python enum" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In Python, functions are first class citizens: functions can be passed to other functions, can be returned from functions, and can be created on the fly. Let's see an example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# define a function on-the-fly&lt;/span&gt;
&lt;span class="n"&gt;pow2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pow2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# take a function as a parameter&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_twice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;print_twice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pow2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# take a function as a parameter and return a new function&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Hello world!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;4
9
9
Hello world!
Hello world!
Hello world!
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Decorators in Pythons are syntactic sugar for passing functions to functions and returning a new function.&lt;/strong&gt; Let's see how this works and how we can put it to use in practice. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Decorators.ipynb"&gt;code for this article is on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@measure&lt;/code&gt;: decorator functions without arguments&lt;/h2&gt;
&lt;p&gt;Let's take a useful example of measuring how long it takes to execute a function. The best would be if we could easily annotate an existing function and get "free" measurements. Let's look at the following two functions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;timeit&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;default_timer&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sleep&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;measure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;---&amp;gt; Calling {func.__name__}()&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_sec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;---&amp;gt; Done {func.__name__}(): {elapsed_sec:.3f} secs&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleeper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Going to sleep...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;measure()&lt;/code&gt; is a function which takes a function &lt;code&gt;func()&lt;/code&gt; as an argument, and returns a function &lt;code&gt;inner()&lt;/code&gt; declared on the inside. &lt;code&gt;inner()&lt;/code&gt; takes whatever arguments are passed in and passed them along to &lt;code&gt;func()&lt;/code&gt;, but wraps this call in a few lines of to measure and print the elapsed time in seconds. &lt;code&gt;sleeper()&lt;/code&gt; is a test function which explicitly sleeps for a while so we can measure it.&lt;/p&gt;
&lt;p&gt;Given these, we can construct a measured &lt;code&gt;sleeper()&lt;/code&gt; function like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;measured_sleeper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;measure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sleeper&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;measured_sleeper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;---&amp;gt; Calling sleeper()
Going to sleep...
Done!
---&amp;gt; Done sleeper(): 3.000 secs
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This works, but if we're already using &lt;code&gt;sleeper()&lt;/code&gt; in a bunch of places, we'd have to replace all those calls with &lt;code&gt;measured_sleeper()&lt;/code&gt;. Instead, we can:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;sleeper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;measure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sleeper&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we are replacing the &lt;code&gt;sleeper&lt;/code&gt; reference in the current scope to point to the measured version of the original &lt;code&gt;sleeper()&lt;/code&gt; function. This is exactly the same thing as putting the &lt;code&gt;@&lt;/code&gt; decorator in front of the function declaration:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@measure&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleeper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Going to sleep...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So &lt;code&gt;@decorators&lt;/code&gt; are just syntactic sugar to passing a newly defined function to an existing decorator function, which returns a new function, and having the original function name point to this new function!&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;@repeat&lt;/code&gt;: parameterized decorator function&lt;/h2&gt;
&lt;p&gt;In the above example we took an existing function &lt;code&gt;sleeper()&lt;/code&gt; and decorated it with a function-taking-and-returning-a-function &lt;code&gt;measure()&lt;/code&gt;, ie. a &lt;code&gt;@decorator&lt;/code&gt;. What if we want to pass arguments to the decorator function itself? For example, imagine we have a function, and we want to repeat it &lt;code&gt;n&lt;/code&gt; times. To accomplish this, we just have to add one more inner function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;repeat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;

&lt;span class="nd"&gt;@repeat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Hello {name}!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;world&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Hello world!
Hello world!
Hello world!
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@trace&lt;/code&gt;: Decorating a class with a function&lt;/h2&gt;
&lt;p&gt;We can also decorate classes, not just functions. As an example, assume we have an existing class &lt;code&gt;Foo&lt;/code&gt;, and we would like to trace it, ie. get a &lt;code&gt;print()&lt;/code&gt; each time a method is called, without having to manually change each method. So we'd like to be able to put &lt;code&gt;@trace&lt;/code&gt; before the class definition and get this functionality for free, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@trace&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What does &lt;code&gt;trace()&lt;/code&gt; look like? It must accepts a &lt;code&gt;cls&lt;/code&gt; argument (the newly defined class, &lt;code&gt;Foo&lt;/code&gt; in our case), and return a new/modified class (with added tracing):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_traced&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traced_method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Executing {cls.__name__}::{method_name}...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;traced_method&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__dict__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;callable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__class__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;make_traced&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The implementation is quite straightforward. We go through all methods in &lt;code&gt;cls.__dict__.items()&lt;/code&gt;, and replace the method with a wrapped method, which we manufacture with the inner &lt;code&gt;make_traced()&lt;/code&gt; function. It works:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;f2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Executing Foo::__init__...
Executing Foo::__init__...
Executing Foo::increment...
Executing Foo::__str__...
This is a Foo object with i = 1
Executing Foo::__str__...
This is a Foo object with i = 4
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@singleton&lt;/code&gt;: The singleton pattern&lt;/h2&gt;
&lt;p&gt;A second example of decorating a class with a function is implementing the common &lt;a href="https://en.wikipedia.org/wiki/Singleton_pattern"&gt;singleton pattern&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In software engineering, the singleton pattern is a software design pattern that restricts the instantiation of a class to one "single" instance. This is useful when exactly one object is needed to coordinate actions across the system.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Our implementation as a Python decorator &lt;code&gt;@singleton&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;singleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__new__singleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;__singleton&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__singleton&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__singleton&lt;/span&gt;                    &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;__new__singleton&lt;/span&gt;                &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As &lt;a href="https://bytepawn.com/tag/enum.html"&gt;mentioned in the &lt;code&gt;Enum&lt;/code&gt; articles&lt;/a&gt;, the &lt;code&gt;__new__()&lt;/code&gt; class method is called to construct new objects, before &lt;code&gt;__init__()&lt;/code&gt; is called on the newly created instance to initialize it. So, to get singleton behaviour, we just need to override &lt;code&gt;__new__()&lt;/code&gt; to always return a single instance. Let's test it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@singleton&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;

&lt;span class="nd"&gt;@singleton&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;f2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;b1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;This is a Foo object with i = 5
This is a Foo object with i = 5
This is a Bar object with i = 9
True
False
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@Count&lt;/code&gt;: Decorating a class with a class&lt;/h2&gt;
&lt;p&gt;The reason the above code works is that in Python, class declarations are really just syntactic sugar for a function which constructs a new &lt;code&gt;type&lt;/code&gt; object. For example, a class &lt;code&gt;Foo&lt;/code&gt; declared above can also be defined programatically like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def make_class(name):
    cls = type(name, (), {})
    setattr(cls, &amp;#39;i&amp;#39;, 0)
    def __init__(self, i): self.i = i
    setattr(cls, &amp;#39;__init__&amp;#39;, __init__)
    def increment(self): self.i += 1
    setattr(cls, &amp;#39;increment&amp;#39;, increment)
    def __str__(self): return f&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;
    setattr(cls, &amp;#39;__str__&amp;#39;, __str__)
    return cls

Foo = make_class(&amp;#39;Foo&amp;#39;)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;But, if that's the case, we can not just decorate a function with a function, a class with a function, but also a class with a class. Let's see an example of this with the &lt;code&gt;@Count&lt;/code&gt; pattern, where we want to count the number of instances created. We have an existing class, and we'd like to be able to just put &lt;code&gt;@Count&lt;/code&gt; before class definition, and get a "free" count of instances created, that we can then access using the decorator &lt;code&gt;Count&lt;/code&gt; class. The solution:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;instances&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# we will use this as a class instance&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__call__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# here cls is either Foo or Bar&lt;/span&gt;
        &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Counted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# here cls is either Foo or Bar&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# here cls is Counted&lt;/span&gt;
                &lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;instances&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__bases__&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
        &lt;span class="n"&gt;Counted&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;
        &lt;span class="c1"&gt;# without this ^ , self.__class__.__name__ would&lt;/span&gt;
        &lt;span class="c1"&gt;# be &amp;#39;Counted&amp;#39; in the __str__() functions below&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Counted&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The trick is that when a class is decorated with &lt;code&gt;Count&lt;/code&gt;, its &lt;code&gt;__call__()&lt;/code&gt; method is invoked by the runtime, and the class is passed in as &lt;code&gt;cls&lt;/code&gt;. Inside, we construct a new class &lt;code&gt;Counted&lt;/code&gt;, which has &lt;code&gt;cls&lt;/code&gt; as its parent, but overrides &lt;code&gt;__new__()&lt;/code&gt;, and increments a counter in the &lt;code&gt;Count&lt;/code&gt; class variable &lt;code&gt;instances&lt;/code&gt; (but otherwise created a new instance and returns it). The newly constructed &lt;code&gt;Counted&lt;/code&gt; class (whose name is overridden) is then returned, and replaces the original defined class. Let's see it in action:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@Count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;
&lt;span class="nd"&gt;@Count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;This is a {self.__class__.__name__} object with i = {self.i}&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;f2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;b1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;class_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_instances&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;instances&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{class_name} -&amp;gt; {num_instances}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Bar&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;
&lt;span class="n"&gt;Foo&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;Bar&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@app.route&lt;/code&gt;: Building a Flask-like application object by decorating functions&lt;/h2&gt;
&lt;p&gt;Finally, many of us have &lt;a href="https://www.digitalocean.com/community/tutorials/how-to-make-a-web-application-using-flask-in-python-3"&gt;used Flask&lt;/a&gt;, and have written HTTP handler functions along the lines of:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Hello, World!&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is yet another creative use of decorator patterns. Here we're building up an &lt;code&gt;app&lt;/code&gt; object by adding our custom handler functions, but we don't have to worry about defining our own class derived from &lt;code&gt;Flask&lt;/code&gt;, we just write flat functions which we decorate. This functionality is straightforward to duplicate as a toy &lt;code&gt;Router&lt;/code&gt; class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;default_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;404 (path was {path})&amp;#39;&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;longest_match&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;routes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;longest_match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;longest_match&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;handler_func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;handler_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;default_handler&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Response: {handler_func(path)}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only trick here is that the &lt;code&gt;Router::route()&lt;/code&gt; can act like a decorator, and returns a function. Example usage:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Hello to my server!&amp;#39;&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/version&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Version 0.1&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/version&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;does-not-exist&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Hello&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;my&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;
&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Version&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="n"&gt;was&lt;/span&gt; &lt;span class="n"&gt;does&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;not&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;exist&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;&lt;code&gt;@decorator&lt;/code&gt; vs &lt;code&gt;@decorator()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;In the &lt;code&gt;@measure&lt;/code&gt; example, we wrote:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nd"&gt;@measure&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleeper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Could we also write &lt;code&gt;@measure()&lt;/code&gt; before the &lt;code&gt;def&lt;/code&gt;? No! We would get an error:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;measure() missing 1 required positional argument: &amp;#39;func&amp;#39;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;But, in the &lt;code&gt;app.route()&lt;/code&gt; example, we do write the &lt;code&gt;()&lt;/code&gt; parentheses. The situation is simple: roughly speaking, &lt;code&gt;@decorator def func&lt;/code&gt; gets replaced by &lt;code&gt;func = decorator(func)&lt;/code&gt;. If we write  &lt;code&gt;@decorator() def func&lt;/code&gt;, it gets replaced by &lt;code&gt;func = decorator()(func)&lt;/code&gt;. So in the latter case, &lt;code&gt;decorator()&lt;/code&gt; is run, and it needs to return a function which accepts a function as as an argument, and returns a function. This is how all the examples where the decorator takes an argument are structured.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In Python functions are first class citizens, and decorators are powerful syntactic sugar exploiting this functionality to give programmers a seemingly "magic" way to construct useful compositions of functions and classes. This is an important language feature that sets Python apart from traditional OOP languages like C++ and Java, where achieving such functionality requires more code, or more complex templated code. This dynamic nature of Python creates more runtime overhead compared to a language like C++, but it makes the code easier to wrote and comprehend. This is a win for programmers and projects; in most real-world software engineering efforts runtime performance is not a bottleneck.&lt;/p&gt;
&lt;p&gt;Thanks &lt;a href="https://github.com/zsol"&gt;Zsolt&lt;/a&gt; for bugfixes and improvement suggestions.&lt;/p&gt;</content><category term="python"></category><category term="decorators"></category></entry><entry><title>Building a toy Python Enum class - Part II</title><link href="https://bytepawn.com/building-a-simple-python-enum-class-part-ii.html" rel="alternate"></link><published>2022-05-05T00:00:00+02:00</published><updated>2022-05-05T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-05:/building-a-simple-python-enum-class-part-ii.html</id><summary type="html">&lt;p&gt;I extend my previous toy implementation of Python's &lt;code&gt;Enum&lt;/code&gt; class to add more features.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/getattribute.png" alt="Python enum" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/building-a-simple-python-enum-class.html#building-a-simple-python-enum-class"&gt;previous article&lt;/a&gt; I started to write a toy implementation of Python's &lt;code&gt;Enum&lt;/code&gt; class. Here I will continue and add more features. The posts are an exercise in how to use Python's language features to build an easy-to-use interface, in this case a class which resembles old-school C++ enums.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Enums%20-%20Part%20II.ipynb"&gt;code for this post&lt;/a&gt; is up on Github. You should also check the &lt;a href="https://github.com/python/cpython/blob/main/Lib/enum.py"&gt;official cpython implementation of &lt;code&gt;Enum&lt;/code&gt; here on Github&lt;/a&gt; (it's 2018 lines of code).&lt;/p&gt;
&lt;h2&gt;Leaner auto()&lt;/h2&gt;
&lt;p&gt;A friend pointed out that the way I implemented &lt;code&gt;auto()&lt;/code&gt; is quite inefficient and unneceassary! It can be simplified to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Also, we can change our code slightly so users can write both &lt;code&gt;auto()&lt;/code&gt; or just &lt;code&gt;auto&lt;/code&gt; when using this feature. In one case we will find an object of type &lt;code&gt;auto&lt;/code&gt;, in the other case we will find the type &lt;code&gt;auto&lt;/code&gt; itself:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="c1"&gt;# handle auto()&lt;/span&gt;
        &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--------------&lt;/span&gt;
                &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_value&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;String representations&lt;/h2&gt;
&lt;p&gt;We want both &lt;code&gt;Enum&lt;/code&gt; classes (such as &lt;code&gt;Color&lt;/code&gt;) and &lt;code&gt;Enum&lt;/code&gt; objects (such as &lt;code&gt;Color(1)&lt;/code&gt;) to have nice string representations, like with the standard library &lt;code&gt;Enum&lt;/code&gt;. This is easy:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;lt;enum &lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;{cls.__name__}&lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;&amp;gt;&amp;#39;&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that with &lt;code&gt;EnumMeta&lt;/code&gt;, the first argument is &lt;code&gt;cls&lt;/code&gt; (not &lt;code&gt;self&lt;/code&gt;) which is a &lt;code&gt;type&lt;/code&gt; object. This gets called when the &lt;code&gt;Color&lt;/code&gt; class itself gets printed out:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# __str__() in EnumMeta&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="c1"&gt;# __str__() in EnumMeta&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;       &lt;span class="c1"&gt;# __str__() in Enum&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Color&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Color&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Pretty!&lt;/p&gt;
&lt;h2&gt;Accessors&lt;/h2&gt;
&lt;p&gt;Next, let's make it so &lt;code&gt;Color['RED']&lt;/code&gt; and &lt;code&gt;Color.RED&lt;/code&gt; returns a appropriate color object. The first one, &lt;code&gt;Color['RED']&lt;/code&gt; is quite simple: when we use &lt;code&gt;[]&lt;/code&gt; on an object, the class' &lt;code&gt;__getitem__()&lt;/code&gt; is called. For &lt;code&gt;Color['RED']&lt;/code&gt;, the object is the class itself, it's type is &lt;code&gt;EnumMeta&lt;/code&gt;. So all we have to do is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getitem__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Making &lt;code&gt;Color.RED&lt;/code&gt; work is a bit more tricky. The basic idea is simple: when we use the dot &lt;code&gt;.&lt;/code&gt; on an object, the class' &lt;code&gt;__getattr__()&lt;/code&gt; and/or &lt;code&gt;__getattribute__()&lt;/code&gt; is called. To understand the difference between these two, &lt;a href="https://medium.com/@satishgoda/python-attribute-access-using-getattr-and-getattribute-6401f7425ce6"&gt;check this post&lt;/a&gt;. &lt;code&gt;__getattr__()&lt;/code&gt; is called if the attribute is not found, and the user wishes to return a computed value. In our case, &lt;code&gt;Color.RED&lt;/code&gt; is defined, so we have to override &lt;code&gt;__getattribute__()&lt;/code&gt;, which is always called by Python when accessing an attribute with dot operator. So, it's natural to think that a copy/paste version of the above &lt;code&gt;__getitem__()&lt;/code&gt; will work here:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# recursively call itself, kills the kernel!&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The problem is that we use the dot operator all over the place, including the &lt;code&gt;EnumMeta&lt;/code&gt; constructor:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt;.&lt;span class="n"&gt;_enumerations&lt;/span&gt; = &lt;span class="n"&gt;enumerations&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What we have to do is to explicitly use Python's built-in &lt;code&gt;__getattribute__()&lt;/code&gt; function, like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;_&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_enumerations&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What this does is, if we are trying to access an attribute which starts with an underscore, such as &lt;code&gt;__class__&lt;/code&gt; or our own &lt;code&gt;_enumerations&lt;/code&gt;, it routes it through the usual channel of the built-in Python attribute look-up. But in other cases, such as &lt;code&gt;Color.RED&lt;/code&gt;, it returns a new object constructed on the fly: &lt;code&gt;cls(...)&lt;/code&gt; in the last line would be &lt;code&gt;Color(...)&lt;/code&gt;, &lt;code&gt;object.__getattribute__()&lt;/code&gt; is used to avoid recursively entering this function, and &lt;code&gt;_enumerations[key]&lt;/code&gt; is our own helper dictionary where we store the &lt;code&gt;Enum&lt;/code&gt;'s cases. In the case of &lt;code&gt;Color.RED&lt;/code&gt;, &lt;code&gt;key&lt;/code&gt; is &lt;code&gt;RED&lt;/code&gt;, &lt;code&gt;_enumerations[key]&lt;/code&gt; is &lt;code&gt;1&lt;/code&gt;, so the whole thing becomes &lt;code&gt;return Color(1)&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Membership and equality&lt;/h2&gt;
&lt;p&gt;Another useful feature of the standard library &lt;code&gt;Enum&lt;/code&gt; is the ability to check &lt;code&gt;Color.RED in Color&lt;/code&gt;, and of course &lt;code&gt;Color.RED == Color.GREEN&lt;/code&gt; checks. These are easy:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__contains__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;  

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;WeekendDay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BLUE&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;WeekendDay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Equality checks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;        &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;            &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;            &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;WeekendDay&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SATURDAY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Static objects&lt;/h2&gt;
&lt;p&gt;There is an issue with the implementation so far:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Red&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;     &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;At least the first one should be &lt;code&gt;True&lt;/code&gt;, and probably all of them (in the standard library version, they're all &lt;code&gt;True&lt;/code&gt;). Unlike equality, Python's &lt;code&gt;is&lt;/code&gt; checks whether the variables are referring to the same object. But in our implementation so far, all of these calls return a new object (by doing &lt;code&gt;cls(...)&lt;/code&gt;). This both breaks the above checks, and is also wasteful. Let's fix this.&lt;/p&gt;
&lt;p&gt;Let's create one instance for each of the &lt;code&gt;Enum&lt;/code&gt; cases (&lt;code&gt;RED, GREEN, BLUE&lt;/code&gt; in the example) up front, and always return references to these objects, in a transparent way, so the user doesn't notice it. The first step is relatively easy, we create these static instances when the class is defined:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="c1"&gt;# make &amp;quot;static&amp;quot; instances of each enumeration object&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="c1"&gt;# initialize static instances&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now we have to change our code to always return these saved instances. In our toy &lt;code&gt;Enum&lt;/code&gt; class, there are several ways to get an &lt;code&gt;Enum&lt;/code&gt; object such as &lt;code&gt;Color&lt;/code&gt;: &lt;code&gt;Color.RED, Color['RED'], Color(1)&lt;/code&gt;. Making the first two return the saved instances is easy, because we already explicitly control what happens in those cases with our &lt;code&gt;__getitem__()&lt;/code&gt; and &lt;code&gt;__getattribute__()&lt;/code&gt; implementation. But what about when the user explicitly calls the constructor like &lt;code&gt;Color(1)&lt;/code&gt;, which we also do currently in our functions when returning &lt;code&gt;Enum&lt;/code&gt; objects. If we get the constructor to return the static instances, we're done.&lt;/p&gt;
&lt;p&gt;In Python, when the user calls the constuctor, two things happen. First, the class' &lt;code&gt;__new__()&lt;/code&gt; is called to construct the object (so this has a return value), and then &lt;code&gt;__init__()&lt;/code&gt; is called to initialize the already created object, which is conventionally called &lt;code&gt;self&lt;/code&gt; (this has no return value, as &lt;code&gt;self&lt;/code&gt; is already a given). Clearly here we have to write our custom &lt;code&gt;__new__()&lt;/code&gt;, and also avoid unneceassary duplicate initialization in &lt;code&gt;__init__()&lt;/code&gt;. The cleanest way to make this work that I have found is to override the newly created's &lt;code&gt;Enum&lt;/code&gt;'s' &lt;code&gt;__new__()&lt;/code&gt; when the type is being created:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="c1"&gt;# overwrite the new Enum&amp;#39;s __new__() so that is returns the static instances&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse_enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Short circuiting the &lt;code&gt;__init__()&lt;/code&gt; is an easy further optimization:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;    
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_Enum__key&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_Enum__value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The attributes are called like &lt;code&gt;_Enum__key&lt;/code&gt; because of Python's mangling of "private class members".&lt;/p&gt;
&lt;p&gt;By inserting some &lt;code&gt;print()&lt;/code&gt; statements, we can verify that our code works as intended:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WeekendDay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;SATURDAY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;
    &lt;span class="n"&gt;SUNDAY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;        &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;            &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;            &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;WeekendDay&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SATURDAY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# False&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;        &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;            &lt;span class="c1"&gt;# True&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WeekendDay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Constructing new type Enum
Constructing new type Color
Initializing new &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt;
Initializing new &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt;
Initializing new &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt;
Constructing new type WeekendDay
Initializing new &amp;lt;enum &amp;#39;WeekendDay&amp;#39;&amp;gt;
Initializing new &amp;lt;enum &amp;#39;WeekendDay&amp;#39;&amp;gt;
False
True
True
True
True
False
False
True
True
True
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;First the &lt;code&gt;Enum&lt;/code&gt; type itself is constructed, then &lt;code&gt;Color&lt;/code&gt;, the 3 &lt;code&gt;Color&lt;/code&gt; instances are created, then the &lt;code&gt;WeekendDay&lt;/code&gt; type is constructed, 2 &lt;code&gt;WeekendDay&lt;/code&gt; instances are created. The actual code itself doesn't create any more objects!&lt;/p&gt;
&lt;h2&gt;Final version&lt;/h2&gt;
&lt;p&gt;The final version is 71 lines of code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="c1"&gt;# handle auto() and auto&lt;/span&gt;
        &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_value&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;
        &lt;span class="n"&gt;reverse_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="c1"&gt;# make &amp;quot;static&amp;quot; instances of each enumeration object&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="c1"&gt;# initialize static instances&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="c1"&gt;# overwrite the new Enum&amp;#39;s __new__() so that is returns the static instances&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_instances&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse_enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;lt;enum &lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;{cls.__name__}&lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;&amp;gt;&amp;#39;&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__iter__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getitem__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;_&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__getattribute__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_enumerations&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__contains__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;    
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_Enum__key&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_Enum__value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="c1"&gt;# make sure the passed in value is a valid enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{value} is not a valid {self.__class__.__name__}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# save the actual enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__eq__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You should also check the &lt;a href="https://github.com/python/cpython/blob/main/Lib/enum.py"&gt;official cpython implementation of &lt;code&gt;Enum&lt;/code&gt; here on Github&lt;/a&gt; (it's 2018 lines of code).&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This was a great exercise to improve and exercise my Python Fu. Highly recommended! Thanks &lt;a href="https://github.com/zsol"&gt;Zsolt&lt;/a&gt; for bugfixes and improvement suggestions.&lt;/p&gt;</content><category term="python"></category><category term="enum"></category></entry><entry><title>Building a toy Python Enum class - Part I</title><link href="https://bytepawn.com/building-a-simple-python-enum-class.html" rel="alternate"></link><published>2022-05-03T00:00:00+02:00</published><updated>2022-05-03T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-05-03:/building-a-simple-python-enum-class.html</id><summary type="html">&lt;p&gt;I write a toy implementation of Python's &lt;code&gt;Enum&lt;/code&gt; class to learn about Python metaclasses.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/enum2.png" alt="Python enum" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Enumerations, or &lt;code&gt;enum&lt;/code&gt;s for short, are part of the core language in traditional statically typed languages like C, C++ and Java. For example, in C++ we can write:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BLUE&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;switch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;RED&lt;/span&gt;  &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;red&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;green&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;BLUE&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;blue&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;When the C++ compiler sees an enum definition, it allocates a special memory structure wide enough to hold the possible values (in the above case, there are 3 possible values, so 2 bits would be enough, but the compiler would actually allocate at least 1 byte, usually 4 bytes). Since C++ is statically typed, if we accidentall write a case statement like &lt;code&gt;case PURPLE : std::cout &amp;lt;&amp;lt; "purple\n";  break;&lt;/code&gt; we would get an error from the compiler, since it knows that &lt;code&gt;PURPLE&lt;/code&gt; is not a valid/possible value of &lt;code&gt;Color&lt;/code&gt;, as defined above.&lt;/p&gt;
&lt;p&gt;Python is a very different language from C++: &lt;strong&gt;enumerations are not part of the core Python language&lt;/strong&gt;, unlike say tuples. In Python, &lt;code&gt;Enum&lt;/code&gt;s (upper case) are part of the standard library, and are implemented using Python code, using Python classes in a tricky way. For example, here is &lt;a href="https://github.com/python/cpython/blob/main/Lib/enum.py"&gt;&lt;code&gt;enum.py&lt;/code&gt; (link)&lt;/a&gt; from &lt;a href="https://github.com/python/cpython"&gt;&lt;code&gt;cpython&lt;/code&gt;&lt;/a&gt;, and on line 1077 you can read the (very tricky) implementation of &lt;code&gt;Enum&lt;/code&gt;s. It looks something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumType&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As a &lt;a href="https://github.com/scalien/scaliendb"&gt;former C++ programmer&lt;/a&gt;, I find this weird, cool and intriguing. So I decided to practice my Python Fu and write my own &lt;strong&gt;toy implmenentation of Python's &lt;code&gt;Enum&lt;/code&gt; class&lt;/strong&gt;, specifically for &lt;code&gt;int&lt;/code&gt; values. Emphasis on "toy"; the full source code for &lt;a href="https://github.com/python/cpython/blob/main/Lib/enum.py"&gt;&lt;code&gt;enum.py&lt;/code&gt;&lt;/a&gt; linked above is 2018 lines of code (!), which also includes related classes such as &lt;code&gt;IntEnum&lt;/code&gt;, &lt;code&gt;StrEnum&lt;/code&gt;, &lt;code&gt;Flag&lt;/code&gt;, etc.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Enums%20-%20Part%20I.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Enumerations in Python&lt;/h2&gt;
&lt;p&gt;In Python, to use standard library &lt;code&gt;Enum&lt;/code&gt;s we have to first import it from the &lt;code&gt;enum&lt;/code&gt; module, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Alternatively, we can skip giving values by hand, and use the &lt;code&gt;auto()&lt;/code&gt; magic function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The magic &lt;code&gt;auto()&lt;/code&gt; function starts numbering at &lt;code&gt;1&lt;/code&gt;, so the above two are equivalent.&lt;/p&gt;
&lt;p&gt;There are various ways to create enums:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;c1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;c2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;c3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;lt;class &amp;#39;enum.EnumMeta&amp;#39;&amp;gt; &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt; &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt; &amp;lt;enum &amp;#39;Color&amp;#39;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Truthiness tests:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;             &lt;span class="c1"&gt;# false&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# true&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RED&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# true&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;An &lt;code&gt;Enum&lt;/code&gt; itself (not an instance) has a useful iteration interface (this sort of convenience does not exist in C++):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# 3&lt;/span&gt;
&lt;span class="c1"&gt;# can iterate:&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;3
Color.RED
Color.GREEN
Color.BLUE
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Examples of things that don't work with enums:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# ValueError: 4 is not a valid Color&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ValueError: &amp;#39;RED&amp;#39; is not a valid Color&lt;/span&gt;
&lt;span class="n"&gt;c1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;c1&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;       &lt;span class="c1"&gt;# TypeError: unsupported operand type(s) for +=: &amp;#39;Color&amp;#39; and &amp;#39;int&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Metaclasses in Python&lt;/h2&gt;
&lt;p&gt;So, as an exercise in Python meta-programming, let's write a simple &lt;code&gt;Enum&lt;/code&gt; class that accomplishes the above. In the code above, notice that &lt;code&gt;type(Color)&lt;/code&gt; is &lt;code&gt;&amp;lt;class 'enum.EnumMeta'&amp;gt;&lt;/code&gt;, this is a big clue. Python has a feature called metaclasses, which is a way to construct classes the way we construct objects. In Python, classes in fact are just objects of type &lt;code&gt;type&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;     &lt;span class="c1"&gt;# not deriving from Enum&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;lt;class &amp;#39;type&amp;#39;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;But then, when we printed the type for &lt;code&gt;Color(Enum)&lt;/code&gt;, we got &lt;code&gt;&amp;lt;class 'enum.EnumMeta'&amp;gt;&lt;/code&gt;, so what's going on? Let's check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;lt;class &amp;#39;enum.EnumMeta&amp;#39;&amp;gt; &amp;lt;class &amp;#39;enum.EnumMeta&amp;#39;&amp;gt; &amp;lt;class &amp;#39;type&amp;#39;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So both &lt;code&gt;Enum&lt;/code&gt; and &lt;code&gt;Color&lt;/code&gt; are of type &lt;code&gt;EnumMeta&lt;/code&gt;, and &lt;code&gt;EnumMeta&lt;/code&gt; itself is of type &lt;code&gt;type&lt;/code&gt;. We will follow this pattern in our own implementation of &lt;code&gt;Enum&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In Python, with metaclasses, we can create classes that are of our own metaclasses' type, and &lt;strong&gt;we can have "constructor" code run when the class is defined&lt;/strong&gt; (not when instances are created). Let's see an example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Defining new Enum type {cls}:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;- metacls = {metacls}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;- bases = {bases}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;- classdict = {classdict}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# specifying our own Enum metaclass&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Defining new Enum type Color:
- metacls = &amp;lt;class &amp;#39;__main__.Enum&amp;#39;&amp;gt;
- bases = ()
- classdict = {&amp;#39;__module__&amp;#39;: &amp;#39;__main__&amp;#39;, &amp;#39;__qualname__&amp;#39;: &amp;#39;Color&amp;#39;,
  &amp;#39;__annotations__&amp;#39;: {&amp;#39;RED&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;, &amp;#39;GREEN&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;,
  &amp;#39;BLUE&amp;#39;: &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;}, &amp;#39;RED&amp;#39;: 1, &amp;#39;GREEN&amp;#39;: 2, &amp;#39;BLUE&amp;#39;: 3}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Note that the code in &lt;code&gt;__new__()&lt;/code&gt; ran when we declared &lt;code&gt;Color&lt;/code&gt;! We never created an instance of &lt;code&gt;Color&lt;/code&gt; in the snippet above!&lt;/strong&gt; This is where we start our toy implementation, and this is also how the standard library &lt;code&gt;Enum&lt;/code&gt; works: by constructing a special class when a class deriving from &lt;code&gt;Enum&lt;/code&gt; is declared. Here we are actually not deriving, we are metaclassing, which we will fix later.&lt;/p&gt;
&lt;h2&gt;Basic &lt;code&gt;Enum&lt;/code&gt; functionality&lt;/h2&gt;
&lt;p&gt;The example above was just an illustration, a &lt;code&gt;Color&lt;/code&gt; defined like this is not useful. To accomplish the standard library functionality, we will use a chain of classes similar to &lt;a href="https://github.com/python/cpython/blob/main/Lib/enum.py"&gt;&lt;code&gt;enum.py&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This way, when we define an &lt;code&gt;Enum&lt;/code&gt; like &lt;code&gt;Color&lt;/code&gt;, it derives from &lt;code&gt;Enum&lt;/code&gt;, which is of metaclass &lt;code&gt;EnumMeta&lt;/code&gt;, so &lt;strong&gt;this way we can use both inheritance and metaclassing features of Python&lt;/strong&gt;. As a first order of business, in &lt;code&gt;EnumMeta&lt;/code&gt;'s &lt;code&gt;__new__()&lt;/code&gt;, let's go through the enumerations defined by the user and save them into the dictionary. As seen in the output above, these are available in the passed in &lt;code&gt;classdict&lt;/code&gt; object:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kr"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nx"&gt;def&lt;/span&gt; &lt;span class="nx"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nx"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="nx"&gt;enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;: &lt;span class="kt"&gt;y&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="kr"&gt;enum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nx"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="kr"&gt;enum&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;enumerations&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kr"&gt;enum&lt;/span&gt;

&lt;span class="kr"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nx"&gt;pass&lt;/span&gt;

&lt;span class="kr"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;specifying&lt;/span&gt; &lt;span class="nx"&gt;our&lt;/span&gt; &lt;span class="nx"&gt;own&lt;/span&gt; &lt;span class="nx"&gt;Enum&lt;/span&gt; &lt;span class="nx"&gt;metaclass&lt;/span&gt;
    &lt;span class="nx"&gt;RED&lt;/span&gt;: &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="nx"&gt;GREEN&lt;/span&gt;: &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="nx"&gt;BLUE&lt;/span&gt;: &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now let's check how this works:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;print(Color._enumerations)                     # {&amp;#39;RED&amp;#39;: 1, &amp;#39;GREEN&amp;#39;: 2, &amp;#39;BLUE&amp;#39;: 3}
print(type(Color), type(Enum), type(EnumMeta)) # ...
Color(1)                                       # TypeError: Color() takes no arguments
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The last call to the &lt;code&gt;Color()&lt;/code&gt; constructor will fail, because the default constructor in Python does not take arguments. Let's fix this in the &lt;code&gt;Enum&lt;/code&gt; base class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# make sure the passed in value is a valid enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{value} is not a valid {self.__class__.__name__}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# save the actual enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now we can try again:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# ValueError: 4 is not a valid Color&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We have provided our own automatic constructor, which takes just the values the user-defined &lt;code&gt;Enum&lt;/code&gt; should. We can pick some low hanging fruit and get string and iteration related functionality working:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__iter__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Color.RED
3
Color.RED
Color.GREEN
Color.BLUE
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Adding &lt;code&gt;auto()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;An easy feature to add is the magic &lt;code&gt;auto()&lt;/code&gt; function, which enables us to avoid writing out &lt;code&gt;int&lt;/code&gt; values, and increments automatically in the class definition. &lt;code&gt;auto()&lt;/code&gt; is just a function that runs when the class is defined, so it needs to return something. Then the results can be "cleaned up" in the metaclass's &lt;code&gt;__new__()&lt;/code&gt; function. We could use &lt;code&gt;None&lt;/code&gt; or &lt;code&gt;-999&lt;/code&gt; as the &lt;code&gt;auto()&lt;/code&gt; a return value, but that would conflict with the user using that value in their own &lt;code&gt;Enum&lt;/code&gt;s, so let's create a class just for this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="c1"&gt;# handle auto()&lt;/span&gt;
        &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if auto() was used, v will be an _Auto&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_value&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now we can do:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;RED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;GREEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;BLUE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The values will be replaced with &lt;code&gt;1, 2, 3&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The final version so far:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="c1"&gt;# handle auto()&lt;/span&gt;
        &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;_Auto&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;enumerations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_value&lt;/span&gt;
                &lt;span class="n"&gt;next_value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metacls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;enum&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enumerations&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__iter__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metaclass&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EnumMeta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# make sure the passed in value is a valid enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{value} is not a valid {self.__class__.__name__}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# save the actual enumeration value&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_enumerations&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;
                &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This is good progress, but some things are still missing: &lt;code&gt;Color['RED']&lt;/code&gt; doesn't work, &lt;code&gt;Color.RED&lt;/code&gt; returns an &lt;code&gt;int&lt;/code&gt;, equality doesn't work, etc. In the next part I will add more features to this toy class to cover the most commonly used functionality of the standard library &lt;code&gt;Enum&lt;/code&gt;.&lt;/p&gt;</content><category term="python"></category><category term="enum"></category></entry><entry><title>"Over 70% of all Porsche vehicles ever built are still on the road today"</title><link href="https://bytepawn.com/porsche-70.html" rel="alternate"></link><published>2022-04-30T00:00:00+02:00</published><updated>2022-04-30T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-04-30:/porsche-70.html</id><summary type="html">&lt;p&gt;Porsche proudly advertises &lt;em&gt;"Over 70% of all Porsche vehicles ever built are still on the road today"&lt;/em&gt;. Is this a testament of the quality and longevity of Porsche cars, or simply a result of the brand switching from niche sportscar manufacturing to mass production around the year 1999?&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/porsche-70.jpg" alt="Over 70% of all Porsche vehicles ever built are still on the road today" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In 2021 November I went to a Porsche event in Dubai called &lt;a href="https://newsroom.porsche.com/en/2021/scene-passion/porsche-icons-festival-dubai-26565.html"&gt;Icons of Porsche&lt;/a&gt;. It was a very cool (free) event, where 100s of iconic Porsche cars were on display, both from local owners, and also many transported from &lt;a href="https://www.porsche.com/international/aboutporsche/porschemuseum/"&gt;the official Porsche museum in Stuttgart&lt;/a&gt; for this event! Near the entrance of the event I snapped this picture:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche-70.jpg" alt="Over 70% of all Porsche vehicles ever built are still on the road today" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This has been tingling my curiosity ever since. &lt;strong&gt;Is this a testament of the quality and longevity of Porsche cars, or simply a result of the brand switching from niche sportscar manufacturing to mass production around the year 1999?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For those who don't know much about Porsches, the brand exploded when the company started to manufacture less expensive and more practical cars around 2000:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Porsche_Boxster/Cayman"&gt;Cayman/Boxster (2 seater)&lt;/a&gt;, introduced in 1996&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Porsche_Cayenne"&gt;Cayenne (big SUV)&lt;/a&gt;, introduced in 2003&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Porsche_Panamera"&gt;Panamera (wagon)&lt;/a&gt;, introduced in 2009&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Porsche_Macan"&gt;Macan (small SUV)&lt;/a&gt;, introduced in 2014&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Porsche_Taycan"&gt;Taycan (electric)&lt;/a&gt;, introduced in 2019&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The iconic &lt;a href="https://en.wikipedia.org/wiki/Porsche_911"&gt;Porsche 911&lt;/a&gt; also had a significant switch in 1998. Models before this were &lt;a href="https://en.wikipedia.org/wiki/Air-cooled_engine"&gt;air-cooled&lt;/a&gt;, models after were &lt;a href="https://en.wikipedia.org/wiki/Internal_combustion_engine_cooling#Liquid_cooling"&gt;water-cooled&lt;/a&gt;. The air-cooled 911s are much more of a collector's item than the later generations. Also, the first water-cooled 911, the &lt;a href="https://en.wikipedia.org/wiki/Porsche_996"&gt;996&lt;/a&gt; had the famous broken-egg shaped lights, which many enthusiasts (including me) find ugly. The 996 and later models are not considered collector's items, while the previous generations, including the last air-cooled &lt;a href="https://en.wikipedia.org/wiki/Porsche_993"&gt;993&lt;/a&gt;, are. If you remember the original &lt;a href="https://www.imdb.com/title/tt0112442/"&gt;Bad Boys movie&lt;/a&gt; (with Will Smith and Martin Lawrence), he famously drove a black 993 generation 911 Turbo. That exact car happened to be on display at the Mall of Emirates a few years ago, where I snapped this picture:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche-bad-boys.jpg" alt="Porsche Bad Boys" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;So the question is: &lt;em&gt;is the seemingly high 70% statistic because a lot of Cayenne/Macans were built in the last 15 years, and these cars simply haven't had time yet to die? Or do Porsche cars in fact have a relatively long lifespan?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Porsche.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Porsche production timeseries&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.stuttcars.com/porsche-sales-production-numbers/"&gt;This page&lt;/a&gt; has annual production numbers from 1998 to 2021. Scrolling down there is also a break-down by model, this shows that the sportscar segment (911, Cayman) hasn't grown much, the growth is coming from the "practical" segment (Cayenne, Macan, Panamera).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche-annual-by-model.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://press.porsche.com/prod/presse_pag/PressResources.nsf/jumppage/unternehmen-pcna-history?OpenDocument"&gt;This official Porsche page&lt;/a&gt; has some scattered annual and cumulative production numbers for the preceeding 50 years. These are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1947&lt;/strong&gt;: 0 produced&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1948&lt;/strong&gt;: 52 produced that year&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1956&lt;/strong&gt;: 10,000 total produced up to this year&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1963&lt;/strong&gt;: 11,000 produced&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1969&lt;/strong&gt;: 14,000 produced&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1977&lt;/strong&gt;: 300,000 total produced up to this year&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1996&lt;/strong&gt;: 1,000,000 total produced up to this year&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I wrote a quick function to compute a piecewise linear fit given these constraints (it's not a simple fit, since the "integral" is also involved):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;TimeSeries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Sequence&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="c1"&gt;# type alias&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;piecewise_linear_fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TimeSeries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TimeSeries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;TimeSeries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# compute a piecewise linear fit to f, also taking&lt;/span&gt;
    &lt;span class="c1"&gt;# into account the cumulative sum constraints in c&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="c1"&gt;# the next closest data point is in f, use that to get the slope&lt;/span&gt;
            &lt;span class="n"&gt;slope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="c1"&gt;# the next closest data point is in c, use that to get the slope&lt;/span&gt;
            &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;slope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_0&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;slope&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x_0&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that given that we're doing a simple survival analysis, the shape of the curve in the 20th centure won't matter much, and one of the last cumulative constraints is at 1996 anyway.&lt;/p&gt;
&lt;p&gt;Running this, we get the following fit for the annual production curve from 1948 to 2021:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche-annual-fit.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Same, but showing cumulatives:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche-cumulative-fit.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Naive survival&lt;/h2&gt;
&lt;p&gt;First, let's do a naive survival model. Let's assume that cars are manufactured, are alive for &lt;code&gt;lifespan&lt;/code&gt; years, and then they die:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ratio_alive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lifespan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;alive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;lifespan&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;alive&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;

&lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;lifespan&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ratio_alive&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;lifespan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;lifespan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{lifespan} -&amp;gt; {r:.3f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;10 -&amp;gt; 0.519
11 -&amp;gt; 0.545
12 -&amp;gt; 0.567
13 -&amp;gt; 0.585
14 -&amp;gt; 0.603
15 -&amp;gt; 0.625
16 -&amp;gt; 0.647
17 -&amp;gt; 0.668
18 -&amp;gt; 0.688
19 -&amp;gt; 0.705
20 -&amp;gt; 0.720
21 -&amp;gt; 0.733
22 -&amp;gt; 0.745
23 -&amp;gt; 0.756
24 -&amp;gt; 0.766
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/porsche-naive-lifespan.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can read off the output that in this simple model, we have to &lt;strong&gt;assume a lifespan of 19 years to get a 70% survival rate for today&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;Monte carlo survival&lt;/h2&gt;
&lt;p&gt;The above gives a good initial indicator, but it's too naive even for a naive model. The problem is, even if we say that it suggests an average lifespan of 19 years, it's biased. Old cars (eg. produced in 1975), even if their lifespan is off from the average, by 2021 most will have died. But for recent cars, we could get early deaths probabilistically, which would drive down the survival rate. Ie. if a car produced from 1975 dies at 5 years, it doesn't affect our survival rate in 2021, but if a car produced in 2015 dies "too early" in 2020, it does. So let's model the lifespan of the cars as a normal distribution centered around an average lifespan:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;avg_lifespan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="c1"&gt;# years&lt;/span&gt;
&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_alive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_produced&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_produced&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;expiry_year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_lifespan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expiry_year&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2022&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;num_alive&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;survival_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_alive&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Assuming avg_lifespan = {avg_lifespan} years and sigma = {sigma}, survival % = {survival_ratio*100:.1f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Assuming avg_lifespan = 20 years and sigma = 5 years, survival % = 70.4
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So introducing a normal distribution doesn't change the result much. It's worth noting that an increasing sigma with a fixed average lifespan yields a lower survival %. So if we assume a 10 year sigma, to get 70% we'd have to assume an average lifespan of 22 years.&lt;/p&gt;
&lt;p&gt;We can also try to divide the cars into a &lt;strong&gt;classics&lt;/strong&gt; vs &lt;strong&gt;mass-produced&lt;/strong&gt;, by assuming cars before 1999 are classics, and after are mass-produced, and using two separate normal distributions for their lifespans. Here we can play around with the assumption that the classic cars get a lot of care and loving, don't get driven much, and hence have very long lifespans, while the mass produced cars have lower lifespans:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# avg_lifespan_classics, sigma_classics = 33, 10 # years&lt;/span&gt;
&lt;span class="c1"&gt;# avg_lifespan_massprod, sigma_massprod = 15, 5  # years&lt;/span&gt;
&lt;span class="n"&gt;avg_lifespan_classics&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma_classics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="c1"&gt;# years&lt;/span&gt;
&lt;span class="n"&gt;avg_lifespan_massprod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma_massprod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;  &lt;span class="c1"&gt;# years&lt;/span&gt;
&lt;span class="n"&gt;classics_year_end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1999&lt;/span&gt;
&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_alive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_produced&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_produced&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;classics_year_end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;expiry_year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_lifespan_classics&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma_classics&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;expiry_year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_lifespan_massprod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma_massprod&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expiry_year&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2022&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;num_alive&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Assuming avg_lifespan_classics = {avg_lifespan_classics} years and sigma_classics = {sigma_classics} years&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Assuming avg_lifespan_massprod = {avg_lifespan_massprod} years and sigma_massprod = {sigma_massprod} years&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Survival % = {survival_ratio*100:.1f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Assuming avg_lifespan_classics = 25 years and sigma_classics = 10 years
Assuming avg_lifespan_massprod = 17 years and sigma_massprod = 5 years
Survival % = 70.4
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Of course there are lot of parameter choices of the 4 four parameters (2 averages, 2 sigmas) that yield a 70% survival rate. But taking all 3 versions into account, it suggests that &lt;strong&gt;modern Porsches have a 15-20 year average lifespan&lt;/strong&gt;, which does sound pretty good. &lt;a href="https://www.google.com/search?q=average+car+lifespan&amp;amp;oq=average+car+lifespan"&gt;Google suggests that the average car's lifespan is 8 years&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The analysis is quite naive, but because all 3 models were in roughly the same range, I think the conclusion of modern Porsches having a 15-20 year average lifespan is directionally sound. Having said that, some caveats and comments:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Lifespan in itself is not just a property of the car/brand, it's also a function of the owners. Many Porsche owners love their cars and take good care of it, which increases their lifespans (compared to a Suzuki). Also, for similar reasons many Porsche cars don't get driven as many miles as other brands, which reduces opportunities for the car to die.&lt;/li&gt;
&lt;li&gt;Even if a car brand has good reliability and owners take care of it, if it's been mass producing cars for a long time, the survival rate would be significantly lower. For example, the &lt;a href="https://en.wikipedia.org/wiki/Ford_Motor_Company"&gt;Ford Motor Company&lt;/a&gt; has been mass producing cars since 1901.&lt;/li&gt;
&lt;li&gt;Lifespan is not necessarily related to low fault rates. In my experience, Porsche cars have just as many issues (or more) as other similar brands (BMW, Land Rover/Jaguar). Official statistics are in disagreement, some suggest Porsches are reliable, other suggest it's below-average.&lt;/li&gt;
&lt;li&gt;Here lifespan-in-years was the primary input metric, but a better analysis would also take into account lifespan-in-miles.&lt;/li&gt;
&lt;li&gt;I don't know how realistic a normal distribution is for modeling the lifespan, I didn't research how cars die. The sigmas don't seem to matter much.&lt;/li&gt;
&lt;li&gt;Doing a per model (eg. Macan, Cayenne, 911, etc) model doesn't seem to be worth it, since already the classic vs. mass-produced had too many parameters.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/porsche-random.jpg" alt="A picture of a random 2017 Porsche 911 Turbo" style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="porsche"></category><category term="timeseries"></category><category term="fit"></category><category term="survival"></category></entry><entry><title>"The company is all hot air"</title><link href="https://bytepawn.com/the-company-is-all-hot-air.html" rel="alternate"></link><published>2022-04-26T00:00:00+02:00</published><updated>2022-04-26T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-04-26:/the-company-is-all-hot-air.html</id><summary type="html">&lt;p&gt;Theranos, WeWork and the startup hustle.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/theranos-wework.jpg" alt="Bayes vs z-test" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Theranos&lt;/h2&gt;
&lt;p&gt;In 2013, user &lt;a href="https://news.ycombinator.com/user?id=car"&gt;car&lt;/a&gt; posted a link titled &lt;a href="https://news.ycombinator.com/item?id=6349349"&gt;Theranos&lt;/a&gt; to Hacker News, pointing to the company's website. A few minutes later a throaway user account &lt;a href="https://news.ycombinator.com/user?id=medman77"&gt;medman77&lt;/a&gt; was created, and posted the following &lt;a href="https://news.ycombinator.com/threads?id=medman77"&gt;comment&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"&lt;b&gt;The company is all hot air.&lt;/b&gt; They have a board full of retired military figure heads that have no experience in medical devices or retail services. Additionally, they do not have any products to show. Look at their patents. They are all very general and broad. There has been NO FDA CLEARANCE for anything they are doing, which raises legal questions. Speaking of legal, search for lawsuits they are involved in. Their core technology is not even theirs. They stole it from someone else."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two years later, &lt;a href="https://en.wikipedia.org/wiki/John_Carreyrou"&gt;John Carreyrou&lt;/a&gt; published his first article on Theranos in the Washington Post, &lt;a href="https://www.wsj.com/articles/theranos-has-struggled-with-blood-tests-1444881901"&gt;Hot Startup Theranos Has Struggled With Its Blood-Test Technology&lt;/a&gt;. This was the first blow to the house of cards that was Theranos, a company that raised \$1B and was valued at \$9B at its peak. The company was dissolved September 4, 2018, 15 years after its founding. Since then, CEO &lt;a href="https://en.wikipedia.org/wiki/Elizabeth_Holmes"&gt;Elizabeth Holmes&lt;/a&gt; and COO &lt;a href="https://en.wikipedia.org/wiki/Sunny_Balwani"&gt;Sunny Balwani&lt;/a&gt; were charged with wire fraud and conspiracy, with Holmes being found guilty on four counts in January 2022. Her sentencing is set for late 2022. Balwani's trial began in March 2022.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/elizabeth-holmes-cover.jpg" alt="Elizabeth Holmes" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;I first learned about Theranos sometime in 2018 when John Carreyrou's excellent book &lt;a href="https://www.amazon.com/Bad-Blood-Secrets-Silicon-Startup/dp/0525431993/ref=tmm_pap_swatch_0?_encoding=UTF8&amp;amp;qid=1650787301&amp;amp;sr=1-1"&gt;Bad Blood&lt;/a&gt; popped up in my Amazon recommendations. I bought the book, read it, and I've been hooked on the story ever since. In case you haven't heard about Theranos, there are so many great accounts, I will not repeat the story here. It's a fascinating story, full of patterns you can learn and watch out for in real life. The best sources are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Bad-Blood-Secrets-Silicon-Startup/dp/0525431993"&gt;Bad blood (book)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.imdb.com/title/tt10166622/"&gt;The dropout (series)&lt;/a&gt; is an 8 episode Hulu series about Theranos starring Amanda Seyfried as Elizabeth Holmes. The series is done so well, it's worth watching even if you've read the book.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/channel/UCnpOlzpWAjg03o8sOd6K1xw"&gt;Coffee and Cults (Youtube)&lt;/a&gt; is a Youtube channel which focuses on Theranos and WeWork content&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Theranos"&gt;Theranos&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Elizabeth_Holmes"&gt;Elizabeth Holmes&lt;/a&gt; on Wikipedia&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;WeWork&lt;/h2&gt;
&lt;p&gt;The WeWork story is similar, but not identical to Theranos. In the case of Theranos, the CxOs misled investors by telling them they had a working machine that can run multiple tests on a drop of blood, when they didn't. Furthermore, their lab returned unreliable test results to millions of patients. &lt;strong&gt;In the case of WeWork, it was a case of grand visions sold to investors leading to outsized valuations, accompanied by a huge burn rate.&lt;/strong&gt; WeWork raised a total of \$21B, the peak valuation was \$47B, but the current, post-bubble valuation is approximately \$5B. At its peak, the company was reportedly burning \$100M per week. What's remarkable is that when WeWork crashed and 90% of (virtual) shareholder value was destroyed, the two co-founders Adam Neumann and Miguel McKelvey still walked away with a combined $2B cash. Former CEO Adam Neumann was never sued, in fact it was he who sued investor Softbank to get the full amount of his final payout. Although WeWork's valuation tanked and Adam was fired, the company still exists, was led through Covid times by new management, and their co-working spaces still operate today.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/adam-neumann.jpg" alt="Adam Neumann" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Similarly to Theranos this is also a fascinating story worth diving into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.imdb.com/title/tt11188154/"&gt;WeWork (documentary)&lt;/a&gt; is a 90-minute Hulu documentary, definitely worth watching&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.imdb.com/title/tt12005128/?ref_=nv_sr_srsg_0"&gt;WeCrashed (series)&lt;/a&gt; is an 8 episode Apple series about WeWork, starring Jared Leto and Anne Hathaway&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/channel/UCnpOlzpWAjg03o8sOd6K1xw"&gt;Coffee and Cults (Youtube)&lt;/a&gt; is a Youtube channel which focuses on Theranos and WeWork content&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Cult-We-Neumann-Startup-Delusion/dp/0593237137"&gt;The Cult of We (book)&lt;/a&gt; - I haven't read it yet, so cannot make a recommendation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Billion-Dollar-Loser-Spectacular-Neumann/dp/0316461334"&gt;Billion Dollar Loser (book)&lt;/a&gt; - I haven't read it yet, so cannot make a recommendation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/WeWork"&gt;WeWork&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Adam_Neumann"&gt;Adam Neumann&lt;/a&gt; on Wikipedia&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In case you're wondering how Adam Neumann convinced his investors that re-selling office space is a $47B business, watch this &lt;a href="https://www.youtube.com/watch?v=Dgp-CM-gQik"&gt;40 minute post-crash interview him&lt;/a&gt;. He's insanely likeable, well-spoken, a master of selling and public relations.&lt;/p&gt;
&lt;h2&gt;The startup hustle&lt;/h2&gt;
&lt;p&gt;The word "hustler" has many meanings:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;a person who is hard-working, passionate and determined to succeed&lt;/em&gt; (positive)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;a person who is good at selling&lt;/em&gt; (positive)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;a person who achieves their goals by unethical means&lt;/em&gt; (negative)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Investors want co-founders to be the first two, but not the last.&lt;/strong&gt; Determination is important because startups are hard, and being the CEO of your own startups is the hardest job you'll ever have. Selling is important because on the way you'll have to convince a lot of people that you're &lt;em&gt;right&lt;/em&gt;: investors, employees, customers, press.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/hustle.jpg" alt="Hustle" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;What's interesting is the gray area where these 3 definitions intersect, &lt;em&gt;when founders sell their vision really hard&lt;/em&gt;. When selling a vision and creating a 5 year sales forecast, where is the limit between ethical but extreme optimism and making it up in an unethical way?&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Even in my limited experience of working at ~10 companies in my life, out of which 5-6 were startups (4 failed, including &lt;a href="https://github.com/scalien/scaliendb"&gt;my own&lt;/a&gt;), I've seen many cases where the co-founders sold their vision hard and didn't fully succeed, and investors didn't get the returns they hoped for. So, this is actually quite common in startups. VCs know that co-founders are selling their vision hard, and take this into account. They don't expect all investments to succeed. They assume only 1 in $N$ will be a home-run, and they don't really care why the other $N-1$ failed (as long as there was no fraud). The only company I ever worked at where the founders fully achieved their goals (and more) was Facebook, and companies like Facebook should be considered an outlier — which is exactly what investors are trying to find.&lt;/p&gt;
&lt;p&gt;Finally, it's worth noting that good investors also pick bad apples, see &lt;a href="https://en.wikipedia.org/wiki/UBiome"&gt;uBiome&lt;/a&gt;, which was &lt;a href="https://www.ycombinator.com/"&gt;YCombinator&lt;/a&gt; and &lt;a href="https://a16z.com"&gt;Andreessen Horowitz&lt;/a&gt; backed, roughly in the same space as Theranos. According to Wikipedia:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In 2021, the Securities and Exchange Commission charged two of the cofounders (Richman and Apte) with defrauding investors.&lt;/p&gt;
&lt;/blockquote&gt;</content><category term="startups"></category><category term="theranos"></category><category term="wework"></category><category term="scalien"></category></entry><entry><title>Python types for Data Scientists - Part III</title><link href="https://bytepawn.com/python-types-for-data-scientists-part-iii.html" rel="alternate"></link><published>2022-04-22T00:00:00+02:00</published><updated>2022-04-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-04-22:/python-types-for-data-scientists-part-iii.html</id><summary type="html">&lt;p&gt;I show slightly more advanced aspects of type checking in Python for Data Scientists.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mypy.png" alt="Mypy" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/python-types-for-data-scientists.html#python-types-for-data-scientists"&gt;first post&lt;/a&gt; I showed how to get started using Python static type checking in ipython notebooks. The &lt;a href="https://bytepawn.com/python-types-for-data-scientists-part-ii.html#python-types-for-data-scientists-part-ii"&gt;second post&lt;/a&gt; looked at slightly more advanced uses of typing to further increase the safety and readability of code. Here I will continue, and look at some aspects of type hinting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;type check errors and runtime errors&lt;/li&gt;
&lt;li&gt;where type hints don't work&lt;/li&gt;
&lt;li&gt;Abstract Base Classes vs. Protocols&lt;/li&gt;
&lt;li&gt;types for class variables vs instance variables&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Types%20-%20Part%203.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;. The best reference is the &lt;a href="https://docs.python.org/3/library/typing.html"&gt;official Python documentation of the &lt;code&gt;typing&lt;/code&gt; module&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mypy.png" alt="Mypy" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Type check errors and runtime errors&lt;/h2&gt;
&lt;p&gt;It's important to remember that in Python, &lt;strong&gt;type hints are optional and ignored (not enforced) by the Python runtime&lt;/strong&gt;. Type hints are interpreted by external programs. In these example, I use &lt;a href="https://pypi.org/project/nb-mypy/"&gt;nb_mypy&lt;/a&gt;, which actually runs &lt;code&gt;mypy&lt;/code&gt; to do type checking. Then, &lt;em&gt;irrespective of the result of type checking,&lt;/em&gt; the regular Python runtime runs (and ignores all type hints). In other contexts, when using an IDE, the IDE would run type checks in the background, and show errors, or use the type hint informations for code complete.&lt;/p&gt;
&lt;p&gt;Given the roots of Python, I find this to be a good trade-off to introduce typing and get 80% of the benefits.&lt;/p&gt;
&lt;p&gt;But, it leads to some weird behaviour, which cannot be changed with nb_mypy: even if there is a typecheck error in the current cell, the code code in the cell will still be run after the typecheck completed. This leads to some confusing outputs. For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hello&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;foo&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;hello&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;First the nb_mypy type checker runs, finds the type error, prints the error, but then the code is executed anyway. And since the Python runtime ignores all typehints, the code runs just fine, since to Python the function &lt;code&gt;foo()&lt;/code&gt; is equivalent to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Where type hints don't work&lt;/h2&gt;
&lt;p&gt;There are some cases where writing type hints does not work as we'd expect. The big ones are &lt;code&gt;for&lt;/code&gt; and &lt;code&gt;while&lt;/code&gt; loops:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;SyntaxError&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;invalid&lt;/span&gt; &lt;span class="n"&gt;syntax&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is not a type check error, it's a syntax error. We cannot put the type hint for &lt;code&gt;i&lt;/code&gt; in the &lt;code&gt;for&lt;/code&gt; loop itself. It has to go before:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;However, at least in such simple cases, I would just skip the type hint since it's quite ugly. It's actually not required, the type checker can infer the &lt;code&gt;int&lt;/code&gt; type from &lt;code&gt;range()&lt;/code&gt;, so this will throw a type check error:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
    &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;f&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Abstract Base Classes vs. Protocols&lt;/h2&gt;
&lt;p&gt;In the previous post, there was the example of declaring a &lt;code&gt;Protocol&lt;/code&gt; for addability:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__add__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# anything that declares __add__() can stand in for an Addable&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;NotImplementedError&lt;/span&gt;

&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bound&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__(), uses __str__()&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In this example, what we pass to &lt;code&gt;add()&lt;/code&gt; needs to declare an &lt;code&gt;__add__()&lt;/code&gt;. So if we define our own class &lt;code&gt;MyInt&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;add&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;MyInt&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can fix this by implementing &lt;code&gt;__add__()&lt;/code&gt; in &lt;code&gt;MyInt&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="c1"&gt;# note that MyInt does not inherit Addable&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__add__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__str__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# so print() works&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bound&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__(), uses __str__()&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;     &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;lt;class &amp;#39;__main__.MyInt&amp;#39;&amp;gt; &amp;lt;class &amp;#39;__main__.MyInt&amp;#39;&amp;gt;
3
&amp;lt;class &amp;#39;int&amp;#39;&amp;gt; &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;
3
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This works, even though &lt;code&gt;MyInt&lt;/code&gt; &lt;em&gt;does not mention&lt;/em&gt; &lt;code&gt;Addable&lt;/code&gt; in the class declaration at all!&lt;/p&gt;
&lt;p&gt;What happens if we go back to the &lt;code&gt;Addable&lt;/code&gt; declaration and change &lt;code&gt;Protocol&lt;/code&gt; to &lt;code&gt;ABC&lt;/code&gt;, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__add__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;NotImplementedError&lt;/span&gt;

&lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;     &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We get a type error from both lines:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;add&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;MyInt&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;add&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Neither &lt;code&gt;MyInt&lt;/code&gt; or &lt;code&gt;int&lt;/code&gt; can stand in for an &lt;code&gt;Addable&lt;/code&gt; if it's an &lt;code&gt;ABC&lt;/code&gt;. Only classes that derive can stand in for abstract base classes.&lt;/p&gt;
&lt;p&gt;Let's change &lt;code&gt;MyInt&lt;/code&gt; to derive from &lt;code&gt;Addable&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="o"&gt;....&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MyInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay, MyInt derives from Addable&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;     &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;add&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;MyInt&lt;/code&gt; is now fine, &lt;code&gt;int&lt;/code&gt; still cannot stand in for an &lt;code&gt;Addable&lt;/code&gt;. This shows the difference between &lt;code&gt;Protocol&lt;/code&gt; and &lt;code&gt;ABC&lt;/code&gt;. With &lt;code&gt;Protocol&lt;/code&gt;, anything that implements the declared functions can stand-in for that type, &lt;em&gt;irrespective of inheritance&lt;/em&gt;. With &lt;code&gt;ABC&lt;/code&gt;, only types that inherit from the base class (in the example above, &lt;code&gt;MyInt&lt;/code&gt; inherits from &lt;code&gt;Addable&lt;/code&gt;) can stand in for that type.&lt;/p&gt;
&lt;h2&gt;Types for class variables vs instance variables&lt;/h2&gt;
&lt;p&gt;Let's try this code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="c1"&gt;# num is NOT a class variable&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;

&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# prints 1&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints 1 2&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# AttributeError: type object &amp;#39;Foo&amp;#39; has no attribute &amp;#39;num&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;1
1 2
AttributeError: type object &amp;#39;Foo&amp;#39; has no attribute &amp;#39;num&amp;#39;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here, we declare the &lt;code&gt;num&lt;/code&gt; instance variable of class &lt;code&gt;Foo&lt;/code&gt; to be of type &lt;code&gt;int&lt;/code&gt;. We create 2 instances of &lt;code&gt;Foo&lt;/code&gt;, and we see that each of them carries a separate &lt;code&gt;num&lt;/code&gt;. Then we try to access &lt;code&gt;Foo.num&lt;/code&gt; class variable, and we get an &lt;code&gt;AttributeError&lt;/code&gt;, because it doesn't exist.&lt;/p&gt;
&lt;p&gt;Let's make one minor modification of the code, and assign some initial value to &lt;code&gt;num&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="c1"&gt;# num is now a class variable&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;

&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# prints 1&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# prints 1 2&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# prints 0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This one change creates &lt;code&gt;num&lt;/code&gt; as a class variable, which can be accessed. Note that both the class variable and the instance variable carry the &lt;code&gt;int&lt;/code&gt; type:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints 0 1&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;hello&amp;quot;&lt;/span&gt;       &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;world&amp;quot;&lt;/span&gt;     &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This post concludes this short series on Python typing for Data Scientists. I think the verdict is still out whether type hints are worth in in Data Science code (which tends to be short, linear and less structured than application software), but it's good to know that type hints exist and how it works.&lt;/p&gt;</content><category term="python"></category><category term="types"></category></entry><entry><title>Python types for Data Scientists - Part II</title><link href="https://bytepawn.com/python-types-for-data-scientists-part-ii.html" rel="alternate"></link><published>2022-04-17T00:00:00+02:00</published><updated>2022-04-17T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-04-17:/python-types-for-data-scientists-part-ii.html</id><summary type="html">&lt;p&gt;I show slightly more advanced uses of type checking in Python.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/python-hero.jpeg" alt="Python snake" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/python-types-for-data-scientists.html#python-types-for-data-scientists"&gt;previous post&lt;/a&gt; I showed how to get started using Python static type checking in ipython notebooks. Here I will look at slightly more advanced uses of typing to further increase the safety and readability of code. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Types%20-%20Part%202.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;. The best reference is the &lt;a href="https://docs.python.org/3/library/typing.html"&gt;official Python documentation of the &lt;code&gt;typing&lt;/code&gt; module&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/python-hero.jpeg" alt="Python snake" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;Optional&lt;/code&gt; types and &lt;code&gt;Union&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Sometimes we want to declare that something can be of a certain type, or &lt;code&gt;None&lt;/code&gt;. Imagine we don't know about &lt;code&gt;numpy.random.random_sample&lt;/code&gt; and we're writing a function &lt;code&gt;randoms()&lt;/code&gt; to return a random &lt;code&gt;list[float]&lt;/code&gt; of length &lt;code&gt;num&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="c1"&gt;# not okay, NoneType is not list[float]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We want to be good programmers and return &lt;code&gt;None&lt;/code&gt; if a negative value for &lt;code&gt;num&lt;/code&gt; is passed in, but &lt;code&gt;None&lt;/code&gt; is not a &lt;code&gt;list[float]&lt;/code&gt;, so this won't work:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;got&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;None&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List[float]&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is what &lt;code&gt;Optional[T]&lt;/code&gt; is for, it declares that the type will be &lt;code&gt;T&lt;/code&gt; or &lt;code&gt;None&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="c1"&gt;# okay, return type is Optional[...]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The same can also be achieved by using &lt;code&gt;Union[]&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="c1"&gt;# okay, return type is Union[..., None]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that &lt;code&gt;None&lt;/code&gt; as a type hint is a special case and is replaced by &lt;code&gt;type(None)&lt;/code&gt; by Python.&lt;/p&gt;
&lt;p&gt;What if we're a different kind of programmer, and we want to raise an exception instead of returning &lt;code&gt;None&lt;/code&gt;, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt; &lt;span class="c1"&gt;# okay, there is no typed way to communicate raised exceptions&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is okay, in Python there is no typed way to communicate raised exceptions.&lt;/p&gt;
&lt;p&gt;Finally, what if we want to return just a &lt;code&gt;float&lt;/code&gt; if the user is asking for one random, and &lt;code&gt;None&lt;/code&gt; on negative input. &lt;code&gt;Union&lt;/code&gt; is the solution:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                       &lt;span class="c1"&gt;# float&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="c1"&gt;# list[float]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;                           &lt;span class="c1"&gt;# None&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that as of Python 3.10, &lt;code&gt;Union[X, Y]&lt;/code&gt; can be written as &lt;code&gt;X | Y&lt;/code&gt;, but this does not work yet on Python 3.9:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;randoms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                       &lt;span class="c1"&gt;# float&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="c1"&gt;# list[float]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;                           &lt;span class="c1"&gt;# None&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Type aliases and &lt;code&gt;NewType&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Suppose we are building a library for machine learning and we are using &lt;code&gt;list[float]&lt;/code&gt; for feature vectors. One way we can communicate this to the user of our library is by calling the arguments of our functions names like &lt;code&gt;feature_vector&lt;/code&gt;. We can also accomplish this in our typing, by declaring an alias for &lt;code&gt;list[float]&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;FeatureVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# type alias&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can now write &lt;code&gt;FeatureVector&lt;/code&gt; interchangeably with &lt;code&gt;list[float]&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FeatureVector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FeatureVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Suppose that we want to declare our type classes like in type aliases, but we want to be more strict. We only want to accept &lt;code&gt;list[float]&lt;/code&gt;s that were explicitly declared to be &lt;code&gt;FeatureVector&lt;/code&gt; classes. We can achieve this by using &lt;code&gt;NewType&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;FeatureVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NewType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;FeatureVector&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="c1"&gt;# all FeatureVectors are list[float], but not all list[float] are FeatureVectors&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FeatureVector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FeatureVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FeatureVector&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# explicit cast&lt;/span&gt;
&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;predict&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List[float]&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;FeatureVector&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the above example, all &lt;code&gt;FeatureVector&lt;/code&gt;s are &lt;code&gt;list[float]&lt;/code&gt;, but not all &lt;code&gt;list[float]&lt;/code&gt; are &lt;code&gt;FeatureVector&lt;/code&gt;s. So any function that accepts a &lt;code&gt;list[float]&lt;/code&gt; will accept a &lt;code&gt;FeatureVector&lt;/code&gt;, but not the other way around.&lt;/p&gt;
&lt;h2&gt;Generics with &lt;code&gt;TypeVar&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Suppose we want to write a function &lt;code&gt;first()&lt;/code&gt; which returns the first element of a list, and we want to declare that the list contains things of type &lt;code&gt;T&lt;/code&gt;, and the return type will be the same type &lt;code&gt;T&lt;/code&gt;. We can accomplish this with a &lt;code&gt;TypeVar&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# declare type variable T to be used&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can also mix &lt;code&gt;TypeVar&lt;/code&gt;s with &lt;code&gt;Optional&lt;/code&gt; to make &lt;code&gt;first()&lt;/code&gt; more useful:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# declare type variable T to be used&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that we cannot bind a &lt;code&gt;TypeVar&lt;/code&gt; by usage. In the example below, we cannot bind &lt;code&gt;T&lt;/code&gt; to be &lt;code&gt;str&lt;/code&gt; (there is no "type solver"), this will return an error:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# declare type variable T to be used&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;hello&amp;quot;&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The error is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;got&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Protocols&lt;/h2&gt;
&lt;p&gt;Let's look at another example, where we want to &lt;code&gt;add&lt;/code&gt; two things:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# declare type variable T to be used&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This will result in a type-error, because Python doesn't know whether &lt;code&gt;T&lt;/code&gt; implements &lt;code&gt;__add__()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Unsupported&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt; &lt;span class="n"&gt;operand&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;T&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To achieve the desired typing, we have to use &lt;code&gt;Protocol&lt;/code&gt;s:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__add__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# anything that declares __add__() can stand in for an Addable&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;NotImplementedError&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we are declaring a class &lt;code&gt;Addable&lt;/code&gt; using &lt;code&gt;typing.Protocol&lt;/code&gt;, which declares &lt;code&gt;__add__()&lt;/code&gt;. 
Anything that declares &lt;code&gt;__add__()&lt;/code&gt; can stand in for an &lt;code&gt;Addable&lt;/code&gt;, even if it's not descended from &lt;code&gt;Addable&lt;/code&gt;. For example, an &lt;code&gt;int&lt;/code&gt; is an &lt;code&gt;Addable&lt;/code&gt;. Examples:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__()&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;   &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;   &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;   &lt;span class="c1"&gt;# not a typecheck error, but a runtime error&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&amp;lt;class &amp;#39;str&amp;#39;&amp;gt; &amp;lt;class &amp;#39;str&amp;#39;&amp;gt;
&amp;lt;class &amp;#39;int&amp;#39;&amp;gt; &amp;lt;class &amp;#39;int&amp;#39;&amp;gt;
&amp;lt;class &amp;#39;int&amp;#39;&amp;gt; &amp;lt;class &amp;#39;float&amp;#39;&amp;gt;
&amp;lt;class &amp;#39;int&amp;#39;&amp;gt; &amp;lt;class &amp;#39;str&amp;#39;&amp;gt;
TypeError: unsupported operand type(s) for +: &amp;#39;int&amp;#39; and &amp;#39;str&amp;#39; # coming from the last add()
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;All 4 of these pass the type checks, because &lt;code&gt;str&lt;/code&gt;, &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;float&lt;/code&gt; are all &lt;code&gt;Addable&lt;/code&gt;, since they have &lt;code&gt;__add__()&lt;/code&gt;. The last one will raise a run-time exceptions, since &lt;code&gt;+&lt;/code&gt; doesn't work implicitly for &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;str&lt;/code&gt;. Note that this is a runtime exception coming from running the code, not a type error — the type checker did not raise any errors.&lt;/p&gt;
&lt;p&gt;There are 2 ways we can think about this mini-problem:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We want to allow adding of 2 different types (eg. &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;float&lt;/code&gt;), but only if it makes sense. We want the type checker to raise an error for cases when a runtime exception would be raised, (eg. adding &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;str&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;We only want to allow adding of exactly the same types, eg. &lt;code&gt;int, int&lt;/code&gt;, &lt;code&gt;float, float&lt;/code&gt;, &lt;code&gt;str, str&lt;/code&gt;. We will see that this is not achievable in Python with generic types.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let's look at another version of this, where we declare a &lt;code&gt;TypeVar&lt;/code&gt; and bind it to be &lt;code&gt;Addable&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bound&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Addable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__()&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;   &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;   &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# typecheck error and runtime error&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here, the last line raises a type check error and a runtime error:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;error: Value of type variable &amp;quot;T&amp;quot; of &amp;quot;add&amp;quot; cannot be &amp;quot;object&amp;quot;  # typecheck error coming from the last add()
...
TypeError: unsupported operand type(s) for +: &amp;#39;int&amp;#39; and &amp;#39;str&amp;#39;  # runtime error coming from the last add()
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Not that the third &lt;code&gt;int, float&lt;/code&gt; version still runs fine. So this version implements case 1. above, where different types can still be passed, as long as addition makes sense for them.&lt;/p&gt;
&lt;p&gt;One last attempt could be to use a union'd &lt;code&gt;TypeVar&lt;/code&gt;, where we limit ourselves to certain types that can stand in for &lt;code&gt;T&lt;/code&gt;. But as before, in this case the type checker also doesn't enforce the instances of &lt;code&gt;T&lt;/code&gt; to be the same:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="c1"&gt;# checks for __add__()&lt;/span&gt;

&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It turns &lt;a href="https://stackoverflow.com/questions/40033093/how-can-i-ensure-that-arguments-have-same-type-without-listing-the-types-explici"&gt;we cannot use binding to get case 2. above&lt;/a&gt;, ie. to force the type checker to make sure that both &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt; arguments are actually the same type in &lt;code&gt;add(a, b)&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next article I will look at more uses of protocol and abstract base classes.&lt;/p&gt;</content><category term="python"></category><category term="types"></category></entry><entry><title>Python types for Data Scientists - Part I</title><link href="https://bytepawn.com/python-types-for-data-scientists.html" rel="alternate"></link><published>2022-04-08T00:00:00+02:00</published><updated>2022-04-08T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-04-08:/python-types-for-data-scientists.html</id><summary type="html">&lt;p&gt;I show how to use basic type hints and get type checking working in ipython notebooks.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/python-data-types.jpeg" alt="Python types for Data Scientists" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Python supports types, and from 3.5, type annotation and typechecking. Let's see:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;turn on type checking in ipython notebooks&lt;/li&gt;
&lt;li&gt;how to write type hints&lt;/li&gt;
&lt;li&gt;simple examples&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;a href="https://docs.python.org/3/library/typing.html"&gt;official documentation is here&lt;/a&gt;. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Types.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://pynative.com/wp-content/uploads/2021/02/python-data-types.jpg" alt="MLFlow LinearRegression artifacts." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Type and type expressions&lt;/h2&gt;
&lt;p&gt;Let's use the built-in &lt;code&gt;type&lt;/code&gt; function to get the types of some common Python expressions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;42&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;([]),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;({}),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NoneType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's get more abstract:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Dynamic vs static typing&lt;/h2&gt;
&lt;p&gt;Python, by default, is dynamically typed. This means that variables can start of as an &lt;code&gt;int&lt;/code&gt;, can become &lt;code&gt;str&lt;/code&gt; or a &lt;code&gt;list&lt;/code&gt; or &lt;code&gt;NoneType&lt;/code&gt; later:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;However, we can switch Python to static typing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;By default, Python treats the &lt;code&gt;: int&lt;/code&gt; as an annotation and ignores it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;       &lt;span class="c1"&gt;# no problem since type checking is off&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# no problem since type checking is off&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;          &lt;span class="c1"&gt;# no problem since type checking is off&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can load a static type checking module, and the type annotation will be enforced. First we need to &lt;code&gt;pip install nb_mypy&lt;/code&gt;, then we can achieve type checking in ipython notebooks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;load_ext&lt;/span&gt; &lt;span class="n"&gt;nb_mypy&lt;/span&gt;
&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;nb_mypy&lt;/span&gt; &lt;span class="n"&gt;On&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Once we do this, we can get type checking each time a cell is executed:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;            &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;       &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;          &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The output will be:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List[object]&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;None&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Container types&lt;/h2&gt;
&lt;p&gt;Containers can also be typed:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;    &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;            &lt;span class="c1"&gt;# not okay!&lt;/span&gt;
&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;hello&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# not okay!&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List[int]&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Or:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hello&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# not okay!&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;append&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;list&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Functions&lt;/h2&gt;
&lt;p&gt;Functions can also carry type annotations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;f&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Incompatible&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List[str]&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Classes&lt;/h2&gt;
&lt;p&gt;Class member variables:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;memberVariable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memberVariable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memberVariable&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memberVariable&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Then:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;        &lt;span class="c1"&gt;# okay&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hello&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;increase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# not okay&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Argument&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;A&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="n"&gt;incompatible&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;str&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Unsupported&lt;/span&gt; &lt;span class="n"&gt;operand&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;None&amp;quot;&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;int&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next part, I will look at more complicated type expressions, such as &lt;code&gt;Optional&lt;/code&gt;.&lt;/p&gt;</content><category term="python"></category><category term="types"></category></entry><entry><title>Solving 5 algorithmic interview questions</title><link href="https://bytepawn.com/solving-5-algorithmic-interview-questions.html" rel="alternate"></link><published>2022-03-26T00:00:00+01:00</published><updated>2022-03-26T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-03-26:/solving-5-algorithmic-interview-questions.html</id><summary type="html">&lt;p&gt;Recently I was considering whether to introduce some CS style algorithmic interview questions into our Data Science hiring loop, since having an understanding of algorithms and data structures can be useful for Data Scientists. Not having done this soft of interview for a few years I picked up my copy of &lt;a href="https://www.amazon.com/Daily-Coding-Problem-exceptionally-interviews/dp/1793296634"&gt;Daily Coding Problem&lt;/a&gt; and starting solving a few problems to refresh my feeling for what it feels like as a candidate, and whether it would give us any useful signals.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://d1w7fb2mkkr3kw.cloudfront.net/assets/images/book/lrg/9781/7932/9781793296634.jpg" alt="Daily coding problem" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I was considering whether to introduce some CS style algorithmic interview questions into our Data Science hiring loop, since having an understanding of algorithms and data structures can be useful for Data Scientists. Not having done this soft of interview for a few years I picked up my copy of  &lt;a href="https://www.amazon.com/Daily-Coding-Problem-exceptionally-interviews/dp/1793296634"&gt;Daily Coding Problem&lt;/a&gt; and starting solving a few problems to refresh my feeling for what it feels like as a candidate, and whether it would give us any useful signals.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://d1w7fb2mkkr3kw.cloudfront.net/assets/images/book/lrg/9781/7932/9781793296634.jpg" alt="Daily coding problem" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Arrays #1&lt;/h2&gt;
&lt;p&gt;Given an input array &lt;code&gt;li&lt;/code&gt; of numbers, return an array of numbers where the ith element is the multiplication of all other elements except the ith from &lt;code&gt;li&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# easy, with division&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# tricky, without divisions&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;befs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;befs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;befs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;afts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;afts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;afts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;befs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;afts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;afts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;befs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Input: {li}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2646000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5292000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;882000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;661500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;756000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;588000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;756000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2646000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5292000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;882000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1058400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;661500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;756000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;588000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;756000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Quick and dirty large scale test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_tests} random tests passed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Arrays #2&lt;/h2&gt;
&lt;p&gt;Given an input array &lt;code&gt;li&lt;/code&gt; of ints, what is the minimum range that need to be sorted? Ie. elements outside of this range are already in their correct position in &lt;code&gt;li&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# easy, sort and compare&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;sd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# tricky&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;mins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;maxs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;mins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt;
            &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;maxs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;maxs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Range&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Range&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Quick and dirty large scale test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_tests} random tests passed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Arrays #3&lt;/h2&gt;
&lt;p&gt;Given an array of integers &lt;code&gt;li&lt;/code&gt;, what is largest contiguous sub-array sum?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;max_sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_range&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;max_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
                &lt;span class="n"&gt;max_range&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_range&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Input: {li}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;max_range&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Max sum: {max_range}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;max_range&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Max sum: {max_range}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Input: [7, 7, -10, 9, -1, -2, 10, 10, -9]
Max sum: 30
Max sum: 30
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Quick and dirty large scale test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_tests} random tests passed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Strings #1&lt;/h2&gt;
&lt;p&gt;Given a word &lt;code&gt;w&lt;/code&gt; and a string &lt;code&gt;s&lt;/code&gt;, find starting locations of &lt;code&gt;s&lt;/code&gt; that are anagrams of &lt;code&gt;w&lt;/code&gt; (anagram = same length, same letters, different order).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# simple, continuously re-compute the per-character sums&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)])]&lt;/span&gt;

&lt;span class="c1"&gt;# tricky, maintain the running sum&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)]]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inds&lt;/span&gt;

&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;bb&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;abbxbbbaabb&amp;#39;&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;w: {w}, s: {s}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;abbxbbbaabb&lt;/span&gt;
&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Quick and dirty large scale test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;999&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9999999999&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_tests} random tests passed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Strings #2&lt;/h2&gt;
&lt;p&gt;Given a list of words &lt;code&gt;wl&lt;/code&gt;, find all pairs of words so that the concatenation of the words is a palindrome (eg. &lt;code&gt;'ab'+'a'&lt;/code&gt; is a palindrome).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_palindrome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;permutations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_palindrome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
    &lt;span class="n"&gt;inds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;pre&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;suf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pre&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;is_palindrome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;suf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pre&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
                &lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pre&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;suf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;is_palindrome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pre&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;suf&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
                &lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;suf&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="n"&gt;wl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ab&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;bba&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;xx&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;bab&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Input: {wl}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result sum: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Result sum: {result}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Input: [&amp;#39;ab&amp;#39;, &amp;#39;bba&amp;#39;, &amp;#39;xx&amp;#39;, &amp;#39;x&amp;#39;, &amp;#39;bab&amp;#39;, &amp;#39;a&amp;#39;]
Result sum: [(0, 1), (0, 5), (2, 3), (3, 2), (4, 0), (5, 1)]
Result sum: [(0, 1), (0, 5), (2, 3), (3, 2), (4, 0), (5, 1)]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Quick and dirty large scale test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;wl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9999&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tricky&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wl&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_tests} random tests passed&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;My opinion, from having done many Software/Data Engineering loops (both as a candidate and an interviewer), is that I don't like these types of questions, because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Candidates who tend to stress in such situations (like me, I get high blood pressure) will do much worse than in real life&lt;/li&gt;
&lt;li&gt;The specific questions/problems are not relevant to the job (whether SWE or DS)&lt;/li&gt;
&lt;li&gt;These interview situations are a competition against time, but being able to solve a 5-10 line programming puzzle in 10 minutes is like saying: "we want to hire people who can run fast, so they will spend less time walking to and from lunch, so they'll have more time left to work"&lt;/li&gt;
&lt;li&gt;These sort of problems usually come from a question pool (like a book), and the questions tend to be alike, so they can be prepared for. Candidates can prepare for such interviews by solving a lot of similar problems, and if they're lucky, they'll have seen the problem on the interview, or it will follow a usual pattern (eg. use a hashmap). These sort of questions favor more junior candidates, who studied algorithms in school more recently, and who have more time to prepare (no family, more hungry).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Realistically, in a live interview situation, I would probably fail on 3-5 of these 5 questions [in a 10-15 minute limit].&lt;/p&gt;
&lt;p&gt;I think the flaw is to expect candidates to come up with the more tricky efficient solution. I think for Data Scientists, more signal can be gained from telling them to come up with the simple, less efficient solution (and instead asking them to be as pythonic as possible in their solution). Data Scientists tend to use libraies (and SQL query engines) to do the heavy lifting, almost all the code they write is essentially "driver code"; being able to understand the structure of the problem and writing a clean Python solution is good enough.&lt;/p&gt;</content><category term="interview"></category><category term="python"></category></entry><entry><title>Fair coin from biased coin</title><link href="https://bytepawn.com/fair-coin-from-biased-coin.html" rel="alternate"></link><published>2022-03-22T00:00:00+01:00</published><updated>2022-03-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-03-22:/fair-coin-from-biased-coin.html</id><summary type="html">&lt;p&gt;Given a biased coin, construct a fair coin.&lt;br/&gt;&lt;br/&gt; &lt;img src="https://www.epic-escapes.games/wp-content/uploads/2020/10/eduardo-soares-LR0kDki_tPo-unsplash-scaled-e1603378645986-1400x568.jpg" alt="Fair coin from biased coin" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I saw a &lt;a href="https://news.ycombinator.com/item?id=30671354"&gt;fun puzzle on Hacker News&lt;/a&gt;: &lt;strong&gt;given a biased coin, construct a fair coin.&lt;/strong&gt;  Ie. you're allowed to toss a biased coin repeatedly and record the sequence of results (like &lt;code&gt;001...1&lt;/code&gt;, and the goal is to return 0/1 with 50%-50% probability, so use the biased coin to simulate a fair coin. I quickly scanned the article and comments to get some ideas, but I wanted to work through solution(s) myself. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Fair%20coin%20from%20biased%20coin.ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Let's assume the biased coin comes up &lt;code&gt;1&lt;/code&gt; with $p$ probability and &lt;code&gt;0&lt;/code&gt; with $q=1-p$ probability.&lt;/p&gt;
&lt;p&gt;First, some helper functions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;coin_flip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_biased_coin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;bias&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bias = {bias:.3f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;coin_flip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;von Neumann's probabilistic solution&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Fair_coin#:~:text=a%20Bernoulli%20process.-,Fair%20results%20from%20a%20biased%20coin,start%20over%2C%20forgetting%20both%20results."&gt;The most elegant solution is from von Neumann&lt;/a&gt;: toss the biased coin twice. The 4 possible outcomes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;00&lt;/code&gt;: probability $P=p^2$&lt;/li&gt;
&lt;li&gt;&lt;code&gt;11&lt;/code&gt;: probability $P=q^2$&lt;/li&gt;
&lt;li&gt;&lt;code&gt;01&lt;/code&gt;: probability $P=pq$&lt;/li&gt;
&lt;li&gt;&lt;code&gt;10&lt;/code&gt;: probability $P=pq$&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Observe that the last 2 probabilities are the same. Algorithm: if the result is &lt;code&gt;00&lt;/code&gt; or &lt;code&gt;11&lt;/code&gt;, repeat the procedure. If the result is &lt;code&gt;01&lt;/code&gt;, return &lt;code&gt;0&lt;/code&gt;, if the result is &lt;code&gt;10&lt;/code&gt;, return &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;probabilistic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Statistical solution&lt;/h2&gt;
&lt;p&gt;I wanted to come up with a solution of my own for this puzzle. I thought, first let's measure (estimate) the bias of the biased coin, and then somehow correct for it. Let's flip the coin $N/2$ times first, take the mean of the flips, let's call this $\hat{p_1}$, our first estimate of $p$. Then, do another $N/2$ flips (for a total of $N$ flips) to get $\hat{p_2}$, our second estimate of $p$. We can pretend that the first measurement gets the true value of $p$, and then the second measurement varies around $p$ in a symmetrical guassian way, so $P(\hat{p_2} &amp;gt; \hat{p_1}) = P(\hat{p_2} &amp;lt; \hat{p_1})$.&lt;/p&gt;
&lt;p&gt;What if get unlucky, and $\hat{p_2} = \hat{p_1}$? We repeat the process and try again. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;statistical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;num_flips&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;
        &lt;span class="n"&gt;num_flips&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;num_flips&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Permutations solution&lt;/h2&gt;
&lt;p&gt;This is based on a comment I saw on Hacker News. Flip the biased coin $N$ times to get an $N$ long sequence of bits. Count the number of 1s in the sequence, let's say it's $k$. The probability of getting this specific sequence is $P=p^k q^{N-k}$. We notice that all sequences with $k$ 1s have this probability. So if we can somehow enumerate all such sequences, we can order them lexicographically, and we can divide them into two sets of equal cardinality: pick the middle bit sequence, and take the ones smaller than the middle and the ones bigger than the middle bit sequence.&lt;/p&gt;
&lt;p&gt;Implementing this is trivial with one more restriction: let's only consider bit sequences such that the bit sequence is not equal to its reverse (ie. if the bit sequence is &lt;code&gt;01011&lt;/code&gt; then it's reverse is &lt;code&gt;11010&lt;/code&gt;). In this case each bit sequence has exactly one pair, its reverse, and we can create a trivial ordering between these pairs. If we get unlucky, and the bit sequence is the same as its reverse, we repeat the process and try again. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;permutations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;inst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
        &lt;span class="n"&gt;revr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inst&lt;/span&gt;&lt;span class="p"&gt;[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;inst&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;revr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;inst&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;revr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;All 3 solutions are the same&lt;/h2&gt;
&lt;p&gt;For the second and third solution, what should $N$ be? For the statistical solution, first I thought that the higher $N$, the more accurately it approximates a fair coin. But, this is not the case. All of the solutions are perfectly accurate at all $N$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coin_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;means&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;means&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;means&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{coin_type}: mean = {m:.3f}, stdev = {s:.3f}, elapsed = {elapsed:.0f} msec&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;num_experiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_flips&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;biased_coin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_biased_coin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# fix bias at 0.2 for measurement&lt;/span&gt;

&lt;span class="n"&gt;experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coin_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Fair coin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;coin_flip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="n"&gt;experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;coin_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Probabilistic coin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;probabilistic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;coin_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Statistical coin (N={N})&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;statistical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;coin_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Permutations coin (N={N})&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;coin_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;permutations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;biased_coin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Bias = 0.200
Fair coin:                 mean = 0.501, stdev = 0.016, elapsed = 629   msec
Probabilistic coin:        mean = 0.500, stdev = 0.016, elapsed = 1815  msec
Statistical coin (N=2):    mean = 0.500, stdev = 0.016, elapsed = 5293  msec
Statistical coin (N=50):   mean = 0.500, stdev = 0.016, elapsed = 13198 msec
Statistical coin (N=100):  mean = 0.500, stdev = 0.015, elapsed = 24275 msec
Permutations coin (N=2):   mean = 0.500, stdev = 0.016, elapsed = 4569  msec
Permutations coin (N=50):  mean = 0.499, stdev = 0.016, elapsed = 18403 msec
Permutations coin (N=100): mean = 0.499, stdev = 0.016, elapsed = 35645 msec
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Given that, we should run them at the minimal $N$, which is $N=2$ coin flips. Then, if we think about it, the statistical and permutations based solutions trivially reduce to the von Neumann solution at $N=2$ coin flips:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;statistical: flip the coin once, it's either 0 or 1, so the first mean $\hat{p_1}$ is either 0 or 1. Flip it a second time to get $\hat{p_2}$. The conditions of $\hat{p_1} &amp;lt; \hat{p_2}$ and $\hat{p_1} &amp;gt; \hat{p_2}$ reduce to the &lt;code&gt;01&lt;/code&gt; and &lt;code&gt;10&lt;/code&gt; cases (probability $P=pq$), and we flip again in the &lt;code&gt;00&lt;/code&gt; and &lt;code&gt;11&lt;/code&gt; case.&lt;/li&gt;
&lt;li&gt;permuations: there are 4 possible outcomes of 2 flips. &lt;code&gt;00&lt;/code&gt; and &lt;code&gt;11&lt;/code&gt; are their own reverse, so we flip again, in the other two cases of &lt;code&gt;01&lt;/code&gt; and &lt;code&gt;10&lt;/code&gt; we return 0 or 1 (probability $P=pq$).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I had great fun thinking through this trivial puzzle.&lt;/p&gt;</content><category term="probability"></category><category term="statistics"></category></entry><entry><title>The german tank problem in World War II</title><link href="https://bytepawn.com/the-german-tank-problem-in-world-war-ii.html" rel="alternate"></link><published>2022-03-12T00:00:00+01:00</published><updated>2022-03-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-03-12:/the-german-tank-problem-in-world-war-ii.html</id><summary type="html">&lt;p&gt;I run Monte Carlo simulations to show the frequntist solution to the German tank problem. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/german_tank_1.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;One of the challenges the Allies faced in World War II is knowing how many tanks the Germans had. Mathematically, the problem is the following: assume $N$ tanks are produced and numbered sequentially: $1, 2, 3... N$. $k &amp;lt; N$ tanks are destroyed, and their serial numbers are read off by soldiers and sent to a central intelligence agency. Based on these collected serial numbers, what is our best estimate for $N$? For the estimate to work, we assume that the destroyed tanks are a random sample from all the tanks, which may or may not be true in real life.&lt;/p&gt;
&lt;p&gt;Per &lt;a href="https://en.wikipedia.org/wiki/German_tank_problem"&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the statistical theory of estimation, the German tank problem consists of estimating the maximum of a discrete uniform distribution from sampling without replacement. In simple terms, suppose there exists an unknown number of items which are sequentially numbered from 1 to N. A random sample of these items is taken and their sequence numbers observed; the problem is to estimate N from these observed numbers.&lt;/p&gt;
&lt;p&gt;The problem can be approached using either frequentist inference or Bayesian inference, leading to different results. Estimating the population maximum based on a single sample yields divergent results, whereas estimation based on multiple samples is a practical estimation question whose answer is simple (especially in the frequentist setting) but not obvious (especially in the Bayesian setting).&lt;/p&gt;
&lt;p&gt;The problem is named after its historical application by Allied forces in World War II to the estimation of the monthly rate of German tank production from very limited data. This exploited the manufacturing practice of assigning and attaching ascending sequences of serial numbers to tank components (chassis, gearbox, engine, wheels), with some of the tanks eventually being captured in battle by Allied forces.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Monte Carlo simulation&lt;/h2&gt;
&lt;p&gt;The solution is simple: we just have to order the serial numbers and compute the average gap between them. Then, we take the maximum and add the average gap. The intuition can be illustrated with a very small sample size. Imagine there are 100 tanks produced, and 3 are destroyed. If we randomly select 3 numbers, and order them, then the average of the smaller will be ~25, the middle one ~50, and the average of the bigger will be ~75, and the average gap will be ~25. The estimate for the number of tanks is $75+25=100$.&lt;/p&gt;
&lt;p&gt;We can check our thinking with a simple Monte Carlo simulation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;num_destroyed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;num_experiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;destroyed_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_tanks&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Average estimate for num_tanks = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;[25.287, 50.517, 75.771]&lt;/span&gt;
&lt;span class="na"&gt;Average estimate for num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;101.01&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;There is a small bug in our thinking. We expected the sorted averages to be &lt;code&gt;[25, 50, 75]&lt;/code&gt;, but they are approx. &lt;code&gt;[25.25, 50.50, 75.75]&lt;/code&gt;, the gap is 25.25 instead of 25, and the estimate is off by 1 (101 instead of 100). Why?&lt;/p&gt;
&lt;p&gt;This is quite counter-intuitive (to me), it's because we're generating &lt;code&gt;int&lt;/code&gt;s and not &lt;code&gt;float&lt;/code&gt;s. Our intuition does work if we switch to &lt;code&gt;float&lt;/code&gt;s:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;num_destroyed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;num_experiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;destroyed_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_tanks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;estimate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Average estimate for num_tanks = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;[25.748, 50.498, 75.236]&lt;/span&gt;
&lt;span class="na"&gt;Average estimate for num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;99.98&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is per our intuition, 50.5 is the midpoint between 1 and 100 (which is the interval we're uniformly sampling), and 25.75 is the midpoint between 1 and 50.5, and 75.25 is the midpoint between 50.5 and 100. Another way to see that the "quantization" by using integers matters is to think of the trivial example or 3 tanks and all 3 are destroyed: in this case the 3 numbers will [1, 2, 3], and clearly 1 is the not the midpoint between 1 and 2.&lt;/p&gt;
&lt;p&gt;In any case, the actual formula for the least variance estimate for the number of tanks is:&lt;/p&gt;
&lt;p&gt;$N = M + M/k - 1$&lt;/p&gt;
&lt;p&gt;where $M$ is the maximum destroyed tank serial number and $k$ is the number of destroyed tanks. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;num_destroyed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;num_experiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;destroyed_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_tanks&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;estimate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Average estimate for num_tanks = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Average estimate for num_tanks = 100.01
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's see how good this estimate is, and plot the standard deviation of for different $(N, k)$ combinations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tanks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;num_destroyed_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;num_experiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;destroyed_tanks_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;num_tanks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;num_destroyed_ratio&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_experiments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;destroyed_tanks_dict&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;stdvs_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks_dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;destroyed_tanks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;stdvs_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;num_destroyed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;k as a ratio of N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;stdev as ratio of N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;legends&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stdvs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stdvs_dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stdvs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;nt&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stdvs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;legends&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N = {nt}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;legends&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/german_tank_1.png" alt="German tank problem" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This shows that even at $N=1000$, if you are able to observe about 10% of serial numbers, you can expect to make a very good estimate, you will essentially be on target to within a few % when estimating $N$!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;From the &lt;a href="https://en.wikipedia.org/wiki/German_tank_problem"&gt;Wikipedia article&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;According to conventional Allied intelligence estimates, the Germans were producing around 1,400 tanks a month between June 1940 and September 1942. Applying the formula below to the serial numbers of captured tanks, the number was calculated to be 246 a month. After the war, captured German production figures from the ministry of Albert Speer showed the actual number to be 245.&lt;/p&gt;
&lt;/blockquote&gt;</content><category term="statistics"></category><category term="war"></category></entry><entry><title>Delightfully cynical half-truths about organizations</title><link href="https://bytepawn.com/delightfully-cynical-half-truths-about-organizations.html" rel="alternate"></link><published>2022-02-12T00:00:00+01:00</published><updated>2022-02-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-02-12:/delightfully-cynical-half-truths-about-organizations.html</id><summary type="html">&lt;p&gt;Five delightfully cynical half-truths about organizations: the Peter principle, the Dilbert principle, the Gervais principle, Negative selection and the Dunning-Kruger effect. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/dilbert1.jpg" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The author is in a (mid-level) leadership position.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/dilbert1.jpg" alt="Dilbert comic" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Peter principle&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Peter_principle"&gt;Wikipedia&lt;/a&gt;: As long as a worker is good at their job, they will keep getting promoted, until they reach the first "level" where they're incompetent. Here they get stuck, and are no longer promoted. The corollary is that all higher level jobs in an organization is occupied by people who are incompetent at their jobs.&lt;/p&gt;
&lt;h2&gt;Dilbert principle&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Dilbert_principle"&gt;Wikipedia&lt;/a&gt;: Companies tend to systematically promote incompetent employees to management &lt;em&gt;to get them out of the workflow&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;Gervais principle&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.ribbonfarm.com/2009/10/07/the-gervais-principle-or-the-office-according-to-the-office/"&gt;Ribbonfarm&lt;/a&gt;: Organizations are made up of Losers at the bottom, Clueless in the middle and Sociopaths (eg. CEOs and other CxOs) at the top. Sociopaths, to keep things running, promote over-performing Losers into middle-management, and to protect their own job, groom under-performing Losers into Sociopaths. The Clueless are the ones who lack the competence to circulate freely through the job market (unlike Sociopaths and Losers), and have long tenures at the organization.&lt;/p&gt;
&lt;h2&gt;Negative selection&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Negative_selection_(politics)"&gt;Wikipedia&lt;/a&gt;: The person on the top of the hierarchy, wishing to remain in power forever, chooses his associates with the prime criterion of incompetence – they must not be competent enough to remove him from power. Since subordinates often mimic their leader, these associates do the same with those below them in the hierarchy, and the hierarchy becomes progressively filled over time with more and more incompetent people.&lt;/p&gt;
&lt;h2&gt;Dunning-Kruger effect&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect"&gt;Wikipedia&lt;/a&gt;: The cognitive bias whereby people with low ability at a task overestimate their ability. Some researchers also include in their definition the opposite effect for high performers, their tendency to underestimate their skills.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The first four posit different mechanisms (promotion based on true merit, organizational-defensive promotion and self-defensive promotion) that result in incompetent people getting promoted to leadership position. The last one, the Dunning-Kruger effect, says that if promotion decisions are based on people's demeanor and self-promotion then less qualified will be promoted. These are half-truths, but there is some truth in them. What do high-performance organizations do to avoid such problems?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;At a high-level, all these problems (apart from the Dunning-Kruger effect) have their roots in misaligned incentives. The employee's (whether management or individual contributor) and the organization's long-term interests are not aligned. A good tool for such alignment is stock options. However there are very few organizations that are on a growth curve where this influences employees to do the right thing at scale. For this to work, the majority of employees have to believe that doing "the right thing" will cause their (current) stock to rise more versus the benefits (higher salary, more stock) they can hope to get when doing "the wrong thing". Also see &lt;a href="https://en.wikipedia.org/wiki/Tragedy_of_the_commons"&gt;Tragedy of the Commons&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Another root cause is poor hiring. This is especially tricky when the organization is hiring for a role that is not within the core competency of the organization, such as when a non-tech company hires for technical roles, or when a tech startup hires for finance or sales roles. There are various ways around this, such as getting recommendations from investors, but in general this is not a solved problem in industry, even though this is a very costly mistake. If an incompetent manager is hired, then this person will build an incompetent organization. Once this happens, it takes years to solve rebuild, and this can even bring down the entire company before it's resolved.&lt;/li&gt;
&lt;li&gt;Avoiding the Peter principle: instead of promoting to level $L+1$ when the person does a good job at level $L$, promote when the person already demonstrated competence at the $L+1$ level for some time. There are companies who do this, eg. Facebook did this in 2016-17 when I worked there. However, I found this unacceptable HR policy, because asking me to overperform at the next level for 18 months without compensation just to get promoted isn't fair. And this assumes that the employee's manager does not change, since usually a new manager will often not credit past performance. Another alternative is to promote based on performing at the $L$ level, but then demote if not performing at the $L+1$ level, and have a culture that accepts this and doesn't see it as major personal failure. This is also hard to accomplish at scale, since many people just aren't like this personally, won't accept demotion, and will just start looking for a new job. The industry solution seems to be just that: (i) promotions based on performance at the $L$ level, (ii) people leave their jobs for another $L+1$ level job if they don't perform at the $L+1$ level, (iii) but if they really are incompetent at this level, they will bounce around jobs and eventually slide down to lower tier companies.&lt;/li&gt;
&lt;li&gt;Avoiding the Dilbert principle: some companies employ up-or-out management, so either get promoted or get managed out. At other companies it's not understood that promotion always happens every X years (on either IC or Management track). This is fundamentally a people management problem, people managers need to set realistic expectations: if the person's performance isn't good enough for promotion then promotion should not be used as a carrot-on-a-stick.&lt;/li&gt;
&lt;li&gt;Negative selection: the best counter-measure is to try to align the management's long-term incentive with the company's.&lt;/li&gt;
&lt;li&gt;Dunning-Kruger effect: organizations can avoid this by deploying tools such as &lt;a href="https://en.wikipedia.org/wiki/OKR"&gt;OKRs&lt;/a&gt; with &lt;a href="https://en.wikipedia.org/wiki/SMART_criteria"&gt;SMART&lt;/a&gt; outcomes across the organization hiararchy. This increases the probability that employees will be judged based on the objective outcomes they deliver, instead of their ability to market and promote themselves.&lt;/li&gt;
&lt;/ul&gt;</content><category term="peter-principle"></category><category term="gervais-principle"></category><category term="dunning-kruger"></category><category term="dilbert"></category></entry><entry><title>Probabilistic spin glass - Conclusion</title><link href="https://bytepawn.com/probabilistic-spin-glass-conclusion.html" rel="alternate"></link><published>2022-01-31T00:00:00+01:00</published><updated>2022-01-31T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-01-31:/probabilistic-spin-glass-conclusion.html</id><summary type="html">&lt;p&gt;I summarize the 5 previous posts on probabilistic spin glasses. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/prob_spin_glass_21.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I wrote 5 posts on probabilistic spin glasses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html#probabilistic-spin-glass"&gt;Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html#probabilistic-spin-glass-part-ii"&gt;Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html#probabilistic-spin-glass-part-iii"&gt;Part III&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iv.html#probabilistic-spin-glass-part-iv"&gt;Part IV&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-v.html#probabilistic-spin-glass-part-v"&gt;Part V&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here I will summarize my main findings, misses, and reflect what I still don't have a good feel for.&lt;/p&gt;
&lt;h2&gt;Motivation&lt;/h2&gt;
&lt;p&gt;My motivation for studying these systems was to discuss a toy model similar to spin glasses, but without having to introduce the &lt;a href="https://en.wikipedia.org/wiki/Hamiltonian_mechanics"&gt;Hamiltonian&lt;/a&gt;, so non-physicist readers can also follow along. I'm not actually sure how well this "probabilistic spin glass" resembles spin glasses studied by physicists. (I stopped being a real physicist 10+ years ago.)&lt;/p&gt;
&lt;p&gt;The idea was to take a $N \times N$ sized spin system, ie. each cell has two states, ↑ and ↓, or 0 and 1. If each spin is indepdendent of its neighbours, then the whole system is just a list of indepdendent random variables, the geometry doesn't matter. To make it interesting, I made the spins dependent on the neighbours: for spin $s$ and neighbour $n$: $P(s=↑ | n=↑) = P(s=↓ | n=↓) = p$. So the spins align with probability $p$ and are opposite with probability $1-p$.&lt;/p&gt;
&lt;p&gt;The questions then are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;given $N$ and $p$, what is the probability of each possible $N \times N$ grid, of which there are $2^{N \times N}$?&lt;/li&gt;
&lt;li&gt;given $N$ and $p$, how do we write a Monte Carlo simulation so that each grid out of the $2^{N \times N}$ options is generated with the correct possibility?&lt;/li&gt;
&lt;li&gt;how do we verify our generation procedure?&lt;/li&gt;
&lt;li&gt;what can we say about the entropy of the grid [ensemble]?&lt;/li&gt;
&lt;li&gt;if we start with some grid, can we create a dynamic update rule given $N$ and $p$?&lt;/li&gt;
&lt;li&gt;what happens to the system if we apply the above update rule?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Symmetry&lt;/h2&gt;
&lt;p&gt;In part I, I wrote a MC method to generate grids given $N$ and $p$. They looked right, passed the smell test:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_22.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, later I noticed that the construction method is not symmetric. Ie. there are grids that are "the same" (grid A and B are the same if you can go from one to the other by some combination of flipping all bits at once, flipping the grid up-down or left-right, or rotating the grid in either direction).&lt;/p&gt;
&lt;p&gt;I generated a large number of grids using the (broken) method and noticed that the frequencies don't match:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_1_bad_probs.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;This was a pretty big miss. But, fixing this was relatively easy in the MC method, I just randomly applied a combination of the symmetry methods, and that fixed the frequencies:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_2_good_probs.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;At this point I wanted to be sure, so I verified that my construction process yields the correct $p$ probabilities within the grid (it does), so I used a MC simulation to plot the calibration curve:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_4_calibration.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Direct probabilities and entropy&lt;/h2&gt;
&lt;p&gt;I also wanted to plot the entropy as it varies with $N$ and $p$. In part I, I naively used the MC method, and generated $100 \times 2^{N \times N}$ grids in the hope of getting enough samples to get meaningful frequencies. Then in part II I realized that I can just compute the direct probabilility of each possible grid, and compute the exact entropy. Interestingly, the actual plot is the same, only high probabilility configurations seem to contribute:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_5.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is interesting because it's the same shape as a random coin toss (ie. a $1 \times 1$ spin glass). If somebody would ask me how to "scale up" a random coin toss to $N &amp;gt; 2$ outcomes, while maintaining the concave shape of the entropy curve, I would be in trouble, but this is it.&lt;/p&gt;
&lt;h2&gt;Dynamic behaviour&lt;/h2&gt;
&lt;p&gt;In part III, I wanted to see what happens if I take a spin glass (grid), and start changing the spins one-by-one, by honoring the $p$ probabililty. To see what's going on, I rendered the process as a GIF:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_0.900.gif" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Naively, by looking at the evolution, I assumed that if I generate a "typical" $(N,p)$ grid, and start evolving it, it will remain "close to the original", it will remain a "typical" $(N,p)$ grid. I believe this is naive for a number of reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;there are typical grids for a given $(N,p)$ because they have a relatively high probabilility, but all $2^{N \times N}$ possible grids have non-zero probability&lt;/li&gt;
&lt;li&gt;in the same spirit, any evolution history has non-zero probability, including the one where all spins "spontanously" go to all ↑ or all ↓, or into a nice chessboard pattern, irrespective of the current configuration. So the above assumptions, which I put into quotes, are quite wishy-washy hand-wavy, and in fact I believe they are wrong (see below).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Fraction curve&lt;/h2&gt;
&lt;p&gt;I also plotted the fraction curve in part III, this shows the majority fraction $f$ of aligned spins (whichever of there are more, ↑ or ↓). This is interesting because it shows a non-linear relationship between $p$ and $f$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p3_fractions_curve.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, I completely missed this plot's $N$-dependence in part III. In part IV I realized this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p4_2.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The more spins there are (the larger the grid), the harder it is (the lower the probability) for more spins to align. Essentially, as $N$ gets larger, although islands of aligned spins form, eventually by chance the islands end, and an opposite spin island forms. Since the system is very large, and both spin directions are equally favored, overall the system will have roughly equal ↑ and ↓, and from far away, it will look like random noise (like TV noise). For large $N$, $p$ has to be very close to 1 for the majority fractions to move away from 0.5! This was a major insight for me, which came quite late, in part IV.&lt;/p&gt;
&lt;p&gt;However, it is quite obvious. For example , drawing $500 \times 500$ grids, even the high $p$ ones are close to random noise. Contrast this with the first low $N$ plots at the beginning.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p4_1_500.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Attractors&lt;/h2&gt;
&lt;p&gt;In part V, I did a more thorough, but still simulation-based, investigation of $p$ and $N$-dependence of dynamic behaviour. Plotting the saturation trajectories for the different sized grids, for different $p$s (y-axis is % of majority aligned spins, x-axis is frame/steps, each trajectory is a line, 100 lines for 100 trajectories):&lt;/p&gt;
&lt;p&gt;$10 \times 10$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_1.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;$20 \times 20$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_2.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;$50 \times 50$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_3.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;$100 \times 100$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_4.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Plotting the saturation ratio (of the 100 trajectories), for different sized grids, at different $p$s:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_5.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This suggests:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;higher $p$ leads to higher likelihood of saturation&lt;/li&gt;
&lt;li&gt;higher $N$ makes it harder to saturate, ie. for a higher $N$ spin glass $p$ also has to be higher to saturate&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, this part of the investigation is inconclusive:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I suspect that majority fraction $f=0.5$ and $f=1$ attractors are "sticky", ie. it's hard to get away from them. For low $p$s, it's hard to get away from $f=0.5$ and it seems it practically "never" happens, then at some critical $p$ it becomes "possible" or "likely", and in those cases I believe it creeps up to $f=1$, where it gets stuck (sticky ceiling)&lt;/li&gt;
&lt;li&gt;as I wrote earlier, probabilistically every configuration can go into any other configuration (assuming $0 &amp;lt; p &amp;lt; 1$), but I don't know how to express the apparent stickyness of the $f$ ceiling&lt;/li&gt;
&lt;li&gt;it's also not clear to me whether the critical $p$, if it exists, is the same for all grid sizes, and I just have to run the simulations longer to see the same behaviour, or it differs&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This has been an interesting investigation, purely conducted out of intellectual curiousity. I may re-visit it once I have some new ideas, especially on how to quantitatively classify the long-term dynamic behaviour and its potential attractors.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>Optimal coverage for Wordle with Monte Carlo methods - Part III</title><link href="https://bytepawn.com/optimal-coverage-for-wordle-with-monte-carlo-methods-part-iii.html" rel="alternate"></link><published>2022-01-22T00:00:00+01:00</published><updated>2022-01-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-01-22:/optimal-coverage-for-wordle-with-monte-carlo-methods-part-iii.html</id><summary type="html">&lt;p&gt;I present a simple Monte Carlo solution which finds a 25-letter-unique Worlde wordlist in about 10 minutes. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/wordle_3.PNG" alt="Wordle" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/optimal-coverage-for-wordle-with-monte-carlo-methods-part-ii.html"&gt;previous article&lt;/a&gt;, we saw that a relatively simple Monte Carlo method can find a lot of 24-letter-unique wordlists, about 1 per second running at 16x parallelism. Here I use a crucial insights from the english language to make the search more efficient, and find a 25-letter-unique Worlde wordlist. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Wordle%20vowels%20MT.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Vowels&lt;/h2&gt;
&lt;p&gt;The key insight came from my friend and collegaue &lt;a href="https://rkakodker.medium.com/"&gt;Rameez Kakodker&lt;/a&gt;, who introduced me to Wordle. We were discussing the methods for solving this problems, when he said that vowels are likely to duplicate. This led me to the following idea: to get a 25-unique-solution, we need 5 words with 25 unqiue letters. But if a word has two (or more) vowels, like &lt;code&gt;BRAVE&lt;/code&gt;, it is very unlikely that this word would be in such a wordlist, because it contains both &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;E&lt;/code&gt;. But there are only 5+1 vowels in english: &lt;code&gt;AIEUO&lt;/code&gt;+&lt;code&gt;Y&lt;/code&gt;, so ignoring &lt;code&gt;Y&lt;/code&gt;, if a word uses up 2 vowels, like &lt;code&gt;BRAVE&lt;/code&gt;, then we won't have enough vowels left for all the 5 words.&lt;/p&gt;
&lt;p&gt;So the idea is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Like before, prune the vowel wordlist, and keep only words which have 5 unique letters.&lt;/li&gt;
&lt;li&gt;Further prune the wordlist, and keep only words that have 1 vowel. So eg. &lt;code&gt;BRAVE&lt;/code&gt; and &lt;code&gt;ALIEN&lt;/code&gt; is pruned, but &lt;code&gt;STICK&lt;/code&gt; is kept.&lt;/li&gt;
&lt;li&gt;When searching for the 5 words, always pick 5 words (from the above pruned words) which contain one of the 5 vowels &lt;code&gt;AIEUO&lt;/code&gt;. In other words, treat the 5 word list like 5 slots, and always pick an &lt;code&gt;A&lt;/code&gt;-word for the first slot, always pick an &lt;code&gt;I&lt;/code&gt;-word for the second slot, and so on, eg. &lt;code&gt;['bawty', 'crimp', 'fjeld', 'vughy', 'zonks']&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Step 1 and 2 in code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;vowels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;e&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;i&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;u&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;ltw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# letter-to-word index&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# make sure all 5 letters are unique&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vowels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# make sure there is exactly 1 vowel in the word&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;letter&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For step 3, we will use the same method as before: randomly generate wordlists, and if we find one that is promising (22 or more unique letters), then stick to that and try a few variations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_improve_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_seed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;131071&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vowel&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;vowel&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;vowels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_improve_attempts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vowels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# pick out one of the vowels randomly&lt;/span&gt;
                &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vowels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# pick a new word which contains this vowel&lt;/span&gt;
                &lt;span class="n"&gt;new_wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;
                &lt;span class="n"&gt;new_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_letters&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The rest of the code is as before, using &lt;code&gt;joblib&lt;/code&gt; to run in parallel:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flatten_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sublist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sublist&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;})]&lt;/span&gt;

&lt;span class="c1"&gt;# single-threaded:&lt;/span&gt;
&lt;span class="c1"&gt;# solve_wordle(num_tests=10*1000*1000, num_improve_attempts=1*1000*1000)&lt;/span&gt;

&lt;span class="c1"&gt;# multi-threaded:&lt;/span&gt;
&lt;span class="n"&gt;n_jobs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_improve_attempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                     &lt;span class="n"&gt;num_improve_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_improve_attempts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                     &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flatten_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;num_unique_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_unique_letters}-unique-letter solution: {wordlist}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num_unique_letters&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;*** JACKPOT! ***&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Found {len(solutions)} solutions...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;On my 12-core AMD system, after running it for 13 minutes, I got:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;24-unique-letter solution: [&amp;#39;backs&amp;#39;, &amp;#39;flint&amp;#39;, &amp;#39;grews&amp;#39;, &amp;#39;jumpy&amp;#39;, &amp;#39;vozhd&amp;#39;]
24-unique-letter solution: [&amp;#39;balky&amp;#39;, &amp;#39;cinqs&amp;#39;, &amp;#39;grump&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;wheft&amp;#39;]
...
24-unique-letter solution: [&amp;#39;brick&amp;#39;, &amp;#39;flews&amp;#39;, &amp;#39;jumpy&amp;#39;, &amp;#39;thanx&amp;#39;, &amp;#39;vozhd&amp;#39;]
25-unique-letter solution: [&amp;#39;brick&amp;#39;, &amp;#39;glent&amp;#39;, &amp;#39;jumpy&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;waqfs&amp;#39;]
*** JACKPOT! ***
24-unique-letter solution: [&amp;#39;brick&amp;#39;, &amp;#39;gulpy&amp;#39;, &amp;#39;meynt&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;waqfs&amp;#39;]
24-unique-letter solution: [&amp;#39;bring&amp;#39;, &amp;#39;chomp&amp;#39;, &amp;#39;junky&amp;#39;, &amp;#39;veldt&amp;#39;, &amp;#39;waqfs&amp;#39;]
...
24-unique-letter solution: [&amp;#39;jumby&amp;#39;, &amp;#39;knelt&amp;#39;, &amp;#39;pyric&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;waqfs&amp;#39;]
24-unique-letter solution: [&amp;#39;jumby&amp;#39;, &amp;#39;prong&amp;#39;, &amp;#39;veldt&amp;#39;, &amp;#39;wakfs&amp;#39;, &amp;#39;zilch&amp;#39;]
Found 249 solutions...
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Jackpot!&lt;/h2&gt;
&lt;p&gt;The following wordlist has 25 unique letters: &lt;code&gt;['brick', 'glent', 'jumpy', 'vozhd', 'waqfs']&lt;/code&gt;. The english alphabet has 26 letters, the one missing letter is &lt;code&gt;x&lt;/code&gt;. Playing these 5 words in Wordle will usually reveral a lot of letters from the solutions, and probably a few positions. For example, for today's Wordle:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wordle_3.PNG" alt="World" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;So, without any thinking, we already know the 5 letters (&lt;code&gt;ICENW&lt;/code&gt;), and 2 positions (&lt;code&gt;W**C*&lt;/code&gt;). The word is &lt;code&gt;WINCE&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wordle_4.PNG" alt="World" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Words&lt;/h2&gt;
&lt;p&gt;3 of the 5 words are unknown to me, 2 of them are not really english:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;glent: &lt;em&gt;to move quickly especially in an oblique direction&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;vozhd: &lt;em&gt;a Russian leader (the word is russian, and means "to lead")&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;waqfs: &lt;em&gt;plural of waqf (probably from arabic), a Muslim religious or charitable foundation created by an endowed trust fund&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I find it pleasing that hard problems can be solved on a modern desktop computer by brute-forcing with some minimal insights and minor optimizations.&lt;/p&gt;</content><category term="wordle"></category><category term="monte-carlo"></category></entry><entry><title>Optimal coverage for Wordle with Monte Carlo methods - Part II</title><link href="https://bytepawn.com/optimal-coverage-for-wordle-with-monte-carlo-methods-part-ii.html" rel="alternate"></link><published>2022-01-21T00:00:00+01:00</published><updated>2022-01-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-01-21:/optimal-coverage-for-wordle-with-monte-carlo-methods-part-ii.html</id><summary type="html">&lt;p&gt;I improve on the previous brute-force Monte Carlo approach for attacking the Wordle coverage problem. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/wordle_2.PNG" alt="Wordle" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/optimal-coverage-for-wordle-with-monte-carlo-methods.html"&gt;previous article&lt;/a&gt;, we saw that the simplest brute-force approach, when run in parallel, will yield 21-22-letter-unique solutions to Wordle. Let's look at simple ways to get 23-24-letter-unique solutions. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Wordle%20Fully%20MT.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Prune the dictionary&lt;/h2&gt;
&lt;p&gt;If we want to get a lot of unique letters, we need unique letters in the dictionary words. For example, &lt;code&gt;HELLO&lt;/code&gt; is wasteful in the sense that it already has a duplicate. So let's only keep words where all 5 letters are unique. While we do this, let's a build a letter-to-word &lt;code&gt;ltw&lt;/code&gt; index, which contains a list of words which contain each letter:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;ltw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# letter-to-word index&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# make sure all 5 letters are unique&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;letter&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Two-stage Monte-Carlo&lt;/h2&gt;
&lt;p&gt;Let's take the following approach:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Select 5 unique random letters.&lt;/li&gt;
&lt;li&gt;For each letter, take a random word to generate a 5-word candidate wordlist.&lt;/li&gt;
&lt;li&gt;If the number of unique letters is less than 22, go to 1.&lt;/li&gt;
&lt;li&gt;Else, this looks promising, go to stage 2, then go to 1.&lt;/li&gt;
&lt;li&gt;Stage 2: Given the promising wordlist, try to improve it. Select a letter which is not currently in the wordlist, and replace one of the words with a word that contains this missing letter. If the new wordlist is better than the old, continue stage 2 with that. Perform this stage 2 search a million (or more) times.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_loops&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# stage 1:&lt;/span&gt;
    &lt;span class="n"&gt;seed_solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_seed_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# stage 2:&lt;/span&gt;
    &lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_improve_loops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seed_solutions&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;num_loops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_seed_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_seed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;131071&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;seed_solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;random_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# 5 random letters&lt;/span&gt;
        &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;letter&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;random_letters&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seed_solutions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;seed_solutions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;seed_solutions&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_improve_loops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seed_solutions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_loops&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;loop_wordlist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seed_solutions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_loops&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;additionals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loop_wordlist&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;new_wordlists&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sub_solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_wordlists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;additionals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_wordlists&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sub_solutions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loop_wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;additionals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_wordlists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;new_wordlists&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;all_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ascii_lowercase&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;remaining_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;all_letters&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;letters&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;letter&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;remaining_letters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;new_word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ltw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;old_word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;new_wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;old_word&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;new_word&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
                &lt;span class="n"&gt;new_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="n"&gt;new_wordlists&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;new_wordlist&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;new_wordlists&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can run this independently in parallel:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flatten_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields in 33 minutes of runtime:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;24-unique-letter solution: [&amp;#39;ablet&amp;#39;, &amp;#39;crunk&amp;#39;, &amp;#39;jimpy&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;waqfs&amp;#39;]
24-unique-letter solution: [&amp;#39;absit&amp;#39;, &amp;#39;fling&amp;#39;, &amp;#39;jumpy&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;wreck&amp;#39;]
...
24-unique-letter solution: [&amp;#39;jumps&amp;#39;, &amp;#39;qubit&amp;#39;, &amp;#39;vozhd&amp;#39;, &amp;#39;wreck&amp;#39;, &amp;#39;xylan&amp;#39;]
24-unique-letter solution: [&amp;#39;micky&amp;#39;, &amp;#39;pelfs&amp;#39;, &amp;#39;twang&amp;#39;, &amp;#39;urbex&amp;#39;, &amp;#39;vozhd&amp;#39;]
Found 1863 solutions...
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So this approach yields about one 24-letter-unique solution per second.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next article, I will show a Monte-Carlo approach which finds the jackpot, 25-unique-letter solution(s) in a few minutes.&lt;/p&gt;</content><category term="wordle"></category><category term="monte-carlo"></category></entry><entry><title>Optimal coverage for Wordle with Monte Carlo methods - Part I</title><link href="https://bytepawn.com/optimal-coverage-for-wordle-with-monte-carlo-methods.html" rel="alternate"></link><published>2022-01-19T00:00:00+01:00</published><updated>2022-01-19T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-01-19:/optimal-coverage-for-wordle-with-monte-carlo-methods.html</id><summary type="html">&lt;p&gt;I show a simple brute-force Monte Carlo approach for attacking the Wordle coverage problem. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/wordle_1.PNG" alt="Wordle" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.powerlanguage.co.uk/wordle/"&gt;Wordle&lt;/a&gt; is a simple 5x5 word guessing game. It's free, a new puzzle is posted every day. You have to guess an unknown 5 letter word, like &lt;code&gt;PROXY&lt;/code&gt;. You have 5 guesses, and after each guess, the game tells you if you got one of the letters, but it's not at the right position (brown), or it's the right letter in the right position (green). Only words from the Wordle dictionary are accepted as guesses, ie. &lt;code&gt;HELLO&lt;/code&gt; and &lt;code&gt;WORLD&lt;/code&gt; are valid guesses, but &lt;code&gt;ABCDE&lt;/code&gt; is not an accepted guess.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wordle_1.PNG" alt="World" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;The list of Wordle dictionary words is known, it's a set of 12,972 words.&lt;/p&gt;
&lt;h2&gt;Optimal coverage challenge&lt;/h2&gt;
&lt;p&gt;One strategy is to analyze the set of words, frequencies of letters, and given what we know so far, pick the word [from the Wordle dictionary] to guess that maximizes the average information gain of the next guess, assuming that the target word is randomly selected [from the Wordle dictionary]. One of the inputs to this is an assumption how the player values knowing the correct position of the letters (green) versus just knowing the set of letters making up the word (brown).&lt;/p&gt;
&lt;p&gt;There is a coincidence (?) in the structure in the game: the english alphabet is 26 words, and we can guess a total of 25 characters. If we can find 5 words which share no identical characters, then we can always use these 5 words as our 5 guesses, and assuming the target word has N unique letters, we will always no at least N-1. Eg. if the word is &lt;code&gt;HELLO&lt;/code&gt;, and after our 5 guesses we know that &lt;code&gt;H&lt;/code&gt; &lt;code&gt;E&lt;/code&gt; &lt;code&gt;L&lt;/code&gt; and &lt;code&gt;O&lt;/code&gt; are in the words, then:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;either the word has a duplicate letter, one of those 4&lt;/li&gt;
&lt;li&gt;the word has a fifth unique letter, the 26th letter that was not included in our 5 guesses (which covered 25)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We don't know which of the above is the case, but most of the time we can assume a reasonable player will be able to guess the target word at this point, also taking into account that on average a few letters' position will be known.&lt;/p&gt;
&lt;p&gt;So the question is: from the Wordle dictionary, can we find 5 words such that all 25 letters are unique? If not, what is the maximum achievable unique-count?&lt;/p&gt;
&lt;h2&gt;Monte Carlo approach&lt;/h2&gt;
&lt;p&gt;The simplest Monte Carlo approach is to just randomly pick 5 words from the dictionary, and check how many unique letters we have:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cigar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;rebut&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;sissy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# the Wordle dictionary&lt;/span&gt;
&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# 5 random words&lt;/span&gt;
    &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It will print something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;21 [&amp;#39;lotsa&amp;#39;, &amp;#39;finks&amp;#39;, &amp;#39;judge&amp;#39;, &amp;#39;macho&amp;#39;, &amp;#39;warby&amp;#39;]
21 [&amp;#39;muzak&amp;#39;, &amp;#39;vales&amp;#39;, &amp;#39;letch&amp;#39;, &amp;#39;bilgy&amp;#39;, &amp;#39;fjord&amp;#39;]
21 [&amp;#39;fagot&amp;#39;, &amp;#39;cushy&amp;#39;, &amp;#39;pawks&amp;#39;, &amp;#39;melon&amp;#39;, &amp;#39;diver&amp;#39;]
21 [&amp;#39;miter&amp;#39;, &amp;#39;munch&amp;#39;, &amp;#39;pawky&amp;#39;, &amp;#39;bodge&amp;#39;, &amp;#39;lifes&amp;#39;]
22 [&amp;#39;kight&amp;#39;, &amp;#39;flyby&amp;#39;, &amp;#39;roven&amp;#39;, &amp;#39;clasp&amp;#39;, &amp;#39;dwaum&amp;#39;]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's try to be a little smarter.&lt;/p&gt;
&lt;h2&gt;Parallelization&lt;/h2&gt;
&lt;p&gt;The simplest, still brute force improvement is to run the same thing, but in parallel. I have a 12-core (24 threads) CPU, so I can run 16x parallelism without impact the usability of my computer:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_seed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;131071&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# 5 random words&lt;/span&gt;
        &lt;span class="n"&gt;letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flatten_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sublist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sublist&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;})]&lt;/span&gt;

&lt;span class="n"&gt;n_jobs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solve_wordle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clean_solutions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flatten_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;num_unique_letters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{num_unique_letters}-unique-letter solution: {wordlist}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num_unique_letters&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;*** JACKPOT! ***&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Found {len(solutions)} solutions...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Wordle%20-%20Part%201.ipynb"&gt;The ipython notebook is on Github.&lt;/a&gt; Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;21-unique-letter solution: [&amp;#39;admix&amp;#39;, &amp;#39;itchy&amp;#39;, &amp;#39;ovels&amp;#39;, &amp;#39;tupik&amp;#39;, &amp;#39;wrung&amp;#39;]
21-unique-letter solution: [&amp;#39;adzed&amp;#39;, &amp;#39;boxty&amp;#39;, &amp;#39;brugh&amp;#39;, &amp;#39;vinyl&amp;#39;, &amp;#39;wacks&amp;#39;]
 ...
21-unique-letter solution: [&amp;#39;fyces&amp;#39;, &amp;#39;goban&amp;#39;, &amp;#39;lurve&amp;#39;, &amp;#39;mewed&amp;#39;, &amp;#39;piths&amp;#39;]
21-unique-letter solution: [&amp;#39;light&amp;#39;, &amp;#39;packs&amp;#39;, &amp;#39;vends&amp;#39;, &amp;#39;wauff&amp;#39;, &amp;#39;womby&amp;#39;]
Found 47 solutions...
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next part, I will show an improved Monte Carlo approach, which finds about one 24-unique-letter solution every second.&lt;/p&gt;</content><category term="wordle"></category><category term="monte-carlo"></category></entry><entry><title>Probabilistic spin glass - Part V</title><link href="https://bytepawn.com/probabilistic-spin-glass-part-v.html" rel="alternate"></link><published>2022-01-06T00:00:00+01:00</published><updated>2022-01-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2022-01-06:/probabilistic-spin-glass-part-v.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations to explore the dynamic behaviour of probabilistic spin glasses, specifically how saturation scales with $p$ and $N$. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/spin_glass_p5_3.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles I looked at various properties of probabilistic spin glasses by simulating ensembles of many samples and computing various statistics, while in the case of entropy I computed probabilities directly. Then I let grids evolve dynamically over "time" by changing spins one by one. In the previous article I ran simulations to understand the ensemble scaling behaviour for large spin glasses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html"&gt;Part I&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html"&gt;Part II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html"&gt;Part III&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iv.html"&gt;Part IV&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now  I will revisit the dynamic behaviour and see how the saturation behaviour scales with $p$ and $N$. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Probabilistic%20spin%20glass%20-%20Part%205.ipynb"&gt;ipython notebook is on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Saturation behaviour&lt;/h2&gt;
&lt;p&gt;Let's take a periodic $N \times N$ spin glass with $P(s=1|n=1)=p$ and let it evolve over time. One evolution step is taking a random spin, ignoring its current state, and based on the surrounding four states probabilistically setting a new state. Let's define a &lt;code&gt;frame&lt;/code&gt; to be $N \times N$ such steps, so that on average each spin gets one chance to change per frame.&lt;/p&gt;
&lt;p&gt;What I am interested here is the saturation behaviour of the spin glass. Saturation is when the majority of the spins align in one direction, whether that direction is ↑ or ↓. Since we're running probabilistic simulations, the ratio of aligned spins will not be exactly 1.0; we define a saturation threshold, above which we say that "essentially" all spins are aligned. For the smallest considered $10 \times 10$ spin glass let's use &lt;code&gt;satuation_limit=0.90&lt;/code&gt;, for larger ones &lt;code&gt;saturation_limit=0.95&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let's take a $N \times N$ spin glass, and let it evolve for 1000 frames, using different $p$s. At each frame, let's plot the saturation, ie. the majority spin ratio. Let's repeat this for 100 different trajectories, and record the ratio of trajectories where the spin glass saturated (this will be shown in the plot titles like &lt;code&gt;sat=0.8&lt;/code&gt; meaning 80% of trajectories saturated ie. reached the &lt;code&gt;saturation_limit&lt;/code&gt; of aligned spins). The code for this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_saturation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_trajectories_per_job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saturation_limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;starting_ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;trajectories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_trajectories_periodic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                 &lt;span class="n"&gt;num_trajectories&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_trajectories_per_job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                 &lt;span class="n"&gt;num_frames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;trajectories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;itertools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;saturation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;saturation_limit&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saturation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;trajectory&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trajectory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.49&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N={N}, start_p={starting_p:.3f}, p={p:.3f}, sat={saturation:.2f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;draw_plots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note the use of &lt;code&gt;Parallel()&lt;/code&gt; to use all available cores for Monte Carlo simulation. I ran this with 20x parallelism for more sample size.&lt;/p&gt;
&lt;p&gt;We can run this for different sized spin glasses. In each of the trajectories, let's start with a random noise &lt;code&gt;starting_p=0.5&lt;/code&gt; spin glass:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# fixed starting_p=0.5&lt;/span&gt;
&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_trajectories_per_job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.75&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.975&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;saturation_limits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saturation_limit&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saturation_limits&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_saturation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_trajectories_per_job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saturation_limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Plotting the saturation trajectories for the $10 \times 10$ case, for different $p$s (y-axis is % of majority aligned spins, x-axis is frame/steps, each trajectory is a line, 100 lines for 100 trajectories):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_1.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;This shows that for the $10 \times 10$ spin glass (notice the &lt;code&gt;sat&lt;/code&gt; in the plot titles):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;at $p=0.5$, none of the trajectories saturate within 1000 frames (first, left-most plot)&lt;/li&gt;
&lt;li&gt;at $p=0.75$, 67% of trajectories saturate within 1000 frames (second plot)&lt;/li&gt;
&lt;li&gt;at $p=0.9$ and higher, all trajectories saturate&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Repeating this for the $20 \times 20$ spin glass:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_2.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;$50 \times 50$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_3.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;$100 \times 100$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_4.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;What this suggest is that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Higher $p$ leads to higher likelihood of saturation.&lt;/li&gt;
&lt;li&gt;Higher $N$ makes it harder to saturate, ie. for a higher $N$ spin glass $p$ also has to be higher to saturate.&lt;/li&gt;
&lt;li&gt;The saturation ceiling is sticky. For all cases considered, it seems that once a system saturates, it's very hard to escape, ie. come back down from the saturation ceiling; of the trajectories shown, there is not a single case where this happens.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I will not show it here, but the notebook shows it: this behaviour is not dependent on &lt;code&gt;starting_p&lt;/code&gt;, the $p$ used to generate the initial grid in the trajectories. Whether &lt;code&gt;starting_p=p&lt;/code&gt; or &lt;code&gt;starting_p=0.5&lt;/code&gt; (random noise), the behaviour is the same (after 1000 frames).&lt;/p&gt;
&lt;p&gt;Let's stop looking at individual trajectories, and plot the saturation ratio (of the 100 trajectories), for different sized grids, at different $p$s:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p5_5.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This confirms what we saw on the trajectory plots:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;higher $p$ leads to higher likelihood of saturation&lt;/li&gt;
&lt;li&gt;higher $N$ makes it harder to saturate, ie. for a higher $N$ spin glass $p$ also has to be higher to saturate&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The lower $N$ curves suggest that there is a critical $p$, above which trajectories are very likely to saturate. I suspect that this is the case for larger spin glasses too, but 1000 frames is not enough to see this; however, this is not clear, since the definition of a frame is $N$-dependent. Subsequent experiments to run:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;zoom in on the critical $p$ at varous spin sizes&lt;/li&gt;
&lt;li&gt;see if the critical "floor to ceiling" behavior is there for large $N$, at higher frame count&lt;/li&gt;
&lt;li&gt;as the frame count goes to infinity, would all spin glasses at all $p$s saturate eventually?&lt;/li&gt;
&lt;/ul&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>Probabilistic spin glass - Part IV</title><link href="https://bytepawn.com/probabilistic-spin-glass-part-iv.html" rel="alternate"></link><published>2021-12-31T00:00:00+01:00</published><updated>2021-12-31T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-12-31:/probabilistic-spin-glass-part-iv.html</id><summary type="html">&lt;p&gt;This is a continuation of the previous articles on probabilistic spin glasses. I run simulations to understand the scaling behaviour for large spin glasses. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/spin_glass_p4_2.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles (&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html"&gt;Part I&lt;/a&gt;, &lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html"&gt;Part II&lt;/a&gt;, &lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html"&gt;Part III&lt;/a&gt;)  I looked at various properties of probabilistic spin glasses by simulating ensembles of many samples and computing various statistics, while in the case of entropy I computed probabilities directly. Then I let grids evolve over "time" by changing spins one by one.&lt;/p&gt;
&lt;p&gt;In this final article I will run simulations to understand the scaling behaviour for large spin glasses. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Probabilistic%20spin%20glass%20-%20Part%204.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Scaling behaviour&lt;/h2&gt;
&lt;p&gt;Let's see what different size spin glasses look like at different $P(s=1|n=1)=p$ values. Here are $50 \times 50$ grids:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p4_1_50.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here are $500 \times 500$ grids, so each of these grids is $10 \times 10 = 100$ times larger than the previous ones:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p4_1_500.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;These look quite different! In the $500 \times 500$ case, the $p=0.7$ and even the $p=0.8$ look like random noise, whereas for the smaller grid, we see the deviation from random noise much earlier.&lt;/p&gt;
&lt;p&gt;This behaviour is quite interesting. Actually, each $50 \times 50$ part of the larger $500 \times 500$ grid looks about the same as the corresponding smaller grid, if we were to zoom in. But with a larger grid, there are more chances for the spins to mis-align, so even if there is a block of ↑ spins, eventually there will come a block of ↓ spins. For a smaller grid, this also happens, but there is less space for it to happen, so the grid is more likely to show a pattern. And for a large enough grid, in the zoomed out view, these will look like random specks of noise.&lt;/p&gt;
&lt;p&gt;We can quantify the above by drawaing the majority fraction of the spins. This is just the fraction of the spins that all ↑ or ↓, whichever is more. In the previous article, we already looked at this, with the standard deviation of this fraction:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p3_fractions_curve.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let's draw the same curve, but for different sized grids. Let's look at the average and standard deviation seperately using our usual Monte Carlo measurement method:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Monte Carlo measurement&lt;/span&gt;
&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;measurements&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.975&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;create_grid_symmetric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;measurements&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="c1"&gt;# plot&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;measurements&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;pts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;measurements&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pts&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pts&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;majority fraction&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pts&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pts&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;stdev. of majority fraction&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{N}x{N}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_p4_2.png" alt="Probabilistic spin glass" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;This confirms the above. As the grid size gets larger, the (mean) majority fraction gets pushed down, as does its standard deviation. So in the limiting case, for an infinitely large grid, the majority fraction curve would get pushed down to 0.5, and then jump to 1 at $p=1$ (if $p=1$, all the spins have to align).&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In retrospect, the above scaling behaviour makes sense, but it was not expected. I thought it follows the same fraction-curve at all grid sizes.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>Probabilistic spin glass - Part III</title><link href="https://bytepawn.com/probabilistic-spin-glass-part-iii.html" rel="alternate"></link><published>2021-12-25T00:00:00+01:00</published><updated>2021-12-25T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-12-25:/probabilistic-spin-glass-part-iii.html</id><summary type="html">&lt;p&gt;I run simulations to understand the dynamic probabilistic evolution of these toy models. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/prob_spin_glass_p3_fractions_curve.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles (&lt;a href="https://bytepawn.com/probabilistic-spin-glass.html"&gt;Part I&lt;/a&gt;, &lt;a href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html"&gt;Part II&lt;/a&gt;), I looked at various properties of probabilistic spin glasses by simulating ensembles of many samples and computing various statistics, while in the case of entropy I computed probabilities directly. Here I will take a different route, and instead start with a starting grid, and let it evolve over "time" by changing spins one by one, and see how it behaves. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Probabilistic%20spin%20glass%20-%20Part%203.ipynb"&gt;ipython notebook is on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Evolution rule&lt;/h2&gt;
&lt;p&gt;Let's see what happens if we apply a dynamical probabilistic evolution operator to a probabilistic spin glass. The approach we will follow is simple:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Generate an initial grid per the (symmetrized) method described in the previous articles.&lt;/li&gt;
&lt;li&gt;Pick a random (non-edge) spin, and given the four neighbours, pick a new alignment with the appropriate probability.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;First, we have to arrive at the evolution probabilities: for a spin glass defined by $P(s=1 | n=1)=p$, what is the correct evolution probability given 4, already set neighbours? Instead of actually deriving the proability (which seems non-trivial to me), I cheat, and generate a large number of spin glasses and count frequencies to arrive at the probabilities. Since for 4 neighbours there are only $2^4=16$ possibilities, it's relatively easy to get good statistics.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_star_conditional&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;joint_frequencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;simulation&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;pct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;simulation&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Computing conditionals for the {rows}x{cols} spin glass, progress {pct}% &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d%d%d%d%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="n"&gt;joint_frequencies&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;joint_probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="c1"&gt;# joint_probabilities&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;joint_frequencies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;joint_probabilities&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;down&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Visualization&lt;/h2&gt;
&lt;p&gt;Now we can run this probabilistic evolution, and on every &lt;code&gt;steps_per_frame&lt;/code&gt; step, draw the frame on the screen:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_animation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frames_per_second&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conditional_set_four_neighbours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_star_conditional&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frames_per_second&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_four_neighbours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;suptitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;starting_p={starting_p:.2f}, p={p:0.2f}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;im&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Greys&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;off&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;animate_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;im&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;im&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;anim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;animation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FuncAnimation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;animate_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;frames_per_second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;frames_per_second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;anim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;frames_per_second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Previously, spin glasses had 4 parameters: &lt;code&gt;rows, cols, p0, p&lt;/code&gt;. Here there's a fifth one, &lt;code&gt;starting_p&lt;/code&gt;. This is in case we want the initial starting grid to be generated with a different &lt;code&gt;p&lt;/code&gt; then in the evolution steps. This is interesting because we can check whether and how quickly the system forgets its initial configuration.&lt;/p&gt;
&lt;p&gt;This is what a &lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.9, p=0.9&lt;/code&gt; spin glass looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_0.900.gif" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;And here is the same, but with &lt;code&gt;starting_p=0.5&lt;/code&gt;, so it starts from random noise:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/spin_glass_from_noise_0.900.gif" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: the spins at the edges of the grid are not changed in the simulation.&lt;/p&gt;
&lt;p&gt;The main takeaways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;On the first simulation, the overall pattern doesn't change much. This is because at &lt;code&gt;p=0.9&lt;/code&gt;, if all 4 neighbours are the same spin alignment, the spin is very likely to align. So the spins that tend to change are the ones on the edges of the patterns, where at least 1 or 2 of the spins are not aligned.&lt;/li&gt;
&lt;li&gt;On the second simulation with &lt;code&gt;starting_p=0.5&lt;/code&gt; we start out with a &lt;code&gt;p=0.5&lt;/code&gt; grid, which is just random noise. But then, very quickly, since we're evolving with the &lt;code&gt;p=0.9&lt;/code&gt; probabilities, out of the original randomnesss, a typical &lt;code&gt;p=0.9&lt;/code&gt; patterned grid forms, which then seemingly behaves like described in the previous point.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Visualizations of other parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.500.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.5&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.600.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.6, p=0.6&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.700.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.7, p=0.7&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.800.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.8, p=0.8&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.900.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.9, p=0.9&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.950.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.95, p=0.95&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_0.975.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.975, p=0.975&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Same, but starting from &lt;code&gt;starting_p=0.5&lt;/code&gt; noise:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.500.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.5&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.600.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.6&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.700.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.7&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.800.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.8&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.900.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.9&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.950.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.95&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='/images/spin_glass_from_noise_0.975.gif'&gt;&lt;code&gt;rows=50, cols=50, p0=0.5, starting_p=0.5, p=0.975&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Convergence behaviour&lt;/h2&gt;
&lt;p&gt;We can check the convergence behaviour more systematically, ie. how quickly does the system forget the &lt;code&gt;starting_p&lt;/code&gt;. Let's look at multiple trajectories, with different &lt;code&gt;starting_p&lt;/code&gt;s, and let's use the majority fraction of spins that are aligned as the measure of order (so this is the y-axis):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_trajectories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_trajectories&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frames_per_second&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;conditional_set_four_neighbours_cached&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conditional_set_four_neighbours_cached&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_star_conditional&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conditional_set_four_neighbours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_four_neighbours_cached&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frames_per_second&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt;
    &lt;span class="n"&gt;trajectories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;starting_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_trajectories&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;fs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;starting_grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;fs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_four_neighbours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;steps_per_frame&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;fs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="n"&gt;pct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Doing trajectory {t+1}/{num_trajectories} on the {rows}x{cols} spin glass, progress {pct}% &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;trajectories&lt;/span&gt;

&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;mts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.975&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.975&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;trajectories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_trajectories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_trajectories&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;starting_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trajectories&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p3_fractions.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;What this suggests is that it doesn't matter what the starting grid is, this initial condition is quickly washed out, and the grid behaves as if it always was running at its evoluationary &lt;code&gt;p&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But why is the fraction so different in different simulation runs at higher &lt;code&gt;p&lt;/code&gt;? To make sense of this, let's look at the fraction-curve from Part I again, but this time let's also plot the standard deviation, specifically for a $50 \times 50$ grids:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p3_fractions_curve.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is in agreement with what we see in the simulation runs: at &lt;code&gt;p&lt;/code&gt; higher than 0.7, the standard deviation of the majority spin alignment is quite high, so there is a lot of deviation from the average fraction.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In the next and final piece I will make the system periodic and look at how behaviour changes with grid size.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>Probabilistic spin glass - Part II</title><link href="https://bytepawn.com/probabilistic-spin-glass-part-ii.html" rel="alternate"></link><published>2021-12-18T00:00:00+01:00</published><updated>2021-12-18T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-12-18:/probabilistic-spin-glass-part-ii.html</id><summary type="html">&lt;p&gt;This is a continuation of the previous article on probabilistic spin glasses, with improvements to the simulation code and improved entropy computation. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/prob_spin_glass_p2_5.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is a continuation of the &lt;a href="https://bytepawn.com/probabilistic-spin-glass.html"&gt;previous article&lt;/a&gt; on probabilistic spin glasses, with improvements to the simulation code and improved entropy computation. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Probabilistic%20spin%20glass%20-%20Part%202.ipynb"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Symmetry&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/probabilistic-spin-glass.html"&gt;previous article&lt;/a&gt; I generated probabilistic spin glasses such as neighbouring spins are aligned with probability $p$. The code used started in the top left corner, first generated the first spin with probability $p_0$, and then used $P(s=1 | n=1) = p$ to generate the spins in the first row. The first element in the second row is again trivial to generate using $p$, then each spin has 2 neighbours, so I derived the conditional formula for 2 neighbours. To be sure the construction is correct, I ran Monte Carlo simulations: I generated a large number of spin glasses, and made sure that neighbouring spins are aligned with probability $p$.&lt;/p&gt;
&lt;p&gt;However, the construction still has a shortcoming. We would expect spin glasses that are "the same" to be generated with equal probability. Ie. if X and Y are spin glasses and X is an up-down or left-right mirror or for square grids a 90° rotation (or an arbitrary combination of the previous 3) of Y, then X and Y are really "the same", and should be generated with the same probability in a Monte Carlo ensemble.&lt;/p&gt;
&lt;p&gt;Let's check this for the code from the previous article. Let's generate a large number of grids, and count the frequencies:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;frequencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;pct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Computing grid probabilities for the {rows}x{cols} spin glass, progress {pct}% &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;    
    &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;())]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_common&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hot&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_xaxis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_yaxis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p=&lt;/span&gt;&lt;span class="si"&gt;%0.4f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_1_bad_probs.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;The symmetry does not hold! The first grid in the 2nd row and the first grid in the last row are the same after a left-right flip, but their probabilities are not the same! There are two ways to fix the construction:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;When generating the grid, randomly pick the starting point (top or bottom, left or right) and randomly pick the starting direction (horizontal or vertical). Implementing it yields hard to read code.&lt;/li&gt;
&lt;li&gt;Generate the grid using the simple code from the previous article, then randomly apply one of the 8 symmetry transformations.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let's do the second:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
    &lt;span class="c1"&gt;# code from previous article generates grid&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
    &lt;span class="c1"&gt;# make the ensemble symmetric&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fliplr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flipud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rot90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;After this, the probabilities for grids in the same symmetry groups are equal:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_2_good_probs.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Probabilities instead of Statistics&lt;/h2&gt;
&lt;p&gt;In the above example, I generated 10,000,000 grids to get good sample size. But this is not really required. For a $ N \times K $ spin glass, there are $ 2^{N \times K} $ up-down spin combinations. We can turn the construction process on its head: instead of using the probabilities to generate random grids, we can just enumerate all possible grids, and compute the probability of generating that. The basic logic is very similar to the Monte Carlo simulation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
    &lt;span class="n"&gt;cp2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;calculate_conditional_probs_for_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="c1"&gt;# first element&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# first row&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# remaining rows&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;cp2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;   
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;However, this has the same symmetry problem, let's fix that:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;spin_glass_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="c1"&gt;# create all possible symmetry transformations of grid&lt;/span&gt;
    &lt;span class="c1"&gt;# all combinations of left-right, up-down mirroring and 90 degree rotation&lt;/span&gt;
    &lt;span class="c1"&gt;# total of 2^3 = 8 transformations, including identity, ie. no transformation&lt;/span&gt;
    &lt;span class="c1"&gt;# the symmetric probability of the grid is the average probability&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fliplr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flipud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fliplr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flipud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;normalization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rot90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fliplr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rot90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flipud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rot90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fliplr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flipud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rot90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;normalization&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;normalization&lt;/span&gt;
    &lt;span class="c1"&gt;#&lt;/span&gt;
    &lt;span class="c1"&gt;# without honoring symmetries:&lt;/span&gt;
    &lt;span class="c1"&gt;# return spin_glass_probability_inner(grid, p0, p)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can now compare this to the Monte Carlo simulation to verify that we compute the same probabilities for each $2 \times 2$ grid:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;itertools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;asarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;())]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt;

&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;most_common&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;axs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hot&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vmax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_xaxis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_yaxis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p=&lt;/span&gt;&lt;span class="si"&gt;%0.4f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_3_good_probs.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;With this approach we don't waste CPU time generating 10,000,000 samples. The Monte Carlo version took 203 seconds, this took just a few miliseconds!&lt;/p&gt;
&lt;p&gt;There is still value in the Monte Carlo method, since computing all $ 2^{N \times K} $ combinations is not feasable for large spins, eg. a $ 100 \times 100 $ spin glass has $2^{10,000}$ spin combinations. By the time we go through that, the Sun has exploded and the Universe has ended. Here the MC method's importance sampling is still preferable. Let's use MC to draw the calibration curve for a $ 100 \times 100 $ grid using 100 samples, to make sure the generated grids obey the desired probabilities:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# consistency check&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ZERO&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ONE&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;same&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                    &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                    &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;measured_p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;same&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;measured_p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;actual p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;measured p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_4_calibration.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Entropy&lt;/h2&gt;
&lt;p&gt;Next, let's see how entropy varies with grid size for square grids. Since we can directly compute the probability of each grid, we enumerate the grids, get the probabilities, and use the standard formula for entropy:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;spin_glass_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;entropy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;li&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;itertools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;))):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;asarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;entropy&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;grid_probability&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid_probability&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;entropy&lt;/span&gt;

&lt;span class="n"&gt;grid_entropies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ZERO&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ONE&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;grid_entropies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grid_entropies&lt;/span&gt;&lt;span class="p"&gt;]))):&lt;/span&gt;
    &lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grid_entropies&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;ys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grid_entropies&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;entropy (bits)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt; x &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grid_entropies&lt;/span&gt;&lt;span class="p"&gt;]))))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_5.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can plot the same another way, drawing lines for indentical proabilities $p$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_p2_6.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;For a given grid size, the entropy $H(p)$ is concave. For a given probability $p$, the entropy $H(N)$ is convex and scales like $O(N^2)$ for a $N \times N$ spin glass (for $p=0.5$, entropy is exactly $H(N)=N^2$).&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The main lesson here was that it's not always necessary to perform Monte Carlo simulations. Sometimes it's possible to enumerate all possibilities with the appropriate proabilities. When computing the entropy, this is preferred, since importance sampling does not necessarily yield a good approximation of entropy.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>Probabilistic spin glass - Part I</title><link href="https://bytepawn.com/probabilistic-spin-glass.html" rel="alternate"></link><published>2021-12-11T00:00:00+01:00</published><updated>2021-12-11T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-12-11:/probabilistic-spin-glass.html</id><summary type="html">&lt;p&gt;I run Monte Carlo simulations on probabilistic spin glasses, a simple mathematical model of magnetized matter with short range interactions. I use entropy to characterize the model's order-disorder transition. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/prob_spin_glass_21.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/tag/entropy.html"&gt;previous articles&lt;/a&gt; I talked about entropy, and as an example of entropy in Physics, I &lt;a href="https://bytepawn.com/the-physical-sackur-tetrode-entropy-of-an-ideal-gas.html"&gt;derived the Sackur-Tetrode entropy of a monatomic ideal glass&lt;/a&gt;. Here I will discuss what I call probabilistic &lt;a href="https://en.wikipedia.org/wiki/Spin_glass"&gt;spin glass&lt;/a&gt;, a simple mathematical model of magnetized matter with short range interactions. In this article I introduce the concept and run some Monte Carlo simulations to get a feel for the model, and use entropy to characterize its order-disorder transition. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Probabilistic%20spin%20glass.ipynb"&gt;The code is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Probabilistic spin glass&lt;/h2&gt;
&lt;p&gt;Let's take an $N \times N$ sized spin system, ie. each cell has two states, ↑ and ↓, or 0 and 1. If each spin is indepdendent of its neighbours, then the whole system is just a list of indepdendent random variables, the geometry doesn't matter. To make it interesting, let's make the spins dependent on the neighbours: for spin $s$ and neighbour $n$: $P(s=↑ | n=↑) = P(s=↓ | n=↓) = p$. So the spins align with probability $p$ and are opposite with probability $1-p$.&lt;/p&gt;
&lt;h2&gt;Monte Carlo simulations&lt;/h2&gt;
&lt;p&gt;Let's take the above and run Monte Carlo simulations to see what happens as we vary $p$. We will take an array of size $N \times N$ and start filling out the spins according to the probabilities above. First we have to introduce an additional probability $p_0$, which is the probability of the first (let's say top left) spin being ↑. Once we set the first spin, we can set the rest of the spins in the first row. Once we get to the second row, we run into a difficulty. The spins above it have already been set, so these spins have 2 set neighbours. We need to compute the conditional probability $P(s=↑ | n_1=., n_2=.)$ given 2 neighbours $n_1$ and $n_2$ in a way that is consistent with the basic definition of $p$. Using &lt;a href="https://en.wikipedia.org/wiki/Bayes%27_theorem"&gt;Bayes' theorem&lt;/a&gt;, for the specific case of $P(s=↑ | n_1=↑, n_2=↑)$:&lt;/p&gt;
&lt;p&gt;$P(s=↑ | n_1=↑, n_2=↑) = \frac{P(s=↑, n_1=↑, n_2=↑)}{P(s=↑, n_1=↑, n_2=↑) + P(s=↓, n_1=↑, n_2=↑)} $&lt;/p&gt;
&lt;p&gt;The joint triplet probabilities are easy to compute, we can think of setting $s$ first with probability $p_0$, and then the left and right neighbour with probability $p$:&lt;/p&gt;
&lt;p&gt;$P(s=↑, n_1=↑, n_2=↑) = p_0 p^2$&lt;/p&gt;
&lt;p&gt;Expressed as Python code &lt;code&gt;calculate_conditional_probs_for_1()&lt;/code&gt; returns the conditional probabilities for $P(s=↑, n_1=., n_2=.)$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_conditional_probs_for_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
    &lt;span class="n"&gt;joint_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;111&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;010&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;101&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;110&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;011&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;001&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;100&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;00&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;010&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;010&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;01&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;011&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;011&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;001&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;10&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;110&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;110&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;100&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;11&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;111&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;111&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;joint_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;101&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The rest of the system:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_unconditional_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_conditional_set_one_neighbour&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_conditional_set_two_neighbours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conditional_probs_for_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;calculate_conditional_probs_for_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;conditional_probs_for_1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;unconditional_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_unconditional_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conditional_set_one_neighbour&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_conditional_set_one_neighbour&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conditional_set_two_neighbours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_conditional_set_two_neighbours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    
    &lt;span class="c1"&gt;# first element&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;unconditional_set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# first row&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_one_neighbour&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# remaining rows&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_one_neighbour&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conditional_set_two_neighbours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;hot&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;interpolation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;nearest&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's run it with $p_0=0.5$ and $p=0.7$ and see a realization:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;show(create_grid(rows=10, cols=10, p0=0.5, p=0.7))
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/prob_spin_glass_1.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Validation&lt;/h2&gt;
&lt;p&gt;First, let's make sure the math and logic is correct. Let's simulate a large $100 \times 100$ spin glass with the above rules, then count all pairs, and make sure that the original conditional probability holds:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# consistency check&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.70&lt;/span&gt;
&lt;span class="n"&gt;same&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;measured_p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;same&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Ran {num_simulations} Monte Carlo simulations for grid size {rows}x{cols} with p={p}, measured p was {measured_p:.2}&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Ran 100 Monte Carlo simulations for grid size 100x100 with p=0.7, measured p was 0.7
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Disorder and domains&lt;/h2&gt;
&lt;p&gt;First, let's look at the special case of $p=0$. In this case the spin glass looks always looks like this (or inverted):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_3.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is because $p=0$ forces the spins to always be inverted compared to the neighbour.&lt;/p&gt;
&lt;p&gt;Let's look at what happens as we vary $p$ from 0.1 to 1.0:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_21.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The same, on $100 \times 100$ spin glass systems:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prob_spin_glass_22.png" alt="Probabilistic spin glass" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;As $p$ increases, disorder increases. Note that at $p=0.5$ the neighboring spins are independent. At $p &amp;gt; 0.5$ domains form, because spins prefer to align with each other, but sometimes randomly an inversion occurs, and a patch is formed.&lt;/p&gt;
&lt;p&gt;Let's check how the fraction $f$ of aligned spins varies with $p$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;pairs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.975&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.999&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;f&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.48&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/prob_spin_glass_41.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Perhaps unintuitively, the dependence is not linear. The plot has two parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for $0 \leq p \leq \frac{1}{2}, f=\frac{1}{2}$. In this region, on average always half of the spins is ↑, half is ↓&lt;ul&gt;
&lt;li&gt;at $p=0$, the spins are completely ordered like a chess board&lt;/li&gt;
&lt;li&gt;at $p=\frac{1}{2}$, the spins are completely independent&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;at $p &amp;gt; \frac{1}{2}$, one of the spin directions starts to dominate as $f$ breaks away from $\frac{1}{2}$, but it's very slow&lt;ul&gt;
&lt;li&gt;$p$ has to get very close to 1 for $f$ to approach 1&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Disorder and entropy&lt;/h2&gt;
&lt;p&gt;$f$ is constant between $0 \leq p \leq \frac{1}{2}$, but visually we can see that the system goes from order (chessboard) to complete disorder (independent spins). We can use entropy to characterize the order-disorder property of the model. Let's do brute-force Monte Carlo simulations to compute the entropy of a small $4 \times 4$ spin glass:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;spin_glass_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;frequencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    
        &lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;packbits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="n"&gt;pairs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.999&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spin_glass_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;entropy (bits)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/prob_spin_glass_5.png" alt="Probabilistic spin glass" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the entropy shown on the y-axis is specific to this $4 \times 4$ spin glass, it would be higher for a larger system.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The entropy plot above proves our intuition: at $p=0$, the entropy is just 1 bit, which is coming from $p_0=\frac{1}{2}$, which gives two (inverted) ways to get a chess board. It's also 1 at $p=1$, where the entire spin glass is either all ↑ or all ↓. In the middle, at $p=\frac{1}{2}$, the entropy is 16 bits, since $4 \times 4 = 16$, and all the spins are independent.&lt;/p&gt;
&lt;p&gt;The concave shape of the entropy plot is the same as that of a $p$-biased coin with entropy function $H(p)=-(p) log[p] - (1-p)log[1-p]$, so with respect to entropy the spin glass behaves like a biased coin, where the biasing factor $p$ is the correlation strength between neighboring spins.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Spin glasses are a widely studied models of magnetism in Physics. Here I used the toy probabilistic spin glass model to avoid having to introduce too much physical formalism (eg. Hamiltonian). We saw how the behaviour of the spin glass strongly depends on $p$, the correlation probability, and that entropy describes the order-disorder aspect of the system that we observe visually.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category><category term="spin"></category><category term="glass"></category></entry><entry><title>The physical Sackur-Tetrode entropy of an ideal gas</title><link href="https://bytepawn.com/the-physical-sackur-tetrode-entropy-of-an-ideal-gas.html" rel="alternate"></link><published>2021-11-29T00:00:00+01:00</published><updated>2021-11-29T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-11-29:/the-physical-sackur-tetrode-entropy-of-an-ideal-gas.html</id><summary type="html">&lt;p&gt;I derive the Sackur-Tetrode equation for entropy of a monatomic ideal gas. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/sackur.gif" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/entropy-of-an-ideal-gas-with-coarse-graining.html#entropy-of-an-ideal-gas-with-coarse-graining"&gt;previous article&lt;/a&gt;, I derived a coarse-grained formula for the entropy of an ideal gas, taking into account only the uncertainty in position of the particles. The derived expression for entropy cannot be physical, since it depends on how we as observer's choose to coarse grain the volume of the gas ($M$ bins). Also, it doesn't take into account the velocity of the particles. The next step, coming in this article, is to take make physical arguments and compute an expression for $W$ that takes into account the velocities, parametrized using the kinetic energy of the system and derive the &lt;a href="https://en.wikipedia.org/wiki/Sackur%E2%80%93Tetrode_equation"&gt;Sackur–Tetrode equation&lt;/a&gt; for a monatomic ideal gas of indistinguishible particles. Once we have that, we can also derive the ideal gas equation of state $pV = NkT$.&lt;/p&gt;
&lt;h2&gt;Macroscopic state&lt;/h2&gt;
&lt;p&gt;When defining the entropy of an ideal gas, we first need to define some of its macroscopic properties. For example, are we talking about 1 liter of gas or 2 liters? What is the density of the gas? The parameters we need to define are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$V$ volume&lt;/li&gt;
&lt;li&gt;$N$ number of particles (specifying the number of particles $N$ is the same as specifying density $\rho$, since $\rho = N/V$)&lt;/li&gt;
&lt;li&gt;$E$ total kinetic energy of the particles making up the gas&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The first 2 make sense, but what about the 3rd one? We know from real-world experience that at the human, macroscopic scale, objects have a temperature.  We will see that the total kinetic energy of particles is related to the temperature of the gas (slower movement is colder, faster movement is hotter).&lt;/p&gt;
&lt;h2&gt;Quantum mechanics&lt;/h2&gt;
&lt;p&gt;In classical physics, both position $x$ and velocity $v$ can be measured at arbitrary accuracy. So, in a thought experiment, we could measure $x$ and $v$ at time $t_0$, and we would know everything about the system at any other time. All we have to do is run a simulation, either backward or forward in time, and we could always tell $x_i(t)$ and $v_i(t)$ for all particles $i=1...N$. So, where does the uncertainty come from? Does it come from our unwillingness or inability to measure all the positions and velocities? If that were the case, entropy would be an observer or measurement dependent quantity, similar to the expression derived in the previous article.&lt;/p&gt;
&lt;p&gt;The way out is to remember &lt;a href="https://en.wikipedia.org/wiki/Uncertainty_principle"&gt;Heisenberg's uncertainty principle&lt;/a&gt; from quantum mechanics, which states that although a particle's position &lt;em&gt;or&lt;/em&gt; velocity (momentum) can each be measured to arbitrary precision, &lt;strong&gt;both cannot be at the same time&lt;/strong&gt;. There is a joint uncertainty in the measurement:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$\Delta x_i \Delta p_i \ge h$&lt;/p&gt;
&lt;p&gt;Here $x_i$ is the position of the $i$th particle, $p_i=m v_i$ is the $i$th particle's momentum, $m$ is the common particle mass, and $h$ is a physical constant called &lt;a href="https://en.wikipedia.org/wiki/Planck_constant"&gt;Planck constant&lt;/a&gt;. In other words, the uncertainty principle introduces a natural coarse graining on the combined $(x, p)$ space!&lt;/p&gt;
&lt;h2&gt;Phase space&lt;/h2&gt;
&lt;p&gt;For an ideal gas, the microscopic state corresponds to specifying the position $x_i$ and momenta $p_i$ of all $i=1...N$ particles. For our treatment, we will take all the these and build one large phase vector $(x_1, p_1, x_2, p_2, ... x_N, p_N)$. Here each $x$ and $p$ is itself an (x, y, z) Cartesian 3D vector, so this vector lives in the $6N$-dimensional so-called &lt;a href="https://en.wikipedia.org/wiki/Phase_space"&gt;phase space&lt;/a&gt;. Each point in phase space defines all the positions and momenta of the particles making up the gas and corresponds to a possible microscopic state of the system.&lt;/p&gt;
&lt;p&gt;However, we must put additional constraints on this allowable points in the phase space:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the positions $x_i$ have to fall within the volume of the gas&lt;/li&gt;
&lt;li&gt;the kinetic energies of the particles have to add up to the total kinetic energy $E$ of the gas&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Marbles, bins and indistinguishible particles&lt;/h2&gt;
&lt;p&gt;In the previous article, we did not use a phase space formalism. Instead, we assumed we coarse grain the available volume into $M$ bins, and compute how many $W(M, N)$ ways we can put $N$ particles into the $M$ bins. In the phase space formalism, each point (and after quantization into $h$-sided hypercubes, each hypercube) in phase space already corresponds to a complete specification of the system. In other words, in this formalism, we are not computing $M$, the number of bins, we are directly computing $W$, the number of microscopic states!&lt;/p&gt;
&lt;p&gt;How do we take into account the indistinguishibility of particles in the phase space formalism? If we pick a phase vector (or surrounding $h$-sided hypercube), where $(x_1, p_1)=A$ and $(x_2, p_2)=B$, that is really the same phase vector as when $(x_1, p_1)=B$ and $(x_2, p_2)=A$, since we can't tell particles 1 and 2 apart. So in the phase space formalism, if we have indistinguishible particles, we are double (multi) counting microscopic states, so we have to devide $W_{phase}$ by some factor to get the real $W$. What is the factor? We have to use some physical intuition and hand-waving. Because $h$ is a very small number, dividing the phase space into $h$-sided hypercubes yields a very large number of hypercubes, &lt;em&gt;even compared to the number of particles $N,$&lt;/em&gt; so the phase space will have a lot more hypercubes than $N$. This means that most of the time, no two particles will be in the same hypercube, ie. we can ignore cases like $(x_1, p_1)=A$ and $(x_2, p_2)=A$ (where $=A$ denotes "falls into hypercube A"). In that case, if all $N$ particles are in a different hypercube, the degree of overcounting is just $N!$, the number of permutations of N distinct letters.&lt;/p&gt;
&lt;h2&gt;Sackur-Tetrode equation&lt;/h2&gt;
&lt;p&gt;Based on the above considerations, we can now derive the physical entropy of a monatomic ideal gas. Let's assume the gas is in a cube of volume $V=L^3$, and the total kinetic energy of particles is $E= \frac{m}{2} \sum_i v_i^2 $. Let's switch to using momenta $p_i = m v_i$ and $v_i^2 = \frac{1}{m^2} p_i^2$, then the energy in terms of momenta is $E = \frac{1}{2m} \sum_i p_i^2$. So the phase space $(x_1, p_1, x_2, p_2, ..., x_N, p_N)$ is constrained to the manifold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$ 0 &amp;lt; x_i &amp;lt; L$ — this contraint for $x_i$s defines a hypercube of volume $V_{hyper}=V^N$, of dimensionality $3N$ (note that each $x_i$ here has 3 internal coordinates $(x, y, z)$)&lt;/li&gt;
&lt;li&gt;$E = \frac{1}{2m} \sum_i p_i^2$ — this contraint for $p_i$s defines the $(3N-1)$-dimensional shell of a $3N$-dimensional hypersphere of radius $r^2 = 2mE$ (note that each $p_i$ here has 3 internal coordinates $(p_x, p_y, p_z)$). This &lt;a href="https://en.wikipedia.org/wiki/N-sphere#Volume_and_surface_area"&gt;shell has area&lt;/a&gt; $ A_{hyper} = \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{\Gamma(\frac{3N}{2})} $&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In lower dimensions if we take a circle (with circumference $C$) and a square (with area $A$), they form a cylinder with wall area $C A$. This is also what happens in higher dimensions, a hypersphere with shell area $A_{hyper}$ and a hypercube with volume $V_{hyper}$ form a hypercylinder with wall area $ A_{hyper} V_{hyper} $. So the phase space volume $ P = V_{hyper} A_{hyper} $ is:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{\Gamma(\frac{3N}{2})} $&lt;/p&gt;
&lt;p&gt;Here $\Gamma(x)$ is the &lt;a href="https://en.wikipedia.org/wiki/Gamma_function"&gt;Gamma function&lt;/a&gt;. Let's assume that $N$ is even, so $\frac{3N}{2}$ is integer, in which case the $\Gamma$ function is $\Gamma(x)=(x-1)!$. With this:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} $&lt;/p&gt;
&lt;p&gt;Now we face a problem, or problems. The phase space itself is of 6N-dimensions ($N$ particles, each has 3 position and 3 momenta coordinates), but the manifold is actually a "surface" of 6N-1 dimensions, because the momenta contraints defined a $(3N-1)$-dimensional &lt;em&gt;shell&lt;/em&gt; of a $3N$-dimensional hypersphere. To turn the area into a volume, we need to give the shell some thickness. So we will assume that the shell has an infinitesimal thickness $\Delta p$. Since $E=\frac{m}{2} v^2=\frac{1}{2m}p^2$, so $p^2=2mE$ and $p(E)=\sqrt{2mE}$. This means that $dP = \sqrt{\frac{m}{2E}} dE$, so the thickness is $ \Delta p = \sqrt{\frac{m}{2E}} \Delta E $. This term will be approximately ignored later, so it's not that important.&lt;/p&gt;
&lt;p&gt;At this point, the fixed phase space volume is:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $&lt;/p&gt;
&lt;p&gt;Next, we use the uncertainty principle to coarse-grain this volume into hypercubes. Each $x-p$ pair (3 per particle, for $(x, y, z)$ physical 3D space) contributes an $h$, so the hypercubes have volume $h^{3N}$. We divide $P$ by this to get $W_{phase}$:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ W_{phase} = \frac{P}{h^{3N}} = \frac{V^N}{h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $&lt;/p&gt;
&lt;p&gt;Now we just need to divide by the factor $N!$ to take into account the indistinguishibility of particles and get rid of the overcounting to get the final expression:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ W = \frac{W_{phase}}{N!} = \frac{V^N}{N! h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $&lt;/p&gt;
&lt;p&gt;The entropy $S$ is then (in units of $k$, the Boltzmann constant):&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ \frac{S}{k} = ln[W] = ln[ \frac{V^N}{N! h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E ] $&lt;/p&gt;
&lt;p&gt;This looks quite complicated, but can be handled quickly. The trick is to use the properties of the logarithm ($ln[ab]=ln[a]+ln[b]$, $ln[\frac{a}{b}]=ln[a]-ln[b]$ and $ln[a^k]=k ln[a]$) and Stirling's approxmiation $ln[n!]=n ln[n]-n$ to create a long list of log-terms. Then, put it in a form like $W=N A + B$, where $A$ is all the terms that have an $N$ multiplier outside of the logarithm. In the $A$ part, approximate $ln[\frac{3N-1}{2}] \approx ln[\frac{3N}{2}]$, and the entire $B$ part can be approximated to be small compared to the large $N$ multiplier $A$ part, so we are only left with the $N A$ part (I did it with pen and paper..). This leaves us with the famous Sackur-Tetrode equation for entropy of a monatomic ideal gas:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$\frac{S}{kN} = ln[\frac{V}{N}(\frac{4 \pi m E}{3 h^2 N})^\frac{3}{2}] + \frac{5}{2}$&lt;/p&gt;
&lt;p&gt;Note that the entropy $S$ is a function of $S(V, N, E)$, and the Planck constant $h$ also appears. &lt;/p&gt;
&lt;h2&gt;Entropy is extensive&lt;/h2&gt;
&lt;p&gt;The above expression is obviously extensive. If we double our system, which takes $V \rightarrow 2V, N \rightarrow 2N, E \rightarrow 2E$, then due to the $\frac{V}{N}$ and $\frac{E}{N}$ term the logarithm value does not change, however the $N$ outside introduces the 2 multiplier, so $S(2V, 2N, 2E) = 2S(V, N, E)$.&lt;/p&gt;
&lt;h2&gt;Temperature, pressure and the equation of state&lt;/h2&gt;
&lt;p&gt;For a closed system, the &lt;a href="https://en.wikipedia.org/wiki/First_law_of_thermodynamics#State_functional_formulation_for_infinitesimal_processes"&gt;1st law of thermodynamics&lt;/a&gt; for an infinitesimal process where N does not change can be written as:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$dE = T dS - P dV$&lt;/p&gt;
&lt;p&gt;Now let's take the Sackur-Tetrode equation, and rewrite it like:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ S = k N ( ln[V] + \frac{3}{2} ln[E] ) + f(N) $&lt;/p&gt;
&lt;p&gt;We kept just the $V$ and $E$ terms, and put all the rest into the function $f$, which is only a function of $N$. Now take the differential:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ dS = k N ( \frac{1}{V} dV + \frac{3}{2E} dE ) + f'(N) dN$&lt;/p&gt;
&lt;p&gt;Since we're realing with a process where $N$ does not change, $dN = 0$, we can ignore the last term. We can rewrite the remaining terms like:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$ dE = \frac{2E}{3kN} dS - \frac{2E}{3V} dV $&lt;/p&gt;
&lt;p&gt;Comparing this with the 1st law, we can read off the expression for temprate $T=\frac{2E}{3kN}$ and pressure $P=\frac{2E}{3V}$.&lt;/p&gt;
&lt;p&gt;Multiplying $P$ by $V$ yields:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$PV = \frac{2E}{3}$&lt;/p&gt;
&lt;p&gt;Multiplying $T$ by $N k$ yeilds:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;
$NkT = \frac{2E}{3}$&lt;/p&gt;
&lt;p&gt;We can read off the famous ideal gas equatation of state $PV=NkT$. We derived this from computing the entropy from physical considerations and using the 1st law of thermodynamics, which for our purposes effectively defined $T$ and $P$.&lt;/p&gt;
&lt;h2&gt;$h$ but no $c$&lt;/h2&gt;
&lt;p&gt;In Physics, when a formula contains the Plank constant $h$ it means that some aspect of quantum mechanics was taken into account. The Sackur-Tetrode equation contains $h$ because we used the uncertainty principle from quantum mechanics to quantize the phase space. Similarly, if a formula contains the &lt;a href="https://en.wikipedia.org/wiki/Speed_of_light"&gt;speed of light&lt;/a&gt; $c$ it means that special relativity is taken into account. The Sackur-Tetrode equation does &lt;strong&gt;not&lt;/strong&gt; contain $c$, it is not a relativistically correct. There is no contraint on the maximum speed any given particle can obtain, there is just the global contraint $E = \frac{1}{2m} \sum_i p_i^2$, so a microscopic state where one of the particles has a very high speed exceeding $c$ is allowed, even though it is not physically possible.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The article used the example of the physical entropy of a monatomic ideal gas to show the similarities and differences between entropy as it appears in phsyics and &lt;a href="https://bytepawn.com/cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy.html"&gt;math&lt;/a&gt; (&lt;a href="https://bytepawn.com/entropy-in-data-science.html"&gt;data science&lt;/a&gt;). In math, we have well-defined random variables and a simple defintion of entropy on top of that. In physics, we concept of entropy is more nuanced, and involves lots of idealizations and approximations (eg. we silently assumed all microscopic states we considered have the same probability).&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category></entry><entry><title>Entropy of an ideal gas with coarse-graining</title><link href="https://bytepawn.com/entropy-of-an-ideal-gas-with-coarse-graining.html" rel="alternate"></link><published>2021-11-19T00:00:00+01:00</published><updated>2021-11-19T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-11-19:/entropy-of-an-ideal-gas-with-coarse-graining.html</id><summary type="html">&lt;p&gt;I show the first steps of how to arrive at a definition of entropy for a monatomic ideal gas modeled as hard billiard balls. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/ideal_gas_cg_4.png" alt="." style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;An &lt;a href="https://en.wikipedia.org/wiki/Ideal_gas"&gt;ideal gas&lt;/a&gt; is a model of a real physical gas where we make some simplifications. We model the gas as hard billiard balls, bouncing off the walls of the container and each other in perfectly &lt;a href="https://en.wikipedia.org/wiki/Elastic_collision"&gt;elastic collisions&lt;/a&gt;. In classical physics, momentum is always conserved, in elastic collisions kinetic energy is also conserved (ie. kinetic energy never gets transfered to some other form of energy [1]). Here I will consider a monoatomic ideal gas, ie. all the gas is made up of one type of billiard ball.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;[1] In &lt;a href="https://en.wikipedia.org/wiki/Inelastic_collision"&gt;inelastic collisions&lt;/a&gt;, kinetic energy is lost as energy is converted to heat energy. This wouldn't make sense here, since we are using the billiard balls at the micro level to model heat and temperature at the macro level.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Entropy is the function of the probabilities in a random variable. If we assume that all outcomes are equally likely, so $ p_i = p = 1/N $, then the expression for the entropy becomes $ H = log[N] $.&lt;/p&gt;
&lt;p&gt;Let's start with a naive (and wrong) thought experiment to get to entropy. Let's assume we have a container, with 1 particle in it. The particle has some velocity and is bouncing around the container. Let's compute &lt;strong&gt;the entropy due to the position&lt;/strong&gt; of the particle.&lt;/p&gt;
&lt;p&gt;Let's start with the simplest possible coarse-graining, and split the container into 2 bins (A and B), so the ball is either in A or B. With this coarse-graining, the entropy of the system is 1 bit. Now, if we imagine 2 of these systems sitting next to each other, but separated by a thin wall, the entropy will be 2 bits. As long as the wall is there, the systems like independent random variables, so the joint entropy H(X, Y) = H(X) + H(Y). What if we now remove the wall? We will have 2 balls in the system, and keeping the physical size of the coarse graining, there are now 4 parts. In total, we can put 2 identical balls into 4 bins $4+3+2+1=10$ ways, so the entropy is $log_2[10]=3.3$ bits. So the entropy of the combined system with the wall removed is greater than the entropy of the sum of the individual systems, because the balls can now occupy just the left side or just the right side of the system, a state that was not present before.&lt;/p&gt;
&lt;p&gt;From a physical point of view, this is where our thought experiment fails. If we put two identical containers of &lt;a href="https://en.wikipedia.org/wiki/Monatomic_gas"&gt;monatomic gas&lt;/a&gt; (identical atoms, pressure, temperature, density) next to each other, remove the barrier, and let them mix, then effectively nothing happens, so &lt;a href="https://en.wikipedia.org/wiki/Intensive_and_extensive_properties"&gt;intensive properties&lt;/a&gt; (like pressure, temperature, chemical potential) remain the same and &lt;a href="https://en.wikipedia.org/wiki/Intensive_and_extensive_properties"&gt;extensive properties&lt;/a&gt; (like volume, entropy and particle number) are added (they are additive). In the above example, we'd expect the combined system to have 2 bits of entropy (at this level of coarse-graining).&lt;/p&gt;
&lt;p&gt;The problem is simple: the ideal gas model of thermodynamics only works if $N$, the number of particles is very large, and $N=1$ is not large enough. So let's consider a lot of particles.&lt;/p&gt;
&lt;h2&gt;Notation&lt;/h2&gt;
&lt;p&gt;In physics, when we deal with entropy, there is some minor differences in notation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;we use $S$ instead of $H$ to denote entropy&lt;/li&gt;
&lt;li&gt;after &lt;a href="https://en.wikipedia.org/wiki/Ludwig_Boltzmann"&gt;Ludwig Boltzmann&lt;/a&gt;, we use $W$ instead of $N$ to denote the total number of possible outcomes (W as in &lt;em&gt;"ways things can happen"&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;we the $ln=log_e$ instead of $log_2$&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Marbles and bins&lt;/h2&gt;
&lt;p&gt;Let's generalize the above naive approach, and imagine $N$ particles and a coarse-graining of volume so that we have $M$ bins of equal volume. Given $N$ indistinguishible marbles, how many $W$ ways can we put them into $M$ distinguishible bins? The &lt;a href="https://math.stackexchange.com/questions/192670/n-unlabelled-balls-in-m-labeled-buckets"&gt;answer&lt;/a&gt;, using combinatorics, is:&lt;/p&gt;
&lt;p&gt;$ W(N, M) = \frac{(N + M - 1)!}{N!(M-1)!} $&lt;/p&gt;
&lt;p&gt;You can double-check on paper that in the case of the above toy model, $W(2, 4) = 10$.&lt;/p&gt;
&lt;p&gt;In a physical system, assuming all microscopic states are equally likely, entropy scales like $ S \approx log[W] $. With this, we can revisit the above situation of putting two containers of ideal gas next to each other. The question is:&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) \stackrel{?}{=} 2 S(W(N, M)) $&lt;/p&gt;
&lt;h2&gt;Stirling's approximation&lt;/h2&gt;
&lt;p&gt;The last tool we need is &lt;a href="https://en.wikipedia.org/wiki/Stirling%27s_approximation"&gt;Stirling's approximation&lt;/a&gt;. For large $N$:&lt;/p&gt;
&lt;p&gt;$ ln[N!] \approx N ln[N] - N $&lt;/p&gt;
&lt;p&gt;Using basic properties of the logarithm ($log[ab] = log[a] + log[b]$ and $log[\frac{a}{b}] = log[a] - log[b]$):&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) = ln[\frac{(2N + 2M - 1)!}{(2N)!(2M-1)!}] = ln[(2N + 2M - 1)!] - ln[2N!] - ln[(2M-1)!] $.&lt;/p&gt;
&lt;p&gt;Then, using Stirling's approximation:&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) = (2N + 2M - 1) ln[2N + 2M - 1] - (2N + 2M - 1) - 2N ln[2N] + 2N - (2M-1) ln[2M-1] + (2M-1) $&lt;/p&gt;
&lt;p&gt;The non-log parts cancel out, so:&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) = (2N + 2M - 1) ln[2N + 2M - 1] - 2N ln[2N] - (2M-1) ln[2M-1] $.&lt;/p&gt;
&lt;p&gt;For large N, we can approximate $ (N-1) ln[N-1] \approx N ln[N] $, so:&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) \approx (2N + 2M) ln[2(N + M)] - 2N ln[2N] - 2M ln[2M] $&lt;/p&gt;
&lt;p&gt;Now, we can break the logs out, like $ln[2N] = ln[2] + ln[N]$. The terms multiplied with $ln[2]$ just cancel out, so:&lt;/p&gt;
&lt;p&gt;$ S(W(2N, 2M)) \approx 2(N + M) ln[N + M] - 2N ln[N] - 2M ln[M] = 2 ((N + M) ln[N + M] - N ln[N] - M ln[M]) \approx 2 S(W(N, M))$.&lt;/p&gt;
&lt;p&gt;So, for large N, the coarse-grained entropies &lt;em&gt;are&lt;/em&gt; additive (approximately)! It we consider something like a liter of gas, it will have $ N \approx 10^{22} $, so these approximations hold to very high precision.&lt;/p&gt;
&lt;h2&gt;Numerical checks&lt;/h2&gt;
&lt;p&gt;The code shown below is &lt;a href="https://github.com/mtrencseni/playground/blob/master/Entropy%20of%20an%20ideal%20gas%20with%20coarse-graining.ipynb"&gt;up on Github&lt;/a&gt;.
 First, let's check Stirling's approximation for large N: $ ln[N!] \approx N ln[N] - N $&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# numerical verification of Stirling&amp;#39;s approximation for large N&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;log_factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# compute log(N!) as a sum, so we can compute if for large N&lt;/span&gt;
    &lt;span class="c1"&gt;# using the identity log(a*b) = log(a) + log(b)&lt;/span&gt;
    &lt;span class="c1"&gt;# this is not an approximation, aside from floating point issues&lt;/span&gt;
    &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;

&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;log_factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Ratio of log[N!] to its Stirling-approximation&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/ideal_gas_cg_1.png" alt="Ideal gas coarse-graining" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Next, let's verify that $ (N-1) ln[N-1] \approx N ln[N] $:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# numerical verification that (N-1)*log(N-1) ~ N*log(N) for large N&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The ratio f(N-1)/f(N) for f(N)=N*log(N)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/ideal_gas_cg_2.png" alt="Ideal gas coarse-graining" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Next, let's verify that the exact entropy S and its Stirling-approximated are the same for large N:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# numerical verification that the exact entropy S and its&lt;/span&gt;
&lt;span class="c1"&gt;# Stirling-approximated are the same for large N&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;W&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;S&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;W&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;S_stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;S&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;S_stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The ratio the exact entropy and its Stirling-approximation&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/ideal_gas_cg_3.png" alt="Ideal gas coarse-graining" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Finally, let's verify our final result that the entropy $S$ is extensive:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# numerical verification that entropy is extensive&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;S_stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;S_stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;S_stirling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The ratio 2*S(N, N) / S(2*N, 2*N) showing that entropy is extensive&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/ideal_gas_cg_4.png" alt="Ideal gas coarse-graining" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Next steps&lt;/h2&gt;
&lt;p&gt;The above formula for entropy cannot be physical, since it depends on how to decide to coarse grain the volume of the gas ($M$). Also, it doesn't take into account the velocity of the particles. The next step, coming in the next article, is to take make physical arguments and compute an expression for $W$ that takes into account the velocities, parametrize using the kinetic energy of the system and derive the &lt;a href="https://en.wikipedia.org/wiki/Sackur%E2%80%93Tetrode_equation"&gt;Sackur–Tetrode equation&lt;/a&gt;. Once we have that, we can also derive the ideal gas equation of state $pV = NkT$.&lt;/p&gt;</content><category term="entropy"></category><category term="physics"></category></entry><entry><title>WeToddle retrospective</title><link href="https://bytepawn.com/wetoddle-retrospective.html" rel="alternate"></link><published>2021-10-29T00:00:00+02:00</published><updated>2021-10-29T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-10-29:/wetoddle-retrospective.html</id><summary type="html">&lt;p&gt;The idea behind WeToddle came from the Baby Fanclub group we have on Messenger, which has most of our family in it. It turns out some ex-Facebook people had a similar idea in 2019, raised $3M, spent a 2 years on it, and then gave up because it didn’t go anywhere (presumably).&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/wetoddle_android.png" alt="WeToddle Android" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;The idea&lt;/h2&gt;
&lt;p&gt;The idea behind WeToddle came from the &lt;strong&gt;Baby Fanclub&lt;/strong&gt; group we have on Messenger, which has most of our family in it. It’s essentially a mini-social network centered around The Baby. I noticed that some of our older parents effectively use it as The Social Network, and start posting content unrelated to the baby in there. Many friends confirmed they have something similar going on, with similar behaviour.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;I thought, shouldn’t this be its own app?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I didn’t do any research, I just wanted to write some code and see what happens.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/baba_fanclub.png" alt="Baba fanclub" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;I started writing code for a generic groups app (authentication, posts, pictures, comments, likes). Writing the server-side code was easy and fun. Writing the client side code was not fun at all, but I got a simple Android app working. One of the things I learned is that most of the pain is in the client side development. And I was not even doing iOS, just Android. Towards the end I found &lt;a href="https://flutter.dev/"&gt;Flutter&lt;/a&gt;, which potentially takes some of this pain away, but I didn't try it.&lt;/p&gt;
&lt;p&gt;The first commit is from May 21, the last one from August 11. I started working on it a bit earlier, probably May 11. So I spent 3 calendar months on this. There are 23 commits, I spent around 100 work hours on this (\$10k billable assuming \$100/hr), mostly during vacation days and weekends. In terms of actual cost, there was none, I didn’t spend any extra money on this.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wetoddle_android.png" alt="WeToddle Android" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Cocoon&lt;/h2&gt;
&lt;p&gt;Then I accidently saw this article: &lt;a href="https://techcrunch.com/2021/08/24/substack-acqui-hires-team-behind-subscription-social-app-cocoon/"&gt;Substack acqui-hires team behind subscription social app Cocoon&lt;/a&gt;. It turns out some ex-Facebook people had a similar idea in 2019, raised $3M, spent a 2 years on it, and then gave up because it didn’t go anywhere (presumably).&lt;/p&gt;
&lt;p&gt;Reading Coccon’s copy and looking at their app, they had the same exact motivation and built the exact same app I had in mind. So &lt;strong&gt;I immediately knew this means I shouldn’t invest any more time into this&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Following the links on Techcrunch, this is the announcement from 2019: &lt;a href="https://techcrunch.com/2019/11/26/cocoons-social-app-for-close-friends-gets-vc-backing-to-chase-paths-dream/"&gt;Cocoon’s social app for close friends gets VC backing to chase Path’s dream. Per Techcrunch&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You may have heard the pitch before, &lt;strong&gt;Facebook&lt;/strong&gt;, Twitter and Instagram &lt;strong&gt;aren’t homes for your real friends anymore&lt;/strong&gt; because they’re too big, too commercial and too influencer-y, the result is that &lt;strong&gt;your most important relationships&lt;/strong&gt; have been relegated to the lowest common denominator tool on your phones: &lt;strong&gt;your texting app&lt;/strong&gt;. Cocoon, a startup from a couple of &lt;strong&gt;ex-Facebook employees&lt;/strong&gt; that &lt;strong&gt;went through YC&lt;/strong&gt; earlier this year, is hoping to create the dedicated software that you use for that most important group chat in your life. The iOS-only app is a bit of a cross between Life360, Slack and Path.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Later they also released an Android app. They even supported voice calls! Here are some screenshots from their &lt;a href="https://play.google.com/store/apps/details?id=com.glasswing.cocoon&amp;amp;hl=en&amp;amp;gl=US"&gt;Google Play page&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cocoon.png" alt="Cocoon" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Cocoon was a YC startup. From their &lt;a href="https://www.ycombinator.com/companies/cocoon"&gt;fact sheet&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Co-founders:&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/alex-cornell-5449a99/"&gt;Alex Cornell&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/sachinmonga/"&gt;Sachin Monga&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;When: 2019 winter batch&lt;/li&gt;
&lt;li&gt;Location: San Francisco&lt;/li&gt;
&lt;li&gt;Employees: 6&lt;/li&gt;
&lt;li&gt;Tagline: &lt;em&gt;Cocoon actively brings your most important groups closer together. It provides a private space to gather, the tools you need to keep each other close, and helps you keep writing your ongoing story as a unit.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I didn’t do any research, I just wanted to write some code and see what happens. However, &lt;strong&gt;once I figured out the idea doesn’t work, I stopped writing code&lt;/strong&gt;. So even for a toy project &lt;strong&gt;I need to believe it could end up as something successful&lt;/strong&gt;, otherwise I don’t continue writing code, even as a self-education project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;I didn’t do enough research on existing startups&lt;/strong&gt; when I started to work on this. If I would have found Cocoon in May, even though they have not given up at that point, I would probably not have started to work on it at all, since:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Somebody with a very similar background (ex-Facebook), $3m funding and a 2 year headstart is already working on this idea. Great, no problem! I would have run with another idea. &lt;/li&gt;
&lt;li&gt;I would have seen that Cocoon is not showing signs of success.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Still, this was a success.&lt;/strong&gt; I &lt;em&gt;only&lt;/em&gt; spent 3 months / 50-100 hours on this, and I learned that this idea doesn’t work (at least today, with the set of ideas I had, which were a 100% match with Cocoon). Meanwhile, I learned a lot about client-side app development.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/failfast.png" alt="Cocoon" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;My #1 learning is that I need to spend more time researching existing solutions, even if I’m just doing it for fun (as a coding exercise). I should probably “learn” how to search on Techcrunch and Crunchbase more effectively. The other, less valuable learning is technical: I spent a lot of time blocking on app development. I can move very fast on the server-side, but I repeatedly hit brick walls on the client side, I could not move fast. &lt;strong&gt;I need to find a way so I can quickly get crappy versions of app out, in a few weeks.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I don’t exactly know why the Cocoon gave up, but I can take an educated guess: like me, they thought that making the app free would result in the same business model (and eventually, product) as Facebook, but then what’s the point? So they tried to charge users $5/month. I assume the problem is, Facebook/Messenger/Whatsapp are really polished products, everybody is already there, so &lt;strong&gt;the value proposition&lt;/strong&gt; of a few, &lt;strong&gt;relatively minor differentiating features&lt;/strong&gt; are just &lt;strong&gt;not enough&lt;/strong&gt; to make people use another app and even pay for it.&lt;/p&gt;</content><category term="startups"></category><category term="cocoon"></category><category term="facebook"></category></entry><entry><title>Entropy in Data Science</title><link href="https://bytepawn.com/entropy-in-data-science.html" rel="alternate"></link><published>2021-10-24T00:00:00+02:00</published><updated>2021-10-24T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-10-24:/entropy-in-data-science.html</id><summary type="html">&lt;p&gt;I discuss 4 uses of entropy in Data Science: (i) cross entropy as a loss function for training neural network classifiers (ii) entropy as a splitting criterion for building decision trees (iii) entropy for evaluating clustering algorithms (iv) entropy for understanding relationships in tabular data. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/decision_tree_split.png" alt="Information gain" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I first wrote about entropy in &lt;a href="https://bytepawn.com/what-is-the-entropy-of-a-fair-coin-toss.html"&gt;What's the entropy of a fair coin toss?
&lt;/a&gt; followed by &lt;a href="https://bytepawn.com/cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy.html"&gt;Cross entropy, joint entropy, conditional entropy and relative entropy&lt;/a&gt;, which explained these different entropy-related concepts. Now I will discuss 4 uses of entropy in Data Science:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cross entropy as a loss function for training neural network classifiers&lt;/li&gt;
&lt;li&gt;Entropy as a splitting criterion for building decision trees&lt;/li&gt;
&lt;li&gt;Entropy for evaluating clustering algorithms&lt;/li&gt;
&lt;li&gt;Entropy for understanding relationships in tabular data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Entropy%20in%20clustering.ipynb"&gt;The ipython notebook is here.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Cross entropy as a loss function&lt;/h2&gt;
&lt;p&gt;When building a supervised neural network classifier, &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;cross entropy&lt;/a&gt; or &lt;a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence"&gt;relative entropy&lt;/a&gt; (usually called Kullback–Leibler divergence in Deep Learning frameworks) are common loss functions. For example, suppose we're building an &lt;a href="https://en.wikipedia.org/wiki/MNIST_database"&gt;MNIST digit&lt;/a&gt; classifier, so there are 10 output classes, corresponding to digits 0..9. Since there are 10 classes, the neural network has 10 outputs. When the network is being trained, for each input image we know the actual digits it's showing, which we encode in a vector like $v_{actual}=[0,0,..,1]$. Since the last number is 1, this is a digit 9. Then we run the input image through the neural network, and read out the outputs for the 10 classes, which may look like $v_{predicted}=[0.1, 0, 0, ... 0, 0.1, 0.8]$. To make sure the output probabilities some to 1, we always put a &lt;a href="https://en.wikipedia.org/wiki/Softmax_function"&gt;softmax layer&lt;/a&gt; as the last layer of the network. So in this case, the cross entropy loss is $H(p=actual, q=predicted) = -\sum_i p_i log [ q_i ] = - log [ 0.8 ] $. In this expression the only $p_i$ that's non-zero is the one for the actual class, and for that $p_i = 1$, so cross entropy loss really only contains that one term. Note that during actual training, cross entropy loss is computed (summed) for an entire minibatch, ie. multiple training points.&lt;/p&gt;
&lt;p&gt;It would be tempting to say that cross entropy loss doesn't care what probabilities the model assigns to the incorrect classes, but this is misleading. As mentioned before, to get the probabilities of the output layer to sum to 1, the last layer is always a softmax, and each of the output probabilities depends on all the outputs of the previous layer. In the MNIST case, we have 10 outputs $z_i$, which are then softmax'd like $q_i = \frac{e^{z_i}}{\sum_j e^{z_j}}$, the divident contains all $z_i$s, so $q_i = q_i(z_0 ... z_9)$.&lt;/p&gt;
&lt;p&gt;What's the difference between cross entropy and relative entropy as a loss function? In neural networks, the training method is usually some form of gradient descent, which means that the loss function is differentiated by the weights of the neural network. Per the previous article, relative entropy and cross entropy are related like $ D_{KL}(actual | predicted) = H(actual, predicted) - H(actual) $. Here, $H(actual)$ is a constant with respect to the network weights [but it changes for each mini-batch], so it's derivative wrt to the network weights is 0. Because of this, although the two loss &lt;em&gt;values&lt;/em&gt; are different, their derivatives wrt network weights are identical ( $ \frac{\partial D_{KL}(actual | predicted)}{\partial w_i} = \frac{\partial H(actual, predicted)}{\partial w_i} $ because $ \frac{\partial H(actual)}{\partial w_i} = 0$ ), so their usage is equivalent, since the computed gradients propagated back into the network will be exactly the same!&lt;/p&gt;
&lt;p&gt;See the Bytepawn article &lt;a href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html"&gt;Solving MNIST with Pytorch and SKL&lt;/a&gt; for an example Pytorch deep neural network classifier using cross entropy.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-example.png" alt="MNIST digits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Entropy in decision tree classifiers&lt;/h2&gt;
&lt;p&gt;When building a decision tree, the algorithm starts with just one root nodes which contains all samples. Then, at each step, leaf nodes are split into 2 child nodes until a stopping condition is met. Given a set $S$ of samples in the current node to be split, what is the criterion to divide it into two children $S_L$ and $S_R$? If we look at &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html"&gt;Scikit Learn's &lt;code&gt;DecisionTreeClassifier&lt;/code&gt;&lt;/a&gt;, the first parameter is called &lt;code&gt;criterion&lt;/code&gt; and can either be &lt;code&gt;gini&lt;/code&gt; or &lt;code&gt;entropy&lt;/code&gt;. Let's look at &lt;code&gt;entropy&lt;/code&gt;: in this case, the algorithm will try to split to maximize Information Gain, which is defined as $I = H_{parent} - H_{children}$. Here $H_{parent}$ is the entropy of all samples in the parent node, where probability is just the frequency. So if the parent node contains 4 samples for class A and 6 samples for class B, then $H_{parent} = H([4,6]) = -\frac{4}{10}log\frac{4}{10} - \frac{6}{10}log\frac{6}{10} = 0.97$ bits. Suppose the algorithm is evaluating a split where the left side contains $[A,B]=[3, 1]$ samples and the right side contains $[A,B]=[1, 5]$ samples. So $H_{left} = H([3, 1]) = -\frac{3}{4}log\frac{3}{4} - \frac{1}{4}log\frac{1}{4} = 0.81$ bits. For the right side,  $H_{right} = H([1, 5]) = -\frac{1}{6}log\frac{1}{6} - \frac{5}{6}log\frac{5}{6} = 0.65$ bits. Then, $H_{children}$ is sum of these two entropies, weighted by sample size: $H_{children} = \frac{4}{10} H_{left} + \frac{6}{10} H_{right} = 0.71$ bits. So the information gain is $I = H_{parent} - H_{children} = 0.26$ bits. The algorithm looks at multiple possible splits, and selects the one with the highest information gain. Note that if it finds a way to put all As into the left node and all Bs into right node (or vica versa), then the child entropies will be 0, this is the maximum achievable $I_{max}=H_{parent}$, the best possible outcome of the split. Note that &lt;a href="https://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity"&gt;Gini impurity is also related to the concept of entropy&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/decision_tree_split.png" alt="Decision tree split" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Entropy in clustering&lt;/h2&gt;
&lt;p&gt;Suppose we want to evaluate some unsupervised clustering algorithms, or the same algorithm with different feature vectors. We have some labeled data, so we can compare the cluster classification to the real classes. So we have for each sample point $i$ a feature vector $x_i$, actual class $A_i$ and the clustering algorithm's cluster identity ie. the modeled class $M_i$. Note that the $M_i$s will not be labeled, since a clustering algorithm just says that a subset of points belong to a cluster, but it doesn't know what that cluster is, there is no name given to it. The $A_i$s are usually labeled, like car, boat, airplane. Because the $A_i$s and $M_i$s are different, cross entropy and relative entropy don't make sense here, since there it's assumed that the two random variables take on the same values.&lt;/p&gt;
&lt;p&gt;One approach is to take each cluster (so set of points where $M_i$ is the same), and calculate the entropy using the $A_i$s. We want this to be low, since that corresponds to a cluster with lots of points with the same actual class. Do this for each cluster, and then take the weighted sum of per-cluster entropies. What I described is just the conditional entropy $H(A|M)$! The downside of using conditional entropy is that it can be gamed by creating a lot of little clusters; in the worst case, each point is its own cluster, and all per-cluster entropies are 0. This is best controlled by limiting the number of clusters the clustering algorithm is allowed to create to the number of actual classes.&lt;/p&gt;
&lt;p&gt;A better alternative is to use &lt;a href="https://en.wikipedia.org/wiki/Mutual_information"&gt;mutual information&lt;/a&gt; $I(A, C)$, which will be zero if $A$ and $M$ are completely independent (worst case), and maximal if $A$ and $M$ completely detemine each other, ie. if there are an exactly equal number of classes, and knowing a modeled class $M_i$ always corresponds to the same $A_i$, and vica versa. For more, see &lt;a href="https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html"&gt;this article&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;The implementation of the above clustering metrics are straightforward in Python. To demonstrate, let's use the standard &lt;a href="https://en.wikipedia.org/wiki/Iris_flower_data_set"&gt;Iris flower dataset&lt;/a&gt;, which has 4 float numbers describing the length and the width of the sepals and petals in centimeters, and a label for the type of iris (setosa, versicolor, virginica). Let's use &lt;a href="https://en.wikipedia.org/wiki/K-means_clustering"&gt;k-means clustering&lt;/a&gt; from Scikit learn and have it return 3 clusters. Then, let's compute both the conditional entropy and information gain between the modeled and actual classes. Let's see what happens if we run the clustering with just 1, 2, 3 or all 4 of the available feature vector elements:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.cluster&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KMeans&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;iris_clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;iris&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_iris&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KMeans&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_names&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;use_columns&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;use_columns&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frequencies&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;conditional_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="c1"&gt;# computes H(Y|X)&lt;/span&gt;
    &lt;span class="n"&gt;clusters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;yy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;clusters&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;xx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cluster&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;clusters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;information_gain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;     &lt;span class="n"&gt;entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conditional_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# same as  entropy(Counter(q).values()) - conditional_entropy(q, p)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;use_columns&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;modeled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris_clusters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_columns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Using &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt; columns in feature vector, conditional entropy = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; bits, information gain = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; bits&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; 
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_columns&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;conditional_entropy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;modeled&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;information_gain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;modeled&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Using 1 columns in feature vector, conditional entropy = 0.99 bits, information gain = 0.60 bits
Using 2 columns in feature vector, conditional entropy = 0.56 bits, information gain = 1.02 bits
Using 3 columns in feature vector, conditional entropy = 0.44 bits, information gain = 1.14 bits
Using 4 columns in feature vector, conditional entropy = 0.39 bits, information gain = 1.19 bits
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The result is as expected, allowing the clustering algorithm to see more of the feature vector yields lower relative entropy (better) and higher informationg gain (better).&lt;/p&gt;
&lt;h2&gt;Entropy for understanding columnar data&lt;/h2&gt;
&lt;p&gt;We can use entropy to see how much information is contained in a column. For example, suppose we have a table &lt;code&gt;users&lt;/code&gt;, which has a column &lt;code&gt;country&lt;/code&gt;. We know that most of our users are located in our home country, but there may be others. We can compute the per-column entropy with a bit of SQL:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;users&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;GROUPING&lt;/span&gt; &lt;span class="k"&gt;SETS&lt;/span&gt;
        &lt;span class="p"&gt;((),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;-- there will be one row with country = NULL, corresponding to the overall counts&lt;/span&gt;
    &lt;span class="c1"&gt;-- the rest of the rows are the country-wise counts&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;g&lt;/span&gt;
    &lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;entropy_bits&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;probs&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This snippet can be extended to handle more columns, the simplest method is to concatenate columns like &lt;code&gt;'(' || col1 || ',' || col2 || ... ')'&lt;/code&gt;. Note that if &lt;code&gt;entropy_bits&lt;/code&gt; is 0, it means all values in the column are the same. Interestingly, the entropy (in bits) we compute is also the number of bits a database engine would need to efficiently encode these values in a row-wise manner.&lt;/p&gt;
&lt;p&gt;We can also check if another column contains additional information. Suppose there is another column in &lt;code&gt;users&lt;/code&gt; called &lt;code&gt;locale&lt;/code&gt;, which is the language locale of their browser, like &lt;code&gt;en_us&lt;/code&gt; or &lt;code&gt;fr_fr&lt;/code&gt;. For this we can use conditional entropy &lt;code&gt;H(locale|country)&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;users&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;GROUPING&lt;/span&gt; &lt;span class="k"&gt;SETS&lt;/span&gt;
        &lt;span class="p"&gt;((),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;marginal_probs&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;g&lt;/span&gt;
    &lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;joint_probs&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;g&lt;/span&gt;
    &lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;entropy_bits&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;marginal_probs&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;joint_probs&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;xy&lt;/span&gt;
&lt;span class="k"&gt;USING&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For example, if &lt;code&gt;country -&amp;gt; locale&lt;/code&gt; is a deterministic mapping, ie. &lt;code&gt;USA&lt;/code&gt; always maps to &lt;code&gt;en_us&lt;/code&gt;, then conditional entropy will be 0. As in earlier examples, information gain also makes sense for tabular data.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The above examples show that entropy is a common and useful concept in Data Science. I plan to write a follow-up article about interesting examples of entropy in Physics.&lt;/p&gt;</content><category term="entropy"></category></entry><entry><title>100 articles</title><link href="https://bytepawn.com/100-articles.html" rel="alternate"></link><published>2021-10-18T00:00:00+02:00</published><updated>2021-10-18T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-10-18:/100-articles.html</id><summary type="html">&lt;p&gt;A review and introspect on the first 100 articles written on Bytepawn. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/100.png" alt="100" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The previous article, &lt;a href="https://bytepawn.com/cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy.html"&gt;Cross entropy, joint entropy, conditional entropy and relative entropy&lt;/a&gt; was the 💯th post on Bytepawn! I took a few hours and reviewed my writing so far.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/100.png" alt="100" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;h2&gt;First article&lt;/h2&gt;
&lt;p&gt;I wrote the first article, &lt;a href="https://bytepawn.com/cargo-cult-data.html"&gt;Cargo Cult Data&lt;/a&gt;, in 2015 January because I saw that many companies are not getting the most out of their data and data teams due to organizational and cultural discipline issues. At that time, the issues I saw were flaky logs, no standardized metrics, or constant and pointless arguments around data (eg. correlation vs causation, trustwortyness). The topic of cargo culting is still relevant, but today I would write a different article. Since then I've worked at a lot more companies, and what I called "cargo culting" is the way things are at most companies — the only exception I've seen is Facebook (the company). Since then Data Science and AI have exploded and moved the topic of conversation away from traditional analytics topics like logging and A/B testing. DS/ML/AI have their own set of problems which are not best described by the "cargo cult" analogy. The final reason I would write a different article today, being almost 7 years wiser, is that today I believe it's not constructive to be confrontative on such issues. It's better to educate people, leading by showing, which is why later I wrote so many posts on data and A/B testing related topics here.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cargo_cult_data.jpg" alt="Cargo cult data" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Airflow&lt;/h2&gt;
&lt;p&gt;When I left my job at Prezi I took a month off before joining Facebook (the company), and I spent some time reviewing open source ETL tools. At Prezi we spent a lot of time building hand-rolled ETL tools, one in bash, a second one in Haskell and a third one in golang. By the end of my tenure I realized this is crazy, there must be (or should be) a good enough open source solution. I reviewed &lt;a href="https://bytepawn.com/airflow.html"&gt;Airflow&lt;/a&gt;, &lt;a href="https://bytepawn.com/luigi.html"&gt;Luigi&lt;/a&gt; and &lt;a href="https://bytepawn.com/pinball.html"&gt;Pinball&lt;/a&gt; and finally wrote a &lt;a href="https://bytepawn.com/luigi-airflow-pinball.html"&gt;3-way comparison piece&lt;/a&gt;. To this day, this gets a lot of daily long-tail hits — people are stil curious about what ETL tool to use. My takeaway then was that Airflow is the way forward. Since then Airflow has become the de-facto open source ETL tool. At Facebook I didn't use Airflow, we had something called Dataswarm, but I learned that Airflow is actually based on Dataswarm (written by an ex-Facebook engineer who left the company). So after leaving Facebook, I've set up Airflow and made it the standard ETL tool at all subsequent jobs, also teaching team members how to use it (building an Airflow job is always part of onboarding). Over the years I've also written a &lt;a href="https://bytepawn.com/tag/airflow.html"&gt;few more articles on how we use Airflow in production&lt;/a&gt;. The biggest "trick" — which I hope every team figures out — is that we always write helper functions that internally construct DAGs, so my less technical Data Scientist teammates can just ignore the technicalities and use a template for building DAGs. We currently use Airflow for both SQL transform jobs as well as ML model building (for which it is less ideal), and I plan to continue to use Airflow in the future.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/airflow_code_example.png" alt="100" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Fetchr&lt;/h2&gt;
&lt;p&gt;My first job in Dubai, after I left Facebook and London, was at a last mile delivery startup called Fetchr. One of the upsides about working at Fetchr was that, unlike at Facebook, I could write about work related topics freely, nobody cared. I wrote &lt;a href="https://bytepawn.com/tag/fetchr.html"&gt;about 10 articles about interesting Data Science work we did at Fetchr&lt;/a&gt;. I'm especially fond of the article &lt;a href="https://bytepawn.com/automating-a-call-center-with-machine-learning.html"&gt;Automating a Call Center with Machine Learning&lt;/a&gt;, this service is probably the most impactful piece of software I've ever written: it ended up saving about $3-5m/yr for Fetchr, which was about 3-5% of expenses at that time. It's been 2 years since I left, &lt;a href="https://www.bloomberg.com/news/articles/2021-10-06/top-backer-of-dubai-app-fetchr-warns-startup-faces-liquidation"&gt;unfortunately Fethcr is not doing too well&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ml-share3.png" alt="Share of ML scheduled versus Call center scheduled deliveries" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Pytorch&lt;/h2&gt;
&lt;p&gt;The following are all true for the Deep (and Reinforcement) Learning revolution: (i) the productization opportunities are genuinely exciting (ii) everybody's talking about it (iii) it's very interesting technically (iv) yet, very few companies are doing it in production. Big tech does use DL in production, but in my experience almost all other companies don't, because 80-90% of tabular business problems can be solved by SQL, scripting and dashboarding. For the remaining 10-20% there is some benefit in using Scikit-Learn models (tree-based models, gradient boosting, etc) or building FBProphet forecasts (or similar libraries like XGBoost or LGBM). Whenever I actually use a neural network at work, it's usually when I'm building a classifier, and I'm playing around with Scikit-Learn's &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html"&gt;LogisticRegression&lt;/a&gt;, &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html"&gt;DecisionTreeClassifier&lt;/a&gt;, &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html"&gt;RandomForestClassifier&lt;/a&gt;, &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html"&gt;GradientBoostingClassifier&lt;/a&gt;, etc. models and I also try and see how &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html"&gt;MLPClassifier&lt;/a&gt; performs with 2-3 hidden layers. Usually I decide against it, since performance is either not superior or within 1-2% of the second best, but training is slow and convergence is not always deterministic.&lt;/p&gt;
&lt;p&gt;I still had to learn Deep Learning! It's so interesting, exciting, and you first have to know what it is before you can make a good call whether to (not) use it for a given problem. So I used this blog to drive my learning, writing an article on each mini-project I built. I cast my vote for Pytorch (vs Tensorflow), because of it's native integration with Python which makes learning, experimentation and debugging easy (vs the Tensorflow runtime model). Some of the articles took a lot of time (also runtime), up to 20-40 hours sometimes. All in all I wrote &lt;a href="https://bytepawn.com/tag/pytorch.html"&gt;18 posts tagged pytorch&lt;/a&gt;. The approach worked, I know the basics (and a bit more) about Deep (and Reinforcement) Learning. My favorites articles are the ones where I go beyond getting something to work, such as &lt;a href="https://bytepawn.com/mnist-pixel-attacks-with-pytorch.html"&gt;seeing how many pixels I have to change on MNIST digits to break a deep neural network classifier&lt;/a&gt; or the one where &lt;a href="https://bytepawn.com/classification-accuracy-of-quantized-autoencoders-with-pytorch-and-mnist.html"&gt;I play around with the information storage capacity of autoencoders&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-four.png" alt="MNIST pixel attacks" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Craftmanship&lt;/h2&gt;
&lt;p&gt;I love thinking and writing about craftmanship. Craftmanship are the little things such as indenting SQL code properly, formatting numbers and charts so they're readable, or writing readable regular expressions. Articles from this theme:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/beat-the-averages.html#beat-the-averages"&gt;Beat the averages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/rxe-literate-and-composable-regular-expressions.html#rxe-literate-and-composable-regular-expressions"&gt;Literate and composable regular expressions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/metrics-atlas.html#metrics-atlas"&gt;Metrics Atlas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/calibration-curves-for-delivery-prediction-with-scikit-learn.html#calibration-curves-for-delivery-prediction-with-scikit-learn"&gt;Calibration curves for delivery prediction with Scikit-Learn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/how-i-write-sql-code.html#how-i-write-sql-code"&gt;How I write SQL code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/sql-best-practices-for-data-scientists-and-analysts.html#sql-best-practices-for-data-scientists-and-analysts"&gt;SQL best practices for Data Scientists and Analysts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/validation-checks-for-ab-tests.html#validation-checks-for-ab-tests"&gt;Validation checks for A/B tests&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/effective-data-visualization-part-1-categorical-data.html#effective-data-visualization-part-1-categorical-data"&gt;Effective Data Visualization Part 1: Categorical data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/effective-data-visualization-part-2-formatting-numbers.html#effective-data-visualization-part-2-formatting-numbers"&gt;Effective Data Visualization Part 2: Formatting numbers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/effective-data-visualization-part-3-line-charts.html#effective-data-visualization-part-3-line-charts"&gt;Effective Data Visualization Part 3: Line charts and stacked area charts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;A/B testing&lt;/h2&gt;
&lt;p&gt;The most frequent topic I wrote about is A/B testing, there are &lt;a href="https://bytepawn.com/tag/ab-testing.html"&gt;20 articles tagged A/B testing&lt;/a&gt;. In my experience writing clean SQL code and the concepts of A/B testing are the two most important skills in Data Science and Analytics — but very few people are clear on them. 4 out of 5 people I interview cannot explain to me what a p-value is. This is why I wrote articles such as &lt;a href="https://bytepawn.com/building-intuition-p-values-statistical-significance.html"&gt;Building intuition for p-values and statistical significance&lt;/a&gt;. I had a lot of fun with running Monte Carlo simulations of A/B testing on random graphs (&lt;a href="https://bytepawn.com/ab-testing-on-social-networks.html"&gt;A/B testing on social networks&lt;/a&gt; and &lt;a href="https://bytepawn.com/ab-testing-and-network-effects.html"&gt;A/B testing and networks effects&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A/B testing was also my greatest blunder on Bytepawn. In &lt;a href="https://bytepawn.com/ab-testing-and-the-historic-lift-paradox.html"&gt;this article&lt;/a&gt; I incorrectly claimed that it's a fallacy to look at historic (before the A/B test) data. This post made it to the Hacker News front page (most don't), and people called me out on this incorrect claim and pointed me to something called CUPED. This was painful, because in all my A/B testing articles I run Monte Carlo simulations to check my thinking, and here I got lazy and did not. This was a good reminder for me that although A/B testing seems like an easy subject, it's easy to mislead yourself, even for somebody who spends a lot of time thinking about it. It's always better to check your thinking with simple Monte Carlo simulations. But, this way I &lt;a href="https://bytepawn.com/tag/cuped.html"&gt;learned about CUPED, and wrote 5 articles on it&lt;/a&gt; — with Monte Carlo simulations, of course!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/coin_toss.jpg" alt="Coin toss" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Pelican&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://blog.getpelican.com/"&gt;Pelican&lt;/a&gt; is the static site generator I use for Bytepawn. &lt;a href="https://bytepawn.com/how-to-make-a-blog-like-this.html#how-to-make-a-blog-like-this"&gt;I wrote an article about it&lt;/a&gt;, what's changed since then is now I host it myself (with nginx and Let's Encrypt) to avoid the ~10 second delays when changing content with Github Pages. The combination of writing articles in Markdown, not worrying too much about formatting and Pelican has worked well in the past 5 years, I don't plan to change it. I don't want to switch to something like Medium, because this way I have and feel more ownership for the blog.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Writing the first 100 articles has been great fun, I hope and plan to write another 100! I write this blog for fun, it's not monetized, I don't regularly push the links to broadcast-social-media like Hackers News, only to my local-social-media (my Facebook and Linkedin). It used to get around 50-100 readers a day, this year it went up to 100-150 per day, probably because I write regularly: one of my yearly goals is to write 40 blog posts in 2021 (I'm on track_. Much more is not doable next to work and family, but I'll try to push myself to write 50 in 2022, and it'd be nice to have ~250 readers a day. A big question to myself is whether I should start writing articles about Physics — time will tell!&lt;/p&gt;
&lt;p&gt;Onward to the next 💯!&lt;/p&gt;</content><category term="meta"></category></entry><entry><title>Cross entropy, joint entropy, conditional entropy and relative entropy</title><link href="https://bytepawn.com/cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy.html" rel="alternate"></link><published>2021-10-09T00:00:00+02:00</published><updated>2021-10-09T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-10-09:/cross-entropy-joint-entropy-conditional-entropy-and-relative-entropy.html</id><summary type="html">&lt;p&gt;What's the difference between cross entropy, joint entropy, conditional entropy and relative entropy? &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/inf_gain.png" alt="Information gain" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/what-is-the-entropy-of-a-fair-coin-toss.html"&gt;previous article I discussed entropy&lt;/a&gt;. As a follow-up, I revisit more advanced variations of entropy and related concepts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Joint_entropy"&gt;Joint entropy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Conditional_entropy"&gt;Conditional entropy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;Cross entropy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence"&gt;Relative entropy&lt;/a&gt; (also known as Kullback–Leibler divergence)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Mutual_information"&gt;Mutual information&lt;/a&gt; (also known as Information gain)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Everything I cover here is &lt;strong&gt;elementary information theory&lt;/strong&gt;, mostly found in the first chapter of the classic &lt;a href="https://www.amazon.com/Elements-Information-Theory-Telecommunications-Processing/dp/0471241954"&gt;Cover, Thomas: Elements of Information Theory&lt;/a&gt; or the wikipedia pages linked above.&lt;/p&gt;
&lt;h2&gt;Entropy&lt;/h2&gt;
&lt;p&gt;Entropy is the amount of uncertainty of a random variable, expressed in bits. Imagine Alice has a random variable and she needs to communicate the outcome over a digital binary channel to Bob. What is a good encoding to minimizes the average amount of bits she sends? In the &lt;a href="https://bytepawn.com/what-is-the-entropy-of-a-fair-coin-toss.html"&gt;previous article I discussed the case of a fair coin&lt;/a&gt;, so let's make it a bit more complicated here, and use a 4-sided dice, ie. a tetrahedron. Let's say it's a fair tetrahedron, so each side comes up with $ p = \frac{1}{4} $ probability.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/tetrahedron_dice.jpg" alt="4 sided dice" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;p&gt;There are 4 outcomes, so she needs to use 2 bits:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;1 -&amp;gt; 00
2 -&amp;gt; 01
3 -&amp;gt; 10
4 -&amp;gt; 11
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since all outcomes are equally likely, there is no reason for her to deviate from this trivial encoding. So, on average she will use 2 bits per message.&lt;/p&gt;
&lt;p&gt;What if one of the sides is more likely to come up? Suppose the distribution is $ p = [\frac{1}{2}, \frac{1}{6}, \frac{1}{6}, \frac{1}{6}] $. In this case, it makes sense to use less bits for the first side, since it will come up more, so she sends less bits on average. Eg. she can try:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;1 -&amp;gt; 0
2 -&amp;gt; 10
3 -&amp;gt; 110
4 -&amp;gt; 111
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this encoding, on average she's sending $[\frac{1}{2}, \frac{1}{6}, \frac{1}{6}, \frac{1}{6}] \times [1, 2, 3, 3] = \frac{11}{6} = 1.83$ bits per message.&lt;/p&gt;
&lt;p&gt;The formula for entropy is $H(X) = - \sum_i p(x_i) log[p(x_i)]$. For the fair tetrahedron, $H(X)=2$, which means that the trivial encoding is optimal. For the biased tetrahedron above, $H(X)=1.79$ bits, which is less than the encoding above. This means that there are better encodings, we can still save about 0.04 bits per message. To do this, we need to do block encoding, encode several messages, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;11 -&amp;gt; 0
12 -&amp;gt; ...
14 -&amp;gt; ...
14 -&amp;gt; ...
21 -&amp;gt; ...
...
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;By creating larger and larger block encodings, we can arbitrarily approach the theoretical limit of $H(X)$, but we can never do better than that.&lt;/p&gt;
&lt;p&gt;The formula $H(X) = - \sum_i p(x_i) log[p(x_i)]$ for entropy can be made intuitive: the number of bits we should use to encode an outcome with probability $p(x_i)$ is of length $ - log[p(x_i)] $ bits. Eg. for $ p(x_i) = \frac{1}{2} $ this is 1 bit, for $ p(x_i)= \frac{1}{4} $ this is 2 bits. Considering the examples above, this is intuitive. And the $p(x_i)$ multiplier is just how often this outcome will occur. So it makes sense that this is the average number of bits sent per message.&lt;/p&gt;
&lt;p&gt;Also note the entropy of a random variable that has just one outcomes is 0, we don't need any bits if we already know what the outcome is going to be.&lt;/p&gt;
&lt;h2&gt;Joint entropy&lt;/h2&gt;
&lt;p&gt;Just as there is probability $p(x)$, $p(y)$ and joint probability $p(x, y)$, we can define entropy $H(X)$ and joint entropy $H(X, Y)$.&lt;/p&gt;
&lt;p&gt;$ H(X, Y) = - \sum_{ij} p(x_i, y_j)log[p(x_i, y_j)]$&lt;/p&gt;
&lt;p&gt;If $X$ and $Y$ are independent, then $H(X, Y) = H(X) + H(Y)$. This is easy to see, because then $p(x, y) = p(x) p(y)$, so:&lt;/p&gt;
&lt;p&gt;$ H(X, Y) = - \sum_{ij} p(x_i, y_j)log[p(x_i, y_j)] = - \sum_{ij} p(x_i)p(y_j)log[p(x_i)p(y_j)] $&lt;/p&gt;
&lt;p&gt;Now we use $log(ab)=log(a)+log(b)$:&lt;/p&gt;
&lt;p&gt;$ H(X, Y) = - \sum_{ij} p(x_i, y_j)log[p(x_i, y_j)] = - \sum_{ij} p(x_i)p(y_j)[log[p(x_i)] + log[p(y_j)]] $&lt;/p&gt;
&lt;p&gt;$ H(X, Y) = - \sum_j p(y_j) \sum_i p(x_i)log[p(x_i)] - \sum_i p(x_i) \sum_y p(y_j)log[p(y_j)] $&lt;/p&gt;
&lt;p&gt;$ H(X, Y) = - \sum_j p(y_j) H(X) - \sum_i p(x_i) H(Y) = H(X) \sum_j p(y_j) + H(Y) \sum_i p(x_i) = H(X) + H(Y) $.&lt;/p&gt;
&lt;p&gt;If $X$ and $Y$ are not indepedent, then:&lt;/p&gt;
&lt;p&gt;$ H(X, Y) &amp;lt; H(X) + H(Y) $.&lt;/p&gt;
&lt;p&gt;A simple example of the above is the 4 sided dice. Suppose $X$ is the random variable for each side coming up (4 outcomes each with probability $\frac{1}{4}$, $Y$ is the random variable for even or odd numbered sides coming up (2 outcomes, like a toin coss), &lt;em&gt;referring to the same toss&lt;/em&gt;. $X$ and $Y$ are not independent, eg. $P(X=1,Y=even)=0$ but $P(X=1)=\frac{1}{4}$ and $P(Y=even)=\frac{1}{2}$. Obviously, $H(X)=2 $ bits and $H(Y)=1 $ bits, but $H(X, Y)=H(X)=2 $ bits, because knowing $X$ already tells us everything about $Y$.&lt;/p&gt;
&lt;h2&gt;Conditional entropy&lt;/h2&gt;
&lt;p&gt;Suppose we have two random variables $X$ and $Y$. Suppose that we know the outcome of $X$, and the question is, how much entropy is "left" in $Y$? Another mental model: suppose we have Alice and Bob communicating over a digital channel. Alice needs to encode the outcome of $Y$ in bits and send it over to Bob (many times), but there is a second random variable $X$, whose outcome both Alice and Bob have access to. For example, Alice has a thermometer ($Y$) and is encoding and sending the reading to Bob who is a mile away, but both Alice and Bob experience the same weather ($X$). Can Alice use less bits to encode $Y$?&lt;/p&gt;
&lt;p&gt;We define conditional entropy like:&lt;/p&gt;
&lt;p&gt;$ H(Y | X) = \sum_i p(x_i) H(Y | X = x_i) $&lt;/p&gt;
&lt;p&gt;We go through all the values $X$ can take, calculate the entropy of $H(Y | X = x_i)$ of $Y$, and we average this over the outcomes of $X$. Note that this is similar to the formula for &lt;a href="https://en.wikipedia.org/wiki/Conditional_expectation"&gt;conditional expectation&lt;/a&gt;. $H(Y | X = x_i)$ is just the entropy over the conditional probabilities:&lt;/p&gt;
&lt;p&gt;$ H(Y | X = x_i) = - \sum_j p(y_j | x_i) log[p(y_j | x_i)] $.&lt;/p&gt;
&lt;p&gt;Since $ p(y_j | x_i) p(x_i) = p(y_j, x_i) $ conditional entropy can also be written as:&lt;/p&gt;
&lt;p&gt;$ H(Y | X) = - \sum_{ij} p(y_j, x_i) log[p(y_j | x_i)] = - \sum_{ij} p(y_j, x_i) log\frac{p(y_j, x_i)}{p(x_i)} $.&lt;/p&gt;
&lt;p&gt;Let's look at conditional entropy in the above example of the tetrahedron. $H(Y | X)$ is easy, if we know $X$, the outcome of the toss, then we don't need any additional bits to communicate whether it's even or odd, so $H(Y | X) = 0$ bits. $H(X | Y)$ is also easy, if we know whether the side landed on an even or odd number, in each case we have 2 options left (if it landed odd, it can be 1 or 3, if it landed even, it can be 2 or 4), all have the same $p = \frac{1}{2}$, so it's like a coin toss, so $H(X | Y) = 1$ bits.&lt;/p&gt;
&lt;p&gt;If $X$ and $Y$ are independent, $ H(Y | X) = H(Y) $, since knowing $X$ doesn't tell us anything about $Y$.&lt;/p&gt;
&lt;p&gt;Finally, for all random variables: $ H(Y | X) = H(X, Y) - H(X)$ (chain rule). We can check this again with the tetrahedron example: $ H(Y | X) = 0 = H(X, Y) - H(X) = H(X) - H(X) $ and $ H(X | Y) = H(X, Y) - H(Y) = 1 = 2 - 1 = 1$.&lt;/p&gt;
&lt;h2&gt;Cross entropy&lt;/h2&gt;
&lt;p&gt;Suppose Alice and Bob are communicating again, but this time there is no out-of-bounds communication. Alice wants to encode the outcomes of a random variable $X$ and send it to Bob, with the least number of average bits per outcome. However, let's assume Alice has incomplete or incorrect information about $X$: she mistakenly thinks that the distribution is per another random variable $Y$, and she constructs her encoding per $Y$. How many bits will Alice use on average?&lt;/p&gt;
&lt;p&gt;Let's look at the tetrahedron again, but this time it's uneven, let the probabilities be $X ~ [\frac{4}{8}, \frac{2}{8}, \frac{1}{8}, \frac{1}{8}]$. It would make sense for Alice to encode it with a prefix code like [0, 10, 110, 111], to save bits on the most likely outcomes.&lt;/p&gt;
&lt;p&gt;However, if she mistakenly thinks that the probabilities are $Y ~ [\frac{1}{8}, \frac{1}{8}, \frac{2}{8}, \frac{4}{8}]$, she will choose the encoding [110, 111, 10, 0]. She will send 3 bits in the most common case and 1 bit in the least common case.&lt;/p&gt;
&lt;p&gt;Let $p_i$ be the probabilities for $X$, $q_i$ be the probabilities for $Y$, then the cross entropy is:&lt;/p&gt;
&lt;p&gt;$ H(p, q) = - \sum_i p_i log[q_i] $.&lt;/p&gt;
&lt;p&gt;Note that this is not symmetric, ie. in general $ H(p, q) = - \sum_i p_i log[q_i] \neq H(q, p) = - \sum_i q_i log[p_i] $. Obviously, if $p=q$, then $H(p, q) = H(p) = H(q)$, ie. if Alice has the right distribution, she can construct an encoding that is maximally optimal and approaches the entropy.&lt;/p&gt;
&lt;p&gt;The definition is very intuitive: as mentioned for regular entropy, if an outcome has probability $p_i$, it's best to encode it with a message of length $-log[p_i]$ bits. For example, if $p_i=\frac{1}{2}$, then $-log[p_i]=1$ bit, and in that case we should encode that as a 0 (or 1). On average, this outcome occurs $p_i$ times, so this outcome contributes $-p_i log[p_i]$ bits on average. However, if Alice believes the probabilities are $q_i$, she will pick an encoding where this outcomes in encoded as $-log[q_i]$ bits, but in reality this occurs $p_i$ times, so it contributes $-p_i log[q_i]$ bits on average.&lt;/p&gt;
&lt;h2&gt;Relative entropy (Kullback–Leibler divergence)&lt;/h2&gt;
&lt;p&gt;In the above example, Alice thought that events arrive per $Y \sim q$, but in reality the distribution is $X \sim p$. She uses $ H(p, q) $ bits on average to communicate her outcomes, instead of using the $ H(p) $ bits she'd use if she knew the correct distribution. The difference between the two is called relative entropy, also known as Kullback–Leibler divergence:&lt;/p&gt;
&lt;p&gt;$ D_{KL}(p | q) = H(p, q) - H(p) $.&lt;/p&gt;
&lt;p&gt;Writing out the two definitions for $H(p, q)$ and $H(p)$, we get:&lt;/p&gt;
&lt;p&gt;$ D_{KL}(p | q) = H(p, q) - H(p) = - \sum_i p_i log \frac{q_i}{p_i} $.&lt;/p&gt;
&lt;p&gt;So, relative entropy $ D_{KL}(p | q) $ is the extra bits that Alice wastes. Like cross entropy, relative entropy is also not symmetric.&lt;/p&gt;
&lt;h2&gt;Mutual information (Information gain)&lt;/h2&gt;
&lt;p&gt;Mutual information is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" obtained about one random variable by observing the other random variable. For two random variables $X$ and $Y$, the mutual information is the relative entropy between their joint distribution and the product of their individual distributions:&lt;/p&gt;
&lt;p&gt;$ I(X, Y) = - \sum p(x, y) log \frac{p(x)p(y)}{p(x, y)} $&lt;/p&gt;
&lt;p&gt;If you compare this to the relative entropy formula above, it's the same with $p = p(x, y)$ and $q = p(x)p(y)$. It follows trivially from the definition that mutual information is symmetric, $I(X, Y) = I(Y, X)$. What if $X$ and $Y$ are independent? In that case, $ I(X, Y) = 0 $ because $ p(x, y) = p(x) p(y) $ and $ log[1] = 0 $. If $X$ and $Y$ completely determine another, eg. $ X = Y + 1 $, then one contains all the information about the other, so $ I(X, Y) = H(X) = H(Y) $. This is because in such a case, certain $p(x, y)$ combinations will be non-zero (eg. when $ x = y + 1 $), and for these non-zero cases $ p(x,y) = p(x) = p(y) $, so $ I(X, Y) = - \sum p(x) log[p(x)] = H(X) $.&lt;/p&gt;
&lt;p&gt;The following image explains the relationship between entropy, conditional entropy, join entropy and mutual information. &lt;/p&gt;
&lt;p&gt;&lt;img src="/images/inf_gain.png" alt="Relationship between entropy, conditional entropy, join entropy and mutual information" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;So, for example $ H(X, Y) = H(X|Y) + I(X,Y) + H(Y|X) $.&lt;/p&gt;
&lt;p&gt;Mutual information does not have a useful interpretation in terms of channel coding.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Everything I cover here is introductory information theory, mostly found in the first chapter of the classic &lt;a href="https://www.amazon.com/Elements-Information-Theory-Telecommunications-Processing/dp/0471241954"&gt;Cover, Thomas: Elements of Information Theory&lt;/a&gt; or the wikipedia pages linked above. I plan to write a follow-up post to give examples of using these metrics in Data Science and Machine Learning.&lt;/p&gt;</content><category term="entropy"></category><category term="cross-entropy"></category><category term="joint-entropy"></category><category term="conditional-entropy"></category><category term="relative-entropy"></category><category term="kullback–leibler-diverence"></category></entry><entry><title>What's the entropy of a fair coin toss?</title><link href="https://bytepawn.com/what-is-the-entropy-of-a-fair-coin-toss.html" rel="alternate"></link><published>2021-09-25T00:00:00+02:00</published><updated>2021-09-25T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-09-25:/what-is-the-entropy-of-a-fair-coin-toss.html</id><summary type="html">&lt;p&gt;What's the entropy of a fair coin toss? What if the coin almost always returns Heads? My recruiter reports that very few candidates can answer these entropy related DS screening questions. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/entropy3.png" alt="Entropy" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I use a set of 20 &lt;em&gt;screening questions&lt;/em&gt; for Data Science interviews. The questions are asked by my (non-technical) recruiter on the phone, so I wrote them so they have objective answers, like yes/no/0/1/42.&lt;/p&gt;
&lt;p&gt;One of the questions goes like this:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q:&lt;/strong&gt; What is the entropy of a fair coin toss?&lt;br/&gt;
&lt;strong&gt;A:&lt;/strong&gt; 1 bit&lt;/p&gt;
&lt;p&gt;Then there is a follow-up:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q:&lt;/strong&gt; What is the entropy of a coin that is almost always heads?&lt;br/&gt;
&lt;strong&gt;A:&lt;/strong&gt; 0 bits (or approximately 0 bits)&lt;/p&gt;
&lt;p&gt;My recruiter reports that from a sample size of ~50 candidates, very few can answer these entropy related questions. So, why did I think these are good and relevant screening questions?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Entropy is a common and useful concept in Data Science. Our current &lt;code&gt;datascience-main&lt;/code&gt; git repo has 7 occurences of the string &lt;code&gt;entropy&lt;/code&gt; in our own code (not library code).&lt;/li&gt;
&lt;li&gt;I learned about entropy in multiple classes during my Comp.Sci. degree (20 years ago).&lt;/li&gt;
&lt;li&gt;I learned about entropy in multiple classes during my Physics degree (20 years ago).&lt;/li&gt;
&lt;li&gt;I find entropy to be one of the most intellectually pleasing concepts, partially because it occurs across a wide array of disciplines.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So I'm writing this post to remind and excite ourselves about entropy.&lt;/p&gt;
&lt;h2&gt;What is entropy?&lt;/h2&gt;
&lt;p&gt;Per the Wikipedia page on &lt;a href="https://en.wikipedia.org/wiki/Entropy"&gt;Entropy&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Entropy is a &lt;strong&gt;scientific concept&lt;/strong&gt;, as well as a &lt;strong&gt;measurable physical property&lt;/strong&gt; that is most commonly associated with a &lt;strong&gt;state of disorder, randomness, or uncertainty&lt;/strong&gt;. The term and the concept are used in diverse fields, from classical thermodynamics, where it was first recognized, to the microscopic description of nature in statistical physics, and to the principles of information theory.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you then jump over to &lt;a href="Entropy (information theory)"&gt;https://en.wikipedia.org/wiki/Entropy_(information_theory)&lt;/a&gt;, it even has the answers:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;.. and to communicate the outcome of a coin flip (2 possible values) will require an average of at most 1 bit (&lt;strong&gt;exactly 1 bit for a fair coin&lt;/strong&gt;).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;and&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;.. a double-headed coin that never comes up tails, or a double-tailed coin that never results in a head. Then there is no uncertainty. &lt;strong&gt;The entropy is zero&lt;/strong&gt;: each toss of the coin delivers no new information as the outcome of each coin toss is always certain.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm not sure if I knew this when I wrote the questions, but the opening Wikipedia page has the answers.&lt;/p&gt;
&lt;h2&gt;What is the entropy of a fair coin toss?&lt;/h2&gt;
&lt;p&gt;Let's look at two ways to get the answers:&lt;/p&gt;
&lt;p&gt;(1) Use the formula for information entropy (Shannon entropy), which is $ H(X) = - \sum_i P(X=i) log_2 P(X=i) $. For a coin where the probability of H is p, the shape of the entropy function is:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/entropy3.png" alt="Entropy" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;For a fair coin, we have two outcomes, both have $ P(X=H)=P(X=T)=\frac{1}{2} $, so $ H(X) = - [ \frac{1}{2} log_2\frac{1}{2} + \frac{1}{2} log_2\frac{1}{2} ] = 1 $. Since we're using base 2 logarithm, the result is in &lt;strong&gt;bits&lt;/strong&gt;. For the coin that is almost always H or almost always T, both terms are zero, so the entropy is 0.&lt;/p&gt;
&lt;p&gt;(2) Use intuition: the most untuitive way to think about entropy is this. Suppose there you have to transmit messages using 0 and 1 bits. The number of message types is the number of outcomes of the random variable $X$. So, for a random coin toss, there are 2 messages, H and T. Then, there is a daemon who randomly generates these messages according to the probabilities of $X$. The question is, what is the average minimum number of bits per message we can use to transmit messages coming from $X$? The answer is just $H(X)$, the entropy. So for a fair coin toss, we can encode H as 0, T as 1, and since both occur with equal chance, we can't get more efficient than that, so we need 1 bit. And if the coin always lands on either H or T, there is no randomness, we don't need to transmit anything, the other side already knows.&lt;/p&gt;
&lt;p&gt;Let's strengthen our intuition by first deconstructing it and reconstructing it: the above 2 cases (1 bit and 0 bit entropy) are edge-cases, let's look at the "middle." What if H occurs $\frac{3}{4}$ of the time, and T occurs $\frac{1}{4}$ of the time? Per the formula above, $H(X) = 0.81 $. How can we go below 1 bit to transfer two possible outcomes? For the case of the "almost always H" coin, the answer was zero bits, but that was kindof cheating, because we don't need to send anything at all. How do we send 0.81 bits?&lt;/p&gt;
&lt;p&gt;The answer is, we can encode multiple outcomes: since H is more likely than T, it makes sense to use less bits to encode HH than TT or even HT. We can do the following:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;HH -&amp;gt; P(HH) = 9/16 -&amp;gt; encode as 0   (length = 1 bit )
HT -&amp;gt; P(HT) = 3/16 -&amp;gt; encode as 10  (length = 2 bits)
TH -&amp;gt; P(TH) = 3/16 -&amp;gt; encode as 110 (length = 3 bits)
TT -&amp;gt; P(TT) = 1/16 -&amp;gt; encode as 111 (length = 3 bits)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This code is a prefix code, which means that if $C_i$ is a codeword (like &lt;code&gt;0&lt;/code&gt; or &lt;code&gt;10&lt;/code&gt; in the above example), then $C_i$ never occurs as a prefix of another codeword $C_j$ (eg. above &lt;code&gt;0&lt;/code&gt; is a code, and none of the other codes start with &lt;code&gt;0&lt;/code&gt;, &lt;code&gt;10&lt;/code&gt; is also a code, and none of the others start with &lt;code&gt;10&lt;/code&gt;). So on the receiver end, if this code is used, it is always possible to unambigiously decode the series of 0s and 1s and get back the series of Hs and Ts. With the above code, we're using as average of $ 1\frac{9}{16} + 2\frac{3}{16} + 3\frac{3}{16} + 3\frac{1}{16}  = \frac{27}{16} = 1.68 $ bits to transfer 2 messages, so 0.84 bits per message. So on average we're below 1 bit, although this coding is still about 0.03 bits higher than the entropy itself. We can further approach the theoretical limit set by the entropy by encoding longer sequences (eg. &lt;code&gt;HHH&lt;/code&gt;, &lt;code&gt;HHHH&lt;/code&gt;, etc). Note that this is not &lt;a href="https://en.wikipedia.org/wiki/Huffman_coding"&gt;Huffman-coding&lt;/a&gt;, Huffman-coding would just assign &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;1&lt;/code&gt; to H and T.&lt;/p&gt;
&lt;h2&gt;Entropy in Data Science&lt;/h2&gt;
&lt;p&gt;I will mention two areas where entropy pops up in everyday Data Science work:&lt;/p&gt;
&lt;p&gt;(1) Splitting nodes during Decision Tree building: The first parameter for Scikit-learn's &lt;code&gt;DecisionTreeClassifier&lt;/code&gt; is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;criterion {gini, entropy}, default=gini
&amp;gt; The function to measure the quality of a split. Supported criteria are
&amp;gt; gini for the Gini impurity and entropy for the information gain.
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;One of the ways to build Decision Trees is to look for splits such that the 2 child nodes have on average maximally lower entropy than the parent node. So, if for example the parent node has 10 data points labeled &lt;code&gt;HHHHHTTTTT&lt;/code&gt; and splitting along attribute A yields two nodes with labels &lt;code&gt;HHTTT&lt;/code&gt; and &lt;code&gt;HHHTT&lt;/code&gt;, while splitting along attribute B yields two nodes with labels &lt;code&gt;HTTTT&lt;/code&gt; and &lt;code&gt;THHHH&lt;/code&gt;, then the second one is better, because it has lower average entropy (and hence higher information gain). &lt;a href="https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c18c8"&gt;See this post on TowardsDataScience for a full example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(2) Cross-entropy as a loss function when training Deep Neural Networks: From the Wikipedia page for &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;Cross entropy&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In information theory, the cross-entropy between two probability distributions &lt;strong&gt;p&lt;/strong&gt; and &lt;strong&gt;q&lt;/strong&gt; over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution &lt;strong&gt;q&lt;/strong&gt;, rather than the true distribution &lt;strong&gt;p&lt;/strong&gt;... Cross-entropy can be used to define a loss function in machine learning and optimization. The true probability &lt;strong&gt;p&lt;/strong&gt; is the true label, and the given distribution &lt;strong&gt;q&lt;/strong&gt; is the predicted value of the current model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Not to be confused with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Joint_entropy"&gt;Joint entropy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Conditional_entropy"&gt;Conditional entropy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence"&gt;Kullback–Leibler divergence&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Mutual_information"&gt;Mutual information&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Entropy in Physics&lt;/h2&gt;
&lt;p&gt;Entropy was originally invented/discovered in Physics in the 1800s, before Shannon re-discovered it for Information Theory 100 years later. Today it pops up all over physics: Thermodynamics, Statistical mechanics, Quantum physics, even in General relativity for black holes. The most famous occurence is the &lt;a href="https://en.wikipedia.org/wiki/Second_law_of_thermodynamics"&gt;Second law of thermodynamics&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The second law of thermodynamics establishes the concept of entropy as a physical property of a thermodynamic system. Entropy predicts the direction of spontaneous processes, and determines whether they are irreversible or impossible, despite obeying the requirement of conservation of energy, which is established in the first law of thermodynamics. The second law may be formulated by the observation that &lt;strong&gt;the entropy of isolated systems left to spontaneous evolution cannot decrease&lt;/strong&gt;, as they always arrive at a state of thermodynamic equilibrium, where &lt;strong&gt;the entropy is highest&lt;/strong&gt;. If all processes in the system are reversible, the entropy is constant. &lt;strong&gt;An increase in entropy accounts for the irreversibility of natural processes, often referred to in the concept of the arrow of time.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The classic explanation goes like this: suppose we have a cylinder split in half with a thin divider. The left side has gas in it, the right side has no gas, it's a vacuum. If we remove the divider, the gas will quickly fill up the whole cylinder. The configuration with the gas constrained to the left side has lower entropy, since certain possibilities (eg. positions of molecules on the right cide) are not possible; this is like the coin that is heads $\frac{3}{4}$ of the time. The final spread out gas has higher entropy, this is like the fair coin. On a microscopic level, gas molecules can be thought of as billiard balls constantly colliding with each other and the wall. Under certain conditions we can assume that the collisions are elastic, which means everything is reversible. This means that if we record a video of the gas, and then it backwards, all the microscopic colisions that we see (happening backwards) are allowed by physics to happen [even if time is going forward]. So, if somebody shows us microscopic, zoomed in cideo footage of this gas, we can't tell if the tape is playing forward or backwards. However, if we are allowed to see the whole tape zoomed out (macroscopic), and we see that the gas is going from a low-entropy state to a high-entropy state, we know the tape is playing forward, while if we see the gas going from a high-entropy state to a low-entropy state, we know that the tape is playing backwards.&lt;/p&gt;
&lt;p&gt;A stylized but more every-day version goes like this: imagine all the air molecules (oxygen, nitrogen, etc) in the closed room you're sitting in. Why doesn't all the gas spontaneously collect in the bottom left corner of your room [which would lead you to suffocate]? The other way it works: if we somehow collect all the air in one corner, and then let it go, it will spread out, but the opposite never occurs. This is essentially what the second law is saying: the possible set of configurations of air molecules in just one corner of the room has lower entropy than if the gas is spread out all over the room. So, spreading out naturally occurs all the time, while the reverse never happens. You can actually compute this by dividing the room into grids, and placing the air molecules in grids (also velocities), counting configurations, and approximating the entropy ratio.&lt;/p&gt;
&lt;p&gt;Note: a lovely intellectual rabbit hole to go down: what about gravity? &lt;a href="https://physics.stackexchange.com/questions/244621/entropy-and-gravity"&gt;Gravity tends to pull things together, so does gravity decrease entropy?&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Arrow_of_time"&gt;Arrow of time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Entropy_(arrow_of_time)#Cosmology"&gt;Entropy in Cosmology&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Black_hole_thermodynamics"&gt;Black hole thermodynamics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Hopefully this post is a good reminder that entropy is a useful concept, or enough to get people interested. A good short refresher written by physicist Edward Witten is &lt;a href="https://arxiv.org/abs/1805.11965"&gt;A Mini-Introduction To Information Theory&lt;/a&gt;. If you need a textbook refresher on entropy and information theory, the standard reference is: &lt;a href="https://www.amazon.com/Elements-Information-Theory-Telecommunications-Processing/dp/0471241954"&gt;Cover, Thomas: Elements of Information Theory&lt;/a&gt;. I plan to write a follow-up post about Joint entropy, Conditional entropy, Kullback–Leibler divergence and Mutual information.&lt;/p&gt;</content><category term="entropy"></category><category term="interviews"></category><category term="cross-entropy"></category><category term="physics"></category></entry><entry><title>Five ways to reduce variance in A/B testing</title><link href="https://bytepawn.com/five-ways-to-reduce-variance-in-ab-testing.html" rel="alternate"></link><published>2021-09-19T00:00:00+02:00</published><updated>2021-09-19T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-09-19:/five-ways-to-reduce-variance-in-ab-testing.html</id><summary type="html">&lt;p&gt;I use toy Monte Carlo simulations to demonstrate 5 ways to reduce variance in A/B testing: increase sample size, move towards a more even split, reduce variance in the metric definition, stratification and CUPED.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cuped2.png" alt="Historic lift" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;When performing A/B testing, we're measuring the mean of a metric (such as spend or conversion) on two distinct subsets, and then compare the means to each other. Because of the Central Limit Theorem, the measurement of the two means follows a normal distribution, as does the difference of the two (if A and B are normal, so is A - B).&lt;/p&gt;
&lt;p&gt;Relevant Wikipedia articles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables"&gt;Sum of normally distributed random variables&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Relevant Bytepawn articles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/ab-testing-and-the-central-limit-theorem.html"&gt;A/B testing and the Central Limit Theorem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/beyond-the-central-limit-theorem.html"&gt;Beyond the Central Limit Theorem&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Per the above, the variance of the lift (defined here as the difference) is $ s^2 = s_A^2 + s_B^2 $, where $s_A^2$ and $s_B^2$ are the variances of the A and B measurements, respectively, and $ s_i^2 = \frac{ \sigma^2 }{ N_i } $. So $ s^2 \approx \frac{\sigma^2}{N} $, where $\sigma^2$ is is the variance of the metric itself over the entire population.&lt;/p&gt;
&lt;p&gt;I will use toy Monte Carlo simulations to illustrate 5 ways to reduce variance in A/B testing:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;increase sample size&lt;/li&gt;
&lt;li&gt;move towards an even split&lt;/li&gt;
&lt;li&gt;reduce variance in the metric definition&lt;/li&gt;
&lt;li&gt;stratification&lt;/li&gt;
&lt;li&gt;CUPED&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Reducing%20variance.ipynb"&gt;ipython notebook for this post is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Increase sample size&lt;/h2&gt;
&lt;p&gt;The simplest way to reduce variance is to collect more samples, if possible. For every 4x increase of $N$, we get a 2x decrement of $s$. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;                 &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;N_multiplier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n4_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_multiplier&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;n4_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N samples  A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;4N samples A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n4_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n4_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Raio of lift variance = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; (expected = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n4_lifts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;N_multiplier&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;bins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n1_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N samples&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n4_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{N_multiplier}N samples&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper right&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift histogram&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 2...
N samples  A/B testing, mean lift = 2.00, variance of lift = 0.18
4N samples A/B testing, mean lift = 2.01, variance of lift = 0.05
Raio of lift variance = 0.26 (expected = 0.25)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/5ways_1.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: in real life, collecting more samples is not always possible, since the amount of samples may be limited, or there may be a time limit by which we want to conclude the experiment. Also, in many situations, the relationship of the length of time $t$ that the experiment runs and sample size $N$ is sub-linear, because over time users tend to return. In other words, if we run an experiment for 2 weeks, and get $2N$ users, running it for another 2 weeks will probably yield less than $2N$ additional users.&lt;/p&gt;
&lt;h2&gt;Move towards an even split&lt;/h2&gt;
&lt;p&gt;If we have $N$ samples available to use for A/B testing, one question is, what percent to put into A (control) and what percent to put into B (treatment). Almost always there are risk-management concerns, but it's worth knowing that in terms of the best lift measurement (lowest variance), the most optimal split is the even split of 50%-50%, and a 20%-80% split is better than a 10%-90%. We can see this with simple math. Suppose our $N$ total samples are split into $pN$ and $(1-p)N$ for A and B variants, then the variance is:&lt;/p&gt;
&lt;p&gt;$ s^2 = \frac{ \sigma^2 }{ pN } + \frac{ \sigma^2 }{ (1-p)N } = \frac{ \sigma^2 }{ N } ( \frac{1}{p} + \frac{1}{1-p} ) $.&lt;/p&gt;
&lt;!--Let's compare 2 experiments P and Q. In experiment P, our $N$ total samples are split into $pN$ and $(1-p)N$ for A and B variants. In experiment Q, the splits are $qN$ and $(1-q)N$ for A and B. If we compare the variance of the P and the Q measurements:--&gt;

&lt;!--$ s_P^2 = \frac{ \sigma^2 }{ pN } + \frac{ \sigma^2 }{ (1-p)N } = \frac{ \sigma^2 }{ N } ( \frac{1}{p} + \frac{1}{1-p} ) $.--&gt;

&lt;!--$ s_Q^2 = \frac{ \sigma^2 }{ qN } + \frac{ \sigma^2 }{ (1-q)N } = \frac{ \sigma^2 }{ N } ( \frac{1}{q} + \frac{1}{1-q} ) $.--&gt;

&lt;p&gt;So, $ s^2(p) = C ( \frac{1}{p} + \frac{1}{1-p} ) $, which has a minimum at $p=\frac{1}{2}$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/5ways_plot.png" alt="5 ways to reduce variance in A/B testing" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;!--$ \frac{s_P^2}{s_Q^2} = \frac{ \frac{1}{p} + \frac{1}{1-p} }{ \frac{1}{q} + \frac{1}{1-q} } $--&gt;

&lt;p&gt;The more the split is away from even, the higher the variance compared to the optimal case:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_B&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;                 &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;expected_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;top&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;bot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;bot&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;
&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;even_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;uneven_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;even_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uneven_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;even_ratio&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;even_ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;even_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;uneven_ratio&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;uneven_ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;uneven_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Even split   A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;even_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;   &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;even_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Uneven split A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uneven_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uneven_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Raio of lift variance = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; (expected = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;even_lifts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uneven_lifts&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;expected_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;even_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uneven_ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 2...
Even split   A/B testing, mean lift = 1.98, variance of lift = 0.10
Uneven split A/B testing, mean lift = 1.98, variance of lift = 0.29
Raio of lift variance = 0.36 (expected = 0.36)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/5ways_2.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Reduce variance in the metric definition&lt;/h2&gt;
&lt;p&gt;In the equation $ s^2 = \frac{ \sigma^2 }{ N } $, the previous two methods were increasing the sample size $N$ to reduce $s^2$. Another way to reduce $s^2$ is to reduce the variance $ \sigma^2 $ of the underlying metric itself. I will mention two approaches here:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. &lt;a href="https://en.wikipedia.org/wiki/Winsorizing"&gt;Winsorizing&lt;/a&gt;, ie. cutting or normalizing outliers.&lt;/strong&gt; For example, if the underlying metric is spend per head, there may be outliers, such as high-value B2B customers among the overall population of B2C customers, or bots appearing to spend a lot of time on a site compared to real users. If we cut-off these values at some $\sigma$ (technically, we can either discard these values or set them to the highest acceptable value), we will get a better measurement of our "real" population:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exponential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;                 &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exponential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# add outliers&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;orig_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trunc_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;orig_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;trunc_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/5ways_3.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Another method is to measure the median instead of the mean. Note that with these approaches we are changing the definition of the metric itself. So, for example, Finance's "spend per head" definition may no longer match the winsorized or median-based definition of the experimentation framework!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Measuring a metric further down the funnel.&lt;/strong&gt; Suppose we are measuring a metric such as spend per head. We can split this metric into two parts:&lt;/p&gt;
&lt;p&gt;(spend per head, overall) = (ratio of users who spend more than 0, conversion) x (spend per head of those who spend more than zero, conditional)&lt;/p&gt;
&lt;p&gt;In real-life, the ratio is usually low, less than 10%. In such cases, measuring the two terms (the conversion and the conditional) separately yields lower variance:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_B&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;                 &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;p_lift&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;p_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="n"&gt;mu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;cont_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cond_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;cont_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;cond_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;conv_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;bins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cont_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;overall&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cond_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conditional&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper right&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift histogram&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;bins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conv_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversion&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper right&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lift histogram&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/5ways_4a.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/5ways_4b.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Stratification&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Stratified_sampling"&gt;Stratification&lt;/a&gt; lowers variance by making sure that each sub-population is sampled according to its split in the overall population. As a toy example, assume there are two sub-populations, M and F, with different distributions. For simplicity let's assume that the population is exactly 50% M and 50% F. In an A/B test, if we randomly sample the overall population and assign them to A and B, we will on &lt;em&gt;approximately&lt;/em&gt; get 50% Ms and 50% Fs in both A and B, but not exactly. This small difference will impact our measurement variance. However, if we make sure that both A and B have &lt;strong&gt;exactly&lt;/strong&gt; 50% Ms and 50% Fs, the variance of the measurement is decreased:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;  &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;  &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples_rnd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_mu&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;


&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;
&lt;span class="n"&gt;pop1_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;pop1_mu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;pop2_mu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
&lt;span class="n"&gt;pop1_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pop2_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;str_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rnd_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;str_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples_rnd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pop1_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop2_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pop1_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rnd_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stratified sampling   A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Random sampling       A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rnd_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rnd_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Raio of lift variance = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str_lifts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rnd_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 2...
Stratified sampling   A/B testing, mean lift = 2.01, variance of lift = 0.05
Random sampling       A/B testing, mean lift = 1.97, variance of lift = 1.28
Raio of lift variance = 0.04
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/5ways_5.png" alt="5 ways to reduce variance in A/B testing" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;If we cannot control the sampling of the population, we can also re-weigh samples after data collection is over. Stratification and CUPED are closely related, check section 3.1 and 3.3 of the &lt;a href="https://www.exp-platform.com/Documents/2013-02-CUPED-ImprovingSensitivityOfControlledExperiments.pdf"&gt;CUPED paper&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;CUPED&lt;/h2&gt;
&lt;p&gt;CUPED is a way to reduce variance in A/B testing if the past historic values of the metric are correlated wit the current values we measure in the experiment. In other words, CUPED works if eg. high spenders before the experiment also tend to be high spenders during the experiment, and the same for low spenders. I have written four articles on CUPED, so I will just link to those:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;Reducing variance in A/B testing with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/reducing-variance-in-conversion-ab-testing-with-cuped.htm"&gt;Reducing variance in conversion A/B testing with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/aa-testing-and-false-positives-with-cuped.html"&gt;A/A testing and false positives with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/correlations-seasonality-lift-and-cuped.html"&gt;Correlations, seasonality, lift and CUPED&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The post used toy examples to show some of the fundamental methods and levers to reduce variance in experiments. It's worth noting that I used Monte Carlo simulations over many experiments (typically 1,000) to show the decrease in variance, by visualizing the spread of the lift histogram. The point is, in one specific experiment evaluation, the decrease is not visible, since one measurement does not have variance-of-lift by itself (it's just one specific lift we measure). The rewards of lower variance in A/B testing are reaped in the long term, by having more accurate measurements, making better decisions and releasing better products more quickly.&lt;/p&gt;</content><category term="ab-testing"></category><category term="variance"></category><category term="stratification"></category><category term="cuped"></category></entry><entry><title>Correlations, seasonality, lift and CUPED</title><link href="https://bytepawn.com/correlations-seasonality-lift-and-cuped.html" rel="alternate"></link><published>2021-09-05T00:00:00+02:00</published><updated>2021-09-05T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-09-05:/correlations-seasonality-lift-and-cuped.html</id><summary type="html">&lt;p&gt;In this final blog post about CUPED, I will address some questions about CUPED, such as, &lt;em&gt;is correlation between "before" and "after" the same as seasonality?&lt;/em&gt;&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/aa_cuped_3.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the fourth  post about CUPED. The previous three were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;Reducing variance in A/B testing with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/reducing-variance-in-conversion-ab-testing-with-cuped.html"&gt;Reducing variance in conversion A/B testing with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/aa-testing-and-false-positives-with-cuped.html"&gt;A/A testing and false positives with CUPED&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this final post, I will address some questions relating to CUPED:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;what if some units are missing "before" data?&lt;/li&gt;
&lt;li&gt;what if "before" and "after" is not correlated?&lt;/li&gt;
&lt;li&gt;what if "before" and "after" is correlated, but the treatment lift is 0?&lt;/li&gt;
&lt;li&gt;what if our measurement is not continuous (like USD spend per user), but binary conversion data?&lt;/li&gt;
&lt;li&gt;what if we run an A/B test, and evaluate using both traditional and CUPED methodology, and pick the "more favorable"?&lt;/li&gt;
&lt;li&gt;is correlation between "before" and "after" the same as seasonality?&lt;/li&gt;
&lt;li&gt;what are other ways to reduce variance in A/B testing?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;What if some units are missing "before" data?&lt;/h2&gt;
&lt;p&gt;With CUPED, we can reduce the variance of our A/B test measurement, assuming we have "before" data, and it is correlated with the data collected during the experiment, which I referred to as "after" data in the last posts. &lt;strong&gt;But what if for some experimental units, the "before" data is missing?&lt;/strong&gt; For example, we can imagine that the experiment unit is users, and for some users, we are missing the "before" data. In this case, all we have to do is: &lt;strong&gt;use the un-adjusted "after" metric&lt;/strong&gt;. So, including this modification in the "CUPED recipe" from the &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;first post&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;Let $Y_i$ be the ith user's spend in the "after" period, and $X_i$ be their spend in the "before" period, both for A and B combined. We compute an adjusted "after" spend $Y'_i$ with the following process:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Compute the covariance $cov(X, Y)$ of X and Y.&lt;/li&gt;
&lt;li&gt;Compute the variance $var(X)$ of X.&lt;/li&gt;
&lt;li&gt;Compute the mean $\mu_X$ of X.&lt;/li&gt;
&lt;li&gt;For the i'th user, if $X_i$ is missing: $ Y'_i := Y_i $.&lt;/li&gt;
&lt;li&gt;For the i'th user, if $X_i$ is not missing:$ Y'_i := Y_i - (X_i - \mu_X) \frac{cov(X, Y)}{var(X)} $.&lt;/li&gt;
&lt;li&gt;Evaluate the A/B test using $Y'$ instead of $Y$.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What if "before" and "after" are not correlated?&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;If "before" and "after" are not correlated, and we use CUPED, are we making a mistake? No.&lt;/strong&gt; This question was answered in the first &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;first post&lt;/a&gt;. If there is no correlation, then in the above equation $ cov(X, Y) = 0 $, so $ Y_i' = Y_i $. In actual experiment instances $ cov(X, Y) $ will not be exactly 0, but will be a small number, so $ cov(X, Y) \approx 0 $, so $ Y_i' \approx Y_i $. Running Monte Carlo simulations of many experiments showed that over many experiments this is equivalent to traditional A/B testing.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped6.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;What if "before" and "after" is correlated, but the treatment lift is 0?&lt;/h2&gt;
&lt;p&gt;This was explored in the &lt;a href="https://bytepawn.com/aa-testing-and-false-positives-with-cuped.html"&gt;third post&lt;/a&gt;. If the treatment lift is 0, then in an A/B test the null hypothesis is actually true, and we're really running an A/A test. In this case, &lt;strong&gt;we still get the benefit of CUPED, we're just getting a more reliable measurement of the 0 lift&lt;/strong&gt;. CUPED doesn't care what the lift is, it reduces the variance of the lift measurement in all cases, if there is correlation. See the plot below of an A/A test, where the variance is reduced for 0 lift (the histogram is centered on 0):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_1.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;What if our measurement is not continuous, but binary conversion data?&lt;/h2&gt;
&lt;p&gt;This was explored in the &lt;a href="https://bytepawn.com/reducing-variance-in-conversion-ab-testing-with-cuped.html"&gt;second post&lt;/a&gt; for the binary 0/1 conversion case, and &lt;strong&gt;it is not a problem, CUPED can be applied&lt;/strong&gt;. But it is somewhat counter-intuitive, since the CUPED-adjusted values are no longer 0 and 1, but become continuous. Since the CUPED formula has 2 variables, before and after, and in the binary case each can take on the value 0 and 1, there are 4 possible CUPED-adjusted values. In one experiment realization from that post, these values were:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Possible mappings:
(before=1, after=0) -&amp;gt; adjusted=-0.722
(before=0, after=1) -&amp;gt; adjusted=1.081
(before=0, after=0) -&amp;gt; adjusted=0.081
(before=1, after=1) -&amp;gt; adjusted=0.278
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This illustrates that the binary data loses its binary nature with CUPED.&lt;/p&gt;
&lt;h2&gt;What if we run an A/B test, and pick the "more favorable" outcome?&lt;/h2&gt;
&lt;p&gt;What if we run an A/B test, evaluate using both traditional (without using "before" data) and CUPED A/B testing, and pick the more favorable outcome, the one with higher lift / lower p-value? This was explored in the &lt;a href="https://bytepawn.com/aa-testing-and-false-positives-with-cuped.html"&gt;third post&lt;/a&gt;, and was shown to be a fallacy. We showed with Monte Carlo simulations that this behavior &lt;strong&gt;skews the lift distribution to the right and overestimates the lift&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_3.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note that this error is similar to other "p-hacking" fallacies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;peeking at experiment results before the experiment finishes (and stopping early in case of favorable results)...&lt;/li&gt;
&lt;li&gt;evaluating multiple metrics (and using the most favorable one to report results)...&lt;/li&gt;
&lt;li&gt;checking multiple subsets like countries for favorable results...&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;... without controlling the p-value for multiple checks.&lt;/p&gt;
&lt;h2&gt;Is correlation between "before" and "after" the same as seasonality?&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;No, they are different things.&lt;/strong&gt; In the &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;first post&lt;/a&gt;, I used this toy model to generate correlated "before" and "after" data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def get_AB_samples(before_mean, before_sigma, eps_sigma, treatment_lift, N):
    A_before = list(normal(loc=before_mean, scale=before_sigma, size=N))
    B_before = list(normal(loc=before_mean, scale=before_sigma, size=N))
    A_after  = [x + normal(loc=0, scale=eps_sigma)                  for x in A_before]
    B_after  = [x + normal(loc=0, scale=eps_sigma) + treatment_lift for x in B_before]
    return A_before, B_before, A_after, B_after
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Notice that here, &lt;code&gt;A_before&lt;/code&gt; and &lt;code&gt;A_after&lt;/code&gt; both have the same mean, &lt;code&gt;before_mean&lt;/code&gt;, since &lt;code&gt;A_after&lt;/code&gt; is just &lt;code&gt;A_before&lt;/code&gt; with some (symmetric) noise added (the same is true for the Bs if &lt;code&gt;treatment_lift = 0&lt;/code&gt;). So here we have a correlated experiment, but the means are the same before and after, there is no seasonal change. Now look at this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def get_AB_samples(before_mean, before_sigma, seasonal_lift, treatment_lift, N):
    A_before = list(normal(loc=before_mean, scale=before_sigma, size=N))
    B_before = list(normal(loc=before_mean, scale=before_sigma, size=N))
    A_after  = list(normal(loc=before_mean + seasonal_lift                   scale=before_sigma, size=N))
    B_after  = list(normal(loc=before_mean + seasonal_lift + treatment_lift, scale=before_sigma, size=N))
    return A_before, B_before, A_after, B_after
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In this setup, &lt;code&gt;A_after&lt;/code&gt; has a &lt;code&gt;seasonal_lift&lt;/code&gt; compared to &lt;code&gt;A_before&lt;/code&gt; (and B has an optional &lt;code&gt;treatment_lift&lt;/code&gt;), but A and B are uncorrelated.&lt;/p&gt;
&lt;p&gt;These examples show that seasonality and correlation are not the same thing in A/B testing. Seasonality is a shift in the mean between before and after, correlation is correlation. All 9 combinations between &lt;strong&gt;{no correlation, correlation} x {no seasonal lift, seasonal lift} x {no treatment lift, treatment lift}&lt;/strong&gt; are possible and occur in real life. CUPED only helps in reducing variabce if there is correlation.&lt;/p&gt;
&lt;h2&gt;What are other ways to reduce the variance of the lift measurement?&lt;/h2&gt;
&lt;p&gt;In these Monte Carlo simulations, I used the difference in means for lift:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def lift(A, B):
    return mean(B) - mean(A)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Because of the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt;, the distribution of the error of the mean measurement is a normal, and the difference of two normals is also normal. This is the bell curve we've been seeing on all the Monte Carlo histograms. The standard deviation of this normal is called the &lt;a href="https://en.wikipedia.org/wiki/Standard_error"&gt;standard error&lt;/a&gt;, and its formula is $ s = \frac{\sigma}{\sqrt{N}} $. The important part here is the $ \sqrt{N} $, which says that for every 4x sample size, variance of the lift goes down 2x. So, another way to reduce variance of the lift measurements is to collect more N samples, if possible.&lt;/p&gt;
&lt;p&gt;The other important method for reducing variance is &lt;a href="https://en.wikipedia.org/wiki/Stratified_sampling"&gt;stratification&lt;/a&gt;, where we identify sub-populations of our overall population and calibrate our sampling accordingly, or re-weight sub-population metrics after sampling.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This is the fourth and final post on CUPED. In the future I plan to write about stratification, another method to reduce variance in A/B testing.&lt;/p&gt;</content><category term="ab-testing"></category><category term="cuped"></category></entry><entry><title>A/A testing and false positives with CUPED</title><link href="https://bytepawn.com/aa-testing-and-false-positives-with-cuped.html" rel="alternate"></link><published>2021-08-15T00:00:00+02:00</published><updated>2021-08-15T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-08-15:/aa-testing-and-false-positives-with-cuped.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations of A/A tests to demonstrate how Data Scientists can incorrectly skew lift and p-values if they pick-and-choose between reporting traditional and CUPED results after the experiment has concluded.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/aa_cuped_2.png" alt="Historic lift" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous posts, &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;Reducing variance in A/B testing with CUPED&lt;/a&gt; and &lt;a href="https://bytepawn.com/reducing-variance-in-conversion-ab-testing-with-cuped.html"&gt;Reducing variance in conversion A/B testing with CUPED&lt;/a&gt;, I ran Monte Carlo simulation to get a feel for how CUPED works in continuous (like $ spend per customer) and then binary (conversion) experiments. In both cases, as long as historic "before" and the experiment's "after" data is correlated, CUPED yields lower variance measurements when compared to traditional A/B testing, where only "after" data is used. One of the interesting aspects of CUPED is that although CUPED reduces variance on average, there is no such guarantee for each individual experiment outcome. We saw that on a given experiment outcome, it is possible that the statistics computed from the transformed CUPED variables can be worse than evaluating using just the "after" data using traditional A/B testing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the lift computed with CUPED can be lower (or higher), irrespective of what the true lift is&lt;/li&gt;
&lt;li&gt;the p-value computed with CUPED can be lower (or higher), irrespective of what the true lift is&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In real-life, Data Scientists are often under pressure to achieve positive results. This poses the potential danger of hacking the experiment, when a Data Scientist computes outcomes with both CUPED and traditional A/B testing and reports the more favorable results, ie. the one with higher lifts and lower p-values. Here I will run Monte Carlo simulations to show (to myself and the reader) that such practice results in incorrect lift measurements and incorrect p-value measurements.&lt;/p&gt;
&lt;h2&gt;TLDR&lt;/h2&gt;
&lt;p&gt;The TLDR learning of this post is: as a Data Scientist, we have to pick whether we use traditional A/B testing evaluation (using just "after" data) or CUPED &lt;strong&gt;before we evaluate the experiment data&lt;/strong&gt;, preferably before we even run the experiment. The best case is if there is an experimentation platform which makes this choice for us. As demonstrated below, evaluating both and picking and reporting more favorable results is not statistically sound.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/CUPED%20experiment%20hacking.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Correlated A/A tests&lt;/h2&gt;
&lt;p&gt;First, let's run an A/A test where "before" and "after" is the same; A/A means there is no difference between A and B, ie. the true lift is 0. I will use the same code as in previous posts, setting &lt;code&gt;treatment_lift=0&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;before_mean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; 
&lt;span class="n"&gt;before_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;eps_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 0...
Traditional    A/B testing, mean lift = 0.00, variance of lift = 6.09
CUPED adjusted A/B testing, mean lift = 0.03, variance of lift = 0.78
CUPED lift variance / tradititional lift variance = 0.13 (expected = 0.14)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We see that both traditional and CUPED correctly estimate the true lift of 0, but CUPED has lower variance. Plotting the histograms, we see the familiar shape of a narrower CUPED, but this time centered on 0 lift:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_1.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Lift hacking&lt;/h2&gt;
&lt;p&gt;Let's simulate our Data Scientist under pressure in code: at the end of each experiment, we pick and report the higher lift (between traditional and CUPED):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;before_mean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; 
&lt;span class="n"&gt;before_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;eps_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hacked_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;hacked_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Traditional    A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED adjusted A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Hacked         A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hacked_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hacked_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 10000 A/B tests, true treatment lift is 0...
Traditional    A/B testing, mean lift = 0.00, variance of lift = 5.82
CUPED adjusted A/B testing, mean lift = 0.00, variance of lift = 0.80
Hacked         A/B testing, mean lift = 0.89, variance of lift = 2.50
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can see that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;traditional and CUPED correctly estimate a mean lift of 0&lt;/li&gt;
&lt;li&gt;the hacked result overestimates the lift&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can plot all three on a histogram:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_2.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;It's a bit hard to see, let's just see the "hacked" histogram:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_3.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The histogram seems to have its maximum at 0, but it's skewed towards positive values, so on average our imaginary Data Scientist overestimates the lift.&lt;/p&gt;
&lt;h2&gt;False positive rate&lt;/h2&gt;
&lt;p&gt;We can now repeat the above logic, and also record the p-value. Our imaginary Data Scientist picks traditional or CUPED adjusted, depending on which has higher lift, and also records that p-value. Let's assume the they use a critical p-value of &lt;code&gt;p_crit = 0.05&lt;/code&gt;, so they reject the null hypothesis that A and B are the same, and accept the action hypothesis that B is better. Since A and B are actually the same (since &lt;code&gt;treatment_lift = 0&lt;/code&gt;, since we are running A/A tests), these are all false positives.&lt;/p&gt;
&lt;p&gt;Let's remember the definition of the p-value. This is the probability of incorrectly rejecting the null hypothesis and accepting the action hypothesis, even though the null hypothesis is true. The null hypothesis is that A and B are the same, ie. an A/A test, which is what we're simulating. So by setting &lt;code&gt;p_crit = 0.05&lt;/code&gt;, we are saying we accept a false positive rate (FPR) of 0.05. So if our statistical methodology is sound, repeating our experiment many times (&lt;code&gt;num_simulations&lt;/code&gt;), we should get a false positive rate of 0.05. Let's see what happens.&lt;/p&gt;
&lt;p&gt;In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;before_mean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; 
&lt;span class="n"&gt;before_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;eps_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;p_crit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;
&lt;span class="n"&gt;traditional_fps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cuped_fps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hacked_fps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Simulating &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; A/B tests, true treatment lift is &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adjusted_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;traditional_pvalues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adjusted_pvalues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;adjusted_pvalue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;traditional_fps&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cuped_fps&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;hacked_fps&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;hacked_fps&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive rate (expected: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;):&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Traditional: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_fps&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED:       &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cuped_fps&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Hacked:      &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hacked_fps&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 10000 A/B tests, true treatment lift is 0...
False positive rate (expected: 0.050):
Traditional: 0.049
CUPED:       0.047
Hacked:      0.073
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We see that both with traditional and CUPED, we get around the expected 0.05. However, when we hack our experiment evaluation and pick and choose, our FPR is significantly higher. &lt;strong&gt;This demonstrates that pick and choose is not statistically sound: we both overestimate the lift and have a higher false positive rate!&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;No correlation&lt;/h2&gt;
&lt;p&gt;In the above case, we were running A/A tests on data where the "before" and "after" was correlated. What happens if we do the same when there's no correlation? Recall that the transformation equation for CUPED is:&lt;/p&gt;
&lt;p&gt;$ Y'_i = Y_i - (X_i - \mu_X) \frac{cov(X, Y)}{var(X)} $&lt;/p&gt;
&lt;p&gt;If there is no correlation, $cov(X, Y) \approx 0$, so $Y'_i \approx Y_i$. I write $\approx$ for approximate equality instead of equality, because even if there is no correlation, numerically $cov(X, Y)$ won't exactly equal 0, it will be some small number, like 0.001. But still, the transformed $Y_i'$ values will be very close to the original $Y_i$ values, so the lifts (traditional vs. CUPED) will also be very close, as will the p-values. So the above pick and choose hacking will not "work" here, since the two choices (lifts) will be very close, almost the same. We can see this if we run many experiments, and for each one we plot both the traditional and CUPED computed lifts. It's a tight fit to the $y=x$ line:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aa_cuped_4.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;If we repeat the A/A tests above, but with uncorrelated data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 10000 A/B tests, true treatment lift is 0...
False positive rate (expected: 0.050):
Traditional: 0.050
CUPED:       0.049
Hacked:      0.050
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We see that hacking doesn't work.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The Monte Carlo simulations show explicitly that we have to pick whether we use traditional A/B testing evaluation (using just "after" data) or CUPED &lt;strong&gt;before&lt;/strong&gt; we evaluate the experiment data, preferably before we even run the experiment. As demonstrated in this post, evaluating using both and picking the one with more favorable results is not statistically sound.&lt;/p&gt;</content><category term="ab-testing"></category><category term="cuped"></category></entry><entry><title>Reducing variance in conversion A/B testing with CUPED</title><link href="https://bytepawn.com/reducing-variance-in-conversion-ab-testing-with-cuped.html" rel="alternate"></link><published>2021-08-07T00:00:00+02:00</published><updated>2021-08-07T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-08-07:/reducing-variance-in-conversion-ab-testing-with-cuped.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations of conversion A/B tests to demonstrate how CUPED reduces measurement variance in conversion experiments.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cuped_conversion_2.png" alt="Historic lift" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous post, &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;&lt;em&gt;Reducing variance in A/B testing with CUPED&lt;/em&gt;&lt;/a&gt;, I ran Monte Carlo simulations to demonstrate CUPED, a variance reduction technique in A/B testing on continuous data, like $ spend per customer. Here I will repeat the same Monte Carlo simulations, but with binary 0/1 &lt;strong&gt;conversion data&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The experiment setup is almost entirely the same as in the last post. The only difference is in the &lt;code&gt;get_AB_samples()&lt;/code&gt; and &lt;code&gt;get_AB_samples_nocorr()&lt;/code&gt; functions, since these now have to generate 0/1 conversion data. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/CUPED%20Conversion.ipynb"&gt;jupyter notebook for this post is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Generating correlated conversion data&lt;/h2&gt;
&lt;p&gt;The approach is to assume a base conversion &lt;code&gt;before_p&lt;/code&gt;, say 10%. For each user, do a random 0/1 draw with this probability. This "before" outcome is either 0 or 1. Then, generate the "after" data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if the "before" outcome was 0, do a random 0/1 draw with conversion &lt;code&gt;0 + offset_p&lt;/code&gt;, say 10%. This way, most 0s remain 0s.&lt;/li&gt;
&lt;li&gt;if the "before" outcome was 1, do a random 0/1 draw with conversion &lt;code&gt;1 - offset_p&lt;/code&gt;, say 90%. This way, most 1s remain 1s.&lt;/li&gt;
&lt;li&gt;in the B variant, add &lt;code&gt;treatment_lift&lt;/code&gt; (say 1%) to the above probabilities.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is just one possible scheme, but it results in "before" and "after" data being correlated. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;B_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;A_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;offset_p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;B_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;offset_p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can validate by computing conditional probabilities:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(after = 1 | before = 0) = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(after = 0 | before = 0) = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(after = 1 | before = 1) = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(after = 0 | before = 1) = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;(after = 1 | before = 0) = 0.10
(after = 0 | before = 0) = 0.90 # if it was 0, it&amp;#39;s likely to be 0 again
(after = 1 | before = 1) = 0.90 # if it was 1, it&amp;#39;s likely to be 1 again
(after = 0 | before = 1) = 0.10
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Simulating one experiment&lt;/h2&gt;
&lt;p&gt;The driver code is, apart from parametrization, the same as before:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;before_p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;offset_p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;

&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A mean before = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, A mean after = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, A mean after adjusted = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B mean be|fore = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, B mean after = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, B mean after adjusted = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Traditional    A/B test evaluation, lift = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, p-value = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED adjusted A/B test evaluation, lift = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, p-value = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints (not deterministic):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;A mean before = 0.097, A mean after = 0.179, A mean after adjusted = 0.180
B mean be|fore = 0.099, B mean after = 0.191, B mean after adjusted = 0.190
Traditional    A/B test evaluation, lift = 0.012, p-value = 0.028
CUPED adjusted A/B test evaluation, lift = 0.010, p-value = 0.021
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In this particular run, CUPED did a better job approximating the lift with a lower p-value. However, as we saw before, this is not always the case. Although on average CUPED is a better estimator with lower variance, there are experiment runs where CUPED makes the lift measurement worse. For example, after a few re-runs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;A mean before = 0.098, A mean after = 0.178, A mean after adjusted = 0.180
B mean be|fore = 0.103, B mean after = 0.189, B mean after adjusted = 0.187
Traditional    A/B test evaluation, lift = 0.011, p-value = 0.053
CUPED adjusted A/B test evaluation, lift = 0.006, p-value = 0.133
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, it's interesting to print out the CUPED-transformed variables:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;print(&amp;#39;Possible mappings:&amp;#39;)
mappings = set([&amp;#39;(before=%d, after=%d) -&amp;gt; adjusted=%.3f&amp;#39; % (b, a, p) for
  b, a, p in zip(A_before+B_before, A_after+B_after, A_after_adjusted+B_after_adjusted)])
for m in mappings:
    print(m)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Possible mappings:
(before=1, after=0) -&amp;gt; adjusted=-0.722
(before=0, after=1) -&amp;gt; adjusted=1.081
(before=0, after=0) -&amp;gt; adjusted=0.081
(before=1, after=1) -&amp;gt; adjusted=0.278
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Remember the CUPED transformation equation was $ Y'_i := Y_i - (X_i - \mu_X) \frac{cov(X, Y)}{var(X)} $. In this equation $\mu_X$, $cov(X, Y)$, and $var(X)$ are constants computed from the experiment results. Both $X_i$ and $Y_i$ can take on two values, 0 or 1, so there are 4 possible combinations, hence $Y'_i$ can take on 4 values. In the above run, the values were -0.722, 1.081, 0.081 and 0.278. It's a bit counter-intuitive, since now the conversion experiment has these weird, even negative values, instead of just 0 and 1.&lt;/p&gt;
&lt;h2&gt;Simulating many experiments&lt;/h2&gt;
&lt;p&gt;As with continuous variables, CUPED measures the same lift [in conversion], but with lower variance:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 10,000 A/B tests, true treatment lift is 0.010...
Traditional    A/B testing, mean lift = 0.010, variance of lift = 0.00030
CUPED adjusted A/B testing, mean lift = 0.010, variance of lift = 0.00019
CUPED lift variance / tradititional lift variance = 0.626 (expected = 0.668)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can observe the tighter lifts on a histogram:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_1.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;As with continuous variables, the p-values decrease:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_2.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;As illustrated above, CUPED lifts and p-values are a better estimate with respect to variance, but not in all cases:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_3.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_4.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;No correlation&lt;/h2&gt;
&lt;p&gt;The simplest way to generate uncorrelated conversion data is to use random draws independently. In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples_nocorr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
    &lt;span class="n"&gt;B_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
    &lt;span class="n"&gt;A_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
    &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;before_p&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Checking conditional probabilities:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;P(after = 1 | before = 0) = 0.10
P(after = 0 | before = 0) = 0.90
P(after = 1 | before = 1) = 0.10
P(after = 0 | before = 1) = 0.90
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It's uncorrelated, because &lt;code&gt;P(after = 1) = 0.10&lt;/code&gt; in both 0 and 1 before cases, so "after" is independent of "before" (and same for &lt;code&gt;P(after = 0)&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;With this generator function, running &lt;code&gt;num_experiments=10,000&lt;/code&gt;, we can observe no variance reduction (since "before" and "after" is uncorrelated) on the lift and p-value histograms:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_5.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped_conversion_6.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;CUPED works for both continuous and binary experimentation outcomes, and reduces variance if "before" and "after" are correlated.&lt;/p&gt;</content><category term="ab-testing"></category><category term="cuped"></category></entry><entry><title>Reducing variance in A/B testing with CUPED</title><link href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html" rel="alternate"></link><published>2021-07-31T00:00:00+02:00</published><updated>2021-07-31T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-31:/reducing-variance-in-ab-testing-with-cuped.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations of A/B tests to demonstrate CUPED, a method to use historic "before" data to reduce the variance in the measurement of the treatment lift.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cuped2.png" alt="Historic lift" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://exp-platform.com/cuped/"&gt;CUPED&lt;/a&gt; is a variance reduction technique introduced by Deng, Xu, Kohavi and Walker. Using CUPED we can achieve lower variance in A/B testing by computing an adjusted evaluation metric using historic "before" data. Based on &lt;a href="https://www.google.com/search?q=cuped&amp;amp;oq=cuped&amp;amp;aqs=chrome..69i57j0i10l3j0i512l2j0i10j69i60.1489j0j1&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8"&gt;blog posts&lt;/a&gt;, it seems that many big Internet companies are using CUPED, such as Bing, Facebook, Netflix, Booking, Etsy. &lt;/p&gt;
&lt;p&gt;In this post I will use Monte Carlo simulations of A/B tests to demonstrate (to myself and readers) how CUPED works. The &lt;a href="https://github.com/mtrencseni/playground/blob/master/CUPED.ipynb"&gt;Jupyter notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Experiment setup&lt;/h2&gt;
&lt;p&gt;Assume an A/B testing setup where we're measuring a metric M, eg. $ spend per user. We have N users, randomly split into A and B. A is control, B is treatment. We have metric M for each user for the "before" time period, when treatment and control was the same, and the "after" period, when treatment had some treatment applied, which we hope increased their spend.&lt;/p&gt;
&lt;p&gt;Let $Y_i$ be the ith user's spend in the "after" period, and $X_i$ be their spend in the "before" period, both for A and B combined. We compute an adjusted "after" spend $Y'_i$.&lt;/p&gt;
&lt;p&gt;The CUPED recipe:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Compute the covariance $cov(X, Y)$ of X and Y.&lt;/li&gt;
&lt;li&gt;Compute the variance $var(X)$ of X.&lt;/li&gt;
&lt;li&gt;Compute the mean $\mu_X$ of X.&lt;/li&gt;
&lt;li&gt;Compute the adjusted $ Y'_i = Y_i - (X_i - \mu_X) \frac{cov(X, Y)}{var(X)} $ for each user.&lt;/li&gt;
&lt;li&gt;Evaluate the A/B test using $Y'$ instead of $Y$.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The point of CUPED is, when using $Y'$ instead of $Y$, the estimate of the treatment lift will have lower variance, ie. on average we will be better at estimating the lift.&lt;/p&gt;
&lt;p&gt;Mathematically, the decrease in variance will be $ \frac{var(Y')}{var(Y)} = 1 - corr(X, Y)^2 $ where $ corr(X, Y) = \frac{cov(X, Y)}{\sqrt{var(X)var(Y)}} $ is the correlation between X and Y.&lt;/p&gt;
&lt;p&gt;Note: in some of the blog posts about CUPED, there seems to be a typo and the squared is missing from the $corr(X, Y)^2$. It's there in the original paper.&lt;/p&gt;
&lt;h2&gt;CUPED implementation&lt;/h2&gt;
&lt;p&gt;Implementing CUPED is straightforward. Assuming &lt;code&gt;A_before&lt;/code&gt; and &lt;code&gt;B_before&lt;/code&gt; contain the $X_i$ values for each customer in groups A and B, and &lt;code&gt;A_after&lt;/code&gt; and &lt;code&gt;B_after&lt;/code&gt; contain the $Y_i$ values, &lt;code&gt;A_before[i]&lt;/code&gt; is the ith customer spend in the "before" period, &lt;code&gt;A_after[i]&lt;/code&gt; is the same customer's "after" spend, same for the Bs. So we're indexing the users in A and B separately, the ith user in the A list (who is in group A) is a different user than the ith user in the B list (who is in group B). In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;theta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;mean_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mean_before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;theta&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mean_before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;theta&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Simulating one experiment&lt;/h2&gt;
&lt;p&gt;We need some code to generate the &lt;code&gt;A_before, B_before, A_after, B_after&lt;/code&gt; lists, and in a way so that "before" and "after" are correlated. For this experiment, I chose:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;$X$ follows a normal distribution like &lt;code&gt;N(before_mean, before_sigma)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The A control $Y_i$ is the the same as $X_i$ plus a normal random variable like &lt;code&gt;N(0, eps_sigma)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The B treatment $Y_i$ is the the same as $X_i$ plus a normal random variable like &lt;code&gt;N(0, eps_sigma)&lt;/code&gt; plus a fixed &lt;code&gt;treatment_lift&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;B_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With this, we can now run a single experiment, to see what happens with CUPED:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;before_mean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; 
&lt;span class="n"&gt;before_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;eps_sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A mean before = &lt;/span&gt;&lt;span class="si"&gt;%05.1f&lt;/span&gt;&lt;span class="s1"&gt;, A mean after = &lt;/span&gt;&lt;span class="si"&gt;%05.1f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B mean before = &lt;/span&gt;&lt;span class="si"&gt;%05.1f&lt;/span&gt;&lt;span class="s1"&gt;, B mean after = &lt;/span&gt;&lt;span class="si"&gt;%05.1f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Traditional    A/B test evaluation, lift = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, p-value = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED adjusted A/B test evaluation, lift = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;, p-value = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields, for example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;A mean before = 099.4, A mean after = 099.9, A mean after adjusted = 100.6
B mean before = 100.7, B mean after = 102.9, B mean after adjusted = 102.2
Traditional    A/B test evaluation, lift = 2.941, p-value = 0.214
CUPED adjusted A/B test evaluation, lift = 1.688, p-value = 0.052
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So this shows that in this particular instance, the CUPED adjusted lift computed from $Y'$ (&lt;code&gt;lift = 1.688&lt;/code&gt;) was closer to the actual lift of 2 than the traditional lift computed from $Y$ (&lt;code&gt;lift = 2.941&lt;/code&gt;), and the p-value was lower.&lt;/p&gt;
&lt;p&gt;If I run it a couple of times, we can also get:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;A mean before = 101.8, A mean after = 102.5, A mean after adjusted = 101.9
B mean before = 100.5, B mean after = 101.7, B mean after adjusted = 102.3
Traditional    A/B test evaluation, lift = -0.854, p-value = 0.724
CUPED adjusted A/B test evaluation, lift = 0.380, p-value = 0.670
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This one is quite counter-intuitive, here CUPED flipped the lift. Without CUPED, A was better than B (&lt;code&gt;lift = -0.854&lt;/code&gt;), but with the CUPED adjustment, B is better than A (&lt;code&gt;lift = 0.380&lt;/code&gt;). Running it a couple more times, we can also get:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;A mean before = 099.5, A mean after = 099.7, A mean after adjusted = 101.6
B mean before = 103.2, B mean after = 103.4, B mean after adjusted = 101.6
Traditional    A/B test evaluation, lift = 3.743, p-value = 0.120
CUPED adjusted A/B test evaluation, lift = 0.006, p-value = 0.995
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here, CUPED adjustment made the lift estimate worse (&lt;code&gt;lift = 0.006&lt;/code&gt; is farther from 2 than &lt;code&gt;lift = 3.743&lt;/code&gt;), and had a higher p-value. This is because CUPED reduces the variance on average, but not in every particular experiment instance.&lt;/p&gt;
&lt;p&gt;We can also visualize the original $X$ and $Y$ and adjusted $Y'$ on a scatterplot:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;before&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;after&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A without adjustment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B without adjustment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;A with adjustment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B with adjustment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;On this scatterplot, the x-axis is the "before" spend, y-axis is the "after" spend, and it's showing both $Y$ vs $X$ and $Y'$ vs $X$, for A and B, so 4 groups of points. The "linear trend" is not seasonality, it's the correlation baked in the experiment setup: somebody who spent less "before", is also likely to spend less "after", which then gets removed by CUPED:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped1.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Simulating many experiments&lt;/h2&gt;
&lt;p&gt;In order to demonstrate the variance decrease $ \frac{var(Y')}{var(Y)} = 1 - corr(X, Y)^2 $, we need to run many A/B tests, store the lifts computed using both traditional A/B testing without CUPED adjustment and with CUPED adjustment, and then compute the variances. Let's run 1,000 A/B tests and record the variances:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adjusted_lifts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;traditional_pvalues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adjusted_pvalues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_AB_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;before_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;before_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_cuped_adjusted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;traditional_pvalues&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;adjusted_pvalues&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;A_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after_adjusted&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Traditional    A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED adjusted A/B testing, mean lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, variance of lift = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CUPED lift variance / tradititional lift variance = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adjusted_lifts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;cov&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traditional_lifts&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 2...
Traditional    A/B testing, mean lift = 2.13, variance of lift = 5.66
CUPED adjusted A/B testing, mean lift = 2.06, variance of lift = 0.80
CUPED lift variance / tradititional lift variance = 0.14
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Over 1,000 A/B tests, the mean lift using traditional A/B testing (without using historic "before" data) was &lt;code&gt;mean lift = 2.13&lt;/code&gt; and the variance of the individual lifts about this mean was &lt;code&gt;variance of lift = 5.66&lt;/code&gt;. The CUPED adjusted method was closer to the true mean of 2 with &lt;code&gt;mean lift = 2.06&lt;/code&gt; and had significantly lower variance of &lt;code&gt;variance of lift = 0.80&lt;/code&gt;, a ratio of &lt;code&gt;CUPED lift variance / tradititional lift variance = 0.14&lt;/code&gt;. I seperately verified that this ratio matches the theoretical exptected $ \frac{var(Y')}{var(Y)} = 1 - corr(X, Y)^2 $:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def expected_lift_ratio(A_before, B_before, A_after, B_after):
    cv = cov([A_after + B_after, A_before + B_before])
    corr = cv[0, 1] / (sqrt(cv[0, 0]) * sqrt(cv[1, 1]))
    return 1 - corr**2

large_N = 1000*1000
A_before, B_before, A_after, B_after = get_AB_samples(before_mean, before_sigma, eps_sigma, treatment_lift, large_N)
elr = expected_lift_ratio(A_before, B_before, A_after, B_after)
print(&amp;#39;CUPED lift variance / tradititional lift variance = %.2f (expected = %.2f)&amp;#39; % (cov(adjusted_lifts)/cov(traditional_lifts), elr))
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;CUPED lift variance / tradititional lift variance = 0.14 (expected = 0.14)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It does match, this is a good way to make sure there is no bug in the transformation code.&lt;/p&gt;
&lt;p&gt;We can also plot the histogram of lifts to visualize the decrease in variance, ie. the decrease in spread:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped2.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can also visualize the histogram of p-values to see that reduced variance goes along with lower p-values:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped3.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Counter-intuitive aspects&lt;/h2&gt;
&lt;p&gt;Whether something is counter-intuitive is subjective. Below I show what I personally find counter-intuitive.&lt;/p&gt;
&lt;p&gt;As shown in the simulation above, CUPED reduces variance on average if "before" and "after" are correlated. So in the long-run, applying CUPED is worth it. But sometimes the adjustment yields a worse measurement, which could be confusing in practice. This can be visualized by putting the traditional $Y$ and adjusted $Y'$ lift on a scatterplot, with lines showing the true lifts:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped4.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;This shows that sometimes, CUPED moved the lift from better to worse, sometimes it flips it from positive to negative or negative to positive. This is not a problem, but when you get an experiment realization like this, it could be confusing.&lt;/p&gt;
&lt;p&gt;The traditional vs CUPED adjusted p-values can also be shown on a scatterplot:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped5.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The earlier p-value histogram shows that most of the time CUPED reduces the p-value, but this shows that sometimes it increases it (experiments above the y=x line).&lt;/p&gt;
&lt;h2&gt;No correlation&lt;/h2&gt;
&lt;p&gt;Finally, let's check what happens if "before" and "after" is not correlated at all:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_AB_samples_nocorr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;A_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;A_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;B_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;A_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;A_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;B_after&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Per the formulas, if $cov(X, Y)=0$ then CUPED does "almost nothing" in $ Y'_i = Y_i - (X_i - \mu_X) \frac{cov(X, Y)}{var(X)} $. I say "almost", because in an experiment realization the covariance will not be exactly 0, but some small number. Simulating 1,000 A/B tests confirms this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Simulating 1000 A/B tests, true treatment lift is 2...
Traditional    A/B testing, mean lift = 2.16, variance of lift = 5.26
CUPED adjusted A/B testing, mean lift = 2.16, variance of lift = 5.25
CUPED lift variance / tradititional lift variance = 1.00
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Visually:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped6.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped7.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped8.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cuped9.png" alt="CUPED" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;CUPED is a win. It allows us to more accurately measure experimental lift if there is correlation between "before" and "after", without degrading measurements if there is not. I plan to do more posts on CUPED:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Repeating these simulations for conversion experiments.&lt;/li&gt;
&lt;li&gt;Exploring how trends relate to correlation.&lt;/li&gt;
&lt;li&gt;Check what happens if a Data Scientist evaluates using both traditional and CUPED adjusted, and picks the more favorable outcome — I suspect this is not correct, I suspect it's "p-hacking".&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Finally, some meta-commentary: in almost all the posts about A/B testing here on Bytepawn, I run Monte Carlo simulations to make sure I don't fool myself. The last post was an exception, I didn't run MC simulations, I argued "intuitively", and (mis)led myself down a rabbit hole. This is a great lesson: it's best to double-check my intuition with simulations, they are relatively quick to implement and run, worth it.&lt;/p&gt;</content><category term="ab-testing"></category><category term="cuped"></category></entry><entry><title>A/B testing and the historic lift paradox [redacted]</title><link href="https://bytepawn.com/ab-testing-and-the-historic-lift-paradox.html" rel="alternate"></link><published>2021-07-25T00:00:00+02:00</published><updated>2021-07-25T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-25:/ab-testing-and-the-historic-lift-paradox.html</id><summary type="html">&lt;p&gt;I show how looking at historic "before" values in A/B testing can lead to an apparent paradox.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/historic_lift3.png" alt="Historic lift" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;This post has received very useful &lt;a href="https://news.ycombinator.com/item?id=27970438"&gt;comments on Hacker News&lt;/a&gt;. Multiple commenters pointed to &lt;a href="https://exp-platform.com/cuped/"&gt;CUPED&lt;/a&gt; (by Deng, Xu, Kohavi, Walker), a method to compute an adjusted evaluation metric using historic "before" data. If "before" and "after" are correlated, which is often the case, this adjusted metric has lower variance than the original metric. In practical A/B testing terms, this means that if we apply CUPED we will on average see lower p-values, which is desirable. So the explanation in the post below, which was that "before" should never be used, is &lt;u&gt;not correct&lt;/u&gt;! I redacted the misleading explanation to avoid confusing readers. Read this follow-up &lt;a href="https://bytepawn.com/reducing-variance-in-ab-testing-with-cuped.html"&gt;post about CUPED&lt;/a&gt;, which shows how to use correlations in historic data to reduce variance.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I encountered a new paradox in A/B testing. It goes something like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Randomly split the set of people into treatment and control and apply the treatment to the treatment group.&lt;/li&gt;
&lt;li&gt;Measure the target metric M (eg. \$ sales per person) for control and treatment for T days: $ \alpha_C $ and $ \alpha_T $.&lt;/li&gt;
&lt;li&gt;Using historic data, for the same groups, measure metric M for the T days &lt;em&gt;preceeding&lt;/em&gt; the application of the treatment: $ \beta_C $ and $ \beta_T $.&lt;/li&gt;
&lt;li&gt;Compute the "control lift" and "target lift": $ L_C = \frac{\alpha_C}{\beta_C} - 1 $ and $ L_T = \frac{\alpha_T}{\beta_T} - 1 $&lt;/li&gt;
&lt;li&gt;The "effect lift" is then $ L = L_T - L_C $&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let's put in some example numbers: We send some marketing materials to the treatment group, with the intention of increasing their spend. Measuring on a period of 28 days, the average spend in the treatment group is \$110 per person. For the same period, it's \$105 per person in the control group. Looking at historic data, we see that in the 28 days preceeding the experiment, the spends are \$101 per person and \$100 per person for treatment and control. So the "control lift" is \$105 / \$100 - 1 = 5% and the "treatment lift" is \$110 / \$101 - 1 = 8.9%, and the "effect lift" is 8.9% - 5% = 3.9%.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/historic_lift1.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: red means no treatment, blue means treatment applied.&lt;/p&gt;
&lt;p&gt;What's the problem here? The simplest way to show that the logic above is wrong is to play around with the numbers a little bit. Let's change just one number, the top left number, from \$100 to \$95. This changes percentages like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/historic_lift2.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;So now the "effect lift" is -1.6%, which means treatment didn't work. Right? But wait! If we look at the second row, we see that during the experiment, treatment spent \$110 per person and control spent \$105 per person, so the "experimental lift", equal to $ \frac{\alpha_T}{\alpha_C} -1 $, is 4.7%. &lt;strong&gt;It's a paradox!&lt;/strong&gt; Is treatment better or is control better? What's going on?&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/historic_lift3.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note that this is not just a technical nuance. If we accept -1.6%, we say treatment is worse than control, and stop it going forward. If we accept 4.7%, we say treatment is better than control, the exact opposite. So the real-life business outcomes are very different!&lt;/p&gt;
&lt;!--Explanation--&gt;

&lt;!---------------&gt;

&lt;!--Computing historic lifts like $ L_i = \frac{\alpha_i}{\beta_i} - 1 $ separately for control and treatment, and then subtracting them to get an "effect lift" $ L = L_T - L_C $ is **incorrect**. We just have to compare control and treatment. In the example above, 4.7% is the right lift, treatment is better, pending statistical significance.--&gt;

&lt;!--&lt;img src="/images/historic_lift4.png" alt="." style="width: 400px;"/&gt;--&gt;

&lt;!--Why? **There is no value in looking at the "before" numbers, since before the treatment was applied to the treatment group, control and treatment were the same!** Any difference between $\beta_i$s is a random fluctuation.--&gt;

&lt;!--Imagine having a very large sample size N (in math terms, $ N \to \infty) $, then $ \beta_C = \beta_T = \beta $, the conversion without the treatment applied. However, if that's the case, then $ L = L_T - L_C = \frac{\alpha_T}{\beta} - 1 - \frac{\alpha_T}{\beta} + 1 = \frac{\alpha_T - \alpha_C}{\beta} $. But, if N is very large, then also $ \beta = \alpha_C $, so $ L = \frac{\alpha_T - \alpha_C}{\alpha_C} = \frac{\alpha_T}{\alpha_C} - 1 $, which is just the usual "experimental lift" we compute for an A/B test, without looking at "before" data!--&gt;

&lt;!--So actually, by picking a large enough sample size, the two are the same. So, the correct thing to do is to pick a large enough sample size (per [statistical significance considerations](https://bytepawn.com/building-intuition-p-values-statistical-significance.html) and [moving fast vs being sure](https://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html) considerations), measure control and treatment, and compare the two.--&gt;

&lt;!--Counterarguments, answered--&gt;

&lt;!------------------------------&gt;

&lt;!--**But suppose we don't have enough samples, by using the $L_i$ historic lifts, aren't we improving on things?**--&gt;

&lt;!--&lt;br/&gt;--&gt;

&lt;!--No! If we don't have enough of a sample size, then those historic measurement themselves will be off due to random fluctuations. We're not gaining anything by baselining on the $\beta_i$s like $\frac{\alpha_i}{\beta_i} $, since the $\beta_i$s themselves will be off, if N is too low. And if we have enough N samples, then we don't need to.--&gt;

&lt;!--**But if we don't have enough samples, then can't we look at the historic $ \beta_i $ baselines, and control for the non-representative population we have?**--&gt;

&lt;!--&lt;br/&gt;--&gt;

&lt;!--The problem is, this line of thinking explicitly assumes that the groups are not representative of the population, which invalidates the whole experiment. If our groups are not representative, and we implicitly admit that, then any results from the experiment will also not be representative and hence not conclusive.--&gt;

&lt;!--**Okay, but what if there is a seasonal effect between "before" and "after"?**--&gt;

&lt;!--&lt;br/&gt;--&gt;

&lt;!--A/B testing is not about seasonal effects. In an A/B test, we measure A and B during the same time period, and compare the results. An A/B test doesn't tell us anything about whether A or B will be better a year down the line, or was better a year ago. We use statistical significance to make sure our conclusion of the current time period is within our accepted bounds of error, and usually we assume that during other seasons, with other (new) users, this will hold. But this is not always the case, for example if a large user base comes in from another country, past experiment results may no longer be valid.--&gt;

&lt;!--As a final point here: remember that the equations of statistics, like the above (or p-value calculations) don't have built-in alerts and red flags. We can always plug in numbers and get some results out. But it's up to us, the data scientist, to make sure that the experiment is sound at a conceptual level. The equations don't help us with that.--&gt;

&lt;!--Randomized controlled trials--&gt;

&lt;!--------------------------------&gt;

&lt;!--Remember that **an A/B test is a randomized controlled trial**. Here, the key word is **randomized**. We have some large user base, and we want to draw conclusions about this user base by **running an experiment on a subset of all users**. For this, we conduct an experiment with 2 branches, A (control) and B (treatment). We want (i) both A and B to have the same population distribution and (ii) the population distributions should approximate the overall user base's distribution. The way to accomplish this is (i) randomization, ie. randomly putting users into A and B, and (ii) picking a large enough N sample size. There is no need to compare to "before".--&gt;

&lt;!--But, what if randomization is not possible? For example, sometimes we want to test something offline, "in real life". For example, we want to change the pricing structure in one of our movie theaters, and see if sales goes up. In such a case, we can pick one movie theater site as treatment, change the pricing there (print new banners, etc), and compare it to another, control site. However, this is not a randomized controlled trial, since there is no randomization. We can't randomly split our customers into A and B and force them go to the treatment and control site. In such scenarios, "before" and "after" measurements may make sense, as discussed in the post [Comparing conversion at control and treatment sites](https://bytepawn.com/comparing-conversion-at-control-and-treatment-sites.html#comparing-conversion-at-control-and-treatment-sites). But this is a sub-optimal scenario, drawing robust and credible conclusions is hard and uncertain, since we can't **control** for external effects that may only affect one site. If possible, we should always run a randomized controlled trial, with a large enough sample size N!--&gt;

&lt;!--Conclusion--&gt;

&lt;!--------------&gt;

&lt;!--Experimentation and A/B testing seems simple, but there is a lot of hidden nuance, and hence beauty to it. Mathametics is only part of experimentation, we also have to make sure the experiment is well designed and conducted. If you want to learn more, there are [more than 20 posts on A/B testing on Bytepawn](https://bytepawn.com/tag/ab-testing.html).--&gt;</content><category term="redacted"></category></entry><entry><title>The Full Stack Data Scientist</title><link href="https://bytepawn.com/the-full-stack-data-scientist.html" rel="alternate"></link><published>2021-07-23T00:00:00+02:00</published><updated>2021-07-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-23:/the-full-stack-data-scientist.html</id><summary type="html">&lt;p&gt;What are the core skills a data scientist needs to sustainably achieve bottom-line impact, without blocking on external help from other roles?&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/modern_data_scientist.png" alt="Data Scientist" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;What is a full-stack data scientist? I would argue it's &lt;strong&gt;a data scientist who can sustainably achieve bottom-line impact, without blocking on external help from other roles&lt;/strong&gt; [1]. Having said that, in many situations external help from other roles is available in the form of tools, platforms or actual help, and it's perfectly fine to rely on that. But, in other situations, such as an early startup environment, or when working for a company with no established technology culture, help is not always available.&lt;/p&gt;
&lt;p&gt;The term Data Science covers a lot of ground. Being a Data Scientist is very different in a 10 person startup versus a 100,000 person company like Google or Facebook. What I write here is &lt;strong&gt;based on my personal experience&lt;/strong&gt; at small to medium sized San Francisco style startups, big tech companies like Facebook, and non-tech companies in the delivery and retail space.&lt;/p&gt;
&lt;p&gt;In terms of tooling, fortunately there is a bewildering amount of open source and Saas offerings for Data Science and Machine Learning. No single individual can learn all the tools, that is not what "full-stack" means. I believe the right approach is &lt;strong&gt;judgmental vertical cutting&lt;/strong&gt;: using your experience, judgement and taste, we have to pick a vertical slice of tooling (eg. Scala or Python, AWS or Azure, etc), stop thinking about the other 100 options [2], and get good at the tools we picked.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://miro.medium.com/max/1219/1*WWJWqqjsHQdVNt3uzobe5g.png" alt="Data Scientist" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;What are the core skills a data scientist needs to sustainably achieve bottom-line impact, without blocking on external help from other roles?&lt;/p&gt;
&lt;h2&gt;Core skills&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Reading and learning.&lt;/strong&gt; I put this first because it is a building block for the rest. Because the field is moving so fast, reading and learning is a core skill. If we also want to keep up-to-date with latest results in neural networks or reinforcement learning, we also need to be able to scan academic articles to extract relevant information [3].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Product sense.&lt;/strong&gt; (or business sense) We need to understand the product or business, both qualitatively and quantitatively (metrics), so we can formulate ideas/hypothesis/experiments on how to improve them, eg. get more engagement, more active users, sell more tickets. Without this, however good technically, we won't know what's worth working on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SQL.&lt;/strong&gt; Hopefully your data already lives in an SQL database (a "data warehouse"). If not, first order of business is moving it there. In my experience, we need to spend a significant amount of our time deep-diving data in SQL, so we understands what datasets we have, data quality, delays and landing times, what the core metrics values are, etc. Also, it's a good practice to feed back results from our own work (A/B tests, tabular ML outputs) into the database, then it's easy to work on it (SQL is better than &lt;code&gt;awk&lt;/code&gt; or &lt;code&gt;pandas&lt;/code&gt;), share it with others, put dashboards on top.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dashboarding.&lt;/strong&gt; To understand data, we have to visualize it. It's just worth it to do this on dashboards, that way the visualization automatically updates when the underlying data changes. This makes sense even if we're the only ones looking at the dashboard (eg. a model perfomance dashboard).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ETL.&lt;/strong&gt; ETL is just techno speak for "data job automation". The point is, there are some jobs, involving data, which have to be triggered once an hour or day or week. Examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Usually, when we're visualizing data, we're not working off the raw tables, but make smaller, derived summary tables. These have to be continually re-computed.&lt;/li&gt;
&lt;li&gt;Machine Learning models have to be re-trained and re-deployed, usually daily or weekly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Setting up and running an ETL scheduler like Airflow is quick and easy, it's just worth it. At tech companies, this is one of the jobs of data engineers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Modeling.&lt;/strong&gt; This is what people usually think about when they think Data Science: building random forests, timeseries forecasting models, training neural networks for image recognition, etc. But I suspect that for most of us, they only spend 10-20% of their time doing modeling. At big companies, there are lots of different roles (infra engineers, data engineers, ML engineers, analysts, data scientists), so each role ends up more focused on their own turf, and can safely assume the other roles will do ther part. But most companies are not big tech companies, so we have to be much more full-stack and be able to step outside of just modeling and statistics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Experimentation and A/B testing.&lt;/strong&gt; Experimentation is a core activity of Data Science. Whether we're running an A/B test with a product team to determine which version of a signup funnel works best, or comparing a new version of a model to the current one in production. A/B testing is a beautiful topic with unexpected depth, check out the &lt;a href="https://bytepawn.com/tag/ab-testing.html"&gt;many articles on this blog on A/B testing&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Math, probability theory, statistics.&lt;/strong&gt; Without this, we can not make sense of the inputs and outputs of our models, run A/B tests. It's painful when people make mistakes and don't notice that correlation should be between -1 and 1, a z score of 100 is suspect, as is a MAPE of 0.1%, confuse accuracy and AUC, or don't know what a p-value means.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Programming and systems.&lt;/strong&gt; In my experience, a full stack Data Scientist needs to have a strong software engineering background. Eg. Python is necessary to do basic modeling work in notebooks, but also to productionalize models later. Some companies will have dedicated people for productionalizing, but not at smaller shops. We have to be able to install packages and use tools such as &lt;code&gt;git&lt;/code&gt; and Github. Also, in our daily work we inevitably hit on issues such as exceptions, out-of-memory issues. We have to be able to find and examine logs if our production model crashes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Productionalizing, MLOps.&lt;/strong&gt; To sustainably achieve impact, at a high velocity, we need to build an infrastructure that allows developing, deploying, training and hyperparameter tuning (possibly with AutoML) monitoring, experimenting and logging of models. MLOps is a relatively new field (outside of big tech), there are lots of options, so it's especially important to be judicious and apply taste.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Communication.&lt;/strong&gt; Whether working with a product team or supporting other teams with analytics, insights, experimentation and decision science, we need to be credible and communicate in a clear and concise way [4].&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In my experience, in Data Science it's best to have a software engineering background combined with a quantitative background like mathematics, physics or economics [5]. That, combined with curiousity, and never saying "that's not my job", over an extended period of time, is the way to success in Data Science.&lt;/p&gt;
&lt;p&gt;-&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Footnotes:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[1] Google, Stackoverflow, etc is fine of course...&lt;/p&gt;
&lt;p&gt;[2] I will give an example what the right amount of judgmental is. If we're currently using X, and somebody comes along and shows us that Y is 10x faster or accomplishes the same thing in 10x less code, then we should switch to Y and thank the person for having taught us something. But in general, getting bogged down in "why are we using X and not Y" type discussions is a waste of time, because there are just too many Ys. If X works reasonably well, the burder of proof is on the other side to show a 10x improvement. The 10x rule I stole from the database engineering community, where it's common wisdom that if a challenger database technology wants to displace the current king, it needs to be 10x better in some relevant dimension. Otherwise, people just won't go through the trouble of replacing their core database systems.&lt;/p&gt;
&lt;p&gt;[3] Academic articles are not read "cover to cover", especially not by practicioners. To get an article published, academics are forced to surround a core result or argument with 80% fluff. A good rule of thumb is to read the abstract, the introduction, then skip sections where they talk about previous work, and look at the core result, usually in section 3-5. Benchmarks can usually also be safely skipped, as can the conclusion, which is the same as the abstact and introduction.&lt;/p&gt;
&lt;p&gt;[4] I don't believe in dumbing down analytics and data science and too much "storytelling". Partner roles in 2021 need to be analytics savvy.&lt;/p&gt;
&lt;p&gt;[5] If I had to pick one, I would pick Computer Science. As part of a good computer science degree, we learn all the probability and statistics that's necessary.&lt;/p&gt;</content><category term="data"></category><category term="fallacies"></category></entry><entry><title>Comparing NeuralProphet and Prophet for timeseries forecasting</title><link href="https://bytepawn.com/comparing-neuralprophet-and-prophet-for-timeseries-forecasting.html" rel="alternate"></link><published>2021-07-20T00:00:00+02:00</published><updated>2021-07-20T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-20:/comparing-neuralprophet-and-prophet-for-timeseries-forecasting.html</id><summary type="html">&lt;p&gt;I compare Prophet and NeuralProphet performance using a toy forecasting benchmark.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/neuralprophet5.png" alt="NeuralProphet" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/timeseries-forecasting-with-prophet.html#timeseries-forecasting-with-prophet"&gt;last past I ran forcasting experiments with Prophet&lt;/a&gt;. In the conclusion I mentioned &lt;a href="https://neuralprophet.com/"&gt;NeuralProphet&lt;/a&gt;, a &lt;a href="https://pytorch.org/"&gt;Pytorch&lt;/a&gt; and neural network based alternative to Prophet.&lt;/p&gt;
&lt;p&gt;An overview of the NeuralProphet architecture from the &lt;a href="https://neuralprophet.com/model-overview/"&gt;documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;NeuralProphet is a decomposable time series model with the components, trend, seasonality, auto-regression, special events, future regressors and lagged regressors. Future regressors are external variables which have known future values for the forecast period whereas the lagged regressors are those external variables which only have values for the observed period. Trend can be modelled either as a linear or a piece-wise linear trend by using changepoints. Seasonality is modelled using fourier terms and thus can handle multiple seasonalities for high-frequency data. Auto-regression is handled using an implementation of AR-Net, an Auto-Regressive Feed-Forward Neural Network for time series. Lagged regressors are also modelled using separate Feed-Forward Neural Networks. Future regressors and special events are both modelled as covariates of the model with dedicated coefficients. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here I will compare Prophet and NeuralProphet forecast and runtime performance. As in the previous post, let's use a sample timeseries dataset which contains hourly energy usage data for the major US energy company &lt;a href="https://en.wikipedia.org/wiki/American_Electric_Power"&gt;American Electric Power(AEP)&lt;/a&gt;, in &lt;a href="https://en.wikipedia.org/wiki/Watt#Megawatt"&gt;megawatts&lt;/a&gt;. We expect this timeseries to have daily and weekly seasonality, so it's an ideal candidate for forecasting.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Forecasting%20with%20NeuralProphet.ipynb"&gt;notebook for this post is on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Getting started with NeuralProphet&lt;/h2&gt;
&lt;p&gt;Getting started with NeuralProphet is easy, the library interfaces are similar to Prophet, though, unfortunately, not identical:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# download data&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://github.com/khsieh18/Time-Series/raw/master/AEP_hourly.csv&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# rename columns, NeuralProphet expects ds and y&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datetime64[ns]&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# keep training data&lt;/span&gt;
&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;365&lt;/span&gt;
&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# train model&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NeuralProphet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;H&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# forecast&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make_future_dataframe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;periods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet1.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;h2&gt;NeuralProphet vs Prophet forecast performance&lt;/h2&gt;
&lt;p&gt;Let's compare NeuralProphet vs Prophet forecast performance, in terms of &lt;a href="https://en.wikipedia.org/wiki/Mean_absolute_percentage_error"&gt;Mean Absolute Percentage Error (MAPE)&lt;/a&gt; and running time (in seconds, training and forecasting time combined, on an 8-core 64GB Intel Macbook Pro). For both MAPE and runtime, lower is better. The following helper functions have the core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_mape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_predict&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;mape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;prepare_dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# rename columns, Prophet expects ds and y&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datetime64[ns]&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# save ground truth&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# wipe target variable y for to-be-forecasted section&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;drop_inds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;drop_inds&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_train&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fbprophet_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prepare_dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;compute_mape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;neuralprophet_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prepare_dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;H&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make_future_dataframe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;periods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;compute_mape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The differences between &lt;code&gt;fbprophet_test()&lt;/code&gt; and &lt;code&gt;neuralprophet_test()&lt;/code&gt; show the minor differences between the two forecasting APIs.&lt;/p&gt;
&lt;p&gt;Let's compare training on &lt;code&gt;1, 2, 3, 4, 5&lt;/code&gt; years of hourly data and forecasting on &lt;code&gt;1, 3, 6, 9, 12&lt;/code&gt; months of hourly data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://github.com/khsieh18/Time-Series/raw/master/AEP_hourly.csv&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;training_years&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;forecast_months&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Prophet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fbprophet_test&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;NeuralProphet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;neuralprophet_test&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;training_years&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fm&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;forecast_months&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;365&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fm&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]()&lt;/span&gt;
            &lt;span class="n"&gt;test_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;mape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, training years=&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;, forecast months=&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;, MAPE = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, elapsed secs = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Plotting the results, both in terms of MAPE and runtime (seconds), lower is better for both. On all plots, the x-axis is forecasting months (from 1 to 12), on the left side plots the y-axis is MAPE (all axes go from 0 to 0.25), the right side plots show runtime seconds (not 0 grounded):&lt;/p&gt;
&lt;p&gt;&lt;code&gt;training_years = 1&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet2.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;training_years = 2&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet3.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;training_years = 3&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet4.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;training_years = 4&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet5.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;training_years = 5&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet6.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Takeaways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;NeuralProphet is significantly faster than Prophet, even running on a Macbook Pro without GPU support for Pytorch&lt;/li&gt;
&lt;li&gt;NeuralProphet has lower (better) MAPE in most cases&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Handling missing data&lt;/h2&gt;
&lt;p&gt;Let's fix training years at 5 and forecast months at 6, but vary the drop ratio between 0.0 and 0.5. meaning that at most 50% of all training rows are dropped, and let's see how MAPE varies:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://github.com/khsieh18/Time-Series/raw/master/AEP_hourly.csv&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ty&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;fm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
&lt;span class="n"&gt;drop_ratios&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Prophet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fbprophet_test&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;NeuralProphet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;neuralprophet_test&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;drop_ratios&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;365&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fm&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]()&lt;/span&gt;
        &lt;span class="n"&gt;test_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funcs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;mape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;csv_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, training years=&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;, forecast months=&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;, MAPE = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, elapsed secs = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;, drop_ratio=&lt;/span&gt;&lt;span class="si"&gt;%.1f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; 
            &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;results2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__class__&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/neuralprophet7.png" alt="NeuralProphet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Interestingly, neither Prophet or NeuralProphet is affected by dropping up to 50% of rows. NeuralProphet does a slightly better job at higher &lt;code&gt;drop_ratio&lt;/code&gt;s.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This toy benchmark is not conclusive, but it indicates that NeuralProphet is competitive with Prophet on MAPE, and much faster in terms of runtime. Since NeuralProphet is quite similar to Prophet and easy to use, it's worth checking out for real life production use-cases to save time and possibly gain a few MAPE points.&lt;/p&gt;</content><category term="modeling"></category><category term="timeseries"></category><category term="prophet"></category><category term="neuralprophet"></category></entry><entry><title>Timeseries forecasting with Prophet</title><link href="https://bytepawn.com/timeseries-forecasting-with-prophet.html" rel="alternate"></link><published>2021-07-18T00:00:00+02:00</published><updated>2021-07-18T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-18:/timeseries-forecasting-with-prophet.html</id><summary type="html">&lt;p&gt;Prophet is a simple to use timeseries forecasting library by Facebook.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/prophet1.png" alt="Prophet" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://facebook.github.io/prophet/"&gt;Prophet&lt;/a&gt;, formely FBProphet, is a best-of-class timeseries forecasting library from Facebook. It is open source, released by Facebook's Core Data Science Team. At Majid Al Futtaim we use it on a regular basis. It is a "one-shot" forecasting solution, because it gives close to optimal forecasts with default arguments, without extensive parameter tweaking or feature engineering.&lt;/p&gt;
&lt;p&gt;Here I will show how to use Prophet without going into too much detail about its architecture. For that, check out the extensive documentation and other resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://facebook.github.io/prophet/"&gt;Official Prophet documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/a-quick-start-of-time-series-forecasting-with-a-practical-example-using-fb-prophet-31c4447a2274"&gt;Getting started guide on TowardsDataScience&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/time-series-analysis-with-facebook-prophet-how-it-works-and-how-to-use-it-f15ecf2c0e3a"&gt;Another guide on TowardsDataScience&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/playground/blob/master/Forecasting%20with%20FBProphet.ipynb"&gt;Jupyter notebook for this post&lt;/a&gt; is up on Github.&lt;/p&gt;
&lt;h2&gt;Getting started&lt;/h2&gt;
&lt;p&gt;Getting started with Prophet is easy. Let's use a sample timeseries dataset which contains hourly energy usage data for the major US energy company &lt;a href="https://en.wikipedia.org/wiki/American_Electric_Power"&gt;American Electric Power (AEP)&lt;/a&gt;, in &lt;a href="https://en.wikipedia.org/wiki/Watt#Megawatt"&gt;megawatts&lt;/a&gt;. We expect this timeseries to have daily and weekly seasonality, so it's an ideal candidate for forecasting:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# download data&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://github.com/khsieh18/Time-Series/raw/master/AEP_hourly.csv&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# rename columns, Prophet expects ds and y&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datetime64[ns]&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# keep training data&lt;/span&gt;
&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;365&lt;/span&gt;
&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# wipe target variable y for to-be-forecasted section&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# train model&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Prophet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# forecast&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_lower&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_upper&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prophet also allows us to visually examine the forecast:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prophet1.png" alt="Prophet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;In my experience, if a "default" run of Prophet, like above, has some &lt;a href="https://en.wikipedia.org/wiki/Mean_absolute_percentage_error"&gt;Mean Absolute Percentage Error (MAPE)&lt;/a&gt;, then if we invest 1-2 weeks of custom modeling, we can find improvements of 2-3%, maybe 5% if we're lucky, but not more. So, Prophet yields very good forecasts out of the box.&lt;/p&gt;
&lt;p&gt;Speaking about MAPE, let's adjust the code above to compare the forecast to actuals and compute the MAPE explicitly. Training on 2 years worth of data and forecasting 30 days:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# download data&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://github.com/khsieh18/Time-Series/raw/master/AEP_hourly.csv&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# rename columns, Prophet expects ds and y&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datetime64[ns]&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# keep training data&lt;/span&gt;
&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;365&lt;/span&gt;
&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# save ground truth&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# wipe target variable y for to-be-forecasted section&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;training_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# train model&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Prophet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yearly_seasonality&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# forecast&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_lower&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_upper&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# join train and predict&lt;/span&gt;
&lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_predict&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;mape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
             &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MAPE = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields a MAPE=0.12 or 12%, which is pretty typical for real-life multi-day forecasts.&lt;/p&gt;
&lt;h2&gt;Holidays and periodicity&lt;/h2&gt;
&lt;p&gt;Internally, Prophet models the timeseries as:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; $ y(t)= g(t) + s(t) + h(t) + ε(t) $&lt;/p&gt;
&lt;p&gt;where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$ g(t) $: growth, piecewise linear or logistic&lt;/li&gt;
&lt;li&gt;$ s(t) $: periodic changes (e.g. daily, weekly, yearly seasonality)&lt;/li&gt;
&lt;li&gt;$ h(t) $: holiday effects&lt;/li&gt;
&lt;li&gt;$ ε(t) $: error term&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Adding holidays is as simple as:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_country_holidays&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;AE&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# for UAE holidays&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Regarding the $ h(t) $ term, most countries' holidays are correctly included out of the box, or you can add your own custom holidays.&lt;/p&gt;
&lt;p&gt;Let's test how the periodic modeling term $ s(t) $ performs. As written above, out of the box Prophet assumes daily, weekly and yearly periodicity. Let's construct a sine wave time series which is perfectly periodic every 24 hours and fit Prophet:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-01&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;2020-01-31&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1H&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;left&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;

&lt;span class="n"&gt;spectrum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period_day&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;period_day&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;gt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;spectrum&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gt&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gt&lt;/span&gt;

&lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Prophet&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;df_predict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_lower&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat_upper&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forecast_days&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_predict&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;df_cross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ds&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="n"&gt;mape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;yhat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
             &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df_cross&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MAPE = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;mape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_predict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The resulting MAPE is 0, which means it's a perfect fit:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prophet2.png" alt="Prophet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;By changing the spectrum of the function, we can superimpose several sine waves. For example, we can combine a 1-day periodic wave and a 7-day periodic wave by changing one line in the above code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;spectrum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This now looks more funky, but it's still modeled with MAPE=0 by Prophet:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prophet3.png" alt="Prophet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, Prophet does not assume other periodicities. So if we add something else to the spectrum, the model's fit will no longer be perfect:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;spectrum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This yields MAPE=0.962, which means it's way off.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/prophet4.png" alt="Prophet" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, we can tell Prophet that we suspect our data some specific periodicity:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_seasonality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;monthly&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fourier_order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;After inserting this line, MAPE becomes 0 again.&lt;/p&gt;
&lt;h2&gt;Random noise&lt;/h2&gt;
&lt;p&gt;To test the error term $ ε(t) $ let's add some random noise of amplitude 20% compared to the original signal of the training data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;gt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;spectrum&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gt&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;gauss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
    &lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;gt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since Prophet explicitly models the error term, the MAPE is only 1-3% depending on the amount of training data.&lt;/p&gt;
&lt;h2&gt;Missing data&lt;/h2&gt;
&lt;p&gt;Prophet handles missing data out of the box. As a user, we don't have to write any interpolation logic. Let's continue with the sine wave toy problem, but drop 20% of the rows:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;drop_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="n"&gt;drop_inds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;drop_ratio&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;forecast_days&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;drop_inds&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The MAPE remains 0! What's interesting is, even if I drop 90% of the rows, the MAPE is still 0. This is because the model only needs 3 points to find the daily sine wave, which is what it's assuming internally.&lt;/p&gt;
&lt;h2&gt;Other features&lt;/h2&gt;
&lt;p&gt;Other useful features of Prophet, not explored further in this post, include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;External regressors:&lt;/strong&gt; tell Prophet that a portion of the history had some external effect (such as Covid-19) that it can regress on.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Floors and ceilings:&lt;/strong&gt; Prophet is domain agnostic, so it doesn't know about natural or reasonable limits. For example, if the tail end of the historic data is sloping downwards, Prophet may pick up on that and create a decreasing timeseries that goes to $ - \infty $, even though that may not make sense, eg. the megawatt usage per hour cannot be less than 0. By setting floors and ceilings, we inform Prophet of these "natural" bounds.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Changepoints:&lt;/strong&gt; Sometimes there is a distinct changepoint in the timeseries. For example, a SaaS company's daily active user (DAU) data may be flat until viral growth kicks in.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Prophet is a highly useful, one-shot timeseries forecasting tool which comes with defaults tuned to real-life product and business use-cases, such as daily, weekly and yearly seasonality and built-in holidays. Prophet is highly recommended for everyday production use.&lt;/p&gt;
&lt;p&gt;If you are interested in timeseries forecasting, also check out &lt;a href="https://neuralprophet.com/"&gt;NeuralProphet&lt;/a&gt;, which is inspired by Prophet but uses &lt;a href="https://pytorch.org/"&gt;Pytorch&lt;/a&gt; and deep learning (not from Facebook), and &lt;a href="https://github.com/facebookresearch/Kats"&gt;Kats&lt;/a&gt; (also from Facebook), which provides a one-stop shop for time series analysis, including detection, forecasting, feature extraction and multivariate analysis.&lt;/p&gt;</content><category term="modeling"></category><category term="timeseries"></category><category term="prophet"></category></entry><entry><title>YOLO object detection architecture</title><link href="https://bytepawn.com/yolo-object-detection-architecture.html" rel="alternate"></link><published>2021-07-10T00:00:00+02:00</published><updated>2021-07-10T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-10:/yolo-object-detection-architecture.html</id><summary type="html">&lt;p&gt;I discuss the YOLO neural network architecture for object detection. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/yolov1_regions_notext.JPG" alt="YOLO architecture" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/yolov5-object-detection-experiments.html"&gt;previous post I conducted simple experiments using YOLOv5&lt;/a&gt;. I checked how the model responds to rotation, scaling, stretching and blurring, and found that it's reasonably robust against these transformations. I briefly touched on the architecture in the &lt;strong&gt;Limitations&lt;/strong&gt; section of the post:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There is also a local maximum of objects that it can detect, due to the architecture of the neural network. So if there is a large image, with a smaller section containing 100s of objects to be detected, the user will run into this local maximum constraint.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here I will try to explain the architecture in more detail:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;input-output considerations of the neural network&lt;/li&gt;
&lt;li&gt;bounding boxes&lt;/li&gt;
&lt;li&gt;loss function&lt;/li&gt;
&lt;li&gt;training algorithm&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Sources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=NM6lrxy0bxs"&gt;13 minute YOLOv1 talk on Youtube&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1506.02640"&gt;YOLOv1 paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1612.08242"&gt;YOLOv2 aka YOLO9000 paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1804.02767"&gt;YOLOv3 paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2004.10934"&gt;YOLOv4 paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pytorch.org/hub/ultralytics_yolov5/"&gt;YOLOv5&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Input-output considerations of the neural network&lt;/h2&gt;
&lt;p&gt;Let's zoom out for a moment and think about object detection as an input-output problem. Let's assume the input image is is W pixels wide and H pixels high. If it's grayscale, it can be encoded as a W x H matrix of floats between 0 and 1 representing pixel intensities. If it's an RGB image, it's a 3 channel tensor, 3 x W x H, each layer encoding one color intensity. But, what about the output?&lt;/p&gt;
&lt;p&gt;Ideally, the output should be a list of variable length, like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;object 1: confidence Z1, center X1, center Y1, width Q1, height T1, class C1 (like car, dog, cat)&lt;/li&gt;
&lt;li&gt;object 2: confidence Z2, center X2, center Y2, width Q2, height T2, class C2&lt;/li&gt;
&lt;li&gt;and so on, depending on how many objects are on the image...&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, neural networks cannot return a variable number of outputs. The output of a neural network, like the input, is a tensor, ie. a block of float values.&lt;/p&gt;
&lt;p&gt;Notice that each object's information (confidence Z, center X, center Y, width Q, height T, class C) can easily be encoded as a vector, with each value between 0 and 1:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;confidence as a probability&lt;/li&gt;
&lt;li&gt;center X / image width W is a value between 0 and 1&lt;/li&gt;
&lt;li&gt;center Y / image height H is a value between 0 and 1&lt;/li&gt;
&lt;li&gt;object width Q / image width W is a value between 0 and 1&lt;/li&gt;
&lt;li&gt;object height T / image height H is a value between 0 and 1&lt;/li&gt;
&lt;li&gt;each C class gets its own float between 0 and 1, encoding the probability of the object belonging to that class (sum to 1)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So if we have a total of C classes, each detection can be encoded in a vector of 5 + C floats.&lt;/p&gt;
&lt;p&gt;One thing we could do is set an M maximum number of objects the network can detect, like M=100 or M=1000, and have the output of the network be a tensor (matrix) of dimensions M x (5 + C), each row encoding a possible object detection. This sounds pretty simple, and similar to the limitations of this scheme, YOLO also has a global limit on number of objects. &lt;strong&gt;However, this is not how YOLO works.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The problem with the simple scheme above is that it would be hard to train. Imagine a training image which contains 5 cars. Which of the M rows in the output tensor do we want to give us the detection? There's a danger that certain rows get specialized, so it's always detecting cars of dogs, or detecting objects in certain regions. Then, during usage, if the input image has some other configuration, for example all cars (M cars), then some of the output rows won't be useful.&lt;/p&gt;
&lt;p&gt;So YOLO puts a simple twist on the above scheme. It divides the image into an S x S regions, and each region is its own detector. Each region is responsible for detecting images whose bounding box is centered in that region. As before, tensors have to be of a fixed size, so each region can detect up to B objects. This way, the global maximum of objects that can be detected is S x S x B. This way, it's more clear which row is responsible for detecting which object, although there is still ambiguity, but it's greatly reduced, from S x S x B to B. In principle, by increasing S, B could be set to 1, but this is not the case:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in the original YOLOv1 architecture, S = 7 and B = 2, so a total of 7 x 7 x 2 = 98 objects can be detected. C = 20 classes of objects can be detected.&lt;/li&gt;
&lt;li&gt;in YOLOv2, S = 13, B = 9.&lt;/li&gt;
&lt;li&gt;in later YOLO versions, all these parameters have increased to larger numbers, and can be changed if you train your own network (S, B, C classes)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The center floats are normalized to the region dimensions, so (0, 0) corresponds to the top left corner of the region (not the entire image), (0.5, 0.5) corresponds to the center of the region, and so on.&lt;/p&gt;
&lt;p&gt;One more change is that YOLOv1 does not detect a class for each object, it detects a class for each region. So each region can detect B objects, but each region only has one set of C floats for detecting the classes, so the B objects are all of the same class. So in the end, the output tensor of YOLOv1 is S x S x (B x 5 + C) floats. Later, in YOLOv2 this limitation has been removed, and each detection has its own class, making the output tensor S x S x (B x (5 + C)).&lt;/p&gt;
&lt;p&gt;Note that the price we pay with this architecture is that if an input image has big blank regions, and lots of small objects clustered in a small region, the network will miss a lot of them due to the local B limit. This is what the quote from the last post was referring to.&lt;/p&gt;
&lt;p&gt;One final step is pruning the output. Since a neural network is a "numerical beast", when using the model in production, all output floats will be non-zero, even though there is nothing there. To get rid of this noise, YOLO cuts off object detection confidence at 0.3, so anything less than 0.3 is thrown out as a non-detection.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov1_regions.JPG" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The image above is from the original YOLO paper. Note how each region correspnds to one class (depicted by the colors). This was changed in YOLOv2. The top image shows all 98 bounding boxes, the right side shows the 3 where confidence was higher than 0.3.&lt;/p&gt;
&lt;h2&gt;Loss function and bounding box scoring during training&lt;/h2&gt;
&lt;p&gt;Imagine we're training the neural network described above. One data point in the training set is an input picture, along with a set of objects that are on the picture. Each object has a bounding box (center X, center Y, width Q , height T and class C). Suppose we then run this through the neural network. Given the architecture of the network, there is one specific region which contains the center coordinates, which has B detection boxes. So we want that region to detect the box. However, it's possible that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;multiple boxes (of the B) in that region start "finding" the object&lt;/li&gt;
&lt;li&gt;boxes in other regions also "find" the object (with shifted bounding boxes)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The question is, how do we "score" these detections, to force the neural network to learn to detect each object once, and in the appropriate region only? In a neural network, this scoring happens through the loss function, so we have to encode our logic / intention in the loss function.&lt;/p&gt;
&lt;p&gt;The way the YOLO loss function is set up, detections outside the correct region are deemed incorrect and ignored (zero multiplier in the loss function). Inside the correct region, the detector with the highest overlap gets scored, the rest is ignored (zero multiplier in the loss function). Another trick is to "tell" the loss function that errors in bounding box width and height matter more if the bounding box itself is small (ie. a 10 pixel width error in a 20 pixel wide box matters more than a 10 pixel width error in a 200 pixel wide box). This is accomplished by square-rooting the widths. The overall loss function from YOLOv1 is of a quasi $ L^2$ form:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov1_lf.JPG" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The $ \lambda $ multipliers are set to balance between localization and classification error.&lt;/p&gt;
&lt;h2&gt;Training method&lt;/h2&gt;
&lt;p&gt;There are 2 main points about the training method I will point out:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The CNN portion on the deep neural network is pre-trained on &lt;a href="https://en.wikipedia.org/wiki/ImageNet"&gt;ImageNet&lt;/a&gt; classification, so it already has some "memory" about what features to pick up. Obviously, it can also be trained on other datasets for other use-cases.&lt;/li&gt;
&lt;li&gt;The (C)NNs are not naturally  resistant to rotation, scaling, stretching and blurring. Also, they may pick up spurious features, such as a toothbrush always occuring near a face. To make the model more general, the training includes rotation, scaling, blurring, superimposing different parts of images, and so on. Starting in YOLOv4, images are also "mosaiced" together to make the detection less context-dependent.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="/images/yolov4_mosaic.JPG" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The YOLO models are deep neural networks, so they're fundamentally black boxes. Like all neural networks, as of today they are more art than science. We don't exactly know why one thing works better than the other. The field is very experimental, and when something works, others copy it and start trying out further variations. YOLO is no different, the CNN portion of YOLOv1 is a variation of the earlier &lt;a href="https://arxiv.org/abs/1409.4842"&gt;GoogLeNet&lt;/a&gt;. I say this because "understanding" a neural network is a misnomer, the best we can do is get a feeling for why, combined with the given training regimen, it works. Here I tried to explore the aspects of YOLO that are special to it, ie. the unified bounding box and classification training. I didn't talk about eg. activation functions, since in that respect there is nothing new in YOLO.&lt;/p&gt;</content><category term="yolo"></category><category term="yolov5"></category><category term="vision"></category><category term="object detection"></category></entry><entry><title>YOLOv5 object detection experiments</title><link href="https://bytepawn.com/yolov5-object-detection-experiments.html" rel="alternate"></link><published>2021-07-02T00:00:00+02:00</published><updated>2021-07-02T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-07-02:/yolov5-object-detection-experiments.html</id><summary type="html">&lt;p&gt;I run object detection experiments with pre-trained YOLOv5 models. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/yolov5_webcam1.jpg" alt="YOLO object detection example" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://pytorch.org/hub/ultralytics_yolov5/"&gt;YOLOv5&lt;/a&gt; is a set of pre-trained PyTorch neural network based image detection and classification models. With YOLOv5, &lt;strong&gt;performing image detection and classification is a couple of lines of code&lt;/strong&gt;, whether the source is a local image, an image URL, a video or a live stream, such as a laptop's webcam. For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hub&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ultralytics/yolov5&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yolov5s&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pretrained&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://www.brianhonigman.com/wp-content/uploads/2015/10/Large-crowd-of-people-014.jpg&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_people_bb.jpg" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;What's happening in the above code:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;YOLOv5s, the Small version of the YOLOv5 pre-trained neural network is downloaded from &lt;a href="https://pytorch.org/hub/"&gt;PyTorch Hub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The image is retrieved from the URL&lt;/li&gt;
&lt;li&gt;Object detection is run on the image&lt;/li&gt;
&lt;li&gt;Image with multiple bounding boxes and confidences is shown&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If we git clone the &lt;a href="https://github.com/ultralytics/yolov5/"&gt;full YOLOv5 repo&lt;/a&gt;, we get helper scripts which make running the models on the laptop's webcam feed a one-liner:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone https://github.com/ultralytics/yolov5/
$ &lt;span class="nb"&gt;cd&lt;/span&gt; yolov5
$ python3 detect.py --source &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here is YOLOv5s correctly detecting me and some objects in the background, under low light conditions which make the image grainy:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_webcam1.jpg" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here I put my sunglasses close to my mouth, which fools the model into thinking it's toothbrush:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_webcam2.jpg" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The above is running S (small) version of the network, which runs at real-time FPS. Bigger models may not be able to handle real-time depending on the hardware.&lt;/p&gt;
&lt;p&gt;It's worth noting the difference between &lt;strong&gt;object classification&lt;/strong&gt; and &lt;strong&gt;object detection&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In &lt;strong&gt;object classification, the entire image is assigned one category&lt;/strong&gt;. For example, with MNIST digits, each image of a hand-written digit belongs to one of 10 classes. The task is not to locate the digit on the image, just to figure out which digit (if any) is most likely on the image. See this earlier post about &lt;a href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html"&gt;MNIST digit detection with Pytorch&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Object detection is much harder&lt;/strong&gt;. Here, the image potentially contains multiple objects, each belonging to a different class (eg. person, car, dog). The task is to locate each object (bounding box or pixels) and for each located object, return the most likely class.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Previous computer vision approaches consisted of multiple independent steps, for example in step 1 detect objects (for example using features such as edges and contours), whatever the object may be, then in step 2 classify the detected objects. The downside of these approaches is that the two sub-modules are trained independently, so errors cannot propagate from one to the other. Another possibility is to just take an image classifier, and run it repeatedly on sections of the image. This can be very slow as the model is run repeatedly at different coordinates for different window sizes, since this is essentially a brute-force approach.&lt;/p&gt;
&lt;p&gt;YOLO stands for &lt;strong&gt;You Only Look Once&lt;/strong&gt;. In the YOLO architecture, a deep neural network is trained to perform multiple object detection and classification in one go: find both the bounding boxes of objects on the image, and their probable class (eg. person, car, dog). The best reference for the YOLO architecture is the &lt;a href="https://arxiv.org/abs/1506.02640"&gt;original 2015 paper&lt;/a&gt;. I will not discuss the details of the architecture in this post, just highlight that the YOLO, specifically YOLOv5 architecture is state-of-the-art as of the writing of this post:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://user-images.githubusercontent.com/26833433/114313216-f0a5e100-9af5-11eb-8445-c682b60da2e3.png" alt="YOLOv5 performance" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;YOLO is very popular, so there are many good tutorial and dicussions of the architecture. Good starting points are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="[YOLOv5](https://pytorch.org/hub/ultralytics_yolov5/)"&gt;official YOLOv5 Github repo with documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1506.02640"&gt;original 2015 paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/yolo-you-only-look-once-real-time-object-detection-explained-492dc9230006"&gt;explanation on TDS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/yolov5-end-to-end-object-detector-project-on-custom-dataset-5d9cc2c95921"&gt;end-to-end tutorial on TDS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsai.net/p/data-science/yolo-v5-is-here-custom-object-detection-tutorial-with-yolo-v5"&gt;how to train your own YOLOv5 networks&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are 2 option if one wants to use YOLO:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Download a pre-trained model.&lt;/li&gt;
&lt;li&gt;Train a model yourself: the Github repo contains documentation on how to. You will need a set of labeled images (bounding boxes and classification per bounding box).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I was curious about the performance of the pre-trained models. Models, because there are difference sizes of YOLOv5 models available, from S for Small to X for Xlarge, the biggest. The smallest S model runs about 7x faster than the biggest X model, but finds less objects and is less certain of the findings. Here I will show results for S and X.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/model_comparison.png" alt="YOLOv5 performance" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Limitations&lt;/h2&gt;
&lt;p&gt;The YOLO architecture is fundamentally a deep convolutional neural network (CNN), as illustrated by this high-level diagram from the original YOLO paper:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.researchgate.net/publication/330484322/figure/fig1/AS:716474243108866@1547832179294/YOLO-network-architecture-adapted-from-44.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Although being able to download a pre-trained neural network and get pretty good object detection in 3 lines of Python is machine-learning-magic, it's worth remembering that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The model will only detect objects it was trained to detect: for example, if the training set didn't include any turtles, then the model will never return turtles. YOLOv5 is pre-trained on the &lt;a href="https://cocodataset.org/#home"&gt;COCO dataset&lt;/a&gt; and knows about 80 classes.&lt;/li&gt;
&lt;li&gt;CNNs such as YOLOv5 are not naturally scale or rotation invariant. This has to be taught at training time, for example by scaling and rotating the training images and passing them in as separate images during training.&lt;/li&gt;
&lt;li&gt;There is a global maximum number of object YOLOv5 can detect, 300. There are also local maximums that it can detect, due to the architecture of the neural network. So if there is a large image, wit a smaller section containing 100s of objects to be detected, the user will run into this local maximum constraint.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Below I run experiments to understand how the pre-trained YOLOv5 models react to rotation, scaling, stretching and blurring. &lt;a href="https://github.com/mtrencseni/playground/blob/master/YOLOv5%20experiments.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt; Note that YOLOv5 cuts of confidence at 0.3, so if an object detection's confidence is less than 0.3, the object is not returned. This is why confidences lower than 0.3 ever occur on the plots below. For the experiments I use a picture of a yellow Porsche 911:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/porsche911s.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Rotation&lt;/h2&gt;
&lt;p&gt;First, I test object detection as a function of image rotation. I rotate the car 0, 1 ... 90 degrees using the &lt;a href="https://pillow.readthedocs.io/en/stable/"&gt;Pillow image manipulation library&lt;/a&gt;, and run YOLOv5s and YOLOv5x on the modified image:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hub&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ultralytics/yolov5&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yolov5s&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pretrained&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hub&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ultralytics/yolov5&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yolov5x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pretrained&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;angle&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;91&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;image_mod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;angle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Detection confidences:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_rotation.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;It's really interesting how at some "critical" angles the S model fails (the confidence drops below 0.3), but then at higher angles it's back up again. The X model is able to detect the car at angles up to 65 degrees.&lt;/p&gt;
&lt;h2&gt;Scaling&lt;/h2&gt;
&lt;p&gt;Next, I scale the image by 0.1, 0.2 ... 1.0, ie. I make it smaller:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;image_mod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ANTIALIAS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Detection confidences:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_scale.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;At least on this clean picture, scaling is not a problem for YOLOv5, even the smallest 0.1x image is confidently detected.&lt;/p&gt;
&lt;h2&gt;Stretching&lt;/h2&gt;
&lt;p&gt;Next, I stretch the image, first in the vertical direction, then in the horizontal direction. The code is similar to the stretching case, so I just show the results:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_vertical_stretch.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_horizontal_stretch.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here, YOLOv5 also performs very well. From a stretch factor of 0.2-0.3, it's able to recognize the car.&lt;/p&gt;
&lt;h2&gt;Blurring&lt;/h2&gt;
&lt;p&gt;Next, I apply a gaussian blur of increasing stregth:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;blur&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;image_mod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ImageFilter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GaussianBlur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blur&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;image_mod&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/yolov5_blur.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Finally, I also try blurring on the picture of crowd of people shown at the beginning of the post. Since the picture contains 50+ people, instead of plotting a single confidence, I plot the number of people detected:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/yolov5_people.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;On the simple car picture, the model breaks down at a blur factor of about 10-12, whereas on the more realistic crown picture the starts to miss people much earlier, at lower blur factors.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In Dubai, privacy laws prohibit the storage of videos taken of people. Since we cannot store CCTV video feeds, we cannot label it and train our own (YOLOv5 or other) models on our own labeled images. Hence, using pre-trained models such as YOLOv5 is our only option. This is true irrespective of the intended usage of the model, even if it is privacy-neutral, for example &lt;em&gt;count people entering and leaving through the door&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The above experiments show that YOLOv5 does have limitations, it is not magic in the sense of being equivalent to a human. But overall I find the performance of the pre-trained YOLOv5 models quite impressive.&lt;/p&gt;</content><category term="yolo"></category><category term="yolov5"></category><category term="vision"></category><category term="object detection"></category></entry><entry><title>Predicting party affiliation of US politicians using fasttext</title><link href="https://bytepawn.com/predicting-party-affiliation-of-us-politicians-using-fasttext.html" rel="alternate"></link><published>2021-06-20T00:00:00+02:00</published><updated>2021-06-20T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-06-20:/predicting-party-affiliation-of-us-politicians-using-fasttext.html</id><summary type="html">&lt;p&gt;I train a fasttext classifier on 1.2M data points to predict US politicians' party affiliations from their twitter messages. &lt;br/&gt;&lt;br/&gt;&lt;img src="https://www.rollingstone.com/wp-content/uploads/2019/10/DTAS.jpg" alt="Trump Schiff" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently a colleague pointed me to the excellent &lt;strong&gt;&lt;a href="https://fasttext.cc/"&gt;fasttext&lt;/a&gt;&lt;/strong&gt; text classification library from Facebook. The promise of fasttext is that we don't have to bother with regularizing the text, tokenizing, hand-engineering the tokens. We just feed the raw text and labels, and get a working classifier model within seconds.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://fasttext.cc/img/fasttext-logo-color-web.png" alt="Fasttext" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is a short &lt;a href="https://arxiv.org/abs/1607.01759"&gt;4 page paper describing the topline architecture of fasttext here&lt;/a&gt;. As of 2016, when the paper was written, fasttext was on-par with deep neural networks for text classification tasks. &lt;a href="https://towardsdatascience.com/fasttext-under-the-hood-11efc57b2b3"&gt;FastText: Under the Hood&lt;/a&gt; from Towards Data Science explains some of the internals of fasttext, I will not discuss it here.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Predicting%20political%20affiliation%20with%20fasttext.ipynb"&gt;The ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Dataset&lt;/h2&gt;
&lt;p&gt;I wanted to use an &lt;strong&gt;interesting and relatable toy dataset&lt;/strong&gt;, so I went with a dump of &lt;a href="https://www.reddit.com/r/datasets/comments/6fniik/over_one_million_tweets_collected_from_us/"&gt;1.2M tweets from U.S. politicians, such as Donald Trump and Adam Schiff&lt;/a&gt;. The dataset contains a &lt;strong&gt;total of 1,243,370 tweets from a total of 548 politicians&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I tried two toy exercises with this dataset:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;build a binary classification model to predict the political affiliation (Democrat or Republican) of the author (ignoring Independents and Libertarians)&lt;/li&gt;
&lt;li&gt;build a model to predict the author of a tweet (harder)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="https://www.rollingstone.com/wp-content/uploads/2019/10/DTAS.jpg" alt="Trump Schiff" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Enriching the dataset&lt;/h2&gt;
&lt;p&gt;To run the above exercises, I need to transform the tweets into a list of tuples like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;(screen_name, party, text)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;(screen_name=&amp;#39;RepAdamSchiff&amp;#39;, party=&amp;#39;Democrat&amp;#39;, text=&amp;#39;Admin review should be accompanied by thorough investigation by both House &amp;amp; Senate Intel Committees, perhaps meeting in joint session.&amp;#39;)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;screen_name&lt;/code&gt; and &lt;code&gt;text&lt;/code&gt; is in the json files, but &lt;code&gt;party&lt;/code&gt; is not. For that, I wrote a script to scrape each politician's Wikipedia page, which contains the party in a structured format in the page's infobox:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wiki_politician2.jpg" alt="Wikipedia." style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The function is called with the poitician's name attached to the Twitter account, which is also contained in the data dump:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_politician_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;wiki_page&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;party&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;normalized_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;normalize_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;search_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;https://en.wikipedia.org/w/api.php?action=query&amp;amp;list=search&amp;amp;format=json&amp;amp;srsearch=&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;normalized_name&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39; american politician&amp;#39;&lt;/span&gt;
        &lt;span class="n"&gt;wiki_page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;query&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;search&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;title&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;wiki_page&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;https://en.wikipedia.org/wiki/&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;wiki_page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;wiki_info_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;https://en.wikipedia.org/w/api.php?action=query&amp;amp;prop=revisions&amp;amp;rvprop=content&amp;amp;rvsection=0&amp;amp;redirects=1&amp;amp;format=json&amp;amp;titles=&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;wiki_page&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wiki_info_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;query&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;pages&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;revisions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;*&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;| party&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;|party&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;party&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;parties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;party&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                            &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;party&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;party&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;pass&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;fetch_politician_info(&amp;#39;Adam Schiff&amp;#39;)
&amp;gt; {
&amp;gt;   &amp;#39;name&amp;#39;:      &amp;#39;Adam Schiff&amp;#39;,
&amp;gt;   &amp;#39;wiki_page&amp;#39;: &amp;#39;https://en.wikipedia.org/wiki/Adam_Schiff&amp;#39;,
&amp;gt;   &amp;#39;party&amp;#39;:     &amp;#39;Democratic&amp;#39;
&amp;gt; }
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This method works for 537 out of 548 of the politicians (~98%). The remaining few I looked up manually. The final counts for the political affiliations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Democratic&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;242&lt;/span&gt;
&lt;span class="n"&gt;Republican&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;301&lt;/span&gt;
&lt;span class="n"&gt;Independent&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;Libertarian&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Limiting information content of tweets&lt;/h2&gt;
&lt;p&gt;Examining the tweets I found that some signals may make classification too simple. For example, authors often @mention each other. I didn't check this, but I assume Dems are more likely to mention other Dems, and so on. Also, tweets often contains (shorted) URLs, which may also give away the author's identity or affiliation. So I tried the classification tasks both with the full tweet text, and with the @mentions and URLs removed.&lt;/p&gt;
&lt;h2&gt;Fasttext input format&lt;/h2&gt;
&lt;p&gt;Fasttext does not take dataframes or Python lists as input. The input needs to be a file, 1 line per data point. The target label is just part of the text, with a special prefix to designate it. The default prefix is &lt;code&gt;__label__&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For predicting the author, I emit lines like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Admin review should be accompanied by thorough investigation by both House &amp;amp; Senate Intel Committees,
perhaps meeting in joint session __label__RepAdamSchiff
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For predicting the party, I emit lines like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Admin review should be accompanied by thorough investigation by both House &amp;amp; Senate Intel Committees,
perhaps meeting in joint session __label__Democratic
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Train-test split&lt;/h2&gt;
&lt;p&gt;I sort the tweets by create time, and use the first 1,000,000 for training and the chronological tail, 243,370 tweets for test. This way there is no data leakage from train to test (other than at the boundary, which I ignore for this toy exercise), eg. a test tweet that is just a retweet of a training tweet.&lt;/p&gt;
&lt;h2&gt;Using fasttext&lt;/h2&gt;
&lt;p&gt;This is where fasttext shines. Ones I build the training file per the above format, it's as simple as:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;model = fasttext.train_supervised(file_path)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Training the model on 1M data points takes about 5 seconds on an 8-core Intel Macbook Pro.&lt;/p&gt;
&lt;p&gt;Once the model is trained, predicting on a piece of text is also simple:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;# sample tweet from Trump
txt = &amp;quot;&amp;quot;&amp;quot;
Mike Pence didn’t have the courage to do what should have been done
to protect our Country and our Constitution, giving States a chance to
certify a corrected set of facts, not the fraudulent or inaccurate ones
which they were asked to previously certify. USA demands the truth!
&amp;quot;&amp;quot;&amp;quot;.replace(&amp;#39;\n&amp;#39;, &amp;#39; &amp;#39;)
model.predict(txt)
((&amp;#39;__label__Republican&amp;#39;,), array([0.79000771]))
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Accuracy of the fasttext models&lt;/h2&gt;
&lt;p&gt;First, how well does fasttext do predicting the political party of the author? I only kept Dems and Reps, and re-balanced both the training and test set, so for this binary-classification exercise, the baseline accuracy is 50%. The balanced dataset has 881,460 training tweets (down from 1,000,000) and 195,804 test tweets (down from 243,370). With the original tweet texts:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Train accuracy: 92.9%
Test accuracy:  74.6%
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;After removing @atmentions and URLs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Train accuracy: 85.2%
Test accuracy:  71.8%
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Second, let's see how fasttext does predicting the author of the tweets? This is a C=548 classification problem, so the baseline random predictor would achieve 1/548=0.1% accuracy. Compared to this, fasttext achieves, on the original tweet texts:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Train accuracy: 51.2%
Test accuracy:  22.9%
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;After removing @atmentions and URLs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Train accuracy: 41.1%
Test accuracy:  18.4%
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Fasttext is very convenient to use, and saves a lot of time that would be spent manually regularizing, normalizing and tokenizing the text and then manually building feature vectors. In terms of runtime, it's very fast, both for training and prediction. Although I have no other baseline available for the above toy example, the accuracy fasttext achieves out of the box is promising. Predicting the full author is a hard problem, but fasttext does ~200x better than the random baseline out-of-the-box, which is impressive.&lt;/p&gt;
&lt;p&gt;I'm a bit surprised fasttext only achieves ~75% accuracy on the binary Dem-or-Rep classification problem, I expected this to be a tractable problem, where ~90%+ accuracy is achievable. By default, fasttext does not build multi-word n-words, so I tried running it with &lt;code&gt;wordNgrams=2&lt;/code&gt; and &lt;code&gt;3&lt;/code&gt;, but it didn't improve accuracy. I may re-visit this problem in the future with a hand-built classifier to get a comparison baseline.&lt;/p&gt;</content><category term="statistics"></category><category term="trump"></category><category term="politics"></category><category term="fasttext"></category><category term="twitter"></category></entry><entry><title>Random digits and Benford's law</title><link href="https://bytepawn.com/random-digits-and-benfords-law.html" rel="alternate"></link><published>2021-05-29T00:00:00+02:00</published><updated>2021-05-29T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-05-29:/random-digits-and-benfords-law.html</id><summary type="html">&lt;p&gt;The post explores the distribution of digits of random and non-random numbers from receipts, verifying Benford's law of first digit distribution.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/digits11.png" alt="Early stopping" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is a post exploring the distribution of digits of random and non-random numbers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Given a set of uniform random numbers, what is the distribution of digits?&lt;/li&gt;
&lt;li&gt;How about non-random numbers, amounts like $4.95 from receipts? How close are they to random?&lt;/li&gt;
&lt;li&gt;Can we reproduce &lt;a href="https://en.wikipedia.org/wiki/Benford%27s_law"&gt;Benford's law&lt;/a&gt; with receipt digits?&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Random numbers&lt;/h2&gt;
&lt;p&gt;Let's start with random numbers. First, let's generate uniform random numbers between 0 and 1000, and look at the density plot of the numbers themselves. Left side is the histogram, right side is the cumulative histogram:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits1.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This just confirms that the code used to generate random numbers is bug-free:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_numbers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;upper_limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;numbers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;upper_limit&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_numbers&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now, the digits of these numbers. To get the digits, I will just use &lt;code&gt;str(i)&lt;/code&gt;, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;plt.hist(list(map(int, list(&amp;#39;&amp;#39;.join(str(i) for i in numbers)))),
         density=True, edgecolor=&amp;#39;black&amp;#39;, bins=list(range(0, 11)))
plt.title(&amp;#39;Density of digits of random numbers&amp;#39;)
plt.show()
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/digits2.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is a deficit of 0s. This makes sense, since leading 0s are not printed with &lt;code&gt;str(i)&lt;/code&gt;, so 56 is &lt;code&gt;56&lt;/code&gt; and not &lt;code&gt;056&lt;/code&gt;. We can "fix" this, by using &lt;code&gt;'{:03d}'.format(r)&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits3.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This looks right. But formatting with leading 0s doesn't neccesarily mean that the digit distribution is uniform, just because the distribution of the number is uniform — it depends on the range of the numbers. Below is the distribution of digits for even numbers drawn from different ranges 200, 500, 1000:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits4.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Finally, let's see the distribution of first digits (without leading 0s):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits5.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Amounts from receipts&lt;/h2&gt;
&lt;p&gt;Let's leave random numbers behind, and look at amounts from real-world receipts from Dubai. Similarly as above, I cut off numbers at 1000. First, let's look at the distribution:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits6.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Next, let's look at the distribution of digits, with amounts formatted like &lt;code&gt;4.90&lt;/code&gt;, with 2 decimals, without leading 0s:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits7.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Unsurprisingly, 0 is the most frequent digit, because the decimals are often &lt;code&gt;00&lt;/code&gt; or &lt;code&gt;x0&lt;/code&gt;. What if we cut off the trailing 0s, so &lt;code&gt;4.90&lt;/code&gt; becomes &lt;code&gt;4.9&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits8.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;If we look at it like this, 0 is the least frequent digit, and 5 is the most frequent digit. To get a better feeling for the decimal part, we can plot the density of just the decimal numbers, so &lt;code&gt;90&lt;/code&gt; from &lt;code&gt;4.90&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits9.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This shows that prices are likely to be multiples of 5s (like &lt;code&gt;4.00, 4.05, 4.10 ... 4.95&lt;/code&gt;), &lt;code&gt;00&lt;/code&gt; and &lt;code&gt;50&lt;/code&gt; being the most likely. Interestingly, prices are not likely to end in &lt;code&gt;9&lt;/code&gt; or &lt;code&gt;99&lt;/code&gt; for this dataset.&lt;/p&gt;
&lt;p&gt;Finally, the density of digits of the decimal parts, confirming that 5 is the most likely non-zero digit:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits10.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Plotting just the last digit:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits20.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Benford's law&lt;/h2&gt;
&lt;p&gt;Next, let's check &lt;a href="https://en.wikipedia.org/wiki/Benford%27s_law"&gt;Benford's law&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Benford's law, also called the Newcomb–Benford law, the law of anomalous numbers, or the first-digit law, is an observation about the frequency distribution of leading digits in many real-life sets of numerical data... It has been shown that this result applies to a wide variety of data sets, including electricity bills, street addresses, stock prices, house prices, population numbers, death rates, lengths of rivers, and physical and mathematical constants. A set of numbers is said to satisfy Benford's law if the leading digit d occurs with probability:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;center&gt;$ P(d) = log_{10} (1 + 1/d) $ &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; $ d ∈ {1, ..., 9} $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Plotting the distribution of first digits of our amounts, with Benford's law:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/digits11.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note that first digits from uniform random numbers did not follow Benford's law, but our "real" numbers do. Interestingly, the same distributions also hold for the total amounts from receipts.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Digits from uniform random numbers follow a truncated-uniform distribution, depending on the formatting and upper bound. Digits from real sources like receipts and zip codes follow some artificial pattern that depends on the source, for example here the amounts tend to be multiples of 5s. Benford's law hold holds for 1st digits of "real" numbers.&lt;/p&gt;</content><category term="statistics"></category></entry><entry><title>Tricks vs implementation in coding interviews</title><link href="https://bytepawn.com/tricks-vs-implementation-in-coding-interviews.html" rel="alternate"></link><published>2021-05-22T00:00:00+02:00</published><updated>2021-05-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-05-22:/tricks-vs-implementation-in-coding-interviews.html</id><summary type="html">&lt;p&gt;Thinking up tricky solutions in 3-5 minutes is not a requirement in a work setting. Usually, there are days or weeks for that. But implementing an idea, once the idea is there, should be straightforward for a good programmer. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/daily_coding_problem.jpg" alt="Daily Coding Problem" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;Recently I bought the book &lt;a href="https://www.amazon.com/Daily-Coding-Problem-exceptionally-interviews/dp/1793296634"&gt;Daily Coding Problem&lt;/a&gt;. I bought it because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I find these sorts of problems intellectually pleasing&lt;/li&gt;
&lt;li&gt;it's a good way to keep myself sharp as a programmer&lt;/li&gt;
&lt;li&gt;perhaps I can re-purpose some of the problems for Data Science interviews.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/daily_coding_problem.jpg" alt="Daily Coding Problem" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The problems and solutions in the book reminded me of my own interviewing experiences, as a candidate: I get nervous in interview situations and don't do well on coding interviews. Doctors call this the &lt;strong&gt;white coat effect&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The alerting reaction to the physician's visit is known to induce a blood pressure rise termed "white coat effect." This phenomenon has often been associated with a clinical condition characterized by a persistently high blood pressure in the doctor's office and a persistently normal blood pressure at other times.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In other words, the measurement ("how well can this person program") result is below the actual truth. Let's look at an example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Given an array of numbers, find the maximum sum of any contiguous subarray of the array.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I think any candidate for a programming job can be expected to code up the $ O(n^3) $ brute force solution, but I'm not sure I would figure out the more efficient $ O(n) $ solution known as &lt;a href="https://en.wikipedia.org/wiki/Maximum_subarray_problem"&gt;Kadane's algorithm&lt;/a&gt;, with my blood pressure at 160-170, instead of my normal 135, in 3-5 minutes flat:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;max_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;max_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-inf&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cur_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cur_sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;max_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cur_sum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;max_sum&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So as a hiring manager, I think a lot about how to make hiring loops less painful and more fair. It occured to me that potentially a better interviewing approach would be to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ask the candidate to figure out the naive / brute force solution on their own, and code it up.&lt;/li&gt;
&lt;li&gt;Then, &lt;strong&gt;tell them verbally what the optimal solution is&lt;/strong&gt;, and have them code it up.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Removing the step of the candidate having to think up tricky solutions would remove significant pressure.&lt;/li&gt;
&lt;li&gt;It still tests the most important think: ability to implement without errors and write clean code.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The example above is &lt;strong&gt;not&lt;/strong&gt; a good example, because Kadane's algorithm, once written down, is so simple, it's hard to "tell it" without effectively giving the candidate the solution itself. A better example is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Given an array of numbers, find the bounds of the smallest window that must be sorted in order for the whole array to be sorted. For example, for the input &lt;code&gt;[3, 7, 5, 6, 9]&lt;/code&gt;, return &lt;code&gt;[1, 3]&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Candidate gives the naive / brute force solution: copy and sort the array, then compare with the original. Time complexity is $ O(n log(n) ) $.&lt;/li&gt;
&lt;li&gt;Tell the candidate that: &lt;em&gt;Traverse the array and note whether the element is less than the maximum up to that point. If it is, this element would have to be part of the sorting window, since it's out of order compared to that previous maximum. Same in the reverse direction.&lt;/em&gt; Time complexity is $ O(n) $.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Implementing the algorithm is still not trivial in an interview setting:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;min_window&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;max_seen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-inf&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;inf&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;max_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_seen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;max_seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;right&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;min_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_seen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;min_seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;left&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This seems like a better interviewing approach, because in a work setting, thinking up tricky solutions in 3-5 minutes is not a requirement. Usually, there are days or weeks for that. But implementing an idea, once the idea is there, should be straightforward for a good programmer.&lt;/p&gt;</content><category term="hiring"></category><category term="interviewing"></category></entry><entry><title>10 ways to iterate from 0 to 1 with deciles</title><link href="https://bytepawn.com/ten-ways-to-iterate-from-0-to-1-with-deciles.html" rel="alternate"></link><published>2021-05-14T00:00:00+02:00</published><updated>2021-05-14T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-05-14:/ten-ways-to-iterate-from-0-to-1-with-deciles.html</id><summary type="html">&lt;p&gt;What's the best way to iteratore from 0 to 1 in steps of 0.1 in Python, and what are the potential pitfalls?&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/iterate10.png" alt="Iterating from 0 to 1 in steps of 0.1" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;Suppose we want to iterate from 0 to 1, in steps of 0.1, like &lt;code&gt;0.0, 0.1, 0.2, ... 0.9&lt;/code&gt;. This happens regularly in Data Science notebooks.&lt;/p&gt;
&lt;p&gt;Note that iteration like this is usually &lt;strong&gt;exclusive at the end&lt;/strong&gt;, ie. we expect it to stop at 0.9, the same way &lt;code&gt;range(0, 10)&lt;/code&gt; stops at 9.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #1:&lt;/strong&gt; Use &lt;a href="https://docs.python.org/3/library/functions.html#func-range"&gt;&lt;code&gt;range&lt;/code&gt;&lt;/a&gt;, divide manually:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in range(10):
    print(i/10)

&amp;gt; 0.0
&amp;gt; 0.1
&amp;gt; 0.2
&amp;gt; 0.3
&amp;gt; 0.4
&amp;gt; 0.5
&amp;gt; 0.6
&amp;gt; 0.7
&amp;gt; 0.8
&amp;gt; 0.9
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The downside of this basic solution is that the iteration logic leaks into the &lt;code&gt;print()&lt;/code&gt; code. Also what if we want to iterate with steps of 0.42:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in range(10):
    print(i*0.42)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In real code, it would be less clear that this is actually a clean iteration, and not some other reason we're multiplying.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #2:&lt;/strong&gt; Use &lt;code&gt;arange&lt;/code&gt; from &lt;code&gt;numpy&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The overall best practice is to use &lt;a href="arange from numpy"&gt;&lt;code&gt;arange&lt;/code&gt;&lt;/a&gt; from &lt;code&gt;numpy&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="kp"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.30000000000000004&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6000000000000001&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7000000000000001&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only downside here is that &lt;code&gt;arange&lt;/code&gt; creates the whole array in memory, instead of yielding a Python iterator. This is a non-issue in real-life, but let's geek out and see what the options are to avoid creating the whole list/array in memory. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #3:&lt;/strong&gt; Use &lt;code&gt;count()&lt;/code&gt; from &lt;code&gt;itertools&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The closest built-in helper function Python has is &lt;a href="https://docs.python.org/3/library/itertools.html#itertools.count"&gt;&lt;code&gt;count()&lt;/code&gt;&lt;/a&gt; from &lt;code&gt;itertools&lt;/code&gt;. You supply &lt;code&gt;start&lt;/code&gt; and &lt;code&gt;step&lt;/code&gt; arguments, but it counts infinitely, so you have to take care of the stopping condition:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;itertools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# yields 11 times, a logical error!&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.30000000000000004&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7999999999999999&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.8999999999999999&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.9999999999999999&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;due&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;floating&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt; &lt;span class="n"&gt;precision&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;also&lt;/span&gt; &lt;span class="n"&gt;returns&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is very ugly because of the &lt;code&gt;if&lt;/code&gt;, but also, it has a big logical error: it yields a value 11 times instead of 10 times, because due to floating point precision issues, it also returns ~1, and doesn't stop at ~0.9. We'll return to floating point issues in a bit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #4:&lt;/strong&gt; Use &lt;code&gt;count()&lt;/code&gt; and &lt;code&gt;islice()&lt;/code&gt; from &lt;code&gt;itertools&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;We can use &lt;a href="https://docs.python.org/3/library/itertools.html#itertools.islice"&gt;&lt;code&gt;islice&lt;/code&gt;&lt;/a&gt; to return the first &lt;code&gt;int((stop-start)/step)&lt;/code&gt; elements from the &lt;code&gt;count()&lt;/code&gt; iterator. This saves us the ugly manual &lt;code&gt;if&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;itertools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;islice&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;frange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;islice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.30000000000000004&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7999999999999999&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.8999999999999999&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In this case the floats work out in our favor and it yields 10 elements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #5:&lt;/strong&gt; Use &lt;code&gt;map()&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Let's be functional and use &lt;a href="https://docs.python.org/3/library/functions.html#map"&gt;&lt;code&gt;map()&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in map(lambda x: x / 10, range(10)):
    print(i)

&amp;gt; 0.0
&amp;gt; 0.1
&amp;gt; 0.2
&amp;gt; 0.3
&amp;gt; 0.4
&amp;gt; 0.5
&amp;gt; 0.6
&amp;gt; 0.7
&amp;gt; 0.8
&amp;gt; 0.9
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Approach #6:&lt;/strong&gt; Use &lt;code&gt;map()&lt;/code&gt; in a cleaner way&lt;/p&gt;
&lt;p&gt;We can put the &lt;code&gt;map&lt;/code&gt; in a function to make it cleaner:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def frange(start, stop, step):
    return map(lambda x: start + x * step, range(int((stop-start)/step)))

for i in frange(0, 1, 0.1):
    print(i)

&amp;gt; 0.0
&amp;gt; 0.1
&amp;gt; 0.2
&amp;gt; 0.30000000000000004
&amp;gt; 0.4
&amp;gt; 0.5
&amp;gt; 0.6000000000000001
&amp;gt; 0.7000000000000001
&amp;gt; 0.8
&amp;gt; 0.9
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Approach #7:&lt;/strong&gt; Write it yourself&lt;/p&gt;
&lt;p&gt;Or we can just implement &lt;code&gt;frange&lt;/code&gt; ourselves:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def frange&lt;span class="p"&gt;(&lt;/span&gt;start&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kp"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; step&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="kr"&gt;while&lt;/span&gt; start &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="kp"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        yield start
        start &lt;span class="o"&gt;+=&lt;/span&gt; step

&lt;span class="kr"&gt;for&lt;/span&gt; i &lt;span class="kr"&gt;in&lt;/span&gt; frange&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# yields 11 times, a logical error!&lt;/span&gt;
    &lt;span class="kp"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;i&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.1&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.2&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.30000000000000004&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.4&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.6&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.7&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.7999999999999999&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.8999999999999999&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0.9999999999999999&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; due to floating point precision issues&lt;span class="p"&gt;,&lt;/span&gt; it also returns &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Approach #8:&lt;/strong&gt; Epsilons&lt;/p&gt;
&lt;p&gt;Some of the approaches had logical problems, because they yielded 11 times, also yielding ~1, due to floating point issues (the exact case when this is a problem is platform and argument dependent). One way to address this is to assume floating point precicion problems and explicitly handle ε differences to some user-specified precision. But this is such a bad idea, I will stop here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Approach #9:&lt;/strong&gt; Use &lt;code&gt;decimal&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The smart way to get rid of the floating point issues above is to use the Python standard &lt;a href="https://docs.python.org/3/library/decimal.html"&gt;&lt;code&gt;decimal&lt;/code&gt;&lt;/a&gt; library:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;decimal&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;Decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;Decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;0.1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Notice that the arguments to the &lt;code&gt;Decimal&lt;/code&gt; constructor are strings, not numbers. Passing in numbers yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in frange(Decimal(0), Decimal(1), Decimal(0.1)):
    print(i)

&amp;gt; 0
&amp;gt; 0.1000000000000000055511151231
&amp;gt; 0.2000000000000000111022302462
&amp;gt; 0.3000000000000000166533453693
&amp;gt; 0.4000000000000000222044604924
&amp;gt; 0.5000000000000000277555756155
&amp;gt; 0.6000000000000000333066907386
&amp;gt; 0.7000000000000000388578058617
&amp;gt; 0.8000000000000000444089209848
&amp;gt; 0.9000000000000000499600361079
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that you can set the precision for the &lt;code&gt;decimal&lt;/code&gt; library to cut off the ugly trailing noise, and then you can avoid using ugly strings in the constructor:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;getcontext().prec = 6

for i in frange(Decimal(0), Decimal(1), Decimal(0.1)):
    print(i)

&amp;gt; 0
&amp;gt; 0.100000
&amp;gt; 0.200000
&amp;gt; 0.300000
&amp;gt; 0.400000
&amp;gt; 0.500000
&amp;gt; 0.600000
&amp;gt; 0.700000
&amp;gt; 0.800000
&amp;gt; 0.900000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Approach #10:&lt;/strong&gt; Alternative signatures&lt;/p&gt;
&lt;p&gt;The floating point troubles point to the idea that &lt;code&gt;frange(start, stop, step)&lt;/code&gt; is maybe not the best signature, &lt;code&gt;frange(start, step, count)&lt;/code&gt; is cleaner:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def frange(start, step, num):
    for i in range(num):
        yield start
        start += step
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This way, we never have to worry about floating point issues yielding an extra value, even without using &lt;code&gt;decimal&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in frange(start=0, step=0.1, num=10):
    print(i)

&amp;gt; 0
&amp;gt; 0.1
&amp;gt; 0.2
&amp;gt; 0.30000000000000004
&amp;gt; 0.4
&amp;gt; 0.5
&amp;gt; 0.6
&amp;gt; 0.7
&amp;gt; 0.7999999999999999
&amp;gt; 0.8999999999999999
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Also, here backward iteration works as expected, unlike in some of the solutions given before:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;for i in frange(start=0, step=-0.1, num=10):
    print(i)

&amp;gt; 0
&amp;gt; -0.1
&amp;gt; -0.2
&amp;gt; -0.30000000000000004
&amp;gt; -0.4
&amp;gt; -0.5
&amp;gt; -0.6
&amp;gt; -0.7
&amp;gt; -0.7999999999999999
&amp;gt; -0.8999999999999999
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Or:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;def frange(start, step, num, backward=False):
    if backward:
        step *= -1
    for i in range(num):
        yield start
        start += step
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The downside here is that if we want to switch stepsize from &lt;code&gt;0.1&lt;/code&gt; to &lt;code&gt;0.05&lt;/code&gt;, we have to remember to also change num to get to 1.&lt;/p&gt;
&lt;p&gt;There is also a third possible signature, &lt;a href="https://numpy.org/doc/stable/reference/generated/numpy.linspace.html"&gt;&lt;code&gt;linspace&lt;/code&gt;&lt;/a&gt; from &lt;code&gt;numpy&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;numpy.linspace(start, stop, num=50, endpoint=True, ...)&lt;/code&gt;: return evenly spaced numbers over a specified interval.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In real-life, my recommendation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for &lt;code&gt;start, stop, step&lt;/code&gt; signature use &lt;code&gt;numpy.arange()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;for &lt;code&gt;start, stop, num&lt;/code&gt; signature use &lt;code&gt;numpy.linspace()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;for &lt;code&gt;start, step, num&lt;/code&gt; signature use the &lt;code&gt;frange()&lt;/code&gt; above.&lt;/li&gt;
&lt;/ul&gt;</content><category term="mlflow"></category><category term="tracking"></category></entry><entry><title>Sometimes brute forcing just works</title><link href="https://bytepawn.com/sometimes-brute-forcing-just-works.html" rel="alternate"></link><published>2021-05-06T00:00:00+02:00</published><updated>2021-05-06T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-05-06:/sometimes-brute-forcing-just-works.html</id><summary type="html">&lt;p&gt;I describe a real world use-case where a simple, brute force search based solution worked really well, making more sophisticated Machine Learning unnecessary. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/receipt_sample2.jpg" alt="Sample receipt" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently I was working on the problem of "parsing" store receipts. The customer takes a picture of the receipt using their smartphone and submits it, and it is our job to extract certain core fields from the receipt, such as the total amount or the date of the receipt. We had an existing solution which used a state of the art, but not receipt specific OCR tool to convert the image to a chunk of text, and then used various regular expressions to extract these fields --- however the extraction rate and accuracy only 70-80% for each extracted field. I started working on this as a weekend hack to see whether I can improve the accuracy of the existing solution.&lt;/p&gt;
&lt;p&gt;I will describe how I cracked extracting the date field, where a surprisingly simple brute force search base approach worked, without any machine learning. It's a good lesson to remember that sometimes relatively simple searching and string manipulation works, and we don't need to go to gradient boosting or neural networks.&lt;/p&gt;
&lt;p&gt;It's fair to ask: isn't this a solved problem? Aren't there out of the box solutions for receipt parsing? I thought the same thing, so as a zeroth step we looked at 3-4 different SaaS offerings, but they had very poor recognition rates (in the 50-60% range), possibly because of our arabic text, and because we don't currently enforce quality standards on the uploaded images (eg. many are blurry, etc).&lt;/p&gt;
&lt;h2&gt;Brute force search&lt;/h2&gt;
&lt;p&gt;Since I was hacking, I decided to first stick with the existing OCR approach, before moving on to more fancy deep learning on the image itself. First I did "data exploration", which in this case means looking at a lot of the receipt pictures and the corresponding OCR text to get a sense of the problem. I was able to do this, because the existing solution has been in production for a while, so we have more than a million labeled samples (there is a team of humans correcting and labeling low confidence results).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/receipt_sample2.jpg" alt="Sample receipt" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;An additional difficulty here was that most of the invoices also had arabic text on them, which tripped off the OCR software, so sometimes it switched from left-to-right to right-to-left word ordering in the text output. Also, the OCR software is not specific to invoices, so sometimes it got confused and thought the invoice is like a column of text in a newspaper, and parsed out the line items ("t-shirt") on the left of the invoice as a block of text, and then all the amounts on the right. This means that the resulting OCR string was very "dirty", very hard to "read" and see patterns in it.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;LANDMARK, RETAIL INVESTMENT CO. LLC مركة داند مارلا وپتيل للاستثمار ذ
م Hone Box Sharjah City Center Sharjah 800HOMEBOX الكاشي : 173157
:Cashier بیان :173157 :Sales ۲۲۰۱۶-۲۱/۱۴ ۹۶/۶۷۴۹/۳۰۹۰۳۰/۲۰۶/۱۱۱
23-04-21/12:45/4739/29030/204/111 : : : TRN: 100260641400003 ۱۰۰۳۶۰۶۱۹۰۰۰۰۳ 
: وتم تسجيل ضريبة القيمة هم ببیة فاتورة TAX INVOICE 29 0 3 0 2 0 4 4 7 3 9 2 0 2 1 0 4 2 2 
۹۰۳۰۳۰ ۷۳ ۲۱۰۳۲ الصنف العدل السعر Iten Quantity Rate Value-AED وو؟ ۳۹۹۱۹۹۷۰۳۹۹۳ ۴X 
6299169703493 2 X Knit Basket 3.3L w/o Lid , 4.00 8.00 سنة تخز بین 
HH Excl. VAT Unit Price = 3.91 غير شامل ضريبة القيمة المضافة ۳,۹۱ = سعر الوحدة مفرد مائدة - 
P -Blaza - Beaded Placetat ۶۴۹۹۱۹۹۹:۱۶ ۱۷ ۱۰,۰۰ 5299149955164 1X 10.00 P . 
Excl. VAT Unit Price = 9.52 10.00 غير شامل ضريبة القيمة المضاة ۹,۵۲ = سعر الوحدة إجمالي الكمية 
Total Quantity :: : الاجمالي Total : 18.00 نقدی Cash مجمل المناقصة 100.00 و ۱۰۰
Total Tender Change Due : 100.00 المتبقي المستحق : ۸۳,۰۰۰ -82.00 اجمالي الفواتي 
Tax Summary رمز الضرائب المعفاة الضريبة ضريبة الدخل Code غربیبة ه 
Ex Tax ۱۷,۱ 17.14 Tax ۰,۸۹ 0.86 ,۸۶ 0.86 Inc Tax ۱۸,۰۰ 18.00 AD ۱۴ ,۱۷ الإجمالي
Total 17.14 18.00 معدلات الضرائب Tax Rates = %: 5% 5: % 5 % :5%AD = ضري 5 % معدل تباسي من أن %5 AD :
Standard Rate of عفري معدل :أو 04_ :40 خارج النطاق : أو 00s : 
Out of Scope معني من الضرائب : أكسر EX : Sax Exempt Excl. 
VAT Unit Price : غير شامل فر بيبة القيمة المضافة سعر الوحدة 
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Looking at the invoices, I noticed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Almost every one had the date in a different format, from &lt;code&gt;2020-12-24&lt;/code&gt; to &lt;code&gt;12-24-2020&lt;/code&gt; to &lt;code&gt;2020 Dec'24&lt;/code&gt; to &lt;code&gt;December 24, 2020&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Because the text was the result of OCR, and the pictures were not clear, little characters like &lt;code&gt;,&lt;/code&gt; &lt;code&gt;.&lt;/code&gt; &lt;code&gt;'&lt;/code&gt; were not reliably parsed.&lt;/li&gt;
&lt;li&gt;However, putting aside the dirtyness of the little characters, once I found the date on the invoice image, I was usually able to find it in the OCR text.&lt;/li&gt;
&lt;li&gt;A big "insight" was that the date of the receipt was always very close to the date of submission (=the date when the user took the picture and uploaded it), and the date of submission is known at parse time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So I tried a simple thing. I made a list of 10-20 Python format strings for &lt;code&gt;datetime.strftime()&lt;/code&gt;, like &lt;code&gt;%Y-%m-%d&lt;/code&gt;, and tried the following logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;formats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%Y-%m-&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;previous_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_previous_days&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submission_day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# previous_days is ordered to start at submission_day, and go back in time&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;previous_days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;formats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="c1"&gt;# found it!&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To my big surprise, this worked very well. For the cases where it returned something (not &lt;code&gt;None&lt;/code&gt;), it correctly found the date at 95%+ accuracy!&lt;/p&gt;
&lt;p&gt;So I went further down this path, and started adding more and more date formats. I quickly realized that instead of manually generating the date formats, I should have the computer do that, too. So I wrote a simple loop to generate lots of possible date formats, with different orderings, dividers and y/m/d formats:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;year_formats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%Y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;month_formats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%m&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%b&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;day_formats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;dividers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;orderings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Y-M-D&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;D-M-Y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;M-D-Y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# generate all possible format strings and test&lt;/span&gt;
&lt;span class="n"&gt;formats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;divider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year_format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month_format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day_format&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orderings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dividers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year_formats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month_formats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day_formats&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;divider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year_format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;M&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month_format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;D&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day_format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;formats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This approach continued to work well. Eventually I was at ~80% recognition rate, still at ~95% accuracy. At this point I was finding receipts which had formats which I couldn't generate with &lt;code&gt;datetime.strftime()&lt;/code&gt;, so I came up with a minimal wrapper around it, which let me add my own formattings. And finally, for the final few cases I found, I manually extended the generated date format list. Overall, in the end I had a list of 100s of date formats to try.&lt;/p&gt;
&lt;p&gt;Finally, add to this some lower/uppercasing, logic to handle cases when multiple candidates are found, and some logic to handle cases when spaces and small characters are lost in the OCR process, I arrived at a very competetive solution.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This brute force, search based method makes a prediction for ~97% of cases, and gets ~95% accuracy, outperforming our current solution by a very wide margin. Because of the repeated string searches (Python &lt;code&gt;substring in string&lt;/code&gt; in a loop) it's a bit slow, but it's not worth it to optimize, since it can still process 1000s of requests per second, but we only need to process around 1-10 per second — so optimizing the string searches or rewriting in C++ would be premature optimization.&lt;/p&gt;</content><category term="python"></category></entry><entry><title>Building intuition for p-values and statistical significance</title><link href="https://bytepawn.com/building-intuition-p-values-statistical-significance.html" rel="alternate"></link><published>2021-04-25T00:00:00+02:00</published><updated>2020-04-25T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-04-25:/building-intuition-p-values-statistical-significance.html</id><summary type="html">&lt;p&gt;This is the transcript of a talk I did on experimentation and A/B testing to give the audience an intuitive understanding of p-values and statistical significance.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/coin_toss.jpg" alt="Coin flip" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is the transcript of a talk I did on experimentation and A/B testing to &lt;strong&gt;give the audience an intuitive understanding of p-values and statistical significance&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The format of the talk was a short introduction, then live-coding an &lt;a href="https://github.com/mtrencseni/playground/blob/master/Experimentation.ipynb"&gt;ipython notebook&lt;/a&gt;. The initial, short snippets I typed out to make it interesting, for the later parts I switched to an existing notebook to keep the talk's momentum going.&lt;/p&gt;
&lt;p&gt;I opened the talk by saying that in experimentation Data Scientists have 3 jobs. To &lt;strong&gt;(1) design (2) run, and (3) evaluate&lt;/strong&gt; experiments. This talk is about the evaluation phase, and can be summed up as "don’t get fooled by randomness".&lt;/p&gt;
&lt;h2&gt;Coin flips&lt;/h2&gt;
&lt;p&gt;Let's build our intuitive understanding of p-values and statistical significance by considering coin flips. Whenever I think about A/B testing, in my head I'm secretly thinking about coin flips.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/coin_toss.jpg" alt="Coin toss" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Suppose somebody gives us a coin, and we suspect it's biased. What can we do? As an experimentalist, what we do is we start flipping it and record the outcomes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Suppose we flip this coin 10 times, and we get 7 heads. What can we say about the coin?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To see how usual or unusual this is, let's conduct many coinflipping experiments, and count. However, instead of doing it for real, let's use the computer's built-in random number generator:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6316696366339524&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So if we call &lt;code&gt;random()&lt;/code&gt;, it returns a random float number between 0 and 1. Using this, we can simulate a &lt;strong&gt;fair coin&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# let&amp;#39;s simulate fair coin flips&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;H&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;T&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's try it. If we call it repeatedly, it returns H or T, fifty-fifty:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;H&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Okay, now let's change one little thing. Let's agree to use &lt;code&gt;1&lt;/code&gt; instead of &lt;code&gt;H&lt;/code&gt; and &lt;code&gt;0&lt;/code&gt; instead of &lt;code&gt;T&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# let&amp;#39;s simulate fair coin flips&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This will make our life easier in subsequent steps. Now, let's flip our virtual coin 10 times:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# 10 coinflips&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since we're using &lt;code&gt;1&lt;/code&gt; for heads, we can just add up the results to get how many heads we had:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's create a shorthand for this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_flips&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Okay, now it gets interesting. Let'd conduct this 10-flip experiment 10 times:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So, the first time we got 6 heads out of 10, the second time 5, then just 2 out of 10, and so on.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Presentation note: this is where I switch from typing to running an existing notebook.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Let's count how many times we 1 heads out of 10, 2 heads out of 10, 3 heads out of 10, if we repeat the 10-flip experiment many times:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;outcomes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Out of {} experiments, we flipped {} heads {} times&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;wlipped&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;times&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's increase the number of experiments to get more counts and more representative statistics. Since we're using our computer's random number generator, we can do it a million times in about a second:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;NUM_FLIPS = 10
NUM_EXPERIMENTS = 1000*1000  &amp;lt;---
outcomes = [coinflips(NUM_FLIPS) for _ in range(NUM_EXPERIMENTS)]
count = defaultdict(lambda: 0)
for outcome in outcomes:
    count[outcome] += 1
for i in range(NUM_FLIPS+1):
    print(&amp;#39;Out of {} experiments, we flipped {} heads {} times&amp;#39;.format(
        NUM_EXPERIMENTS, i, count[i]))

&amp;gt; Out of 1000000 experiments, we flipped 0 heads  1004 times
&amp;gt; Out of 1000000 experiments, we flipped 1 heads  9719 times
&amp;gt; Out of 1000000 experiments, we flipped 2 heads  44091 times
&amp;gt; Out of 1000000 experiments, we flipped 3 heads  116981 times
&amp;gt; Out of 1000000 experiments, we flipped 4 heads  206072 times
&amp;gt; Out of 1000000 experiments, we flipped 5 heads  245371 times
&amp;gt; Out of 1000000 experiments, we flipped 6 heads  205059 times
&amp;gt; Out of 1000000 experiments, we flipped 7 heads  117282 times  &amp;lt;---
&amp;gt; Out of 1000000 experiments, we flipped 8 heads  43764 times
&amp;gt; Out of 1000000 experiments, we flipped 9 heads  9675 times
&amp;gt; Out of 1000000 experiments, we flipped 10 heads 982 times
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Going back to our initial question, what can we say about a coin if we get 7 heads out of 10? From the above, we can see that if repeat this 10-flip experiment &lt;strong&gt;with a fair coin&lt;/strong&gt; a million times, we actually get 7 heads out of 10 about 117,000 times. Let's just do one more change, instead of showing raw counts, let's divide them by &lt;code&gt;NUM_EXPERIMENTS = 1000*1000&lt;/code&gt; to get percentages:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;outcomes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Out of {} experiments, we flipped {} heads {:.2f}&lt;/span&gt;&lt;span class="si"&gt;% o&lt;/span&gt;&lt;span class="s1"&gt;f the time&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;4.40&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;11.71&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;20.53&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;24.59&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;20.51&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;11.70&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;4.39&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt;  &lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Just as with the counts, we see that if repeat this 10-flip experiment &lt;strong&gt;with a fair coin&lt;/strong&gt; many times, we actually get 7 heads out of 10 about 11.7% of the time!&lt;/p&gt;
&lt;h2&gt;What is the p-value&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Given an experimental outcome: "we flip 7 heads out of 10", assume the "boring", "non-action" case (the statistical term is "null hypothesis") = &lt;em&gt;"the coin is fair, it's not biased"&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Compute the probability of the experimental outcome (7 heads) in the "boring" case (=11.69%)&lt;/li&gt;
&lt;li&gt;.. and also add to it the probabilities of even-more-extreme outcomes (8, 9, 10 heads, which is + 4.36% + 0.98% + 0.10%) = 17.13%&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$ p = 0.17 $&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In summary: in real life, we don't know if the coin is fair or not. We just know we flipped 7 heads out of 10. What we know from the above is, if we assume it's fair, we would see this outcome (7 heads), or more extreme outcomes (8, 9, 10 heads out of 10), a combined 17% of the time. That is not very unlikely, about 1 in 6 odds.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Comment: I've skipped 1-sided vs 2-sided testing, as it's not critical for building the core intuition.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;What is statistical significance&lt;/h2&gt;
&lt;p&gt;Is the above statistically significant? What does "statistically significant" mean?&lt;/p&gt;
&lt;p&gt;Statistical significance is a way to make a decision based on an experiment. It's a pre-agreed upon, set in protocol, $p_{critical}$ threshold for the p-value, so that if $p &amp;lt; p_{critical}$, we declare the result "statistically significant" and we reject the "boring" null hypothesis, and we accept the alternative, "action" hypothesis.&lt;/p&gt;
&lt;p&gt;In our example, the "boring" hypothesis is that the "coin is fair", rejecting it would mean we conclude the "coin is biased". Since we calculated $p=0.17$, assuming we are working with $p_{critical} = 0.05$, we would not reject the "boring" null hypothesis. We would not reject the hypothesis that the coin is biased.&lt;/p&gt;
&lt;p&gt;So statistical significance is just a human-agreement for decision-making. Usually $p_{critical} = 0.05$ or 5% or 1 in 20 odds, but this is completely arbitrary. There is nothing special or "right" about 5%. We can agree that we will use 1% or 10% or 20% in our experiments.&lt;/p&gt;
&lt;p&gt;From the above we can see that for 10 coin flips, if our $p_{critical}$ is 5%, we'd need to get 0, 1 or 9, 10 heads to conclude that the coin is biased.&lt;/p&gt;
&lt;p&gt;Notes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;At this point I explained that if we pick a p-value of 5%, that means on average, of the times when we reject the null hypothesis, we will be wrong 1 in 20 times. Depending on the audience, this may be too much and confuse them.&lt;/li&gt;
&lt;li&gt;When I presented I had a question about how to pick $p_{critical}$. Here's a dedicated blog post I wrote about how to pick $p_{critical}$ and how to balance experimentation velocity vs being sure: &lt;a href="https://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html"&gt;A/B tests: Moving Fast vs Being Sure&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Sample size&lt;/h2&gt;
&lt;p&gt;We saw that getting 7 out of 10 flips, so 70% heads, is not that unusual. Is 70 heads out of 100 flips also not unusual? Let's check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt;
&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;outcomes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Out of {} experiments, we flipped {} heads {:.3f}&lt;/span&gt;&lt;span class="si"&gt;% o&lt;/span&gt;&lt;span class="s1"&gt;f the time&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;100000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;70&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What we see here is that 70% only happens 0.001% of the time, very rarely! And if we add the %s of for 71, 72 .. 100, it's still a very small number. So with 100 coin flips, a 70% heads (or more) outcome yields a much-much smaller p-value than the 17% we had for 10 coin flips.&lt;/p&gt;
&lt;p&gt;This is the effect of sample size! &lt;code&gt;N=100&lt;/code&gt; coin flips is a lot more than &lt;code&gt;N=10&lt;/code&gt; coin flips. Telling apart a biased and a fair coin is much easier if you flip more and more. If we flip a coin 100 times and get 70 heads, then it is very unlikely that this coin is fair.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: the notebook also has plots the bell curves, but I don't think it helped the audience.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;A/B testing&lt;/h2&gt;
&lt;p&gt;Okay, now that we have built our intuition with coin flips, let's talk about a marketing A/B test. Suppose we have a control (A) and a treatment (B) version of an email. Maybe A is text-only, while B also has some colorful images. Let's say we send it out to &lt;code&gt;N=1000&lt;/code&gt; customers each. We wait one week, look at the logs, and the measurement outcome is that A converts at 10% and B converts at 12%.&lt;/p&gt;
&lt;p&gt;What can we say? It's actually the same thing as coin flips! Except here, the "fair" base coin is not a 50-50 coin, it's the 10% conversion we see in our control group A.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: here I'm decreasing rigor to make the explanation flow better. The gloss over the fact that 10% conversion of control is also a sampled result, so it's not the same as the hypothetical fair coin, which has no uncertainty.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Let's build a "control (A)" coin:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;coinflip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt; &lt;span class="n"&gt;instead&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now, let's see our experiment:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;outcomes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Out of {} experiments, we flipped {} heads {:.2f}&lt;/span&gt;&lt;span class="si"&gt;% o&lt;/span&gt;&lt;span class="s1"&gt;f the time&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.52&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;121&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.38&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;122&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.26&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.18&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;124&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;125&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;126&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.13&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;127&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.09&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;129&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;flipped&lt;/span&gt; &lt;span class="mi"&gt;130&lt;/span&gt; &lt;span class="n"&gt;heads&lt;/span&gt; &lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's change one thing, instead of saying "120 heads", let's change our vocabulary, divide by 1000, and talk about conversion, like 120/1000=12%:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;coinflips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;outcomes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Out of {} experiments, we had {:.2f}&lt;/span&gt;&lt;span class="si"&gt;% c&lt;/span&gt;&lt;span class="s1"&gt;onversion {:.2f}&lt;/span&gt;&lt;span class="si"&gt;% o&lt;/span&gt;&lt;span class="s1"&gt;f the time&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.00&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.48&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.10&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.27&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.20&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.41&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.30&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.22&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.40&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.18&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.50&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.14&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.60&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.70&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.08&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.80&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.07&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;12.90&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="n"&gt;experiments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mf"&gt;13.00&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt; &lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's add up these "12% or more conversion cases":&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p_value = &amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;NUM_FLIPS&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;NUM_EXPERIMENTS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.13&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So if we assume that treatment's (B's) true, intrinsic conversion is also 10%, just like control's (A's) (=this is the "boring" null hypothesis), then we'd expect to see 12% or more conversion, just by chance, 2.13% of the time. If we were using $p_{critical} = 5%%$, we'd conclude that this is "statistically significant", we'd reject the "boring" hypothesis that A and B are the same, and we'd go with the "action" hypothesis: in the future, we will send the B variant to all users --- hence the name "action" hypothesis, it means we take some action.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I close the talk by talking briefly about the importance of having a culture of experimentation, that the most important thing is to actually run experiments, repeatedly, rigorously. I make the point of saying this is more important than statistical rigor itself. First is, run a lot of experiments, and then worry about p-values. I quote Jeff Bezos:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Our success at Amazon is a function of how many experiments we run per year, per month, per week, per day.” - Jeff Bezos&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A &lt;a href="https://www.youtube.com/watch?v=XBqM17MKz6U"&gt;really good 5-minute pitch on experimentation culture is this Youtube video&lt;/a&gt;. Depending on how much time is left, some points can be lifted from here, like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;firsts organizations have to learn how to experiment&lt;/li&gt;
&lt;li&gt;most experiments don't yield wins&lt;/li&gt;
&lt;li&gt;eventual big wins pay for many losers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Note: stressing experimentation culture may or may not be a good investment of time in a talk. For example, in big established companies, culture is usually top-down, so it only makes sense spending significant time on this if the audience includes top management.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Happy experimenting!&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Random numbers, the natural logarithm and higher dimensional simplexes</title><link href="https://bytepawn.com/random-numbers-the-natural-logarithm.html" rel="alternate"></link><published>2021-04-17T00:00:00+02:00</published><updated>2021-04-17T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-04-17:/random-numbers-the-natural-logarithm.html</id><summary type="html">&lt;p&gt;The base $e$ of the natural logarithm shows up in an unexpected place. Let's derive why!&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/rnd_enz.png" alt="Simplex" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is a fun post to exercise our brain and practice college math.&lt;/p&gt;
&lt;p&gt;Let's play a game: draw a uniform random number between 0 and 1. What is the expected value?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;once&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;once&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.50023&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As expected, it's 0.5.&lt;/p&gt;
&lt;p&gt;Okay, what about if we draw two numbers?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;twice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;twice&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="mf"&gt;1.00003&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As expected, it's 1.0.&lt;/p&gt;
&lt;p&gt;Okay, now let's turn it on its head: draw random numbers between 0 and 1, and stop when the sum exceeds 1. What is the expected number of draws?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stop_at&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;total_draws&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;trial_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
        &lt;span class="n"&gt;num_draws&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;trial_sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;num_draws&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;trial_sum&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;total_draws&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;num_draws&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;total_draws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stop_at&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;2.71758&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;This returns $ e $, the &lt;a href="https://en.wikipedia.org/wiki/E_(mathematical_constant)"&gt;base of the natural logarithm&lt;/a&gt;!&lt;/strong&gt; Why does it show up here?&lt;/p&gt;
&lt;h2&gt;Practicing college math&lt;/h2&gt;
&lt;p&gt;As a reminder, the definition of $ e $:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ e  = 1 + \dfrac{1}{1!} + \dfrac{1}{2!} + \dfrac{1}{3!} + \dfrac{1}{4!} + ...  = 1 + \sum\limits_{n=1}^{\infty} \dfrac{1}{n!} $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Why is this the expected value of the draws?&lt;/p&gt;
&lt;p&gt;Let $ N $ denote the random variable which denotes the number of draws required to cross 1.0. The expected value:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ E(N) = \sum\limits_{k=2}^{\infty}{ k \times P(N=k)} $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;The sum starts at $ k = 2 $ because we need at least 2 draws, $ P(N=k) = 0 $ for $ k = 1 $.&lt;/p&gt;
&lt;p&gt;The first step is to realize that $ P(N=k) $, the probability that we stop at $k$, is the probability that the sum is less than $1$ at $k-1$ draws, and exceeds $1$ after the $k$-th draw. Let $U_i$ denote the $i$-th draw, and let $S_k$ denote the sum up to $k$, $ S_k = \sum\limits_{i=1}^{k}{ U_i } $.&lt;/p&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(N=k) = P( S_{k-1} &amp;lt; 1 \land S_k \geq 1 ) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Now some elementary probability theory:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(A) = P(A \land B) + P( \land \neg B) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;With $ A = S_{k-1} &amp;lt; 1 $ and $ B = S_{k} \geq 1 $:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(S_{k-1} &amp;lt; 1) = P( S_{k-1} &amp;lt; 1 \land S_k \geq 1 ) + P( S_{k-1} &amp;lt; 1 \land S_k &amp;lt; 1 ) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Obviously, $ P( S_{k-1} &amp;lt; 1 \land S_k &amp;lt; 1 ) = P( S_k &amp;lt; 1 ) $, so:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(S_{k-1} &amp;lt; 1) = P( S_{k-1} &amp;lt; 1 \land S_k \geq 1 ) + P( S_k &amp;lt; 1 ) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Moving terms:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P( S_{k-1} &amp;lt; 1 \land S_k \geq 1 ) = P(S_{k-1} &amp;lt; 1) - P( S_k &amp;lt; 1 ) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;So:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(N=k) = P( S_{k-1} &amp;lt; 1 \land S_k \geq 1 ) = P(S_{k-1} &amp;lt; 1) - P( S_k &amp;lt; 1 ) $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Now we just need to figure out the formula for $ P( S_k &amp;lt; 1 ) $.&lt;/p&gt;
&lt;p&gt;This is easiest solved geometrically. For $k=2$, $ P( S_k &amp;lt; 1 ) $ is the triangle area inside the x-y unit square where $x+y &amp;lt; 1$, which is clearly $1/2$. For $k=3$, $ P( S_k &amp;lt; 1 ) $ is the simplex volume inside the x-y-z unit square where $ x+y+z &amp;lt; 1 $. The simplex has as its base the previous triangle defined by $x+y &amp;lt; 1$, and has unit height.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/simplex.jpg" alt="Simplex" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The general formula for the area of triangles (volumes of simplexes) with unit height in $k$ dimensions is $ V = A \times k $, where A is the area (volume) of the base, which is the same formula for $k-1$. Inductively, this means that:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P( S_k &amp;lt; 1 ) = \dfrac{1}{k!}$&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;We have the factorial from the definition of $e$! Now we apply a simple identity of the factorial, $ k! = k \times (k-1)! $ to our formula:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ P(N=k) = P(S_{k-1} &amp;lt; 1) - P( S_k &amp;lt; 1 ) = \dfrac{1}{(k-1)!} - \dfrac{1}{k!} = \dfrac{k}{k \times (k-1)!} - \dfrac{1}{k!} = \dfrac{k}{k!} - \dfrac{1}{k!} = \dfrac{k-1}{k!} $&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;We're almost done:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ E(N) = \sum\limits_{k=2}^{\infty}{ k \times P(N=k)} = \sum\limits_{k=2}^{\infty}{ k \times \dfrac{k-1}{k!} } = \sum\limits_{k=2}^{\infty}{ \dfrac{k-1}{(k-1)!} } $ &lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Let $j = k-1$:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ E(N) = \sum\limits_{j=1}^{\infty}{ \dfrac{j}{j!} } = 1 + \sum\limits_{j=2}^{\infty}{ \dfrac{j}{j!} } = 1 + \sum\limits_{j=2}^{\infty}{ \dfrac{1}{(j-1)!} } $ &lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Let $n = j-1$:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;$ E(N) = 1 + \sum\limits_{j=2}^{\infty}{ \dfrac{1}{(j-1)!} } = 1 + \sum\limits_{n=1}^{\infty}{ \dfrac{1}{n!} } = e $ &lt;/center&gt;&lt;/p&gt;
&lt;h2&gt;Varying the stopping threshold&lt;/h2&gt;
&lt;p&gt;We saw above that if we stop at $z=1$, we need an average of $e$ draws. What if we want to stop at $z=2$ or $z=3.14$? What is the $E(N_z)$, the average number of draws required if we want to stop at $z$?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;stop_at&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;z&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;average draws&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;E(N_z)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/rnd_enz.png" alt="Random draws required to reach z" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is intuitive: if we need $e$ draws to reach $z$, and each additional draws averages $0.5$, so $E(N_z) = f(z) = e + 2(z-1)$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometimes it's good to exercise our brain (and $\LaTeX$ skills) to stay sharp!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Random%20numbers%2C%20the%20natural%20logarithm%20and%20higher%20dimensional%20simplexes.ipynb"&gt;The ipython notebook is here.&lt;/a&gt;&lt;/p&gt;</content><category term="bayesian"></category><category term="ab-test"></category></entry><entry><title>Classification accuracy of quantized Autoencoders with Pytorch and MNIST</title><link href="https://bytepawn.com/classification-accuracy-of-quantized-autoencoders-with-pytorch-and-mnist.html" rel="alternate"></link><published>2021-04-09T00:00:00+02:00</published><updated>2021-04-09T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-04-09:/classification-accuracy-of-quantized-autoencoders-with-pytorch-and-mnist.html</id><summary type="html">&lt;p&gt;I measure how the classification accuracy of quantized Autoencoder neural network varies with encoding bits on MNIST digits. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/ae_quant_acc_1.png" alt="Classifier accuracy on quantized Autoencoder output after quantization" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/investigating-information-storage-in-quantized-autoencoders-with-pytorch-and-mnist.html"&gt;previous post&lt;/a&gt; I investigated the information contents, in bits, that Autoencoders store. I took MNIST digits, trained a simple Autoencoder neural network to first encode the pictures in &lt;code&gt;4..256&lt;/code&gt; dimensions, where each vector element is a &lt;code&gt;float32&lt;/code&gt;. Then I took the elements, and quantized them to &lt;code&gt;2..32&lt;/code&gt; bits accuracy, and measured the reconstruction loss as a function of encoding dimension and quantization:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pytorch_ae_q1.png" alt="Pytorch Autoencoder loss with encoding dimension and quantization bits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Based on the plot, my conclusion was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;... 512 bits --- which corresponds to 12x (lossy) compression --- is a good trade-off, or 1024 bits for 10% less loss. Loss does not decrease significantly after 1024 bits, that appears to be best the Autoencoder can accomplish.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;However, this conclusion is just eye-balled based on the plots leveling off. The y-axis here is loss, which is related to pixel distance between original and reconstruction, but it's actually hard to know what a loss of ~300 really means. Also, this loss is not straightforward pixel distance, since it's standard practice to re-scale and normalize MNIST images to help the neural networks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;tv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="n"&gt;pixels&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;scaled&lt;/span&gt;
&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;A better approach to quantify the "loss" is to run a classifier on the output and see how recognizable the digits are in terms of Accuracy %.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/16-pytorch-autoencoder-quantization-2/Pytorch%20MNIST%20Autoencoder%20quantization%20accuracy.ipynb"&gt;The notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Experiment setup&lt;/h2&gt;
&lt;p&gt;The MNIST dataset contains a total of 70,000 digits. Usually the first 60,000 are used for training, the remaining 10,000 for testing. Here, I will use:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;30,000 for training a CNN classifier, as in &lt;a href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html"&gt;this earlier post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;30,000 for training a Autoencoder, like in &lt;a href="https://bytepawn.com/investigating-information-storage-in-quantized-autoencoders-with-pytorch-and-mnist.html"&gt;the previous post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;10,000 for testing, how the classifier performs on quantized digits coming from the Autoencoder&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then, I follow the same strategy as previously, but instead of using loss as a metric, I run the quantized Autoencoder's output through the classifier, and plot the accuracy as a as a function of encoding dimension and quantization.&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;First, some helpers to split the dataset into 3 parts:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MNISTPartialTrainDataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt;
        &lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getitem__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MNISTPartialTestDataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt;
        &lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getitem__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The classifier is a vanilla CNN:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Classifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Classifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxPool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxPool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LogSoftmax&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We train the classifier on the first 30,000 MNIST digits:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;classifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Classifier&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NLLLoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;num_epochs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classifier_dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;epoch {:d}/{:d}, loss: {:.4f}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;classifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cpu&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classifier_dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# get the index of the max log-probability&lt;/span&gt;
    &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Classifier train accuracy: {:.1f}%&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;correct&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;classifier_dataloader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The classifier achieves 99.8% train accuracy, as expected:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Classifier train accuracy: 99.8%
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now the Autoencoder, exactly the same as before:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unflatten&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sigmoid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;round_bits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The main loop is very similar to before, except there is an additional step, where the trained classifier is run on the Autoencoder's output digits:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="c1"&gt;# train autoencoder, code omitted&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="c1"&gt;# test&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;autoencoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;autoencoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cpu&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;autoencoder_test_dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;ae_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ae_imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# get the index of the max log-probability&lt;/span&gt;
            &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;accuracy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;correct&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autoencoder_test_dataloader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;The results can be plotted to show the &lt;strong&gt;accuracy of the classifier&lt;/strong&gt; per &lt;code&gt;encoding_dims&lt;/code&gt;, per &lt;code&gt;quantize_bits&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ae_quant_acc_1.png" alt="Classifier accuracy on quantized Autoencoder output after quantization" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Alternatively we can plot &lt;code&gt;total_bits = encoding_dims * quantize_bits&lt;/code&gt; on the x-axis:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ae_quant_acc_2.png" alt="Classifier accuracy on quantized Autoencoder output after quantization as a function of total bits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The plots re-affirm what I read off the previous plots, that &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;each &lt;code&gt;float32&lt;/code&gt; in the encoding stores around 8 bits of useful information (out of 32), since all of the curves flatten out after 8 bits&lt;/li&gt;
&lt;li&gt;128 dimensions is the maximum required, since the next jump to 256 yield no significant increase in accuracy&lt;/li&gt;
&lt;li&gt;overall, based on these curves, &lt;code&gt;encodim_dims = 64&lt;/code&gt; and &lt;code&gt;quantize_bits = 8&lt;/code&gt; appears to be a good trade-off (&lt;code&gt;total_bits = 64*8 = 512 bits&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, now it's easier to quantify:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;at &lt;code&gt;encodim_dims = 64&lt;/code&gt; and &lt;code&gt;quantize_bits = 8&lt;/code&gt;  (&lt;code&gt;total_bits = 64*8 = 512 bits&lt;/code&gt;), the classifier achieves 97% accuracy, which is still great&lt;/li&gt;
&lt;li&gt;at higher &lt;code&gt;total_bits&lt;/code&gt;, and with no quantization at all, the classifier achieves 98% on these unseen, encoded-than-decoded images, which is the same 98% we get without no quantization at all&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Finally, to visually see what happens when we autoencode and quantize, a sample of the Autoencoder's output, for &lt;code&gt;encoding_dims=64&lt;/code&gt; and various quantization levels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, quantize_bits = 2 -&amp;gt; total_bits = 128, accuracy = 60%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_2_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, quantize_bits = 4, total_bits = 256, accuracy = 90%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_4_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, quantize_bits = 8, total_bits = 512, accuracy = 97%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_8_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, quantize_bits = 16, total_bits = 1024, accuracy = 98%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_16_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, quantize_bits = 32, total_bits = 2048, accuracy = 98%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_32_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;encodim_dims = 64, not quantized, accuracy = 98%&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ae_samples_None_64.png" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/16-pytorch-autoencoder-quantization-2"&gt;More images here.&lt;/a&gt;&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="torchvision"></category><category term="mnist"></category><category term="autoencoder"></category></entry><entry><title>Investigating information storage in quantized Autoencoders with Pytorch and MNIST</title><link href="https://bytepawn.com/investigating-information-storage-in-quantized-autoencoders-with-pytorch-and-mnist.html" rel="alternate"></link><published>2021-04-04T00:00:00+02:00</published><updated>2021-04-04T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-04-04:/investigating-information-storage-in-quantized-autoencoders-with-pytorch-and-mnist.html</id><summary type="html">&lt;p&gt;I investigate how much information an Autoencoder neural network encodes for MNIST digits. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/pytorch_ae_q1.png" alt="Pytorch Autoencoder loss with encoding dimension and quantization bits" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In this experiment I wanted to understand the compression ratio of Autoencoders: &lt;strong&gt;how much information (how many bits) does an Autoencoder encode in the encoding dimensions?&lt;/strong&gt; Let's say an autoencoder is able to encode a 28x28 grayscale MNIST image (28x28x8 bits = 6272 bits) in a 32 dimensional encoding space with acceptable reconstruction loss. What is the compression ratio? With a CUDA/GPU, those 32 dimensions are actually 32 float32's, so it's 32x32 = 1024 bits, which corresponds to 6.1x (lossy) compression. But are all those 1024 bits really needed? Intuitively the entire float32 space is probably not used. A related question is, what is the "right" number of encoding dimensions to pick for Autoencoders?&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/15-pytorch-autoencoder-quantization-1/Pytorch%20MNIST%20Autoencoder%20quantization%20loss.ipynb"&gt;The notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Experiment setup&lt;/h2&gt;
&lt;p&gt;To answer these questions, I took a simple Autoencoder neural network with a &lt;strong&gt;Linear+ReLu&lt;/strong&gt; encoder and a &lt;strong&gt;Linear+Sigmoid&lt;/strong&gt; decoder layer. Since I will want to quantize the bits between the encoder and a decoder, I use the &lt;strong&gt;sigmoid() function to get the encoder's output to be between 0 and 1, and then the inverse, the logit() function before feeding back to the decoder&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://miro.medium.com/max/7200/1*59xmdS1z6OfGIMogPM9P0g.png" alt="sigmoid() and logit() function" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;The code is a straightforward Autoencoder neural network implemented in Pytorch, with some additional transformations in the &lt;code&gt;forward()&lt;/code&gt; function to implement quantization. The arrows mark the departure from a vanilla Autoencoder:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unflatten&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_dims&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sigmoid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                   &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="n"&gt;sigmoid&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;training&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;round_bits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="o"&gt;..&lt;/span&gt; &lt;span class="n"&gt;then&lt;/span&gt; &lt;span class="n"&gt;quantize&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;---&lt;/span&gt; &lt;span class="n"&gt;logit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inverse&lt;/span&gt; &lt;span class="n"&gt;sigmoid&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The function &lt;code&gt;round_bits()&lt;/code&gt; quantizes the input number to &lt;code&gt;2**quantize_bits&lt;/code&gt; levels between 0 and 1:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;round_bits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;mul&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;quantize_bits&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;mul&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;mul&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The main training loop trains the Autoencoder for different &lt;code&gt;encoding_dims&lt;/code&gt;, and then tests the reconstruction loss for various values of &lt;code&gt;quantize_bits&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;encoding_dims&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="c1"&gt;# train&lt;/span&gt;
    &lt;span class="n"&gt;autoencoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;encoding_dims&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BCELoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autoencoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;num_epochs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;autoencoder_train_dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# test&lt;/span&gt;
    &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MSELoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;autoencoder_train_dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;quantize_bits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;The results can be plotted to show the loss per &lt;code&gt;encoding_dims&lt;/code&gt;, per &lt;code&gt;quantize_bits&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pytorch_ae_q1.png" alt="Pytorch Autoencoder loss with encoding dimension and quantization bits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The plot shows that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;each &lt;code&gt;float32&lt;/code&gt; in the encoding stores around 8 bits of useful information (out of 32), since all of the curves flatten out after 8 bits&lt;/li&gt;
&lt;li&gt;128 dimensions is the maximum required, since the next jump to 256 yield no significant decrease in loss&lt;/li&gt;
&lt;li&gt;overall, based on these curves, &lt;code&gt;encodim_dims = 64&lt;/code&gt; and &lt;code&gt;quantize_bits = 8&lt;/code&gt; appears to be a good trade-off (&lt;code&gt;total_bits = 64*8 = 512 bits&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Alternatively we can plot &lt;code&gt;total_bits = encoding_dims * quantize_bits&lt;/code&gt; on the x-axis:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pytorch_ae_q2.png" alt="Pytorch Autoencoder loss with encoding dimension and total bits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This re-affirms that 512 bits --- which corresponds to 12x (lossy) compression --- is a good trade-off, or 1024 bits for 10% less loss. Loss does not decrease significantly after 1024 bits, that appears to be best the Autoencoder can accomplish. For reference, the entire MNIST training dataset, uncompressed is &lt;code&gt;28*28*8 * 60*1000 / 8 = 47,040,000 bytes&lt;/code&gt;. After gzip compression, the file size is &lt;code&gt;9,912,422 bytes&lt;/code&gt;, for a lossless compression ratio of 4.7x. &lt;/p&gt;
&lt;p&gt;In the next post, I will explore what we lose with the Autoencoder's lossy compression in terms of recognizability of the digits.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="torchvision"></category><category term="mnist"></category><category term="autoencoder"></category></entry><entry><title>Building a Pytorch Autoencoder for MNIST digits</title><link href="https://bytepawn.com/building-a-pytorch-autoencoder-for-mnist-digits.html" rel="alternate"></link><published>2021-03-18T00:00:00+01:00</published><updated>2021-03-18T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-03-18:/building-a-pytorch-autoencoder-for-mnist-digits.html</id><summary type="html">&lt;p&gt;I build an Autoencoder network to categorize MNIST digits in Pytorch.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mnist_autoenc.jpg" alt="Conversion difference vs N" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous posts I was training GANs to auto-generate synthetic MNIST digits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/training-a-pytorch-wasserstain-mnist-gan-on-google-colab.html#training-a-pytorch-wasserstain-mnist-gan-on-google-colab"&gt;Training a Pytorch Wasserstein MNIST GAN on Google Colab&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/training-a-pytorch-classic-mnist-gan-on-google-colab.html#training-a-pytorch-classic-mnist-gan-on-google-colab"&gt;Training a Pytorch Classic MNIST GAN on Google Colab&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/training-a-pytorch-lightning-mnist-gan-on-google-colab.html#training-a-pytorch-lightning-mnist-gan-on-google-colab"&gt;Training a Pytorch Lightning MNIST GAN on Google Colab&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here I take a step back to a simpler idea from unsupervised learning, &lt;a href="https://en.wikipedia.org/wiki/Autoencoder"&gt;Autoencoders&lt;/a&gt;. The idea is simple: take the input, reduce the dimensionality toward the middle of the deep neural network in the encoder part, and then restore the original dimensionality in the second, decoder part. The network is then trained to attempt to restore the original input, which in this case is MNIST digits, with minimal loss.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/autoencoder.png" alt="Autoencoder." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Autoencoder"&gt;Wikipedia explanation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”. Along with the reduction side, a reconstructing side is learned, where the autoencoder tries to generate from the reduced encoding a representation as close as possible to its original input, hence its name.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this post, I will try to build an Autoencoder in Pytorch, where the middle "encoded" layer is exactly 10 neurons wide. My assumption is that the best way to encode an MNIST digit is for the encoder to learn to classify digits, and then for the decoder to generate an average image of a digit for each. The &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/14-pytorch-autoencoder/Pytorch%20MNIST%20Autoencoder.ipynb"&gt;ipython notebook is here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist_autoenc.jpg" alt="MNIST autoencoder." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;I will only show the relevant parts of the code. First, the &lt;code&gt;Autoencoder&lt;/code&gt; model, split into an &lt;code&gt;encoder&lt;/code&gt; and a &lt;code&gt;decoder&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="c1"&gt;# 28 x 28&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 4 x 24 x 24&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 8 x 20 x 20 = 3200&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 10&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Softmax&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="c1"&gt;# 10&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 400&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 4000&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unflatten&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="c1"&gt;# 10 x 20 x 20&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConvTranspose2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 24 x 24&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConvTranspose2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="c1"&gt;# 28 x 28&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;enc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;dec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dec&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that the &lt;code&gt;forward()&lt;/code&gt; pass is &lt;code&gt;decoder(encoder(x))&lt;/code&gt;, the only reason I separated the 2 parts of the network is that so I can peek at the the middle, encoded part, to see how it's encoding the digits.&lt;/p&gt;
&lt;p&gt;The training loop:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MSELoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;epoch [{}/{}], loss: {:.4f}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Evaluation&lt;/h2&gt;
&lt;p&gt;Let's compare the activation of the 10 encoded neurons to the true labels of unseen MNIST test data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cuda&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;cpu&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Autoencoder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;testset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;./data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dataloader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;testset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;confusion_matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataloader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="n"&gt;imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;encs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encs&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;predicted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;confusion_matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;predicted&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that the confusion matrix here is not a classical confusion matrix in the sense that it doesn't necessarily have to be an identity matrix, since the encoded neurons ordering is arbitrary. In the best case, the resulting matrix can be row-column-reordered and normalized to be an identity matrix (assuming balanced data).&lt;/p&gt;
&lt;p&gt;Plotting the confusion matrix, where rows are the actual digits (0-9) and columns are the indices of the most active neuron (0-9):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/autoenc_confusion_matrix.png" alt="MNIST autoencoder." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;For example, for row 6 (actuall the seventh, since numbering starts at 0), we can see that column 8 is the most active. And conversely, most of the time column 8 is active is for row 6. So the model learned to tell 6s apart from the rest, with some accuracy.&lt;/p&gt;
&lt;p&gt;However, this unsupervised classifier is very far from perfect. For example, column 3 activates for actuals 3, 5, and 8. That's no surprise, since these digits actually look alike. To gauge how successful the encoding is, we can pick, for each endoding, the most likely actual value:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;classifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# enc -&amp;gt; predict&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we run a prediction model on the encoder with this mapping, we get an accuracy of 40%. It's much worse than the &lt;a href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;99% than is achievable with supervised learning&lt;/a&gt;, but it's much better than the 10% we'd expect from random chance.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This was an experiment in seeing whether an autoencoder can learn, without supervision, to recognize the 10 digits in MNIST, since that would probably be an optimal encoding --- assuming the images for each digit class are not too different in the MNIST data, so that encoding in that way is optimal. The result is worse than I expected, even after I &lt;a href="https://stackoverflow.com/questions/66667949/pytorch-mnist-autoencoder-to-learn-10-digit-classification#66667949"&gt;asked for help on Stackoverflow&lt;/a&gt; and I merged those solutions with my own. I suspect with more tweaking of the network and the training loop, at least 2/3 accuracy could be achieved. If I can improve this model significantly, I will re-visit this topic in a future post.&lt;/p&gt;</content><category term="pytorch"></category><category term="autoencoder"></category><category term="mnist"></category></entry><entry><title>The best parts of Invent and Wander: the Collected Writings of Jeff Bezos</title><link href="https://bytepawn.com/the-best-parts-of-jeff-bezos-invent-and-wander.html" rel="alternate"></link><published>2021-03-06T00:00:00+01:00</published><updated>2021-03-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-03-06:/the-best-parts-of-jeff-bezos-invent-and-wander.html</id><summary type="html">&lt;p&gt;These are the best parts from the book "Invent and Wander: the Collected Writings of Jeff Bezos". The book is a collection of the annual Amazon shareholder letters that Jeff Bezos has been sending out since 1997, and speeches he has given over time.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/invent_wander.jpg" alt="Jeff Bezos Invent and Wander" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;These are the best parts from the book &lt;a href="https://www.amazon.com/Invent-Wander-Collected-Writings-Introduction/dp/1647820715"&gt;Invent and Wander: the Collected Writings of Jeff Bezos&lt;/a&gt;. The book is a collection of the annual Amazon shareholder letters that Jeff Bezos has been sending out since 1997, and speeches he has given over time. It's a great source of business and management wisdom, some also applicable to daily life. The points on &lt;strong&gt;experimentation&lt;/strong&gt; and &lt;strong&gt;decision making&lt;/strong&gt; are especially relevant for data scientists. Enjoy!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/invent_wander.jpg" alt="Jeff Bezos Invent and Wander" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Big winners pay for many failed experiments&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;To invent you have to experiment, and if you know in advance that it’s going to work, it’s not an experiment.&lt;/strong&gt; Most large organizations embrace the idea of invention, but are not willing to suffer the string of failed experiments necessary to get there. Outsized returns often come from betting against conventional wisdom, and conventional wisdom is usually right. Given a ten percent chance of a 100 times payoff, you should take that bet every time. But you’re still going to be wrong nine times out of ten.&lt;/p&gt;
&lt;p&gt;We all know that if you swing for the fences, you’re going to strike out a lot, but you’re also going to hit some home runs. The difference between baseball and business, however, is that baseball has a truncated outcome distribution. When you swing, no matter how well you connect with the ball, the most runs you can get is four. In business, every once in a while, when you step up to the plate, you can score 1,000 runs. This long-tailed distribution of returns is why it’s important to be bold. Big winners pay for so many experiments.&lt;/p&gt;
&lt;h2&gt;Making decisions&lt;/h2&gt;
&lt;p&gt;Many of the important decisions we make at Amazon can be made with data. There is a right answer or a wrong answer, a better answer or a worse answer, and math tells us which is which. These are our favorite kinds of decisions. Opening a new fulfillment center is an example. We use history from our existing fulfillment network to estimate seasonal peaks and to model alternatives for new capacity. We look at anticipated product mix, including product dimensions and weight, to decide how much space we need and whether we need a facility for smaller “sortable” items or for larger items that usually ship alone. To shorten delivery times and reduce outbound transportation costs, we analyze prospective locations based on proximity to customers, transportation hubs, and existing facilities. Quantitative analysis improves the customer’s experience and our cost structure. Similarly, most of our inventory purchase decisions can be numerically modeled and analyzed.&lt;/p&gt;
&lt;p&gt;The above decisions require us to make some assumptions and judgments, but in such decisions, judgment and opinion come into play only as junior partners. The heavy lifting is done by the math. As you would expect, however, not all of our important decisions can be made in this enviable, math-based way. Sometimes we have little or no historical data to guide us and proactive experimentation is impossible, impractical.&lt;/p&gt;
&lt;p&gt;As our shareholders know, we have made a decision to continuously and significantly lower prices for customers year after year as our efficiency and scale make it possible. This is an example of a very important decision that cannot be made in a math-based way. In fact, when we lower prices, we go against the math that we can do, which always says that the smart move is to raise prices. We have significant data related to price elasticity. With fair accuracy, we can predict that a price reduction of a certain percentage will result in an increase in units sold of a certain percentage. With rare exceptions, the volume increase in the short term is never enough to pay for the price decrease. However, our quantitative understanding of elasticity is short-term. We can estimate what a price reduction will do this week and this quarter. But we cannot numerically estimate the effect that consistently lowering prices will have on our business over five years or ten years or more. Our judgment is that relentlessly returning efficiency improvements and scale economies to customers in the form of lower prices creates a virtuous cycle that leads over the long term to a much larger dollar amount of free cash flow, and thereby to a much more valuable Amazon.&lt;/p&gt;
&lt;h2&gt;Type 1 and Type 2 decisions&lt;/h2&gt;
&lt;p&gt;Some decisions are consequential and irreversible or nearly irreversible – one-way doors – and these decisions must be made methodically, carefully, slowly, with great deliberation and consultation. If you walk through and don’t like what you see on the other side, you can’t get back to where you were before. We can call these Type 1 decisions. &lt;strong&gt;But most decisions aren’t like that – they are changeable, reversible – they’re two-way doors. If you’ve made a suboptimal Type 2 decision, you don’t have to live with the consequences for that long. You can reopen the door and go back through.&lt;/strong&gt; Type 2 decisions can and should be made quickly by high judgment individuals or small groups.&lt;/p&gt;
&lt;p&gt;As organizations get larger, there seems to be a tendency to use the heavy-weight Type 1 decision-making process on most decisions, including many Type 2 decisions. The end result of this is slowness, unthoughtful risk aversion, failure to experiment sufficiently, and consequently diminished invention. We’ll have to figure out how to fight that tendency.&lt;/p&gt;
&lt;h2&gt;High-Velocity decision making&lt;/h2&gt;
&lt;p&gt;Day 2 companies make high-quality decisions, but they make high-quality decisions slowly. &lt;strong&gt;To keep the energy and dynamism of Day 1, you have to somehow make high-quality, high-velocity decisions.&lt;/strong&gt; Easy for start-ups and very challenging for large organizations. The senior team at Amazon is determined to keep our decision-making velocity high. Speed matters in business—plus a high-velocity decision-making environment is more fun too. We don’t know all the answers, but here are some thoughts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First, never use a one-size-fits-all decision-making process.&lt;/strong&gt; Many decisions are reversible, two-way doors. Those decisions can use a light-weight process. For those, so what if you’re wrong? I wrote about this in more detail in last year’s letter.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Second, most decisions should probably be made with somewhere around 70 percent of the information you wish you had.&lt;/strong&gt; If you wait for 90 percent, in most cases, you’re probably being slow. Plus, either way, you need to be good at quickly recognizing and correcting bad decisions. If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Third, use the phrase “disagree and commit.”&lt;/strong&gt; This phrase will save a lot of time. If you have conviction on a particular direction even though there’s no consensus, it’s helpful to say, “Look, I know we disagree on this, but will you gamble with me on it? Disagree and commit?” By the time you’re at this point, no one can know the answer for sure, and you’ll probably get a quick yes.&lt;/p&gt;
&lt;h2&gt;Metrics and measurements&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;We first measure ourselves in terms of the metrics most indicative of our market leadership: customer and revenue growth, the degree to which our customers continue to purchase from us on a repeat basis, and the strength of our brand.&lt;/strong&gt; We have invested and will continue to invest aggressively to expand and leverage our customer base, brand, and infrastructure as we move to establish an enduring franchise.&lt;/p&gt;
&lt;h2&gt;Focus on the long term&lt;/h2&gt;
&lt;p&gt;Amazon will make bold rather than timid investment decisions where we see a sufficient probability of gaining market leadership advantages. Some of these investments will pay off, others will not, and we will have learned another valuable lesson in either case. When forced to choose between optimizing the appearance of our GAAP accounting and maximizing the present value of future cash flows, we’ll take the cash flows.&lt;/p&gt;
&lt;p&gt;We will work hard to spend wisely and maintain our lean culture. We understand the importance of continually reinforcing a cost-conscious culture, particularly in a business incurring net losses. We will balance our focus on growth with emphasis on long-term profitability and capital management. At this stage, we choose to prioritize growth because we believe that scale is central to achieving the potential of our business model. We will continue to focus on hiring and retaining versatile and talented employees and continue to weight their compensation to stock options rather than cash. We know our success will be largely affected by our ability to attract and retain a motivated employee base, each of whom must think like, and therefore must actually be, an owner.&lt;/p&gt;
&lt;h2&gt;How to hire&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Will you admire this person?&lt;/strong&gt;If you think about the people you’ve admired in your life, they are probably people you’ve been able to learn from or take an example from. For myself, I’ve always tried hard to work only with people I admire, and I encourage folks here to be just as demanding. Life is definitely too short to do otherwise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Will this person raise the average level of effectiveness of the group they’re entering?&lt;/strong&gt; We want to fight entropy. The bar has to continuously go up. I ask people to visualize the company five years from now. At that point, each of us should look around and say, “The standards are so high now—boy, I’m glad I got in when I did!”&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Along what dimension might this person be a superstar?&lt;/strong&gt; Many people have unique skills, interests, and perspectives that enrich the work environment for all of us. It’s often something that’s not even related to their jobs.&lt;/p&gt;
&lt;h2&gt;Pricing&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Our pricing strategy does not attempt to maximize margin percentages,&lt;/strong&gt; but instead seeks to drive maximum value for customers and thereby create a much larger bottom line—in the long term. For example, we’re targeting gross margins on our jewelry sales to be substantially lower than industry norms because we believe over time—customers figure these things out—this approach will produce more value for shareholders.&lt;/p&gt;
&lt;h2&gt;Missionaries and mercenaries&lt;/h2&gt;
&lt;p&gt;I’m always trying to figure one thing first and foremost: Is that person a missionary or mercenary? &lt;strong&gt;The mercenaries are trying to flip their stock. The missionaries love their product or their service and love their customers and are trying to build a great service.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;By the way, the great paradox here is that it’s usually the missionaries who make more money.&lt;/p&gt;
&lt;h2&gt;Working backwards&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;“Working backward” from customer needs can be contrasted with a “skills-forward” approach where existing skills and competencies are used to drive business opportunities.&lt;/strong&gt; &lt;/p&gt;
&lt;p&gt;The skills-forward approach says, “We are really good at X. What else can we do with X?” That’s a useful and rewarding business approach. However, if used exclusively, the company employing it will never be driven to develop fresh skills. Eventually the existing skills will become outmoded.&lt;/p&gt;
&lt;p&gt;Working backward from customer needs often demands that we acquire new competencies and exercise new muscles, never mind how uncomfortable and awkward-feeling those first steps might be. Kindle is a good example of our fundamental approach. More than four years ago, we began with a long-term vision: every book, ever printed, in any language, all available in less than sixty seconds. The customer experience we envisioned didn’t allow for any hard lines of demarcation between Kindle the device and Kindle the service—the two had to blend together seamlessly. Amazon had never designed or built a hardware device, but rather than change the vision to accommodate our then-existing skills, we hired a number of talented (and missionary!) hardware engineers and got started learning a new institutional skill, one that we needed to better serve readers in the future.&lt;/p&gt;
&lt;h2&gt;Customer focused vs competitor focused&lt;/h2&gt;
&lt;p&gt;Our energy at Amazon comes from the desire to impress customers rather than the zeal to best competitors. We don’t take a view on which of these approaches is more likely to maximize business success. There are pros and cons to both and many examples of highly successful competitor-focused companies. We do work to pay attention to competitors and be inspired by them, but it is a fact that the customer-centric way is at this point a defining element of our culture. One advantage—perhaps a somewhat subtle one—of a customer-driven focus is that it aids a certain type of proactivity. When we’re at our best, we don’t wait for external pressures. We are internally driven to improve our services, adding benefits and features, before we have to. We lower prices and increase value for customers before we have to. We invent before we have to. These investments are motivated by customer focus rather than by reaction to competition. We think this approach earns more trust with customers and drives rapid improvements in customer experience—importantly—even in those areas where we are already the leader.&lt;/p&gt;
&lt;h2&gt;Competition and zero-sum games&lt;/h2&gt;
&lt;p&gt;Zero-sum games are unbelievably rare. Sporting events are zero-sum games. Two teams enter an arena. One’s going to win; one’s going to lose. Elections are zero-sum games. One candidate is going to win; one candidate is going to lose. In business, however, several competitors can do well. That’s very normal.&lt;/p&gt;
&lt;p&gt;The most important thing for doing well against competition—in business and also, I think, with military adversaries—is to &lt;strong&gt;be both robust and nimble&lt;/strong&gt;. And it is scale. So it’s great to be in the US military because you’re big. Scale is a gigantic advantage because it gives you robustness. You can take a punch. But it’s also good if you can dodge a punch. And that’s the nimbleness. And as you get bigger, you grow more robust. The most important factor for nimbleness is decision-making speed.&lt;/p&gt;
&lt;p&gt;The second-most important factor is being willing to be experimental. &lt;strong&gt;You have to be willing to take risks. You have to be willing to fail,&lt;/strong&gt; and people don’t like failure. &lt;strong&gt;I always point out that there are two different kinds of failure.&lt;/strong&gt; There’s &lt;strong&gt;experimental failure&lt;/strong&gt;—that’s the kind of failure you should be happy with. And there’s operational failure. We’ve built hundreds of fulfillment centers at Amazon over the years, and we know how to do that. If we build a new fulfillment center and it’s a disaster, that’s just &lt;strong&gt;bad execution&lt;/strong&gt;. That’s not good failure. But when we are developing a new product or service or experimenting in some way, and it doesn’t work, that’s okay. That’s great failure. And you need to distinguish between those two types of failure and really be seeking invention and innovation.&lt;/p&gt;
&lt;h2&gt;Embrace external trends&lt;/h2&gt;
&lt;p&gt;The outside world can push you into Day 2 if you won’t or can’t embrace powerful trends quickly. &lt;strong&gt;If you fight them, you’re probably fighting the future&lt;/strong&gt;. Embrace them and you have a tailwind. These big trends are not that hard to spot (they get talked and written about a lot), but they can be strangely hard for large organizations to embrace. We’re in the middle of an obvious one right now: machine learning and artificial intelligence.&lt;/p&gt;
&lt;h2&gt;Fending off Day 2&lt;/h2&gt;
&lt;p&gt;“Jeff, what does Day 2 look like?” That’s a question I just got at our most recent all-hands meeting. I’ve been reminding people that it’s Day 1 for a couple of decades. I work in an Amazon building named Day 1, and when I moved buildings, I took the name with me. I spend time thinking about this topic. “Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.” To be sure, this kind of decline would happen in extreme slow motion. An established company might harvest Day 2 for decades, but the final result would still come. I’m interested in the question “How do you fend off Day 2?” What are the techniques and tactics? How do you keep the vitality of Day 1, even inside a large organization? Such a question can’t have a simple answer. There will be many elements, multiple paths, and many traps. I don’t know the whole answer, but I may know bits of it. Here’s a starter pack of essentials for Day 1 defense: customer obsession, a skeptical view of proxies, the eager adoption of external trends, and high-velocity decision making.&lt;/p&gt;
&lt;p&gt;There are many ways to center a business. You can be competitor focused, you can be product focused, you can be technology focused, you can be business model focused, and there are more. But in my view, obsessive customer focus is by far the most protective of Day 1 vitality.&lt;/p&gt;
&lt;p&gt;Why? &lt;strong&gt;There are many advantages to a customer-centric approach, but here’s the big one: customers are always beautifully, wonderfully dissatisfied, even when they report being happy and business is great.&lt;/strong&gt; Even when they don’t yet know it, customers want something better, and your desire to delight customers will drive you to invent on their behalf. No customer ever asked Amazon to create the Prime membership program, but it sure turns out they wanted it, and I could give you many such examples. Staying in Day 1 requires you to experiment patiently, accept failures, plant seeds, protect saplings, and double down when you see customer delight. A customer-obsessed culture best creates the conditions where all of that can happen.&lt;/p&gt;
&lt;h2&gt;Resist Proxies&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;As companies get larger and more complex, there’s a tendency to manage to proxies.&lt;/strong&gt; This comes in many shapes and sizes, and it’s dangerous, subtle, and very Day 2. &lt;strong&gt;A common example is process as proxy.&lt;/strong&gt; Good process serves you so you can serve customers. But if you’re not watchful, the process can become the thing. This can happen very easily in large organizations. The process becomes the proxy for the result you want. You stop looking at outcomes and just make sure you’re doing the process right. Gulp. It’s not that rare to hear a junior leader defend a bad outcome with something like, “Well, we followed the process.” A more experienced leader will use it as an opportunity to investigate and improve the process. The process is not the thing. It’s always worth asking, do we own the process or does the process own us? In a Day 2 company, you might find it’s the second. Another example: market research and customer surveys can become proxies for customers—something that’s especially dangerous when you’re inventing and designing products. “Fifty-five percent of beta testers report being satisfied with this feature. That is up from 47 percent in the first survey.” That’s hard to interpret and could unintentionally mislead. Good inventors and designers deeply understand their customer. They spend tremendous energy developing that intuition. They study and understand many anecdotes rather than only the averages you’ll find on surveys. They live with the design.&lt;/p&gt;
&lt;h2&gt;High standards&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;First, there’s a foundational question: are high standards intrinsic or teachable?&lt;/strong&gt; If you take me on your basketball team, you can teach me many things, but you can’t teach me to be taller. Do we first and foremost need to select for “high standards” people? If so, this letter would need to be mostly about hiring practices, but I don’t think so. I believe high standards are teachable. In fact, people are pretty good at learning high standards simply through exposure. High standards are contagious. Bring a new person onto a high standards team, and they’ll quickly adapt. The opposite is also true. If low standards prevail, those too will quickly spread. And though exposure works well to teach high standards, I believe you can accelerate that rate of learning by articulating a few core principles of high standards, which I hope to share in this letter.&lt;/p&gt;
&lt;p&gt;Another important question is whether high standards are universal or domain specific. In other words, if you have high standards in one area, do you automatically have high standards elsewhere? I believe high standards are domain specific, and that you have to learn high standards separately in every arena of interest. When I started Amazon, I had high standards on inventing, on customer care, and (thankfully) on hiring. But I didn’t have high standards on operational process: how to keep fixed problems fixed, how to eliminate defects at the root, how to inspect processes, and much more. I had to learn and develop high standards on all of that (my colleagues were my tutors). Understanding this point is important because it keeps you humble. &lt;strong&gt;You can consider yourself a person of high standards in general and still have debilitating blind spots. There can be whole arenas of endeavor where you may not even know that your standards are low or nonexistent, and certainly not world class. It’s critical to be open to that likelihood.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Patience&lt;/h2&gt;
&lt;p&gt;Perfect Handstands A close friend recently decided to learn to do a perfect free-standing handstand. No leaning against a wall. Not for just a few seconds. Instagram good. She decided to start her journey by taking a handstand workshop at her yoga studio. She then practiced for a while but wasn’t getting the results she wanted. So, she hired a handstand coach. Yes, I know what you’re thinking, but evidently this is an actual thing that exists. In the very first lesson, the coach gave her some wonderful advice. “Most people,” he said, “think that if they work hard, they should be able to master a handstand in about two weeks. The reality is that it takes about six months of daily practice. If you think you should be able to do it in two weeks, you’re just going to end up quitting.” Unrealistic beliefs on scope—often hidden and undiscussed—kill high standards. To achieve high standards yourself or as part of a team, you need to form and proactively communicate realistic beliefs about how hard something is going to be—something this coach understood well.&lt;/p&gt;
&lt;h2&gt;Time management for executives&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;I like to putter in the morning. I get up early. I go to bed early.&lt;/strong&gt; I like to read the newspaper. I like to have coffee. I like to have breakfast with my kids before they go to school. So my puttering time is very important to me. That’s why I set my first meeting for ten o’clock. I like to do my high-IQ meetings before lunch. &lt;strong&gt;Anything that’s going to be really mentally challenging is a ten o’clock meeting because by 5 P.M., I’m, like, I can’t think more about this issue today. Let’s try this again tomorrow at 10 A.M.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Then on to eight hours of sleep.&lt;/strong&gt; I prioritize sleep unless I’m traveling in different time zones. Sometimes getting eight hours is impossible, but I am very focused on it, and I need eight hours. I think better. I have more energy. My mood is better. And think about it: As a senior executive, what do you really get paid to do? You get paid to make a small number of high-quality decisions. Your job is not to make thousands of decisions every day. So let’s say I slept six hours a day, or let’s go really crazy and say I slept four hours a day. I’d get four so-called productive hours back. So if before I had, say, twelve hours of productive time during any waking day, now all of a sudden I have twelve plus four—I have sixteen productive hours. So I have 33 percent more time to make decisions. If I was going to make, say, one hundred decisions, I can now make thirty-three more. Is that really worth it if the quality of those decisions might be lower because you’re tired or grouchy or any number of things? &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Now, it’s different if the company is a start-up.&lt;/strong&gt; When Amazon was a hundred people, it was a different story, but Amazon’s not a start-up company, and all of our senior executives operate the same way I do. They work in the future. They live in the future. None of the people who report to me should really be focused on the current quarter. When I have a good quarterly conference call with Wall Street, people will stop me and say, “Congratulations on your quarter,” and I say, “Thank you,” but what I’m really thinking is that quarter was baked three years ago. Right now I’m working on a quarter that’s going to reveal itself in 2023 sometime, and that’s what you need to be doing. You need to be thinking two or three years in advance, and if you are, then why do I need to make a hundred decisions today? If I make, like, three good decisions a day, that’s enough, and they should just be as high quality as I can make them. &lt;strong&gt;Warren Buffet says he’s good if he makes three good decisions a year, and I really believe that.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Trust&lt;/h2&gt;
&lt;p&gt;The way you earn trust, the way you develop a reputation is by doing hard things well over and over and over. &lt;strong&gt;The reason, for example, that the US military, in all polls, has such high credibility and reputation is because, over and over again, decade after decade, it has done hard things well.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Questions for tomorrow&lt;/h2&gt;
&lt;p&gt;Tomorrow, in a very real sense, your life—the life you author from scratch on your own—begins. How will you use your gifts? What choices will you make? Will inertia be your guide, or will you follow your passions? Will you follow dogma, or will you be original? Will you choose a life of ease, or a life of service and adventure? Will you wilt under criticism, or will you follow your convictions? Will you bluff it out when you’re wrong, or will you apologize? Will you guard your heart against rejection, or will you act when you fall in love? Will you play it safe, or will you be a little bit swashbuckling? When it’s tough, will you give up, or will you be relentless? Will you be a cynic, or will you be a builder? Will you be clever at the expense of others, or will you be kind? I will hazard a prediction. When you are eighty years old and, in a quiet moment of reflection, narrating for only yourself the most personal version of your life story, the telling that will be most compact and meaningful will be the series of choices you have made. In the end, we are our choices. Build yourself a great story. Thank you, and good luck!&lt;/p&gt;</content><category term="business"></category><category term="experimentation"></category><category term="book"></category><category term="amazon"></category><category term="management"></category></entry><entry><title>Training a Pytorch Wasserstein MNIST GAN on Google Colab</title><link href="https://bytepawn.com/training-a-pytorch-wasserstain-mnist-gan-on-google-colab.html" rel="alternate"></link><published>2021-03-03T00:00:00+01:00</published><updated>2021-03-03T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-03-03:/training-a-pytorch-wasserstain-mnist-gan-on-google-colab.html</id><summary type="html">&lt;p&gt;I train a Pytorch Wasserstein MNIST GAN on Google Colab to beautiful MNIST digits.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gan_wasserstein_200.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="https://bytepawn.com/training-a-pytorch-classic-mnist-gan-on-google-colab.html"&gt;previous post&lt;/a&gt;, I trained a Classic GAN to produce MNIST digits using Pytorch. The results were okay, but a human discriminator can still tell these generated images apart from hand-written digits. In this post, I'll train a modification of the classic GAN called Wasserstein GAN. The &lt;a href="https://colab.research.google.com/drive/12lFAHpVIVOHK8dVbZOlVxnMhFlfKnMND#scrollTo=2jpVpu8OYlky&amp;amp;uniqifier=1"&gt;Google Colab notebook is here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Architecture&lt;/h2&gt;
&lt;p&gt;The Wasserstein GAN (WGAN) was introduced in a &lt;a href="https://arxiv.org/pdf/1701.07875.pdf"&gt;2017 paper&lt;/a&gt;. This &lt;a href="https://developers.google.com/machine-learning/gan/loss"&gt;Google Machine Learning page explains WGANs&lt;/a&gt; and their relationship to classic GANs beautifully:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This loss function depends on a modification of the GAN scheme called "Wasserstein GAN" or "WGAN" in which the discriminator does not actually classify instances. For each instance it outputs a number. This number does not have to be less than one or greater than 0, so we can't use 0.5 as a threshold to decide whether an instance is real or fake. Discriminator training just tries to make the output bigger for real instances than for fake instances.&lt;/p&gt;
&lt;p&gt;Because it can't really discriminate between real and fake, the WGAN discriminator is actually called a "critic" instead of a "discriminator". This distinction has theoretical importance, but for practical purposes we can treat it as an acknowledgement that the inputs to the loss functions don't have to be probabilities.&lt;/p&gt;
&lt;p&gt;The loss functions themselves are deceptively simple:&lt;/p&gt;
&lt;p&gt;Critic Loss: &lt;code&gt;D(x) - D(G(z))&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The discriminator tries to maximize this function. In other words, it tries to maximize the difference between its output on real instances and its output on fake instances.&lt;/p&gt;
&lt;p&gt;Generator Loss: &lt;code&gt;D(G(z))&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The generator tries to maximize this function. In other words, It tries to maximize the discriminator's output for its fake instances. In these functions:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;D(x)&lt;/code&gt; is the critic's output for a real instance.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;G(z)&lt;/code&gt; is the generator's output when given noise z.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;D(G(z))&lt;/code&gt; is the critic's output for a fake instance.&lt;/p&gt;
&lt;p&gt;The output of critic &lt;code&gt;D&lt;/code&gt; does not have to be between 1 and 0.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;The Generator and Discriminator neural networks are exactly the same as in the classic GAN case!&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Generator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BatchNorm1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;layers&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tanh&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Discriminator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;img_size&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img_flat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;validity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_flat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;validity&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The training code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;real_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# train Discriminator&lt;/span&gt;
&lt;span class="n"&gt;discriminator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# sample noise as generator input&lt;/span&gt;
&lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="c1"&gt;# generate a batch of images&lt;/span&gt;
&lt;span class="n"&gt;fake_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# Adversarial loss&lt;/span&gt;
&lt;span class="n"&gt;discriminator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fake_imgs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_imgs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; 
&lt;span class="n"&gt;discriminator_loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;discriminator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# clip weights of discriminator&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clamp_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clip_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clip_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# train the generator every n_critic iterations&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_critic&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# train Generator&lt;/span&gt;
    &lt;span class="n"&gt;generator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# generate a batch of fake images&lt;/span&gt;
    &lt;span class="n"&gt;critics_fake_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Adversarial loss&lt;/span&gt;
    &lt;span class="n"&gt;generator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;critics_fake_imgs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;generator_loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;generator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;Below are samples of MNIST digits produced by the GAN after 1, 5, 10, 50, 100, 150 and 200 epochs of training (an epoch is a full go through the MNIST dataset, consisting of 60K digits):&lt;/p&gt;
&lt;p&gt;Epoch 1:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_1.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 5:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_5.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 10:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_10.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 50:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_50.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 100:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_100.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 150:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_150.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 200:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_wasserstein_200.png" alt="Wasserstein GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Unlike the Classic GAN, the WGAN produces beautiful digits after 200 iterations.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="torchvision"></category><category term="mnist"></category><category term="gan"></category></entry><entry><title>Training a Pytorch Classic MNIST GAN on Google Colab</title><link href="https://bytepawn.com/training-a-pytorch-classic-mnist-gan-on-google-colab.html" rel="alternate"></link><published>2021-03-02T00:00:00+01:00</published><updated>2021-03-02T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-03-02:/training-a-pytorch-classic-mnist-gan-on-google-colab.html</id><summary type="html">&lt;p&gt;I train a Pytorch Classic MNIST GAN on Google Colab to generate MNIST digits.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gan_classic_200.png" alt="Classic GAN Generated MNIST digits" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Generative Adversarial Networks (GANs) were &lt;a href="https://arxiv.org/abs/1406.2661"&gt;introduced by Ian Goodfellow in 2014&lt;/a&gt;. GANs are able to learn a probability distribution and generate new samples from noise per the probability distribution. In plain english, this means GANs can be trained on a set of images (or audio, etc), and will be able to produce realistic looking new "fake" images (or audio, etc). To connect the two explanations: MNIST digits are 28x28 grascale images and pixel values are 0-255, so an MNIST digit can be thought of as a 28x28=784 element vector, where each element is 0-255. Some (tiny) portions of this space look like digits, most parts look like noise (also others look like letters, etc.) &lt;/p&gt;
&lt;p&gt;For another explanation, here is &lt;a href="https://en.wikipedia.org/wiki/Generative_adversarial_network"&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in 2014. Two neural networks contest with each other in a game (in the form of a zero-sum game, where one agent's gain is another agent's loss).  Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proven useful for semi-supervised learning, fully supervised learning, and reinforcement learning. The core idea of a GAN is based on the "indirect" training through the discriminator, which itself is also being updated dynamically. This basically means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I took an old implementation from the abandoned &lt;a href="https://github.com/eriklindernoren/PyTorch-GAN"&gt;Pytorch-GAN repo&lt;/a&gt;, and made it work on Google Colab, the &lt;a href="https://colab.research.google.com/drive/1Xxy_sH6sPJmktMzdE6H-Fs0TPnVSlCWa#scrollTo=6WhpmvokgLPa"&gt;notebook is here&lt;/a&gt;. You can just run the notebook and generate your own digits.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: In &lt;a href="https://bytepawn.com/training-a-pytorch-lightning-mnist-gan-on-google-colab.html#training-a-pytorch-lightning-mnist-gan-on-google-colab"&gt;a previous post&lt;/a&gt;, I tried to train the Softmax MNIST GAN in Pytorch Lighting. I decided to stop using Pytorch Lightning for now, because I ran into numerous framework issues, see &lt;a href="https://github.com/PyTorchLightning/pytorch-lightning/issues/6134"&gt;this&lt;/a&gt; and &lt;a href="https://github.com/PyTorchLightning/pytorch-lightning/issues/6213"&gt;this&lt;/a&gt; issue I opened. Also, the Softmax GAN itself gave me trouble even on Pytorch, so I decided to take a step back and start with Goodfellow's classic GAN.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Architecture&lt;/h2&gt;
&lt;p&gt;Conceptually a GAN program has 6 parts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Data we're trying to generate&lt;/li&gt;
&lt;li&gt;Discriminator network&lt;/li&gt;
&lt;li&gt;Generator network&lt;/li&gt;
&lt;li&gt;Discriminator's loss function&lt;/li&gt;
&lt;li&gt;Generator's loss function&lt;/li&gt;
&lt;li&gt;Training process&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The explanation of the role of the generator and discriminator networks, from Wikipedia:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The generative network generates candidates while the discriminative network evaluates them. The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized (are part of the true data distribution)). A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically the generator is seeded with randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Independent backpropagation procedures are applied to both networks so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images. The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For this blog post:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Data&lt;/strong&gt; = MNIST digits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Discriminator&lt;/strong&gt; = a convolutional neural network (CNN), which given an MNIST digits, tries to distinguish real (from the training set) and generated (by the generator) images. More precisely, it predicts the probability of the digit being real.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generator&lt;/strong&gt; = a deconvolutional network, which given some random input, generates a 28x28 image; once trained, the generator's output should be MNIST-like.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note that 1-3. above is the same for many MNIST GAN architectures, whether it's Classic, Softmax, Wasserstein, etc. For the Classic GAN, we use &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;Binary Cross Entropy&lt;/a&gt; (BCE) in both the generator and discriminator's loss function, ie. a measure of the distance between two probability distributions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Discriminator's loss function&lt;/strong&gt; = BCE between the discriminator's predicted probabilities and the digit's ground truth (0 for generated or 1 for real from the MNIST dataset), computed over an even mix of real and fake.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generator's loss function&lt;/strong&gt; = BCE between the discriminator's prediction over generated images and the constant 1.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Training process&lt;/strong&gt; = one epoch is going through all 60K 28x28 MNIST digits, and for each one also generating a fake digit, in batches of 64. Use &lt;a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent"&gt;Stochastic gradient descent&lt;/a&gt; (Adam) to update the network weights on each batch based on the losses.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;First, let's see the Pytorch code the discriminator, a straighforward CNN:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Discriminator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img_flat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;validity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_flat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;validity&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that "straighforward" is a professional euphemism. One of the reasons Deep Learning took off a few years ago is that people realized that introducing non-linearities like ReLUs makes the network work (vs. purely linear NNs). Guessing what neural network works for a given use-case is a mix of black art, experimentation and copy/paste from existing.&lt;/p&gt;
&lt;p&gt;The generator:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Generator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BatchNorm1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;layers&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tanh&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As mentioned above, the discriminator's loss is binary cross entropy (BCE), which is &lt;a href="https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html"&gt;BCELoss&lt;/a&gt; in Pytorch:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;bce_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BCELoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;generator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen_imgs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;real_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_imgs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;         &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fake_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen_imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;discriminator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_loss&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;fake_loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The overall logic of the training loop is (see comments):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;real&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fill_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fake&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fill_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;real_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;#  train Generator&lt;/span&gt;
&lt;span class="n"&gt;generator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# sample noise as generator input&lt;/span&gt;
&lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="c1"&gt;# generate a batch of images&lt;/span&gt;
&lt;span class="n"&gt;gen_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# loss measures generator&amp;#39;s ability to fool the discriminator:&lt;/span&gt;
&lt;span class="c1"&gt;# for the generated images, the generator wants the discriminator to think they&amp;#39;re real (1)&lt;/span&gt;
&lt;span class="c1"&gt;# so if the discriminator(gen_imgs) == real == 1, then the generator is doing a good job, there is no loss&lt;/span&gt;
&lt;span class="n"&gt;generator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen_imgs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;generator_loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;generator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# train Discriminator&lt;/span&gt;
&lt;span class="n"&gt;discriminator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# loss measure discriminator&amp;#39;s ability to classify real from generated samples:&lt;/span&gt;
&lt;span class="n"&gt;real_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_imgs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;         &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fake_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bce_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen_imgs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;discriminator_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_loss&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;fake_loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;discriminator_loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;discriminator_optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;Note that this is the first GAN invented, so it still suffers from childhood issues such as "mode collapse", where the GAN mostly emits the same digits. A better name would be "range collapse" in my opinion, in mathematics range is the term for the space that a function generates into.&lt;/p&gt;
&lt;p&gt;Below are samples of MNIST digits produced by the GAN after 1, 5, 10, 50, 100, 150 and 200 epochs of training (an epoch is a full go through the MNIST dataset, consisting of 60K digits):&lt;/p&gt;
&lt;p&gt;Epoch 1:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_1.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 5:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_5.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 10:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_10.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 50:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_50.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 100:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_100.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 150:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_150.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Epoch 200:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_classic_200.png" alt="Classic GAN Generated MNIST digits" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;The digits above look okay, many of them are recognizable, but overall the quality is not great.&lt;/li&gt;
&lt;li&gt;The network suffers from range collapse, mostly it's trying to generate 2s and 8s.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In the next post, I will look at the Wasserstein GAN, which doesn't suffer from range collapse and produces beautiful MNIST digits.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="torchvision"></category><category term="mnist"></category><category term="gan"></category></entry><entry><title>How I manage notifications to keep myself uninterrupted</title><link href="https://bytepawn.com/how-i-manage-notifications-to-keep-myself-uninterrupted.html" rel="alternate"></link><published>2021-02-22T00:00:00+01:00</published><updated>2021-02-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-02-22:/how-i-manage-notifications-to-keep-myself-uninterrupted.html</id><summary type="html">&lt;p&gt;I love using apps, email, social networking and messaging. But I want to do use them on my own time. So many years ago I decided to not let my phone interrupt me.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/phone_lock.jpg" alt="My phone's lock screen" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;tldr = my phone is always on mute, no vibrations&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As a tech worker I spend a lot of time in front my Macbook and on my phone.
In terms of usage statistics, I spend about 3 to 5 hours per day on the phone, and an additional 4-8 hours on the Macbook.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/phone_usage.jpg" alt="My phone usage" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;p&gt;I already have very good response time, since I check my phone hourly (or more frequently).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What I don't want to be is interruptable.&lt;/strong&gt; I decided and validated in the last 5 years that almost nothing happening day-to-day is important enough to interrupt me. After all, 30 years ago nobody had mobile phones, and life worked out just fine.&lt;/p&gt;
&lt;p&gt;Just because we have mobiles with email and great social networking and messaging apps doesn't mean that anybody who is my friend on Messenger has a real-time interrupt vector into my life and mind (or Linked or Whatsapp or Viber). They do not, I get to decide who can interrupt me and when. To be clear, I believe that email, social networking and messaging apps are great, in my opinion they are a net positive Δ for the world. I love using them and keeping in touch with my friends and family in other countries. I just want to do it on my own time.&lt;/p&gt;
&lt;p&gt;So many years ago I decided to not let my phone (or any other devices) interrupt me. The way to achieve this is to &lt;strong&gt;run my phone on mute, with vibrations turned off&lt;/strong&gt;. Additionally, to make sure I'm not interrupted when I just look at my phone to check the time, I also turned off all notifications on the lock screen. My lock screen is always blank, and my phone is always silent.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/phone_lock.jpg" alt="My phone's lock screen" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;I also aggressively manage per app notifications.&lt;/strong&gt; Even though a notification will not interrupt me and won't annoy me on the lock screen, it will still show up once I unlock the phone. If it's not important, it needs to go. Android lets me turn off all notifications per app, or just certain notifications per app. A good example is Google Play notifying me that it's updating apps. Who cares? I turned it off. It's easy: when there's a notification, slide down from the top, touch and hold the notification, the app's notification settings come up, and turn it off, possibly for the whole app.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/phone_notifs.jpg" alt="My phone's empty notification panel" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens when somebody tries to call me?&lt;/strong&gt;&lt;br/&gt;
Unless the phone happens to by in my hand, I won't know about it, and I won't pick it up. But I will see the next time I decide to spend time with the phone and unlock it, which is usually 20-60 minutes later. So I call them back, if I want to. If it looks like a sales call, I won't call back. Another good trick with phone calls is, if it's really important, they'll call twice or thrice...&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What if it's urgent?&lt;/strong&gt;&lt;br/&gt;
Empirically I have determined that nothing is urgent: I've been doing this for 5+ years. The only time I turned up the volume for a few days was when our baby was due, but even then it didn't matter in the end. To re-iterate, very few things are really urgent, and I can't think of a single thing that can't wait 20-60 minutes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What if it's work?&lt;/strong&gt;&lt;br/&gt;
Same, it can wait a few minutes to a few hours. Also, during work hours I'm usually on my laptop, so I see notifications there. As a tech worker, I don't commit to be reachable per phone for work purposes in real-time, only to respond to the internal chats in a reasonable time.&lt;/p&gt;
&lt;p&gt;Having said all that, since I check my phone regularly, &lt;strong&gt;I still have great response time&lt;/strong&gt;. Although I may not respond immediately, I almost always respond within an hour, because I like the Zero Inbox concept. So I like my chats to be in a state where I'm expecting a reply instead of the other party expecting a reply from me (that would be a mental todo item, which I try to keep at close to zero).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;It's a trade-off&lt;/strong&gt;&lt;br/&gt;
This is a great trade-off quantitatively. Suppose that once a year there is something important that you would want to know about in real-time. On the flip side, there are about 100 unimportant notifications per day. So that's 1 out of 36,500. I'll take not being interrupted for 99.997% of the time for a 20-60 minute delay in learning about something important.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tablet for reading&lt;/strong&gt;&lt;br/&gt;
I also have a tablet for Kindle reading and taking notes. On this, I totally disabled all notifications. I don't care about anything if I'm reading a book, same way I would read a real book.&lt;/p&gt;
&lt;p&gt;Happy notifications!&lt;/p&gt;</content><category term="personal"></category><category term="interruptions"></category><category term="notifications"></category></entry><entry><title>Training a Pytorch Lightning MNIST GAN on Google Colab</title><link href="https://bytepawn.com/training-a-pytorch-lightning-mnist-gan-on-google-colab.html" rel="alternate"></link><published>2021-02-20T00:00:00+01:00</published><updated>2021-02-20T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-02-20:/training-a-pytorch-lightning-mnist-gan-on-google-colab.html</id><summary type="html">&lt;p&gt;I explore MNIST digits generated by a Generative Adversarial Network trained on Google Colab using Pytorch Lightning. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gan_epoch5_c.png" alt="Softmax GAN after 5 epoch, 100 samples." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Getting started with Google Colab&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://colab.research.google.com/notebooks/intro.ipynb#recent=true"&gt;Google Colab&lt;/a&gt; is a free Jupyter-like service from Google. The easiest way to get started is to &lt;a href="https://colab.research.google.com/drive/1T6TpBvtFt14UrvCwDgP3eIAzaz-Af_e9#scrollTo=inrPxI-LCiNI"&gt;open the notebook for this post&lt;/a&gt;, and then click &lt;code&gt;File &amp;gt; Save a copy in Drive&lt;/code&gt;, this will make a copy for you. Alternatively, in Google Drive, you can click &lt;code&gt;New &amp;gt; More &amp;gt; Google Colaboratory&lt;/code&gt; to start with a blank notebook. Try it out, it's free, &lt;strong&gt;you get GPU training for free&lt;/strong&gt;, and the Colab UI is very nice.&lt;/p&gt;
&lt;p&gt;It's important to know that there are limits on Colab, it's free after all. After one hour of UI inactivity, the underlying Virtual Machine (VM) will shut down, and all memory/state and files you wrote to the VM's disk will be lost. You can re-run the notebook, but it will be on a fresh new VM. So when working on Colab, always make sure your notebook is complete, so that &lt;code&gt;Runtime &amp;gt; Restart and run all&lt;/code&gt; results in correct execution. Also, even if you're active continuosly, the VM will be shut down after 12 hours. This model is good enough for experimentation if you don't hve access to your own GPU(s), but not for more serious work.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/google_colab_1.png" alt="Google Colab." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Enabling the GPU on Colab&lt;/h2&gt;
&lt;p&gt;In Colab, you get a notebook running on a VM with a GPU — but &lt;strong&gt;first you have to enable the GPU&lt;/strong&gt;. Go to &lt;code&gt;Edit &amp;gt; Notebook settings&lt;/code&gt;, and enable a GPU or a TPU. The GPU option is faster, go with that. Without this, the runtime will not see any GPUs! However, you only have to do this once, not on every VM restart.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/google_colab_2.png" alt="Google Colab." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;GAN training with Pytorch Lightning&lt;/h2&gt;
&lt;p&gt;I will use one of &lt;a href="https://github.com/eriklindernoren"&gt;Erik Linder-Noren's&lt;/a&gt; GANs called &lt;a href="https://github.com/eriklindernoren/PyTorch-GAN/blob/master/implementations/softmax_gan/softmax_gan.py"&gt;Softmax GAN&lt;/a&gt;. Erik's version is for Pytorch, and it was adapted for &lt;a href="https://www.pytorchlightning.ai/"&gt;Pytorch Lightning&lt;/a&gt; and made an &lt;a href="https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/domain_templates/generative_adversarial_net.py"&gt;official Pytorch Lightning example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What is Pytorch Lightning?&lt;/strong&gt; It's a small framework on top of Pytorch that is quickly becoming the new standard. It helps you write more modular code by forcing you to factor out code into classes and callbacks. Usually, factoring existing Pytorch code into Lightning code is a simple matter, and results in less code (because Lightning has built in Trainers) and cleaner code. Check the &lt;a href="https://www.pytorchlightning.ai/"&gt;official page for a quick explanation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Since we're using MNIST here, note these previous Bytepawn posts about MNIST:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;Solving MNIST with Pytorch and SKL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/mnist-pixel-attacks-with-pytorch.html#mnist-pixel-attacks-with-pytorch"&gt;MNIST pixel attacks with Pytorch&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Generative Adversarial Networks&lt;/h2&gt;
&lt;p&gt;From &lt;a href="https://en.wikipedia.org/wiki/Generative_adversarial_network"&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The generative network generates candidates while the discriminative network evaluates them. The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized (are part of the true data distribution)).&lt;/p&gt;
&lt;p&gt;A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically the generator is seeded with randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Independent backpropagation procedures are applied to both networks so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images. The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.&lt;/p&gt;
&lt;p&gt;GANs often suffer from a "mode collapse" where they fail to generalize properly, missing entire modes from the input data. For example, a GAN trained on the MNIST dataset containing many samples of each digit, might nevertheless timidly omit a subset of the digits from its output. Some researchers perceive the root problem to be a weak discriminative network that fails to notice the pattern of omission, while others assign blame to a bad choice of objective function.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;Let's start with the code for the Generator, a deconvolutional neural network (it produces an image from a lower dimensional [random] input):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img_shape&lt;/span&gt;

        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BatchNorm1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;layers&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latent_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tanh&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, the Discriminator, a convolutional neural network (takes an image and emits a binary True/False, whether the image is coming from the training set or not):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Discriminator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_shape&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeakyReLU&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;img_flat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;validity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_flat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;validity&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I will not reproduce the rest of the code here or discuss the detail of the Softmax GAN architecture, &lt;a href="https://paperswithcode.com/paper/softmax-gan"&gt;check the paper&lt;/a&gt; if you're interested.&lt;/p&gt;
&lt;h2&gt;Running the code in Colab&lt;/h2&gt;
&lt;p&gt;Colab already comes with Pytorch pre-installed. The only library we have to install manually for this example to run is Pytorch Lightning:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;# suppress boring pip output with capture
%%capture
!pip install pytorch_lightning
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To see example images produces by the GAN I wrote this helper function, which is called after training completes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;show_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;validation_z&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sample_imgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torchvision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_imgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nrow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transpose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;interpolation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;nearest&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;I've run the GAN varying the number of epochs, to see how the Generator improves over time. For lower epochs, I've run it multiple times to see variations.
Each image shows 100 "randomly" generated [by the Generator network] digits.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;num_epochs = 1&lt;/code&gt;, showing output from 3 different training runs:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch1_a.png" alt="Softmax GAN after 1 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch1_b.png" alt="Softmax GAN after 1 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch1_c.png" alt="Softmax GAN after 1 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;num_epochs = 5&lt;/code&gt;, showing output from 3 different training runs:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch5_a.png" alt="Softmax GAN after 5 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch5_b.png" alt="Softmax GAN after 5 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch5_c.png" alt="Softmax GAN after 5 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;num_epochs = 10&lt;/code&gt;, showing output from 3 different training runs:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch10_a.png" alt="Softmax GAN after 10 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch10_b.png" alt="Softmax GAN after 10 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch10_c.png" alt="Softmax GAN after 10 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;num_epochs = 100&lt;/code&gt;, showing output from 1 training run:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch100_a.png" alt="Softmax GAN after 100 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;num_epochs = 200&lt;/code&gt;, showing output from 1 training run:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gan_epoch200_a.png" alt="Softmax GAN after 200 epoch, 100 samples." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The results support the Wikipedia quote's "mode collapse" paragraph. The Generator tends to emit 1 or 2 digits. On the 4th image, it seems to be 5s and 7s, on the next 8s and 9s.
The GAN does not converge to a good solution. In fact, I am having trouble visually seeing obvious benefits of training beyond 5 epochs. The other give-away is that the network produces too much noise.  Any human could easily distinguish these from the MNIST digits, because those have no noise. Having said that, this could be solved easily by de-noising the resulting images using a traditional de-noising filter. I think with that the digits would be reasonably good.&lt;/p&gt;
&lt;p&gt;Google Colab is a nice free notebook product from Google, and it's better than training on your laptop without a GPU. But even with a GPU, training for 100s of epochs is slow and is impractical for experimentation. In a subsequent post I will move training to AWS Sagemaker, where higher GPU instances are available, though not for free. With quicker training, I will try to fix the GAN to produce good quality images, like in &lt;a href="https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/"&gt;this post&lt;/a&gt;.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="gan"></category><category term="mnist"></category><category term="google-colab"></category></entry><entry><title>Automatic MLFlow logging for Pytorch</title><link href="https://bytepawn.com/automatic-mlflow-logging-for-pytorch.html" rel="alternate"></link><published>2021-01-24T00:00:00+01:00</published><updated>2021-01-24T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-01-24:/automatic-mlflow-logging-for-pytorch.html</id><summary type="html">&lt;p&gt;I explore the automatic logging capabilities of MLFlow for Pytorch.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mlflow_pytorch_loss.png" alt="MLFlow Pytorch loss example." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In a &lt;a href="http://bytepawn.com/getting-started-with-mlflow.html#getting-started-with-mlflow"&gt;previous post I looked at getting MLFlow up and running and emitting some simple logs&lt;/a&gt; using &lt;code&gt;log_artifact()&lt;/code&gt;, &lt;code&gt;log_param()&lt;/code&gt;, &lt;code&gt;log_metric()&lt;/code&gt;. Then I played around with &lt;a href="http://bytepawn.com/automatic-mlflow-logging-for-scikit-learn.html"&gt;automatic logging for Scikit Learn&lt;/a&gt;, and in the process even found some bugs in MLFlow, and fixed them. In this post, I will look at the automatic logging capabilities of MLFlow, but this time for Pytorch. The relevant &lt;a href="https://mlflow.org/docs/latest/tracking.html#pytorch-experimental"&gt;documentation is here&lt;/a&gt;. &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/12-mlflow/MNIST%20Pytorch%20Lightning%20MLFlow.ipynb"&gt;The source code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Autologging for classic Pytorch code&lt;/h2&gt;
&lt;p&gt;First, we have to turn on MLFlow itself and the autologging:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_tracking_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;http://127.0.0.1:5000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# set up connection&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;test-experiment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# set the experiment&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pytorch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;autolog&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The easiest thing to try is to take existing Pytorch code, and run it after the above MLFlow code ran. So let's take the &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html"&gt;MNIST code from a previous blog post&lt;/a&gt; (only showing relevant bits):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;losses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nll_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Train Epoch: {} [{}/{} ({:.0f}%)]&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;Loss: {:.6f}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mf"&gt;100.&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;losses&lt;/span&gt;


&lt;span class="n"&gt;train_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;However, &lt;strong&gt;this will not work&lt;/strong&gt;! Nothing will show up in MLFlow after we run the above code.&lt;/p&gt;
&lt;p&gt;Why not? This is classic Pytorch code, where we control the training loop, in the approx. 10 lines of code in &lt;code&gt;train()&lt;/code&gt;. MLFlow can't do any automatic logging, because it doesn't know where we're at in the training (or testing), so it doesn't know when or how to measure and emit metrics. For this, we need to transform the code so it's closer to Scikit Learn's &lt;code&gt;fit()&lt;/code&gt; and &lt;code&gt;predict()&lt;/code&gt; pattern.&lt;/p&gt;
&lt;p&gt;This is what &lt;a href="https://www.pytorchlightning.ai/"&gt;Pytorch Lightning&lt;/a&gt; gives us! Alternatively, if you actually read the linked MLFlow for Pytorch documentation, it states that automatic logging only works with Pytorch Lighning 😊&lt;/p&gt;
&lt;h2&gt;Autologging for Pytorch Lightning code&lt;/h2&gt;
&lt;p&gt;Let's refactor the above code to use Pytorch Lighning:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MNISTModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LightningModule&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# same as in CNN&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MNISTModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# same as in CNN&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_pool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_pool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;training_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_nb&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nll_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;acc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;train_loss&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_epoch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;acc&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_epoch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;configure_optimizers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mnist_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MNISTModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;train_ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getcwd&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;train_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_ds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Trainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;progress_bar_refresh_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mnist_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the Pytorch Lighning structure, it's now obvious when training starts and ends, and what a training step is. &lt;strong&gt;MLFlow autologging works.&lt;/strong&gt; Note that logging of the metrics is not "free", we have to call &lt;code&gt;self.log()&lt;/code&gt; in &lt;code&gt;training_step()&lt;/code&gt;. This is because the framework doesn't know what kind of merics make sense for our model (eg. is it &lt;code&gt;nll_loss()&lt;/code&gt; or &lt;code&gt;cross_entropy()&lt;/code&gt;) and how to compute it.&lt;/p&gt;
&lt;p&gt;So what gets logged? The parameters of the model:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_pytorch_params.png" alt="MLFlow Pytorch parameters example." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Various metrics, such as accuracy and loss are logged for each epoch:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_pytorch_metrics.png" alt="MLFlow Pytorch metrics example." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Showing loss per training epoch:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_pytorch_loss.png" alt="MLFlow Pytorch loss example." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Various artifacts, similar to the Scikit Learn case, such as the environment description and the pickled model:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_pytorch_artifacts.png" alt="MLFlow Pytorch artifacts example." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The automatic logging capabilities are still experimental, but it's a great start. The logging only works with Pytorch Lighning, a great excuse to refactor classic Pytorch code to use Lighning. I will definitely turn on automatic logging for my production Pytorch (Lightning) models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Happy MLFlowing!&lt;/strong&gt;&lt;/p&gt;</content><category term="mlflow"></category><category term="tracking"></category></entry><entry><title>Automatic MLFlow logging for Scikit Learn</title><link href="https://bytepawn.com/automatic-mlflow-logging-for-scikit-learn.html" rel="alternate"></link><published>2021-01-15T00:00:00+01:00</published><updated>2021-01-15T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-01-15:/automatic-mlflow-logging-for-scikit-learn.html</id><summary type="html">&lt;p&gt;I explore the automatic logging capabilities of MLFlow for Scikit Learn. In the process I found a bug in MLFlow, reported it and wrote a pull request to fix it.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mlflow_scatter.png" alt="MLFlow scatter plot." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The &lt;a href="http://bytepawn.com/getting-started-with-mlflow.html#getting-started-with-mlflow"&gt;previous post was about getting MLFlow up and running and emitting some simple logs&lt;/a&gt; using &lt;code&gt;log_artifact()&lt;/code&gt;, &lt;code&gt;log_param()&lt;/code&gt;, &lt;code&gt;log_metric()&lt;/code&gt;. In this post, I will look at the automatic logging capabilities of MLFlow, specifically for Scikit Learn. The relevant &lt;a href="https://mlflow.org/docs/latest/tracking.html#scikit-learn-experimental"&gt;documentation is here&lt;/a&gt;. As the documentation states, this is an experimental feature, and in fact while writing this blog post I ran into a bug. If you're curious, read on. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Automatic%20MLFlow%20logging%20for%20SKL.ipynb"&gt;The source code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Autologging for regression&lt;/h2&gt;
&lt;p&gt;First, we have to turn on MLFlow itself and the autologging:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_tracking_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;http://127.0.0.1:5000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# set up connection&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;test-experiment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# set the experiment&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sklearn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;autolog&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's start with a regression model and see what kind of logging we get from MLFlow:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;10.0&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;

&lt;span class="n"&gt;num_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;MLFlow logs the following &lt;strong&gt;parameters&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;copy_X&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fit_intercept&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;n_jobs&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;normalize&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are the parameters of the &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html"&gt;&lt;code&gt;LinearRegression()&lt;/code&gt; constuctor&lt;/a&gt;. Since I didn't specify anything, the logged values are the default values.&lt;/p&gt;
&lt;p&gt;MLFlow logs the following &lt;strong&gt;training metrics&lt;/strong&gt;, computed at the end of &lt;code&gt;model.fit()&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;training_mae&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_mse&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_r2_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_rmse&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_score&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/mlflow_lr_params.png" alt="MLFlow LinearRegression params." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Additionally, MLFlow logs the following &lt;strong&gt;artifacts&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;MLModel&lt;/code&gt;: MLFlow's environment descriptor, contains things like the Python and the SKL version&lt;/li&gt;
&lt;li&gt;&lt;code&gt;conda.yaml&lt;/code&gt;: another env descriptor, this one for Conda&lt;/li&gt;
&lt;li&gt;&lt;code&gt;model.pkl&lt;/code&gt;: the pickled model object&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, there are some additional artifacts that make sense even for such a toy project:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a scatterplot showing the data and the fit line&lt;/li&gt;
&lt;li&gt;the source code of the ipython notebook: for this we need some ipython magic, check the &lt;a href="https://github.com/mtrencseni/playground/blob/master/Automatic%20MLFlow%20logging%20for%20SKL.ipynb"&gt;notebook how to accomplish this&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/mlflow_lr_artifacts.png" alt="MLFlow LinearRegression artifacts." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;The improved version:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;red&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;linewidth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mlflow_log&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;scatter.png&amp;#39;&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;savefig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_artifact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;scatter plot with fit line&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_artifact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nb_full_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;notebook source&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

&lt;span class="n"&gt;num_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow_log&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This now also logs the scatter plot and the source:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_scatter.png" alt="MLFlow scatter plot." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Autologging for classification&lt;/h2&gt;
&lt;p&gt;Let's set up a similar toy problem for classification:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;matplotlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LogisticRegression&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ravel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;red&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;num_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LogisticRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Similar to the regression case, MLFow logs the parameters of the model's constuctor, and the following metrics of the &lt;code&gt;fit()&lt;/code&gt; training:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;training_accuracy_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_f1_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_log_loss&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_precision_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_recall_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_score&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/mlflow_logr_metrics.png" alt="MLFlow scatter plot." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This time, we also get some nice artifact charts for free from MLFlow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;training_confusion_matrix.png&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_precision_recall_curve.png&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;training_roc_curve.png&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/mlflow_logr_artifacts.png" alt="MLFlow scatter plot." style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;However, this is buggy! There's a bug in the MLFlow code, where it passes &lt;code&gt;y_pred&lt;/code&gt; instead of &lt;code&gt;y_true&lt;/code&gt; to the SKL drawing functions. Because of this, the above 3 artifacts make it seem like the model has perfectly learned the training set. For example, &lt;code&gt;training_confusion_matrix.png&lt;/code&gt; always looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow_training_confusion_matrix.png" alt="MLFlow scatter plot." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;So I filed a &lt;a href="https://github.com/mlflow/mlflow/issues/3991"&gt;bug report&lt;/a&gt;, and then a &lt;a href="https://github.com/mlflow/mlflow/pull/3993"&gt;pull request&lt;/a&gt; to fix the bug.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Clearly the automatic logging capabilities  are still experimental, but it's a great start. I will definitely turn on automatic logging for my production Scikit Learn models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Happy MLFlowing!&lt;/strong&gt;&lt;/p&gt;</content><category term="mlflow"></category><category term="tracking"></category></entry><entry><title>Getting Started with MLFlow</title><link href="https://bytepawn.com/getting-started-with-mlflow.html" rel="alternate"></link><published>2021-01-10T00:00:00+01:00</published><updated>2021-01-10T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2021-01-10:/getting-started-with-mlflow.html</id><summary type="html">&lt;p&gt;For the last few months I’ve been using MFlow in production, specifically its Tracking component. MLFlow is an open source project for lifecycle tracking and serving of ML models, coming out of Databricks. MLFlow is model agnostic, so you can use with SKLearn, XGBoost, Pytorch, Tensorflow, FBProphet, anything.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mlflow-overview.png" alt="MLFlow overview" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;For the last few months I’ve been using &lt;a href="https://github.com/mlflow/mlflow"&gt;MFlow&lt;/a&gt; in production, specifically its &lt;a href="https://www.mlflow.org/docs/latest/tracking.html"&gt;Tracking&lt;/a&gt; component. MLFlow is an open source project for lifecycle tracking and serving of ML models, coming out of &lt;a href="https://databricks.com/"&gt;Databricks&lt;/a&gt;. MLFlow is model agnostic, so you can use with SKLearn, XGBoost, Pytorch, Tensorflow, FBProphet, anything. The best way to get an initial understanding of MLFlow Tracking is to watch the &lt;a href="https://www.youtube.com/watch?v=859OxXrt_TI"&gt;first 25 minutes of this talk by Corey Zumar&lt;/a&gt;, who works on MLFlow at Databricks.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow-overview.png" alt="MLFlow overview." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;MLFlow Tracking is a simple client/server solution to log, track and share model metrics at scale. Using MLFlow makes sense both during the development of the model and also once the model is running in production.&lt;/p&gt;
&lt;h2&gt;Using MLFlow for model development&lt;/h2&gt;
&lt;p&gt;In this scenario, I’m using an ipython notebook and iterating on a model. I’m either reading training data from a files or straight from the database, training models, evaluating on test data, rinse and repeats potentially 100s of times, over days, weeks or months. With MLFlow, I can continuously track various metrics and artifacts of my models runs, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the current source code for the model training/running&lt;/li&gt;
&lt;li&gt;size/shape of the data I'm using for training&lt;/li&gt;
&lt;li&gt;various metrics and distributions of the training and test data&lt;/li&gt;
&lt;li&gt;model runtime and memory usage&lt;/li&gt;
&lt;li&gt;goodness metrics such as accurary, f1 score, precision, recall, TPR, FPR, etc.&lt;/li&gt;
&lt;li&gt;various plots such as ROC&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With MLFlow I achieve a kind of raw versioning of my model development.&lt;/p&gt;
&lt;h2&gt;Using MLFlow in production&lt;/h2&gt;
&lt;p&gt;In production can mean several things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Some models are batchy, and are &lt;strong&gt;re-trained at pre-specified intervals&lt;/strong&gt; (eg. hourly, daily or weekly) on new data, and then &lt;strong&gt;emit their prediction for existing data into a table&lt;/strong&gt;, which is then read out by other processes.&lt;/li&gt;
&lt;li&gt;Other models are re-trained at regular intervals like above, but then &lt;strong&gt;serve prediction requests in real-time&lt;/strong&gt;, eg. over an HTTP API.&lt;/li&gt;
&lt;li&gt;In advanced use-cases models are also updated in real-time as new training data comes in, but this is not typical outside of advanced Internet companies.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;MLFlow Tracking is agnostic to the above, so it can be used in all scenarios. Personally I'm using it in scenario #1 use-cases so far, from Airflow.&lt;/p&gt;
&lt;h2&gt;Setting up MLFlow and the artifact gotcha&lt;/h2&gt;
&lt;p&gt;Setting up MLFlow is easy. Just create a new &lt;code&gt;venv&lt;/code&gt; and &lt;code&gt;pip install mlflow&lt;/code&gt;. You will need to &lt;code&gt;pip install mlflow&lt;/code&gt; on both the server and all the clients that will be emitting tracking metrics to MLFlow. Remember that you can &lt;code&gt;!pip install mlflow&lt;/code&gt; from ipython notebooks.&lt;/p&gt;
&lt;p&gt;Now, we can start the server, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;mlflow server --backend-store-uri XXX --default-artifact-root YYY -h 0.0.0.0 -p 5000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What is &lt;code&gt;backend-store-uri&lt;/code&gt; and &lt;code&gt;default-artifact-root&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;We will use MLFlow for storing metrics like accuracy, essentially in a &lt;strong&gt;flat key-value manner. It will store these in the backend store&lt;/strong&gt;, which can be a filesystem location or a database like Postgresql. We will also send files to MLFlow for tracking, such as the source code currently running, or a plot we generated (eg. an ROC curve). These will be &lt;strong&gt;sent and stored as files under &lt;/strong&gt;artifact root**, which has to be a filesystem or an S3 bucket.&lt;/p&gt;
&lt;p&gt;The big gotcha is that the MLFlow client library:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;when sending metrics, it sends those to the MLFlow server, which stores them in the backend store; the client does not access the backend store directly&lt;/li&gt;
&lt;li&gt;when sending artifacts, the MLFlow client library puts them directly into the artifact store, &lt;strong&gt;so both the client and server needs to have access to it!&lt;/strong&gt; This is very unexpected because it goes against basic client/server principles. For this reason, although you can start up MLFlow by specifying a directory on the server as the artifact root, client calls to save artifacts will fail. &lt;strong&gt;You need to specify something like an S3 bucket, and both the client and server needs an AWS access key to read/write!&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I have set up MLFlow to use the server's local filesystem for backend store and a shared (between server and clients) S3 bucket for artifact storage.&lt;/p&gt;
&lt;p&gt;Once the server is up and running, you can open it in your browser, you should see an empty page like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow0.png" alt="Empty MLFlow page." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Sending tracking data to MLFlow&lt;/h2&gt;
&lt;p&gt;Now we can send tracking data to the MLFlow server. Fundamentally the steps are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;set up a &lt;strong&gt;connection&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;set the &lt;strong&gt;experiment&lt;/strong&gt;; if the experiment doesn’t exist, it will be created automatically (the experiment is essentially the project you’re working on)&lt;/li&gt;
&lt;li&gt;open a new &lt;strong&gt;experiment run&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;log&lt;/strong&gt; params, metrics and artifacts&lt;/li&gt;
&lt;li&gt;check the results on the &lt;strong&gt;web console&lt;/strong&gt;!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;

&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_tracking_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;http://127.0.0.1:5000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# set up connection&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;test-experiment&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# set the experiment&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;                          &lt;span class="c1"&gt;# open new experiment run&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_artifact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;abspath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;source code&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;some parameter&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;accuracy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The experiment &lt;code&gt;test-experiment&lt;/code&gt; will show up on the web console:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow1.png" alt="MLFlow experiment." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Inside the experiment are the &lt;strong&gt;experiment runs&lt;/strong&gt;. Selecting the latest one we will see our logged data:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mlflow2.png" alt="MLFlow experiment contents." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This was a quick introduction on how to set up MLFlow and how to get some initial data into it. Surprisingly, there is not much more handholding that you will get from MLFlow. Although there is experimental integration of MLFlow into common ML libraries such as SKL (&lt;a href="https://www.mlflow.org/docs/latest/python_api/mlflow.sklearn.html"&gt;mlflow.sklearn&lt;/a&gt;), Pytorch (&lt;a href="https://www.mlflow.org/docs/latest/python_api/mlflow.pytorch.html"&gt;mlflow.pytorch&lt;/a&gt;) and many others, most of the relevant metrics for your model you will have to log manually using the functions above: &lt;code&gt;log_artifact()&lt;/code&gt;, &lt;code&gt;log_param()&lt;/code&gt;, &lt;code&gt;log_metric()&lt;/code&gt;. For example, to save an ROC curve, you will have to generate the plot using &lt;code&gt;pyplot&lt;/code&gt;, save it to a &lt;code&gt;.png&lt;/code&gt; file, and then save that file as an artifact. In later posts I plan to discuss useful patterns for MLFlow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Happy MLFlowing!&lt;/strong&gt;&lt;/p&gt;</content><category term="mlflow"></category><category term="tracking"></category></entry><entry><title>Making statistics lie for the 2020 Presidential election</title><link href="https://bytepawn.com/making-statistics-lie-for-the-2020-presidential-election.html" rel="alternate"></link><published>2020-12-17T00:00:00+01:00</published><updated>2020-12-17T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-12-17:/making-statistics-lie-for-the-2020-presidential-election.html</id><summary type="html">&lt;p&gt;After the 2020 US presidential election, the Trump campaign filed over 50 lawsuits and attacked the integrity of the elections by claiming there was voter fraud. One of the last lawsuits was filed in the Supreme Court of the United States by the state of Texas. Here I look at the statistical claims made in this lawsuit that were supposed to show irregularities in the Georgia vote.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/trump-biden.png" alt="Trump vs Biden" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;After the 2020 US presidential election, &lt;a href="https://www.forbes.com/sites/alisondurkee/2020/12/08/trump-and-the-gop-have-now-lost-50-post-election-lawsuits/"&gt;the Trump campaign filed over 50 lawsuits and attacked the integrity of the elections by claiming there was voter fraud&lt;/a&gt;. One of the last lawsuits was filed in the Supreme Court of the United States (SCOTUS) by the state of Texas. Later more than a 100 Republican congressman joined the lawsuit in support. However, just a few days after filing, &lt;a href="https://apnews.com/article/election-2020-joe-biden-donald-trump-lawsuits-elections-fe784086a81497dcfe38f0eb5bffd36d"&gt;SCOTUS rejected the lawsuit&lt;/a&gt; because &lt;em&gt;"Texas has not demonstrated a judicially cognizable interest in the manner in which another state conducts its elections."&lt;/em&gt; In other words, SCOTUS chose not to hear the lawsuit because the state of Texas has no business questioning how other states run their elections, but without opinionating on the merits of the lawsuit. &lt;strong&gt;So did the lawsuit have merit?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It was &lt;a href="https://www.bloomberg.com/news/articles/2020-12-11/understanding-that-one-in-a-quadrillion-claim-about-the-election"&gt;widely reported&lt;/a&gt; however that the lawsuit contained flawed statistical arguments. Having said that, note that the lawsuit contained lots of claims and arguments, this statistical one is just one of them.&lt;/p&gt;
&lt;p&gt;The statistical claims can be summarized as:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Given the 2016 results of the &lt;strong&gt;Hillary Clinton - Donald Trump&lt;/strong&gt; race in Georgia, the 2020 &lt;strong&gt;Joe Biden - Donald Trump&lt;/strong&gt; results are very unlikely (and hence suspicious).&lt;/li&gt;
&lt;li&gt;Given the early lead of Trump in Georgia in 2020, the final win of Biden is very unlikely (and hence suspicious).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The summary of this portion of the lawsuit is reproduced below (&lt;a href="https://www.supremecourt.gov/DocketPDF/22/22O155/163048/20201208132827887_TX-v-State-ExpedMot%202020-12-07%20FINAL.pdf"&gt;source, page 8&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The same less than one in a quadrillion statistical improbability of Mr. Biden winning the popular vote in the four Defendant States—Georgia, Michigan, Pennsylvania, and Wisconsin—independently exists when Mr. Biden’s performance in each of those Defendant States is compared to former Secretary of State Hilary Clinton’s performance in the 2016 general election and President Trump’s performance in the 2016 and 2020 general elections. Again, the statistical improbability of Mr. Biden winning the popular vote in these four States collectively is 1 in 1,000,000,000,000,0005.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And, &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The probability of former Vice President Biden winning the popular vote in the four Defendant States—Georgia, Michigan, Pennsylvania, and Wisconsin — independently given President Trump’s early lead in those States as of 3 a.m. on November 4, 2020, is less than one in a quadrillion, or 1 in 1,000,000,000,000,000. For former Vice President Biden to win these four States collectively, the odds of that event happening decrease to less than one in a quadrillion to the fourth power (i.e., 1 in 1,000,000,000,000,0004).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1&gt;2016 vs 2020&lt;/h1&gt;
&lt;p&gt;Let's look at the first one. The details start on page 20:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scotus1.png" alt="SCOTUS 1" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;In plain english, it is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;modeling Georgia voters across 2016 and 2020 with one random variable, with some probability p of voting Democratic (like a flip of a coin, except it's not an exactly fair coin)&lt;/li&gt;
&lt;li&gt;what is the probability that this one random variable (coin) produced the results of both 2016 and 2020?&lt;/li&gt;
&lt;li&gt;the probability (p-value) is very small, so it is very unlikely&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Using the cited numbers we can compute the p-value directly with the &lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html"&gt;Chi-squared test&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;scipy.stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;
&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1877963&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2474507&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2089104&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2461836&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# first column in 2016 results, second is 2020 results&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p = &lt;/span&gt;&lt;span class="si"&gt;%.10f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;0.0000000000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;There is zero probability that the null hypothesis is true, in other words, the voters in 2016 and 2020 cannot be modeled by the same random variable. &lt;strong&gt;So does this prove there was fraud?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No! Voters from 2016 to 2020 cannot be modeled as a random variable!&lt;/strong&gt; Lots of things happened between 2016 and 2020 which affected Georgia voters. After all, they did not vote in 2020 for Biden based on what they thought of Hillary Clinton in 2016, and by 2020 &lt;a href="https://en.wikipedia.org/wiki/Veracity_of_statements_by_Donald_Trump"&gt;they had a lot more information about Donald Trump&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The best way to prove this is to look at this historic Republican vs Democrat POTUS election chart for Georgia from the Wikipedia page &lt;a href="https://en.wikipedia.org/wiki/United_States_presidential_elections_in_Georgia"&gt;United States presidential elections in Georgia&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scotus-georgia.png" alt="Georgia POTUS elections" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Does it look stable? No! Voters preferences change.&lt;/strong&gt; Georgia used to be blue, then it was swingy, then it was mostly Republican, and now Biden by a thin margin. But this doesn't mean that there was constant voter fraud in the last 100 years as results changed.&lt;/p&gt;
&lt;p&gt;Another quantitative way to show the absurdity of the argument is to compare the &lt;a href="https://en.wikipedia.org/wiki/2012_United_States_presidential_election_in_Georgia"&gt;2012&lt;/a&gt; and the &lt;a href="https://en.wikipedia.org/wiki/2016_United_States_presidential_election_in_Georgia"&gt;2016&lt;/a&gt; elections in Georgia. &lt;strong&gt;In both years, Georgia voted Republican, but:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;scipy.stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;
&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1773827&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1877963&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2078688&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2089104&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# first column in 2012, second is 2016&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p = &lt;/span&gt;&lt;span class="si"&gt;%.10f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Again, prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;0.0000000000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So again, there is zero probability that the null hypothesis is true, in other words, the voters in 2012 and 2016 cannot be modeled by the same random variable. So does this prove there was fraud in 2016 when Trump won? No. It just proves there are hundreds of thousands of Georgians who, from one election to the next:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;enter or exit the voting system (become eligible, move to Georgia, move away, die)&lt;/li&gt;
&lt;li&gt;change their vote&lt;/li&gt;
&lt;li&gt;do or do not vote&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;Early vs late counted votes&lt;/h1&gt;
&lt;p&gt;Here, the longer version of the argument with computational results, from the lawsuit:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scotus2.png" alt="SCOTUS 2" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here the problem is the same as above. Votes from different counties and method of voting (in-person, mail-in) cannot be modeled by the same random variable. The fact that the results turned as votes were counted is simply a result of votes being counted faster or slower per county and per method of voting. In most counties, in-person votes were counted first, and mail-in votes were counted later, and different counties report their results at different speeds.&lt;/p&gt;
&lt;p&gt;The statistical argument here would only hold if all the ballots in Georgia, from all the counties and all methods:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;were first collected&lt;/li&gt;
&lt;li&gt;put in a giant pile&lt;/li&gt;
&lt;li&gt;the pile were thoroughly shuffled&lt;/li&gt;
&lt;li&gt;votes randonly drawn from the pile one by one, and counted&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Similar to how balls are mixed and drawn in a lottery drawing:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scotus-lottery.jpg" alt="SCOTUS Lottery" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In this case, after randomization, it's correct to expect that whatever result we see after the first million votes are counted is representative of the final result&lt;/strong&gt;, and the second million votes counted should yield very similar results as the first million --- assuming our random shuffle truly worked. &lt;strong&gt;But this is not how votes are counted in real life.&lt;/strong&gt;&lt;/p&gt;
&lt;h1&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;This is a beautiful example of making statistics lie. In the examples above, using &lt;code&gt;scipy.stats&lt;/code&gt; illustrates this: statistics is like library function, we put in numbers, it returns back numbers, but it's up to us to interpret and explain them. We can make misleading and bogus assumptions and mischaracterize the results, or even put in bogus numbers. Statistics and library functions will never speak up.&lt;/p&gt;</content><category term="ab-testing"></category><category term="trump"></category><category term="politics"></category></entry><entry><title>Comparing conversion at control and treatment sites</title><link href="https://bytepawn.com/comparing-conversion-at-control-and-treatment-sites.html" rel="alternate"></link><published>2020-12-03T00:00:00+01:00</published><updated>2020-12-03T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-12-03:/comparing-conversion-at-control-and-treatment-sites.html</id><summary type="html">&lt;p&gt;In real-life, non-digital situations, it's often not feasible to run true A/B tests. In such cases, we can compare before and after rollout conversions at a treatment site, while using a similar control site to measure and correct for seasonality. The post discusses how to compute increasingly correct p-values and bayesian probabilities in such scenarios.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/ct_site_control_bayes.png" alt="Monte Carlo simulated control lifts" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In real-life, non-digital situations, it's often not feasible to run true A/B tests. In such cases, we can compare before and after rollout conversions at a treatment site, while using a similar control site to measure and correct for seasonality. The post discusses how to compute increasingly correct p-values and bayesian probabilities in such scenarios.&lt;/p&gt;
&lt;p&gt;Imagine a cinema selling movie tickets and upselling food and beverages, and experimenting with different pricing to see how it affects upsell conversion. For simplicity, let's say non-conversion means a customer buys just a movie ticket, whereas a conversion means a customer buys a movie ticket plus some food and beverages. Since the cinema is a real-life venue (not digital) the pricing is shown on large printed banners, and can't be varied per arriving customer (there may also be legal reasons).&lt;/p&gt;
&lt;p&gt;In such situations, the experiment setup is:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_setup.png" alt="Control site and treatment site setup" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Treatment site:&lt;/strong&gt; roll out the experimental pricing at the treatment site (let's day on Dec-1) and measure the change in conversion before vs. after rollout (let's say for a week).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Control site:&lt;/strong&gt; don't make any changes, and measure the change in conversion before vs. after rollout. The control lift can be used to estimate and subtract out the "seasonal" change from the treatment effect.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It's important to note that even when using a control site, we're doing a best-effort estimation of conversion lift. We cannot be sure that any signal we pick up is due to the treatment we applied, and not some external factor. Some issues we face:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The populations of the before and after (and even the control and treatment site) are not distinct: a customer could show up both at the control and the treatment site, both before and after the rollout. This could result in unexpected spill-over effects. If possible, it's best to exclude such customers from the study.&lt;/li&gt;
&lt;li&gt;The treatment site and the control site will never be statistically exactly the same, so any seasonal correction we measure at the control site is only an approximation of the treatment site's. For example, if a new Bollywood movie comes out after the treatment is applied at the treatment site, and one of the sites draws more people from India, who are more likely to buy food and beverages along with the movie ticket, that will skew the results.&lt;/li&gt;
&lt;li&gt;It's possible that an external event, unrelated to the pricing change, is causing a change at one of the sites after the rollout, and the lift we're measuring is due to this external reason and not the pricing change. For example, there is a big shop next to one of the sites, which closes down during the treatment period and skews the mix of people coming in to this site site.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It would be tempting to say that "because of the above", only a strong signal (high lift, low p-value, high bayesian probability) should be accepted. But this is misleading, because external events like the above can cause a strong signal. The result of such a control vs treatment site analysis should only be accepted after careful checks of populations mixes along various splits, checking various metrics and other external factors (eg. nearby physical events) of the sites indicate that it's safe to proceed with the analysis.&lt;/p&gt;
&lt;p&gt;Having laid out these caveats, the two main questions are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;How can we compute an estimated treatment lift?&lt;/li&gt;
&lt;li&gt;How can we compute an estimated significance (or bayesian probability of "after" being better than "before")?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Treatment%20site%20lift%20and%20significance.ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Experiment setup&lt;/h2&gt;
&lt;p&gt;Let's use the following toy numbers. Our measurements at the control site, before and after the rollout date:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_control_numbers.png" alt="Control site numbers" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;At the treatment site, where we actually change the price:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_treatment_numbers_2.png" alt="Treatment site numbers" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Estimating the treatment lift&lt;/h2&gt;
&lt;p&gt;At the control site, before rollout, conversion was $ 997 / (996+997) = 50 \% $, after rollout $ 1089 / (1004+1089) = 52 \% $, so the control's relative lift in conversion is $ 0.52 / 0.50 - 1 = 4 \% $. Since we didn't make any change at the control site, we assume that this lift is a seasonal lift. Doing the same at the treatment site, the uncorrected, raw relative lift is $ 11.2 \% $. Subtracting out the seasonal lift leaves us with $ 11.2 \% - 4 \% = 7.2 \% $ "real", corrected lift.&lt;/p&gt;
&lt;p&gt;Let's code it up:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;996&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1004&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;997&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1089&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;treatment_site&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;888&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mi"&gt;844&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;911&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1088&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;conv_control_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;conv_control_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_control_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversion at control site before = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_control_before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversion at control site after  = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relative conversion lift at control = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt; (assumed to be seasonality)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversion at treatment site before = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversion at treatment site after  = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relative conversion lift at treatment = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt; (including assumed seasonality)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relative conversion lift at treatment with correction (seasonality subtracted out) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;conversion at control site before = 0.50
conversion at control site after  = 0.52
relative conversion lift at control = 0.040 (assumed to be seasonality)

conversion at treatment site before = 0.51
conversion at treatment site after  = 0.56
relative conversion lift at treatment = 0.112 (including assumed seasonality)

relative conversion lift at treatment with correction (seasonality subtracted out) = 0.072
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Approach 1: Naive estimate of the statistical significance of the treatment lift (wrong)&lt;/h2&gt;
&lt;p&gt;Based on the above analysis, it seems that the treatment worked, and taking into account seasonality, there is still a positive lift. But is it statistically significant? Naively, we could take the treatment site's contingency matrix, and compute a p-value from that:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p-value of original treatment site contingency matrix = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;p-value of original treatment site contingency matrix = 0.001
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;But this is definitely not right, because we didn't take into account the fact that we normalized our conversion lift using the control site.&lt;/p&gt;
&lt;h2&gt;Approach 2: Estimate of the statistical significance using a corrected contingency matrix (flawed)&lt;/h2&gt;
&lt;p&gt;A less wrong approach is to re-normalize the treatment site's contingency matrix, and calculate a p-value from that. First, re-normalize, fixing the "after" sample size, and re-distributing the non-conversion and conversion counts according to the corrected "after" conversion (which excludes the seasonality we compute using the control):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;correct_treatment_contingency_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_after_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after_corr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;

&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;correct_treatment_contingency_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Original treatment site contingency matrix:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Corrected treatment site contingency matrix:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Original treatment site contingency matrix:
[[ 888  844]
 [ 911 1088]]
Corrected treatment site contingency matrix:
[[ 888  883]
 [ 911 1049]]
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;And now we can calculate a more meaningful p-value:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p-value of corrected treatment site contingency matrix = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;p-value of corrected treatment site contingency matrix = 0.029
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Because the seasonality was positive, it corrected the positive lift to be lower, which naturally results in a higher p-value, less statistical significance.&lt;/p&gt;
&lt;p&gt;Similar to the above, we can also use the corrected contingency matrix to calculate the Bayesian probability that the conversion at the treatment site is higher after the treatment was applied vs before. I've &lt;a href="http://bytepawn.com/bayesian-ab-conversion-tests.html"&gt;described in detail how to calculate Bayesian conversion probabilities in this previous post&lt;/a&gt;. We can either use a closed form solution or use Monte Carlo sampling to compute the probability. Since we will use Monte Carlo simulations later anyway, let's go that route:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;beta_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;beta_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;samples_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_before&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;samples_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_after&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;samples_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_after&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bayesian_prob_after_gt_before&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;samples_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_after&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;P(conv_after &amp;gt; conv_before | treatment_site     ): &lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob_after_gt_before&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;P(conv_after &amp;gt; conv_before | treatment_site_corr): &lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob_after_gt_before&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;P(conv_after &amp;gt; conv_before | treatment_site     ) = 0.9997
P(conv_after &amp;gt; conv_before | treatment_site_corr) = 0.9859
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The story is the same as above: since the corrected conversion lift is lower, the probability of the effect being real is a bit lower ($ 98.6 \% $ vs $ 99.9 \% $).&lt;/p&gt;
&lt;h2&gt;Approach 3: Using Bayesian Monte Carlo methods to take into account the sample size at the control site (still flawed)&lt;/h2&gt;
&lt;p&gt;From a mathematical perspective, the big shortcoming of the above analysis (both the frequentist p-value and the bayesian) is that it ignores the sample size at the control site. In other words, when we compute &lt;code&gt;conv_control_lift&lt;/code&gt; and then &lt;code&gt;conv_treatment_lift_corr&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_control_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We take the &lt;code&gt;conv_control_lift&lt;/code&gt; point estimate at face value, without taking into account that this lift at the control site is also a random variable, with spread around a mean. It could also be just random fluctuation.&lt;/p&gt;
&lt;p&gt;To model the uncertainty at the control site, we can use the &lt;a href="https://en.wikipedia.org/wiki/Beta_distribution"&gt;Beta distribution&lt;/a&gt; we know from &lt;a href="http://bytepawn.com/bayesian-ab-conversion-tests.html"&gt;Bayesian A/B testing&lt;/a&gt; and perform a Monte Carlo simulation.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Model the "before" conversion at the control site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;Model the "after" conversion at the control site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;In a Monte Carlo simulation, &lt;code&gt;num_samples&lt;/code&gt; times:&lt;/li&gt;
&lt;li&gt;Draw pairs of "before" and "after" conversion probabilities from the above distributions, and using these sampled conversions compute a sampled control conversion lift.&lt;/li&gt;
&lt;li&gt;Using the sampled conversion lift, repeat the frequentist or bayesian analysis above to get a p-value or a bayesian probability&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;First, code and visualization for steps:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;mean relative conversion lift at control = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt; (assumed to be seasonality)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_control_bayes.png" alt="Monte Carlo simulated control lifts" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;mean relative conversion lift at control = 0.041 (assumed to be seasonality)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can then feed these sampled lifts at the control site into our subsequent calculations. Code for the frequentist p-value case:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;p_value_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;correct_treatment_contingency_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p_value_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p-value of corrected treatment site contingency matrix (taking into account control site uncertainty) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_mc_p.png" alt="Monte Carlo simulated p-values" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;p-value of corrected treatment site contingency matrix (taking into account control site uncertainty) = 0.115
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Note that this is a significant adjustment: without taking into account the control site's uncertainty, the p-value was 0.029, but if we take it into account, it's 0.115, or 4x.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We can repeat the same to get an adjusted Bayesian probability:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bayesian_prob_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;treatment_site_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;correct_treatment_contingency_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob_after_gt_before&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site_corr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_after&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;bs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;bayesian_prob_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;samples_conv_control_lift&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;P(conv_after &amp;gt; conv_before | treatment_site_corr) (taking into account control site uncertainty) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_mc_bayes.png" alt="Monte Carlo simulated bayesian probabilities" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;P(conv_after &amp;gt; conv_before | treatment_site_corr) (taking into account control site uncertainty) = 0.942
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Similar to the frequentist case, this is a significant adjustment: without taking into account the control site's uncertainty, the probability was $ 98.5 \% $, while this calculation yields a more realistic $ 94.2 \% $. Like in the frequentist case, $ 1-P $ increases by 4x.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Approach 4: Pure Bayesian Monte Carlo approach without contingency matrix correction (best)&lt;/h2&gt;
&lt;p&gt;The problem with Approach 2 and 3 is that correcting the treatment site's contingency matrix is not statistically sound. It's like computing probabilities of a pretend measurement.&lt;/p&gt;
&lt;p&gt;Fortunately, we can compute correct probabilities directly in a pure bayesian approach with Monte Carlo sampling. The approach is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Model the "before" conversion at the control site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;Model the "after" conversion at the control site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;Model the "before" conversion at the treatment site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;Model the "after" conversion at the treatment site with a Beta distribution.&lt;/li&gt;
&lt;li&gt;Sample the above distributions, and directly compute the control lift, treatment lift, and the corrected treatment lift.&lt;/li&gt;
&lt;li&gt;Calculate the ratio of samples where the corrected treatment lift &amp;gt; 0.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;beta_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;beta_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;site_contingency&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;samples_before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_before&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;samples_after&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_after&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;samples_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_after&lt;/span&gt;

&lt;span class="n"&gt;samples_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_control_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;control_site&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;samples_treatment_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_treatment_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_conversion_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;treatment_site&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_conv_treatment_lift_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conv_control_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_control_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
    &lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_control_lift&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_lift_corr&lt;/span&gt;

&lt;span class="n"&gt;lift_corrs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;calculate_conv_treatment_lift_corr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;conv_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_control_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv_treatment_after&lt;/span&gt;
         &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_control_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_control_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_treatment_before&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_treatment_after&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift_corrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;lift_corr&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;lift_corr&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lift_corrs&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift_corrs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MC mean relative conversion lift at treatment with correction (seasonality subtracted out) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lift_corrs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;P(corrected conversion lift &amp;gt; 0 at treatment site | control_site, treatment_site) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_control_bayes_mc_best.png" alt="Monte Carlo simulated bayesian probabilities" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;MC mean relative conversion lift at treatment with correction (seasonality subtracted out) = 0.072
P(corrected conversion lift &amp;gt; 0 at treatment site | control_site, treatment_site) = 0.938
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that here the mean relative conversion lift of the Monte Carlo simulation ($ 7.2 \%) matches the point estimate from the beginning.&lt;/p&gt;
&lt;h2&gt;Why the control site's sample size matters&lt;/h2&gt;
&lt;p&gt;It's easy to see why by dividing the control site's contingency matrix by 10. Note that the point estimate of the corrected lift does not change, it's still $ 7.2 \% $, since it's just a function of ratios. Let's see what happen if we re-run the above code (approach 4), which is sensitive to sample size:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;control_site = [
    [99.6, 100.4],
    [99.7, 108.9],
]
# same as above
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;A smaller sample size means our estimate of the seasonality is less accurate, so we expect to be less certain about our treatment lift:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ct_site_control_bayes_mc_best_reduced.png" alt="Monte Carlo simulated bayesian probabilities" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;MC mean relative conversion lift at treatment with correction (seasonality subtracted out) = 0.067
P(corrected conversion lift &amp;gt; 0 at treatment site | control_site, treatment_site) = 0.745
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the Bayesian Monte Carlo method, at this low control sample size, since the spread is not symmetric about the mean, the mean relative lift actually goes down from $ 7.2 \% $ to $ 6.7 \% $, and our certainty that this effect is real goes down from $ 93.8 \% $ to $ 74.5 \% $!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The main takeaways here are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;When conducting such treatment vs control site experiments, we must conduct extensive internal checks to minimize the risk of external effects polluting our measurement.&lt;/li&gt;
&lt;li&gt;We have to take into account the uncertainty of the control site. The cleanest approach is the pure bayesian Monte Carlo method.&lt;/li&gt;
&lt;/ol&gt;</content><category term="ab-testing"></category></entry><entry><title>Unevenness at the edges</title><link href="https://bytepawn.com/unevenness-at-the-edges.html" rel="alternate"></link><published>2020-10-30T00:00:00+01:00</published><updated>2020-10-30T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-10-30:/unevenness-at-the-edges.html</id><summary type="html">&lt;p&gt;Sometimes we look at the top performers in a field and see obviously uneven representations of groups (gender, ethnicity, etc). There a multitude of factors that can lead to it, such as unfair bias in access to opportunities. Here I will show one unintuitive mathematical effect that can contribute to such unevenness in the case of normal distributions. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mean_shift_02.png" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Sometimes we look at the top performers in a field and see obviously uneven representations of groups (gender, ethnicity, etc). There a multitude of factors that can lead to it, such as unfair bias in access to opportunities. Here I will show one unintuitive mathematical effect that can contribute to such unevenness in the case of normal distributions.&lt;/p&gt;
&lt;p&gt;As a toy model let's assume that there is just one metric which determines performance, such as "height" for basketball or "reaction time" for driving, and the populations are described by a normal distribution. &lt;strong&gt;The unintuitive effect is that even a small shift in the means of the normals (assuming equal standard deviations) can lead to a very uneven ratio at the tails.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Unevenness%20at%20the%20edges.ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Normal distributions&lt;/h2&gt;
&lt;p&gt;Let's take a simple toy example, two populations A and B, both described by a normal distribution, both have standard deviations of $ \sigma = 1 $. Population A has mean $ \mu = 0 $ and B has mean $ \mu = 0.2 $, so $ A \sim \mathcal{N}(0, 1), B \sim \mathcal{N}(0.2, 1) $. This is easy to draw:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;scipy.stats&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;stats&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;mean_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;span class="n"&gt;mean_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="n"&gt;stdev&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mean_A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mean_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mean_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mean_shift_02.png" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;If we draw an element of A and B each at random, what's the probability that B will be "higher"?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;B_taller_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;mean_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.1f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;B_taller_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It's 60%.&lt;/p&gt;
&lt;p&gt;Okay, now assume that this is in fact a very large population, and we take the "top peformers". Let's take the ones above a high sigma of $ z_{cut} = 5.2 $. What will be the ratio of As to Bs at the edge of the distribution?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;5.2&lt;/span&gt;
&lt;span class="n"&gt;area_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;area_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.1f&lt;/span&gt;&lt;span class="s1"&gt; to 1&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This tells us that in the A population, 1 in 10,035,700 is above z_cut, in the B population it's 1 in 3,488,555, and in the end we expect the ratio of B to A to be 2.9 to 1, so roughly 3 to 1. So although Bs only have a slight edge over As in the mean, and in the overall population, at the edges we expect to see 3 times as many Bs as As! The effect becomes stronger at higher z's.&lt;/p&gt;
&lt;p&gt;We can repeat this with real height data for males and females from &lt;a href="https://ourworldindata.org/human-height"&gt;Our world in data&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As an aggregate of the regions with available data – Europe, North America, Australia, and East Asia – they found the mean male height to be 178.4 centimeters (cm) in the most recent cohort (born between 1980 and 1994). The standard deviation was 7.59 cm... Women were smaller on average, with a mean height of 164.7 cm, and standard deviation of 7.07 cm.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://ourworldindata.org/uploads/2019/06/distribution-1-775x550.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here we just have to do a bit more work because the standard deviations are not equal, but it's the same story. Assuming there is an equal number of males and females (a good assumption), what's the expected ratio of males to females above 185cm?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;164.7&lt;/span&gt;
&lt;span class="n"&gt;mean_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;178.4&lt;/span&gt;
&lt;span class="n"&gt;stdev_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;7.1&lt;/span&gt;
&lt;span class="n"&gt;stdev_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;7.6&lt;/span&gt;
&lt;span class="n"&gt;z_cut&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;185&lt;/span&gt;
&lt;span class="n"&gt;area_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;area_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.1f&lt;/span&gt;&lt;span class="s1"&gt; to 1&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We expect 1 in 470 females to be above 185cm, 1 in 5 males, and ratio of males to females above 185 cm is 90.7 to 1!&lt;/p&gt;
&lt;h2&gt;Exponential distributions&lt;/h2&gt;
&lt;p&gt;It's worth noting that this doesn't happen with all distributions. For example, with exponential distributions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="n"&gt;mean_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.2&lt;/span&gt;
&lt;span class="n"&gt;stdev&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;z_cut&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;
&lt;span class="n"&gt;area_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;area_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_cut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mean_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stdev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1 in &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.1f&lt;/span&gt;&lt;span class="s1"&gt; to 1&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area_B&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;area_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The unevenness at high z is only 1.5 to 1.&lt;/p&gt;
&lt;h2&gt;Causality&lt;/h2&gt;
&lt;p&gt;It's important to be careful with causality. What this shows is the mathematical fact that a shift in the mean of normal distributions leads to uneven ratios of populations at high z's. But in real life, we can't just turn it around: if we see an uneven distribution at high z's, we can't be sure if this is the reason, or some other effect. For example, currently 100% of Formula 1 drivers are male, but more than likely there are other strong determining factors, eg. parents of girls may be less likely to buy car toys, less likely to take girls gokarting at a young age, etc.&lt;/p&gt;</content><category term="stats"></category><category term="data"></category></entry><entry><title>Effective Data Visualization Part 3: Line charts and stacked area charts</title><link href="https://bytepawn.com/effective-data-visualization-part-3-line-charts.html" rel="alternate"></link><published>2020-09-01T00:00:00+02:00</published><updated>2020-09-01T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-09-01:/effective-data-visualization-part-3-line-charts.html</id><summary type="html">&lt;p&gt;Most charts should be line charts or stacked area chart, because they communicate valuable trend information and are easy to parse for the human eyes and brain. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/p3-weekly-double.png" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;On most dashboards for product, operations or for business consumption, charts should be line charts or stacked area charts. These chart types communicate valuable trend patterns effectively, and are easy for the human eyes and brain to parse. On many dashboards, a "big number with year-over-year" type widget is used, but line charts work better. For example, in 2020 August, showing the YTD revenue compared to 2019 August conflates the actual year-over-year performance during January and February, and the effect of Covid-19 in March and later. On a line chart, these effects are obvious and visible.&lt;/p&gt;
&lt;h2&gt;Toy example&lt;/h2&gt;
&lt;p&gt;Let's walk through an example. Suppose we have an app called MyApp and we want to show revenues up until 2019 Q3. First, let's try a daily line chart:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-daily.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is too much information here. Seeing the weekly periodicity 52 times per year pollutes this chart. Let's turn it into a weekly chart:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-weekly.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;If we want to show the weekly periodicity of a metric, we can make a separate, supporting chart for that, and show median values for daily metrics. I will skip that step here.&lt;/p&gt;
&lt;p&gt;The weekly chart is better, but it's not totally clear whether 2019 is better than 2018. It does't jump off the screen. A sub-optimal solution is to compute the explicit year-over-year percentage, and show that as a % on a linechart. But we can do better, with a simple trick. Instead of showing the date on the x-axis, we can show the day-of-year, and show 2018 and 2019 as two lines:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-weekly-double.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is a pretty neat trick that I never figured out myself, I learned this at Facebook. This makes it obvious that 2019 is indeed doing better than 2018, and also shows that there was a slump period in 2019 that's worth investigating.&lt;/p&gt;
&lt;p&gt;From this chart it's not obvious whether 2019 is ahead of 2018 YTD (year-to-date), given that there was as slump. Let's make it visually obvious:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-cumu.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This nicely shows that although there was slump (the 2019 cumulative line flattens for a bit), 2019 is doing much better than 2018. On a dashboard, actual numbers can be read off by hovering with the mouse over the chart.&lt;/p&gt;
&lt;p&gt;Finally, let's say we want to split our revenue by country, and we're only in 2 countries, USA and Canada. In this case, since we're now segmenting by country, we can't segment by year, so we have to go back to showing the date on the x-axis:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-country-line-2.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note that the y-axis is USD, we converted our CAD revenues to USDs. When showing money on a chart, you must always convert to the same currency. The big improvement here is to show this as a stacked area chart, because our countrywise revenues are additive:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/p3-stacked.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Discussion&lt;/h2&gt;
&lt;p&gt;Most charts should be line charts or stacked area charts, because they communicate valuable trend information and are easy to parse for the human eyes and brain. When building these charts, consider these points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The x-axis should usually be a date (or day-of-year), formatted like &lt;code&gt;YYYY-MM-DD&lt;/code&gt; per the &lt;a href="https://en.wikipedia.org/wiki/ISO_8601"&gt;ISO-8601&lt;/a&gt; standard.&lt;/li&gt;
&lt;li&gt;The y-axis should be a well-defined metric, like &lt;code&gt;Revenue (USD)&lt;/code&gt; or &lt;code&gt;Monthly Active Users (28d)&lt;/code&gt;.&lt;ul&gt;
&lt;li&gt;Put the metric on the y-axis label, unless the metric is mentioned in the chart title, then you don't need to.&lt;/li&gt;
&lt;li&gt;If the metric needs to be explained, put a textbox on the dashboard and explain it.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Remove unnecessary visual clutter (like 3D, gradients, etc) from the chart: &lt;strong&gt;a chart is perfect if you can no longer remove clutter from it but it still communicates all the information it needs to.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Pick good colors for different lines, like red and blue. Don't pick different shades of the same color, some of your users may be colorblind.&lt;/li&gt;
&lt;li&gt;Pick a range for your x-axis that makes sense. On some dashboards, the range can be specified, in this case pick a good default that renders a consumable chart, so that the user is not scared away when opening the dashboard.&lt;/li&gt;
&lt;li&gt;Pick a granularity (daily, weekly, monthly, etc) for your x-axis that makes sense. On some dashboards, the granularity can be specified, in this case pick a good default that renders a consumable chart, so that the user is not scared away when opening the dashboard.&lt;/li&gt;
&lt;li&gt;The chart should have a clear and concise title, like &lt;code&gt;Weekly MyApp revenue by Country&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If showing unsegmented multi-year data, consider breaking it into 1 line per year, to effectively show year-over-year visual information.&lt;/li&gt;
&lt;li&gt;If it makes sense, also show the cumulative metric value on a supporting line chart, eg. revenue, units sold, etc.&lt;/li&gt;
&lt;li&gt;If the metric is additive and segmented, show it on a stacked area chart.&lt;/li&gt;
&lt;li&gt;But only do so if the total is the actual total. For example, if you're only showing USA and Canada on a chart, but the app is actually also out in Mexico (but we don't yet have data on it), showing a "total" on a stacked area chart is misleading, because it's not the actual total. In such a partial case, it's better to use line charts and put a note in a textbox.&lt;/li&gt;
&lt;li&gt;If your data has outliers, in a product/business setting, consider leaving that data off (eg. by excluding it with a &lt;code&gt;HAVING&lt;/code&gt; clause in the &lt;code&gt;SQL&lt;/code&gt;).&lt;ul&gt;
&lt;li&gt;Never use a logarithmic y-axis, it's visually misleading, and some of your users will be confused.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</content><category term="charts"></category><category term="dashboards"></category><category term="data"></category><category term="visualization"></category></entry><entry><title>Effective Data Visualization Part 2: Formatting numbers</title><link href="https://bytepawn.com/effective-data-visualization-part-2-formatting-numbers.html" rel="alternate"></link><published>2020-08-23T00:00:00+02:00</published><updated>2020-08-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-08-23:/effective-data-visualization-part-2-formatting-numbers.html</id><summary type="html">&lt;p&gt;Format numbers for human consumption. What is more readable, &lt;code&gt;1.539e+5&lt;/code&gt; or &lt;code&gt;153,859&lt;/code&gt;? Showing numbers effectively on spreadsheets, charts, dashboards, reports is a basic ingredient for readability, like formatting code in programming. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/d3format.png" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Format numbers for human consumption. What is more readable, &lt;code&gt;1.539e+5&lt;/code&gt; or &lt;code&gt;153,859&lt;/code&gt;? Showing numbers effectively on spreadsheets, charts, dashboards and reports is a basic ingredient for readability, like formatting code.&lt;/p&gt;
&lt;p&gt;For this article, I will use the &lt;a href="https://d3js.org/"&gt;d3 Javascript visualization library&lt;/a&gt;. I use &lt;a href="https://github.com/apache/incubator-superset"&gt;Superset&lt;/a&gt; for charts and dashboards on a daily basis, and Superset uses d3.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Where should numbers be formatted?&lt;/strong&gt; By the database? No. Numbers and dates should be formatted at the last possible moment, before human consumption, eg. by the Javascript in the browser. Why? Because formatting (i) may lose information, for example the time part of a datetime, or numbers are rounded (ii) the type of the data (&lt;code&gt;date, datetime, int, float&lt;/code&gt;) is lost as everything is converted to a &lt;code&gt;string&lt;/code&gt;. This can lead to incorrect ordering, because &lt;code&gt;'9' &amp;gt; '10'&lt;/code&gt; even though &lt;code&gt;9 &amp;lt; 10&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;On my Macbook, to play around with d3, I use &lt;a href="nodejs.org"&gt;node.js&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;npm&lt;/span&gt; &lt;span class="nx"&gt;install&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;npm&lt;/span&gt; &lt;span class="nx"&gt;install&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;
&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;d3&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mi"&gt;153859&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;153,859&amp;#39;&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timeFormat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%Y-%m-%d&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;2020-08-23&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Localization&lt;/h2&gt;
&lt;p&gt;Number formatting is locale specific. For example, &lt;code&gt;153,859.12&lt;/code&gt; is written as &lt;code&gt;153.859,12&lt;/code&gt; in Turkey. There is no hard rule here. Personally, I try to stick with US-standards. For more, see:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Decimal_separator"&gt;Decimal separator on Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html"&gt;Does your code pass the Turkey test&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Dates&lt;/h2&gt;
&lt;p&gt;Format dates per the &lt;a href="https://en.wikipedia.org/wiki/ISO_8601"&gt;ISO-8601&lt;/a&gt; standard, like &lt;code&gt;YYYY-MM-DD&lt;/code&gt;. In d3, the format string for this is &lt;code&gt;%Y-%m-%d&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What if we're showing monthly data, like monthly revenue?&lt;/strong&gt; What to show for the day part? I standardize on always using &lt;code&gt;YYYY-MM-DD&lt;/code&gt;, so I show &lt;code&gt;-01&lt;/code&gt; for the day part. An alternative is to use textual three letter months, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timeFormat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;%Y-%b&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;2020-Aug&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;What about timezones?&lt;/strong&gt; It's best to standardize on one timezone, preferably UTC, and show all dates in UTC time, without specifiying the timezone. But this can still lead to weird artefacts. For example, countries that are ahead in time, their Monday morning usage peak will show up in Sunday UTC time. So the weekly seasonality gets shifted around, per country/timezone. However, this is still preferable to showing times in local times on dashboards, as that is too error-prone. Imagine debugging a drop in traffic, and having to shift the times for each country's local timezone in your head.&lt;/p&gt;
&lt;h2&gt;Currencies&lt;/h2&gt;
&lt;p&gt;Monetary values should be converted to a common currency, like USD or AED. Showing different currencies on the same chart is an error, whether it's a linechart or stacked area chart. Everybody knows what &lt;code&gt;$1,000&lt;/code&gt; means, but other currencies don't have such one-letter abbreviation. I always use the &lt;a href="https://en.wikipedia.org/wiki/ISO_4217"&gt;ISO 4217&lt;/a&gt; currency designation, after the number, with a space, like &lt;code&gt;1,000 USD&lt;/code&gt;, which is &lt;code&gt;3,672.94 AED&lt;/code&gt; right now. The ISO standard does not specify the ordering or spacing.&lt;/p&gt;
&lt;h2&gt;Numbers&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Don't round numbers.&lt;/strong&gt; If your user doesn't care about the details, let him ignore the rest. Rounded numbers can be confusing, because it seems like we don't know the exact count.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;// don&amp;#39;t do this, it seems like we don&amp;#39;t know the exact number&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.2r&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mi"&gt;12345&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;12000&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Don't use Ms for millions and ks for thousands.&lt;/strong&gt; Same argument as for rounding: the user's eyes and brains has to do extra processing to tell what's going on, because the formatting changed in the middle.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;// don&amp;#39;t do this, it looks weird&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1200&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.2s&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;
&lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;
&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="nx"&gt;M&lt;/span&gt;
&lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="nx"&gt;M&lt;/span&gt;
&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="nx"&gt;M&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Just show the number, aligned:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;// do this: the 9 in the format string pads it on the left with spaces so it&amp;#39;s 9 characters wide&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1200&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;9,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;
  &lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Use separators for thousands&lt;/strong&gt;, like &lt;code&gt;1,234,456&lt;/code&gt;. In d3, this is accomplished with the &lt;code&gt;,&lt;/code&gt; format string:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mi"&gt;123456&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;234&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;456&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Use the appropriate number of decimals. For integer numbers, like number of users, don't show decimals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For numbers that have a fractional part, but it's insignificant, don't show it&lt;/strong&gt; [in Data Science work]. If you're showing the monthly revenues, and it's in the millions of dollars, don't show the cents.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,.0f&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mf"&gt;123456.123&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;123,456&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;For ratios and probabilities that are usually shown as percentages, show a percentage between 0 and 100, and not a ratio between 0 and 1.&lt;/strong&gt; For example, funnel conversions are usually shown as percentages. Show the appropriate number of digits after the decimal, which depends on the type of measurement. I usually show 0 or 1 digits after the decimal. Sometimes it's good to show at least one digit just to signify that the numbers are not estimates (which helps is the numbers happen to be round, like A had 40.2% conversion and B had 50.4%).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.1%&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mf"&gt;0.402&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;40.2%&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For p values in statistical significance testing, I like to show 3 decimals, like &lt;code&gt;0.001&lt;/code&gt; (d3 format string is &lt;code&gt;.3f&lt;/code&gt;). This is a bit weird if the value is &lt;code&gt;0.00001&lt;/code&gt;, because it will show as &lt;code&gt;0.000&lt;/code&gt;, but that's fine.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.3f&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;0.001&amp;#39;&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.3f&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mf"&gt;0.00001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;0.000&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Remember: format numbers for human consumption. Make is easy on the eyes, so the user doesn't have to think or. Well formatted numbers are beautiful.&lt;/p&gt;</content><category term="charts"></category><category term="dashboards"></category><category term="data"></category><category term="visualization"></category></entry><entry><title>Effective Data Visualization Part 1: Categorical data</title><link href="https://bytepawn.com/effective-data-visualization-part-1-categorical-data.html" rel="alternate"></link><published>2020-08-22T00:00:00+02:00</published><updated>2020-08-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-08-22:/effective-data-visualization-part-1-categorical-data.html</id><summary type="html">&lt;p&gt;Making clear, readable charts is part of the craftmanship minimum for any data related role. In part one, I look at how to present categorical data. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/pie_chart1.png" alt="A pie chart" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The goal of building charts and dashboards is to convey information about the underlying data effectively. &lt;strong&gt;The user should be able to look at the chart and "get it" quickly&lt;/strong&gt;. The data professional who builds the start needs to know know "it" is, or at least how the user can get at it.&lt;/p&gt;
&lt;p&gt;Unfortunately, many data professionals consider building charts and dashboard to be low-level activities that's "just BI", and they can't be bothered with such considerations. &lt;strong&gt;This is a fallacy.&lt;/strong&gt; Making clear, readable charts is part of the craftmanship minimum for any data related role, whether it's a Data Engineer, Product Analyst, BI Analyst, Data Scientist or ML Engineer. Doing a poor job presenting data in the form of charts and dashboards can mislead the audience, turn off the audience and block the impact of the underlying work.&lt;/p&gt;
&lt;h2&gt;Pie charts&lt;/h2&gt;
&lt;p&gt;What's the problem with this chart?&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pie_chart1.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;3D doesn't add anything. Let's make it 2D:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pie_chart2.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The hole in the middle doesn't help. Let's remove it:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pie_chart3.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Okay, now the pie chart is pretty minimal, but it's still not good. The problem is, the human brain is not good at reading off proportions from a circular scale. It's really hard to tell how the orange and red areas compare. You actually have to move your eye to the text and read it off. But if you're reading it off the text anyway, what's the point of the pie chart? Let's try a bar chart.&lt;/p&gt;
&lt;h2&gt;Bar charts&lt;/h2&gt;
&lt;p&gt;&lt;img src="/images/bar_chart1.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is better. But what happened with the percentages? The bar chart shows the original metric values now. Let's add some axis labels and a chart title to make it more clear what we're looking at:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart2.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is better. But now we're forced to do eye gymnastics and track the height of the bars to the axis to try to read off values, and we'll be reading off estimated values. Let's fix that:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart3.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Okay, but it's still confusing. When was this the number of users? In 2019 June or in 2020 February? Let's add that to the title. Format dates per the &lt;a href="https://en.wikipedia.org/wiki/ISO_8601"&gt;ISO-8601&lt;/a&gt; standard, like &lt;code&gt;YYYY-MM-DD&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart4.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Not bad. But what if there are more countries? A horizontal bar chart will quickly run out of room on the x-axis and the charting software will start tilting the labels:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart5.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let's try a horizontal bar chart:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart6.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;The major gridlines don't add any value here, let's remove those:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart7.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is somewhat better. But with many values displayed, you have to be careful when moving your eye from left to right, to read off the right value, and not confuse numbers above/below. Let's use colors to guide the eye:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bar_chart8.png" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is not terrible. What we've lost compared to the pie chart is that we can't read off percentages.&lt;/p&gt;
&lt;p&gt;At the end of the day, for such a chart, we should either use a horizontal bar chart with a table attached, or just the table.&lt;/p&gt;
&lt;h2&gt;Tables&lt;/h2&gt;
&lt;p&gt;&lt;img src="/images/table1.png" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let's improve the table, by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;using borders to structure the table&lt;/li&gt;
&lt;li&gt;using bold text to highlight columns and totals&lt;/li&gt;
&lt;li&gt;centering text so it's easier on the eyes&lt;/li&gt;
&lt;li&gt;showing secondary information, such as percentages, in gray and smaller text&lt;/li&gt;
&lt;li&gt;adding cumulative percentages, so we can see that the top 4 countries contribute half the users&lt;/li&gt;
&lt;li&gt;changing the column header to make it clear that user counts are in millions&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/table2.png" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Putting it together:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/final1.png" style="width: 900px;"/&gt;&lt;/p&gt;
&lt;p&gt;Great! But are we talking about registered users, monthly active users (MAUs) or daily active users (DAUs). Let's assume we are talking about MAUs, and we can safely assume our audience understands the acronym:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/final2.png" style="width: 900px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Making clear, readable charts is part of the craftmanship minimum for any data related role.&lt;/strong&gt; It doesn't matter whether it's in Google Charts (what I used here), Excel, or a dashboarding software like Tableau or Superset. &lt;strong&gt;Pie charts are overused and should never be used.&lt;/strong&gt; A combination of bar charts, tables (and linecharts, not covered in this article) are always preferable.&lt;/p&gt;</content><category term="charts"></category><category term="dashboards"></category><category term="data"></category><category term="visualization"></category></entry><entry><title>Multi-armed bandits and false positives</title><link href="https://bytepawn.com/multi-armed-bandits-and-false-positives.html" rel="alternate"></link><published>2020-08-21T00:00:00+02:00</published><updated>2020-08-21T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-08-21:/multi-armed-bandits-and-false-positives.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations to explore the false positive rate of Multi-armed bandits.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mab-fixed-fpr-ps.png" alt="Epsilon-greedy" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous blog post &lt;a href="http://bytepawn.com/ab-testing-and-multi-armed-bandits.html"&gt;A/B testing and Multi-armed bandits&lt;/a&gt; I compared fixed-split A/B testing, Epsilon-greedy, UCB1 and Thompson sampling. We saw that MABs offer an interesting alternative to fixed-split A/B testing, being in a different region of the exploration-exploitation space. We saw that with MABs, the price we pay for less regret at the end of the experiment is less statistical significance: in a balanced 50-50 fixed-split A/B test, A and B get an equal number of hits, irrespective of how they have performed so far, while MABs will favor the variant that has performed better. However, because of the imbalance, the final result will be less decisive (higher p-value), so we will be less sure that we actually found the better funnel.&lt;/p&gt;
&lt;p&gt;In this post, I will explore what happens if A and B are actually the same; so neither A and B are really better than the other. In such cases, we ideally want the A/B test to not "go off" and call a winner (a false positive). In frequentist hypothesis testing, we control this with the $\alpha=p_{crit}$ parameter, which we set to 0.05 here. As a reminder of statistical hypothesis testing, see the earlier post &lt;a href="http://bytepawn.com/ab-testing-and-the-ztest.html"&gt;A/B testing and the Z-test&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20Multi-armed%20bandits%20(FPs).ipynb"&gt;The code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Metrics&lt;/h2&gt;
&lt;p&gt;I will use Monte Carlo simulations to compare the above 4 methods, with both A and B converting at 5%. I am curious about two outcome metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Imbalance&lt;/strong&gt;: even though A and B are the same, MABs will probably favor one of the other, and for example A will get 80% of the hits and B only 20%; on average, how far off is this ratio from 50%-50%? Since there is no difference, a high ratio is not actually a problem, it's a measure of how much the MABs "mislead" themselves.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;False positive rate&lt;/strong&gt;: how often is the result stat.sign?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;The Monte Carlo code, which computes the above metrics:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_abtest_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_one&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;num_winning&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;num_significant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;imbalances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Doing simulation run &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;imbalance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels_chosen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;imbalances&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imbalance&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;num_significant&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Avg imbalance: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imbalances&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive rate: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_significant&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Histogram of p values:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kde&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The code for the methods is unchanged, so I won't repeat it. I will simulate funnels converting at 5%, with N=10,000 samples, and run the experiments 1,000 times to get statistics. The driver code (for the fixed case):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes, the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;simulate_abtest_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_fixed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;First, the results from the last post, when B was better than A:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;method&lt;/th&gt;
&lt;th&gt;N&lt;/th&gt;
&lt;th&gt;num_simulations&lt;/th&gt;
&lt;th&gt;conv_A&lt;/th&gt;
&lt;th&gt;conv_B&lt;/th&gt;
&lt;th&gt;ratio_better_won&lt;/th&gt;
&lt;th&gt;ratio_won_statsign&lt;/th&gt;
&lt;th&gt;ratio_traffic_better&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fixed-split&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;td&gt;6%&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;0.59&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Epsilon-greedy(eps=0.1)&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;td&gt;6%&lt;/td&gt;
&lt;td&gt;0.85&lt;/td&gt;
&lt;td&gt;0.38&lt;/td&gt;
&lt;td&gt;0.70&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UCB1&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;td&gt;6%&lt;/td&gt;
&lt;td&gt;0.99&lt;/td&gt;
&lt;td&gt;0.61&lt;/td&gt;
&lt;td&gt;0.58&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thompson sampling&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;td&gt;6%&lt;/td&gt;
&lt;td&gt;0.96&lt;/td&gt;
&lt;td&gt;0.40&lt;/td&gt;
&lt;td&gt;0.80&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The results for this post, when both A and B are the same:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;method&lt;/th&gt;
&lt;th&gt;N&lt;/th&gt;
&lt;th&gt;num_simulations&lt;/th&gt;
&lt;th&gt;conversion&lt;/th&gt;
&lt;th&gt;alpha&lt;/th&gt;
&lt;th&gt;imbalance&lt;/th&gt;
&lt;th&gt;fpr&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fixed-split&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.004&lt;/td&gt;
&lt;td&gt;0.035&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Epsilon-greedy(eps=0.1)&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.368&lt;/td&gt;
&lt;td&gt;0.039&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UCB1&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.029&lt;/td&gt;
&lt;td&gt;0.054&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thompson sampling&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;td&gt;0.234&lt;/td&gt;
&lt;td&gt;0.090&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;My takeaways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;UCB1 is not very appealing, since when B is better than A, it only puts marginally more traffic (&lt;code&gt;ratio_traffic_better&lt;/code&gt;) into B versus A when compared to fixed-split&lt;/li&gt;
&lt;li&gt;Thompson sampling is the best for directing traffic to the better funnel (&lt;code&gt;ratio_traffic_better&lt;/code&gt;), but it has significantly higher false positive rate (&lt;code&gt;fpr&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Epsilon-greedy looks like a good trade-off, and is simple to understand. If I had to use a MAB in production, I would conduct more tests with different epsilon parameters to see if there's a sweetspot.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Appendix&lt;/h2&gt;
&lt;p&gt;Distribution of p values for fixed-split testing:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab-fixed-fpr-ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Distribution of p values for the epsilon-greedy MAB:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab-eps-fpr-ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Distribution of p values for the UCB1 MAB:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab-ucb1-fpr-ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Distribution of p values for the Thompson MAB:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab-thompson-fpr-ps.png" style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and Multi-armed bandits</title><link href="https://bytepawn.com/ab-testing-and-multi-armed-bandits.html" rel="alternate"></link><published>2020-08-07T00:00:00+02:00</published><updated>2020-08-07T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-08-07:/ab-testing-and-multi-armed-bandits.html</id><summary type="html">&lt;p&gt;Multi-armed bandits minimize regret when performing A/B tests, trading off between exploration and exploitation. Monte Carlo simulations shows that less exploration yields less statistical significance.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mab_epsg_ps.png" alt="Epsilon-greedy" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;When we perform an A/B test, we split the population in a fixed way, let's say 50-50. We then put an N number of users through the different A and B funnels: for each user, we flip a coin to decide whether they get A or B. We then measure the outcome metric for the A and B population of users, let's say conversion. We then compare the outcome metric to tell whether A or B is better. We use statistical hypothesis testing, let's say &lt;a href="http://bytepawn.com/ab-testing-and-fishers-exact-test.html"&gt;Fisher's exact test&lt;/a&gt;, to see whether the difference is statistically significant.&lt;/p&gt;
&lt;p&gt;Suppose the A/B test is performed to compare different versions of signup funnels for paid licenses, so there is revenue involved. In this case, a business minded person could ask: &lt;em&gt;"If B is generating more revenue than A, could we have sent less users into A and more into B, to maximize our revenue, even while the test is running?"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This is what &lt;a href="https://en.wikipedia.org/wiki/Multi-armed_bandit"&gt;Multi-armed bandit&lt;/a&gt; algorithms are for. Per Wikipedia:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In probability theory, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In a Multi-armed bandit (MAB) approach, instead of having a fixed split of users between A and B, we make this decision based on how A and B have performed so far. MAB approaches attempt to strike a balance between exploitation (putting users into the better funnel, based on what we've seen so far) and exploration (collecting more data about funnels which have seen less users so far). In general, MAB algorithms will favor funnels which have performed better so far.&lt;/p&gt;
&lt;p&gt;The high-level goal of MAB algorithms is to minimize regret: in this example, the additional amount of money we could have made if we put all users into the funnel that actually performs better. Here I will show three MAB algorithms: Epsilon-greedy, UCB1 and Thompson sampling.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20Multi-armed%20bandits.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Fixed split A/B testing&lt;/h2&gt;
&lt;p&gt;To set a baseline and write some useful code, let's start with plain-vanilla, fixed split A/B testing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_abtest_fixed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnel_outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;funnel_outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is just regular A/B testing. It simulates N users, and each users goes into the funnels according to &lt;code&gt;traffic_split&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes, the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.06&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;simulate_abtest_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_fixed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We set up our experiment such that funnel A has 5% conversion, while funnel B has 6% conversion, so funnel B is better. In the context of MABs, we'd like funnel B to be chosen more often than funnel A, but we'd also like to be able to call a winner at the end decisively (with statistical significance).&lt;/p&gt;
&lt;p&gt;The function &lt;code&gt;simulate_abtest_many()&lt;/code&gt; is a helper function which calls the passed in experimental function (in this case, &lt;code&gt;simulate_abtest_fixed()&lt;/code&gt;) exactly &lt;code&gt;num_sumulations&lt;/code&gt; times, and collects statistics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;how many times has the better funnel won&lt;/li&gt;
&lt;li&gt;in those cases, was the result significant&lt;/li&gt;
&lt;li&gt;what was the histogram of p values&lt;/li&gt;
&lt;li&gt;after the i-th user in the histogram, up to that point, what % of time was the user put in the better funnel B&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The code for &lt;code&gt;simulate_abtest_many()&lt;/code&gt; is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_abtest_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_one&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;num_winning&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;num_significant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen_many&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Doing simulation run &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnels_chosen_many&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
            &lt;span class="n"&gt;num_winning&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;num_significant&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Done!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Ratio better funnel won:  &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_winning&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;num_winning&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Ratio of wins stat.sign.: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_significant&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_winning&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Histogram of p values:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kde&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen_avg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen_many&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Avg cumulative ratio of users in better funnel:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;user sequence&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;avg cumulative ratio of users in better funnel&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels_chosen_avg&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels_chosen_avg&lt;/span&gt;&lt;span class="p"&gt;))])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Running this with a fixed split A/B test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Ratio better funnel won:  1.000
Ratio of wins stat.sign.: 0.590
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mab_fixed_cumu_winning.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab_fixed_ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Outcomes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the better funnel wins 100% of the time&lt;/li&gt;
&lt;li&gt;the results is stat. sign. 59% of the time&lt;/li&gt;
&lt;li&gt;as expected, the split is 50% and doesn't change in the experiment&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Epsilon-greedy&lt;/h2&gt;
&lt;p&gt;Epsilon-greedy is the simplest MAB algorithm. There is a fixed epsilon parameter, say 10%. For each incoming user, with 10% probability we randomly put the user into A or B, and with 90% probability we put them in the funnel that has performed better so far. So we explicitly control the trade-off between explore (10%) and exploit (90%). Implementation is straightforward:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observation_vector&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observation_vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observation_vector&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observation_vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_eps_greedy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;explore_traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
    &lt;span class="n"&gt;rewards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# explore, choose one at random&lt;/span&gt;
            &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;explore_traffic_split&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# exploit, choose the best one so far&lt;/span&gt;
            &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnel_outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;funnel_outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Running the Epsilon-greedy experiment 100 times yields the following results:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Ratio better funnel won:  0.850
Ratio of wins stat.sign.: 0.376
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mab_epsg_cumu_winning.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab_epsg_ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Outcomes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the better funnel wins 85% of the time, less than fixed A/B testing&lt;/li&gt;
&lt;li&gt;the results is stat. sign. 37% of the time, less than fixed A/B testing&lt;/li&gt;
&lt;li&gt;on average, 70% of users are put in the better funnel, better than fixed A/B testing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;UCB1&lt;/h2&gt;
&lt;p&gt;UCB stands for Upper Confidence Bound. It's an algorithm that achieves regret that grows only logarithmically with the number of actions taken. For more details, see these &lt;a href="https://www.cs.bham.ac.uk/internal/courses/robotics/lectures/ucb1.pdf"&gt;lectures slides&lt;/a&gt;. The key point is that &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for each funnel $j$, maintain the conversion rate $c_j$ and number of users $n_j$&lt;/li&gt;
&lt;li&gt;$n$ is the total number of users so far&lt;/li&gt;
&lt;li&gt;choose the funnel that maximises $ c_j + \sqrt{ 2 ln(n) / n_j } $&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without going into the details, UCB1 achieves a good trade-off between exploration and exploitation. Implementation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ucb1_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reward_funnel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_funnel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_total&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;reward_funnel&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n_funnel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_ucb1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
    &lt;span class="c1"&gt;# initially, set each score to a big number, so each funnel goes at least once&lt;/span&gt;
    &lt;span class="n"&gt;ucb1_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ucb1_scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ucb1_scores&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnel_outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;funnel_outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="n"&gt;ucb1_scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ucb1_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Running the UCB1 experiment 100 times yields the following results:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Ratio better funnel won:  0.990
Ratio of wins stat.sign.: 0.606
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mab_ucb1_cumu_winning.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab_ucb1_ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Outcomes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the better funnel wins 99% of the time, roughly the same as fixed A/B testing&lt;/li&gt;
&lt;li&gt;the results is stat. sign. 60% of the time, roughly the same as fixed A/B testing&lt;/li&gt;
&lt;li&gt;on average, 58% of users are put in the better funnel, somewhat better than fixed A/B testing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Thompson sampling&lt;/h2&gt;
&lt;p&gt;Thompson sampling is easy to understand if you understand how &lt;a href="http://bytepawn.com/bayesian-ab-conversion-tests.html"&gt;Bayesian A/B tests work and what the Beta() distribution is&lt;/a&gt;. The idea is simple:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for each funnel $j$, maintain the number of converting users $C_j$ and number of users $n_j$&lt;/li&gt;
&lt;li&gt;for each funnel, at each incoming user, create a Beta distribution with parameters $C_j + 1$ and $n_j - C_j + 1$; the Beta distributions model the conversion % of the funnels&lt;/li&gt;
&lt;li&gt;draw a random number (sampled conversion) from each Beta distribution, and put the user into the funnel with the highest&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The implementation is straightforward:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;beta_distributions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_thompson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
    &lt;span class="n"&gt;funnels_chosen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;betas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_distributions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;p_conv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;betas&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_conv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnel_outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;funnel_outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels_chosen&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Running the Thompson experiment 100 times yields the following results:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Ratio better funnel won:  0.960
Ratio of wins stat.sign.: 0.396
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mab_thompson_cumu_winning.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mab_thompson_ps.png" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Outcomes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the better funnel wins 96% of the time, roughly the same as fixed A/B testing&lt;/li&gt;
&lt;li&gt;the results is stat. sign. 40% of the time, less than fixed A/B testing&lt;/li&gt;
&lt;li&gt;on average, 80% of users are put in the better funnel, better than fixed A/B testing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;There is no free lunch. Statistical significance is a function of the funnel with the lowest number of samples, so diverting traffic from an even split yields lower significance. Different Multi-armed bandits strike a different trade-off in the space of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;exploration (statistical significance)&lt;/li&gt;
&lt;li&gt;exploitation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Another aspect to consider is that Multi-armed bandits are harder to debug and understand in production. With fixed A/B testing, at any time it's possible to compare counts from exposure logs to the experiment's configuration to see if metrics are as expected. With MAB this is more involved, because we have to compare exposure log and conversion counts to the expected MAB algorithm's behaviour given the current performance. The upside is less regret (higher overall conversion).&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Understanding Facebook’s Planout A/B testing framework</title><link href="https://bytepawn.com/understanding-facebook-planout-ab-testing-framework.html" rel="alternate"></link><published>2020-05-22T00:00:00+02:00</published><updated>2020-05-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-05-22:/understanding-facebook-planout-ab-testing-framework.html</id><summary type="html">&lt;p&gt;PlanOut is a framework for online field experiments. It was created by Facebook in 2014 to make it easy to run and iterate on sophisticated experiments in a statistically sound manner.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/blue_dice.png" alt="Planout" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In previous A/B testing posts I focused on the mathematical aspects. This time, I will explain &lt;a href="https://facebook.github.io/planout/index.html"&gt;Facebook’s Planout&lt;/a&gt; product for A/B testing. Planout can be used to declare and configure A/B tests and to assign users into buckets (A, B, etc.) in production. Planout was released in 2014, its main author is &lt;a href="https://eytan.github.io/"&gt;Eytan Bakshy&lt;/a&gt;. This post is based on a &lt;a href="https://www.youtube.com/watch?v=Ayd4sqPH2DE"&gt;PyData talk Eytan gave in 2014.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Unit of A/B testing&lt;/h2&gt;
&lt;p&gt;Let’s pretend we’re running A/B tests on our website, and we use Python/Django. Using Planout, we subclass &lt;code&gt;SimpleExperiment&lt;/code&gt; and define the function &lt;code&gt;assign()&lt;/code&gt;, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;planout.experiment&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;planout.ops.random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In case A, the button size is 100, in case B the button size is 120, and we want to see whether B lifts the click-through-rate (CTR) of the button.&lt;/p&gt;
&lt;p&gt;Once the class is declared, we can create an instance by passing in the unit (in this case, &lt;code&gt;user_id&lt;/code&gt;), and then we can retrieve which experimental bucket the user is in by retrieving the param &lt;code&gt;button_size&lt;/code&gt;, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;xrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MyExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;button_size&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It will print:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;(0, 120)
(1, 120)
(2, 100)
(3, 120)
(4, 120)
(5, 120)
(6, 100)
(7, 120)
(8, 100)
(9, 100)
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Exposure logging and power&lt;/h2&gt;
&lt;p&gt;Continuing the above example, a file &lt;code&gt;MyExperiment.log&lt;/code&gt; will be generated, which looks like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 0}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 1}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 2}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 100}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 3}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 4}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 5}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 6}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 100}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 7}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 120}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 8}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 100}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
{&amp;quot;inputs&amp;quot;: {&amp;quot;user_id&amp;quot;: 9}, &amp;quot;name&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;params&amp;quot;: {&amp;quot;button_size&amp;quot;: 100}, &amp;quot;time&amp;quot;: 1590153690, &amp;quot;salt&amp;quot;: &amp;quot;MyExperiment&amp;quot;, &amp;quot;event&amp;quot;: &amp;quot;exposure&amp;quot;}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is called the &lt;strong&gt;exposure log&lt;/strong&gt; of the experiment. In a production environment this would be funneled into an event stream processing system and eventually stored in a data warehouse, where the experimental results can be evaluated. &lt;em&gt;Note that Planout only deals with running the experiment, it does not deal with evaluation of the results, such as hypothesis testing.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It is important that the exposure logs are only emitted when the experimental parameter &lt;code&gt;button_size&lt;/code&gt; is retrieved, not when the experiment object is created in the &lt;code&gt;e = MyExperiment(user_id=i)&lt;/code&gt; line. This is important to get maximum statistical power. When we compare the CTR for A and B, it's clear how many in A and B clicked through, let's say it's 990 and 1051. But what was the sample size for A and B? It's important to only count users who were actually &lt;strong&gt;exposed&lt;/strong&gt; to the experiment, in this case who accessed this dialog box. Why? Because a difference of 1051-990=61 is much more significant if 5,000 users were exposed to the experiment than if 500,000 users.&lt;/p&gt;
&lt;p&gt;The best way to achieve this programatically is to emit the &lt;strong&gt;exposure logs&lt;/strong&gt; as late as possible, when the parameter (&lt;code&gt;button_size&lt;/code&gt;) is retrieved. There's no guarantee this is enough, a novice programmer could still write code which retrieves and saves these parameters in a database for later usage, but it's the best an A/B testing framework can do.&lt;/p&gt;
&lt;h2&gt;Pseudo-random but deterministic through hashing&lt;/h2&gt;
&lt;p&gt;If you run the above code on your computer, you will notice that you get the same results! In other words, if you create a class called &lt;code&gt;MyExperiment&lt;/code&gt; like above, and pass in &lt;code&gt;user_id=0&lt;/code&gt;, you will also get &lt;code&gt;button_size=120&lt;/code&gt;. Also, if you re-run the code, you will always get the same results.&lt;/p&gt;
&lt;p&gt;This is because Planout is deterministic. It doesn't actually use a random number generator to decide whether to put a user into bucket A or B. Instead, it uses hashing: it takes the salt of the experiment (by default, the salt is the name of the class, &lt;code&gt;MyExperiment&lt;/code&gt;), combines it with the parameter name (&lt;code&gt;button_size&lt;/code&gt;) and the unit value (the unit of the experiment is the &lt;code&gt;user_id&lt;/code&gt;, eg. &lt;code&gt;0&lt;/code&gt;), and computes the SHA, so eg. &lt;code&gt;SHA1('MyExperiment.button_size.0')&lt;/code&gt;. Since the experiment is setup to be a uniform choice between 2 cases, the result of the &lt;code&gt;SHA1()&lt;/code&gt; would be &lt;code&gt;mod 2&lt;/code&gt;'d.&lt;/p&gt;
&lt;p&gt;Determinism is important for a number of reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if a user comes back a 2nd time, we want her to be in the same experimental bucket (A or B) as the 1st time to keep the experiment consistent&lt;/li&gt;
&lt;li&gt;as long as we know which users got exposed in the experiment (the &lt;code&gt;user_id&lt;/code&gt;), we can re-construct which bucket they were in, even if logs are lost; in other words, in the case above, even if the logs did not contain the assigned &lt;code&gt;button_size&lt;/code&gt;, we could re-compute it&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://facebook.github.io/planout/docs/how-planout-works.html"&gt;See the documentation for more.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Changing salts&lt;/h2&gt;
&lt;p&gt;We can override the experiment name and salt, so it's not the (default) class name:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;My awesome experiment&amp;#39;&lt;/span&gt;
    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;salt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;91551c2a7e9117429e43c59ec1e4e8035c19ae15&amp;#39;&lt;/span&gt;
    &lt;span class="c1"&gt;# salt is the result of:&lt;/span&gt;
    &lt;span class="c1"&gt;# date | awk &amp;#39;{print $0&amp;quot;My awesome experiment&amp;quot;}&amp;#39; | shasum&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is useful for a number of reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in a large organization, different users of the A/B testing system could accidentally use the same names in their experiment (eg. &lt;code&gt;MyExperiment&lt;/code&gt; and &lt;code&gt;button_size&lt;/code&gt;). By explicitly setting the salt, experimental results will never get mixed up.&lt;/li&gt;
&lt;li&gt;this way, if the class is renamed to a more descriptive name like &lt;code&gt;BigButtonExperiment&lt;/code&gt; during refactoring, the experimental results don't change, users will continue to get hashed into the same buckets; if the salt is not set explicitly, renaming the class or the parameter will change the hashing!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The salt can also be explicitly set for the parameters, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;27830a83e56b62d9f7cc03868a80f3a67cb69201&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In a sophisticated environment, the salts can be set automatically the first time the code is checked into source control.&lt;/p&gt;
&lt;h2&gt;Multiple parameters&lt;/h2&gt;
&lt;p&gt;Suppose we want to also experiment with the color of the button:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;blue&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;green&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we do it like this, it's an A/B/C/D test, because we will get all 2x2 combinations of sizes and colors. But what if we just want an A/B test, with the combinations &lt;code&gt;(button_size, button_color)&lt;/code&gt; = &lt;code&gt;(100, blue)&lt;/code&gt; and &lt;code&gt;(120, green)&lt;/code&gt;. This can be accomplished by setting the parameter level salts to be the same:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;blue&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;green&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Chaining&lt;/h2&gt;
&lt;p&gt;Suppose we have a baseline recommendation engine &lt;code&gt;v100&lt;/code&gt;, and we are experimenting with a new version, but we're not sure how to tune the new engine. For each user, we want to pick a tuned new engine (&lt;code&gt;v200...v202&lt;/code&gt;) and in each session, we want to either use the baseline, or the new engine (but always the same for the same user).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RecomendationEngineExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;v200&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;v201&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;v202&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;v100&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new_model&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;xrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;xrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RecomendationEngineExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;session_model&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What we accomplish here is that in each session, the user either gets &lt;code&gt;v100&lt;/code&gt; or one of the new ones, but for a user, the new one never changes, eg. &lt;code&gt;user_id=0&lt;/code&gt; either gets &lt;code&gt;v100&lt;/code&gt; or &lt;code&gt;v202&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Planout's design principles are solid and still apply today, so it's good practice to either understand Planout when designing an A/B testing framework, or just use Planout as-is.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Validation checks for A/B tests</title><link href="https://bytepawn.com/validation-checks-for-ab-tests.html" rel="alternate"></link><published>2020-04-16T00:00:00+02:00</published><updated>2020-04-16T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-04-16:/validation-checks-for-ab-tests.html</id><summary type="html">&lt;p&gt;A/B tests go wrong all the time, even in sophisticated product teams. As this article shows, for a range of problems we can run automated validation checks to catch problems early, before they have too bad of an effect on customers or the business. These validation checks compare various statistical properties of the funnels A and B to catch likely problems. Large technology companies are running such validation checks automatically and continuously for their online experiments.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/k-s.png" alt="Kolmogorov-Smirnov test" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;"Anything that can go wrong will go wrong"&lt;/em&gt;, according to &lt;a href="https://en.wikipedia.org/wiki/Murphy%27s_law"&gt;Murphy’s law&lt;/a&gt;. A/B testing is no different. Anybody who has run a lot of A/B tests, over several years, has seen multiple failure modes. Some errors I've seen (assuming “B” is the new experimental funnel in the A/B test):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;misconfigured test&lt;/strong&gt;: we want to do a 20%-80% split, but we accidentally configure 80%-20% split instead&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;accidentally still running&lt;/strong&gt;: we conclude the test, and want to remove A or B from production, but accidentally leave the test on with the original traffic split&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;randomization bias&lt;/strong&gt;: users are not assigned into A and B randomly&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;logging problem&lt;/strong&gt;: no/less/faulty logs coming from B&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;buggy product&lt;/strong&gt;: a software bug in B causes users to drop out&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s look at how we can automatically catch some of these problems. Note that in all of the tests below, we are doing a &lt;a href="https://en.wikipedia.org/wiki/One-_and_two-tailed_tests"&gt;two-tailed test&lt;/a&gt;, we want to catch deviations in either direction.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Validation%20checks%20for%20AB%20tests.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Problem: misconfigured test, missing exposure logs&lt;/h2&gt;
&lt;p&gt;This is the easiest to catch, assuming we have an independent validation system that is correctly configured. On the one hand, if this is the case, we can simply write code that checks whether the configured traffic splits match in production vs validation. However, this is often not possible, because eg. the production configuration is hardcoded into Python or Java source code files.&lt;/p&gt;
&lt;p&gt;In this case, we can perform validation on the exposure logs. Exposure log just means a log entry which is generated when a user is assigned into the funnels A or B. We can do a check using the one-way &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_test"&gt;$\chi^2$ test&lt;/a&gt;: if we expect an 80%-20% split between A and B (were 80-20 is typed a second time in the validation check), we can check how likely it is that the exposure log counts for A and B are coming from that distribution. If there is a misconfiguration, we will get a very low p-value, and can alert on it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;user_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1987&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8013&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# bad&lt;/span&gt;
&lt;span class="c1"&gt;#user_counts = [8013, 1987] # good&lt;/span&gt;
&lt;span class="c1"&gt;# simulates a case where we accidentally switched A and B&lt;/span&gt;
&lt;span class="c1"&gt;# so the test would return a very low ~0 p value, indicating&lt;/span&gt;
&lt;span class="c1"&gt;# that it&amp;#39;s very unlikely that the observed counts are coming from&lt;/span&gt;
&lt;span class="c1"&gt;# the indicated traffic_split&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chisquare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_counts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f_exp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;])[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Low p value (&lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;). Probably badly configured test, or bad logs!&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Everything seems good.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For automation, use a very small critical p-value like 0.001 or 0.0001, since we’re not interested in statistical fluctuations, we want to catch misconfiguration, which will even at moderate sample sizes yield a very small p value, close to 0.&lt;/p&gt;
&lt;p&gt;Note that here we’re not doing a significance test on the outcome of the A/B test. We’re just making sure the split is what we think it is. In the example above, we assumed that out of 10,000 impressions, 1,987 were in the A funnel, and it’s configured to get 80% of the traffic. The $\chi^2$ test that then tells us this is an extremely unlikely outcome. Note that the above $\chi^2$ test can be run for experiments with more than 2 funnels.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html"&gt;See my earlier post on the $\chi^2$ test.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Problem: randomization bias&lt;/h2&gt;
&lt;p&gt;Sometimes the software engineer who implements some of the A/B testing code isn’t math-savvy, and makes a simple mistake. For example, he may think that &lt;code&gt;user_id % 2 == 0&lt;/code&gt; is a good enough way to do a 50%-50% split.&lt;/p&gt;
&lt;p&gt;Note: Another related fallacy is to use and re-use fixed buckets (10 buckets, &lt;code&gt;user_id % 10&lt;/code&gt;), and use different buckets for different A/B tests. This is a problem for a number of reasons: it unnecessarily lowers the sample size, &lt;a href="http://bytepawn.com/running-multiple-ab-tests-in-parallel.html"&gt;see this previous post on parallel testing&lt;/a&gt;. Also, this introduces bias, because we keep reusing the same groups of users, so if a previous A/B test influenced users in the first bucket in a certain way, we may still be measuring that effect in later, unrelated A/B tests.&lt;/p&gt;
&lt;p&gt;This can be tested by running a $\chi^2$-squared test on the last digits (or last 2 digits) of the &lt;code&gt;user_id&lt;/code&gt;s. The idea is simple: if the &lt;code&gt;user_id&lt;/code&gt;s are truly random in A and B, we should see 10% of &lt;code&gt;user_id&lt;/code&gt;s that end in 0, 1 ... 9 both in A and B. The below code simulates this check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;MAX_USER_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;digits_to_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;users_in_experiment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MAX_USER_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;users_in_funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;users_in_funnels&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users_in_experiment&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;p_min&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;expected_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;digits_to_test&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;digits_to_test&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;id_list&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users_in_funnels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;digit_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;id_list&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;expected_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expected_split&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chisquare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digit_counts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f_exp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expected_counts&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_min&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_min&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_min&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Low p value (&lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;). Probably indicated badly configured test, or bad logs!&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Everything seems good.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Problem: logging problem or buggy product&lt;/h2&gt;
&lt;p&gt;The kinds of tests we can run here are similar to the tests above. For example, if the product is buggy and the new dialog doesn’t load at all, we would see that with a test like in the first example, because logs would be missing for B.&lt;/p&gt;
&lt;p&gt;Similar tests can be performed on properties (segments) of users. For example, we can look at the country of the user (US, Europe, Rest), browser (Chrome, Firefox, Edge/IE, Rest), platform (desktop, mobile) and make sure that the splits are the same. For example, if the browser split (Chrome, Firefox, Edge/IE, Rest) of users in A is (0.4, 0.2, 0.3, 0.1) and for B it is (0.5, 0.24, 0.24, 0.02) we can run a  $\chi^2$-squared test (with counts) on the vector pairs to see how likely it is that they are coming from the same distribution; maybe our new B version doesn’t render correctly on all browsers, or there are performance differences.&lt;/p&gt;
&lt;p&gt;Another thing we can do is to run the test for the final target metric (eg. conversion or timespent), and see if the difference is unusually large. As above, we can set an aggressive p-value like (0.001 or 0.0001) and alert on that. This test would go off if for example in one funnel due to a software problem it’s impossible to convert, or lots of users with a certain browser are unable to convert, so the the conversion number is unrealistically suppressed.&lt;/p&gt;
&lt;p&gt;The problem with the above is that, at large sample sizes, which happens at companies with hundreds of millions or billions of users, even small lifts will be very significant, so achieve a low p value. In such cases a better validation check is to compare the distribution, assuming that most A/B tests don't change the distribution much, they just shift the mean; but a software or logging error might change the distribution shape, if we're "lucky". For this we can use the &lt;a href="https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test"&gt;Kolmogorov-Smirnov test&lt;/a&gt;, which tests how likely it is that two samples are coming from the same distribution. The test statistic used by the Kolmogorov-Smirnov test is the largest difference in the cumulative probability distribution function.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/k-s.png" alt="Kolmogorov-Smirnov" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;Code, which notifies us that a uniform distribution is not normal:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;rvs1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;rvs2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ks_2samp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rvs1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rvs2&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;p_crit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Low p value (&lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;). It&amp;#39;&amp;#39;s very unlikely the two samples are coming from the same distribution.&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Everything seems good.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Problem: accidentally still running&lt;/h2&gt;
&lt;p&gt;For this problem, the easiest sanity check is not statistical, it’s a consistency check in code/reporting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;have a report of running A/B tests, which is based on exposure logs only, irrespective of whether the experiment is turned on or off in the experiment configuration tool&lt;/li&gt;
&lt;li&gt;have an alert, so that if an A/B test is turned off in the experiment configuration tool, but the experiment is still producing exposure logs, an alert is generated&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a good A/B testing culture, there is separate library/framework for running experiments in code. For a good example see &lt;a href="https://github.com/facebook/planout"&gt;Planout&lt;/a&gt;, an open-source framework released by Facebook. Here is the demo example from Planout:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;planout.experiment&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;planout.ops.random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FirstExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleExperiment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userid&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UniformChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;#ff0000&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;#00ff00&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;userid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;button_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WeightedChoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Join now!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Sign up.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;userid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;my_exp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FirstExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# parameters may be accessed via the . operator&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;my_exp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;button_text&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;my_exp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;button_color&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# experiment objects include all input data&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;xrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;FirstExperiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Exposure logs should be generated in this line: &lt;code&gt;my_exp = FirstExperiment(userid=12)&lt;/code&gt;, when the user “enters the funnel” and the framework randomly decided on A or B.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;A/B tests go wrong all the time, even in sophisticated product teams. As this article shows, for a range of problems we can run automated validation checks to catch problems early, before they have too bad of an effect on on customers or the business. These validation checks compare various statistical properties of the funnels A and B to catch likely problems. Large technology companies are running such validation checks automatically and continuously for their online experiments.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Running multiple A/B tests in parallel</title><link href="https://bytepawn.com/running-multiple-ab-tests-in-parallel.html" rel="alternate"></link><published>2020-04-06T00:00:00+02:00</published><updated>2020-04-06T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-04-06:/running-multiple-ab-tests-in-parallel.html</id><summary type="html">&lt;p&gt;I show using Monte Carlo simulations that randomizing user assignments into A/B test experiments makes it possible to run multiple A/B tests at once and measure accurate lifts on the same metric, assuming the experiments are independent.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/abpa4.png" alt="Watts-Strogatz" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Suppose we have N=10,000 users and want to run 2 A/B test experiments, $E_1$ and $E_2$, both of which are trying to move the same metric. In this post I will assume the metric we are trying to move is timespent per user (or something like it, a number assigned to each user). The same logic also applies to conversions, but timespents are a better illustration of the concepts.&lt;/p&gt;
&lt;p&gt;It is a common misconception that when running two experiments, we have to split our users between the two experiments, so each experiment will have 50,000 users in it, and each bucket will have 2,500 users in it (A in $E_1$, B in $E_1$, A in $E_2$, B in $E_2$). The cause of this misconception is the belief that if a user is in both experiments, then we cannot tell which experiment led to the user spending more time.&lt;/p&gt;
&lt;p&gt;At face value, this is an error in statistical reasoning. We don’t really care why an &lt;em&gt;individual user&lt;/em&gt; spent more or less time with the product, what we care about is the average timespent between A and B. As long as that measurement is accurate, individual users’ being influenced by multiple experiments is irrelevant. Accurate here means that we would measure the same thing (statistically) if we were running only one A/B test.&lt;/p&gt;
&lt;p&gt;Having said that, there are cases when running multiple tests on the same user leads to statistical errors: this happens if the experiments interact. In other words, if we assume that run by itself $E_1$ lifts by X, and $E_2$ lifts by Y, and if we run both than the lift is X+Y, then we're fine. But if the effects interact with each other, in which case the combined lift is something else (eg. X+Y/2, because the presence of X suppresses Y), then we cannot run them in parallel. This happens if eg. the experiments are making UI changes to the same dialog.&lt;/p&gt;
&lt;p&gt;If the experiments are independent, there is in fact no need to limit the sample sizes, both experiments can run on all 100,000 users, in parallel.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Parallel%20AB%20tests.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Modeling the experiments&lt;/h2&gt;
&lt;p&gt;To understand why parallel experiments work, let’s remember how an A/B test is modeled in our simulations: each user is represented by a independent random variable (RV). Independent, because we assume users don’t affect each other (so, we’re not in a social network setting here), and it’s a &lt;a href="https://en.wikipedia.org/wiki/Random_variable"&gt;random variable&lt;/a&gt; because individual user outcomes are random. In this post, like before, I will use an &lt;a href="https://en.wikipedia.org/wiki/Exponential_distribution"&gt;exponential distribution&lt;/a&gt; to model timespents. The exponential distribution has one parameter $\mu$, which works out to be the mean. I will assume that by default, users have $\mu=1$.&lt;/p&gt;
&lt;p&gt;In our timespent simulations, when we say that an A/B test is actually working, we model this by increasing the $\mu$ parameter for the user’s random variable. In the end, we will sample the random variable, so the actual outcome can be any timespent $t&amp;gt;0$, but on average, users with lifted parameters will have higher timespents. This is the key: in an A/B test, we don’t care about individual user’s outcomes, since they are statistically random anyway, we care about measuring accurate average lifts between groups of users.&lt;/p&gt;
&lt;h2&gt;Visualizing one A/B test&lt;/h2&gt;
&lt;p&gt;There is an easy visual way to understand why parallel A/B tests work. Before we look at the parallel cases, as a starting point, let’s look at the simple case of just one experiment. We can use code like in the previous posts for this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;1.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;timespent_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;timespents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="n"&gt;funnel_assignment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timespent_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;force_equal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;timespent_params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;timespents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What this code is doing: there are $N=10,000$ users, we will split them evenly between A and B funnels in the experiment (per &lt;code&gt;funnels&lt;/code&gt;, 2nd column). Each user is modeled by an exponential random variable's parameter (&lt;code&gt;timespent_params&lt;/code&gt;, which has default parameter 1). The function &lt;code&gt;simulate_abtest()&lt;/code&gt; assigns each user into A or B, it returns this assignment into &lt;code&gt;funnel_assignment&lt;/code&gt;. Further, it adjusts the &lt;code&gt;timespent_params&lt;/code&gt;, by increasing the RV’s parameter for users in the B bucket by 1, leaving As alone (per &lt;code&gt;funnels&lt;/code&gt;, 1st column). The final &lt;code&gt;for&lt;/code&gt; loop samples the exponential distributions and stores the actual timespent values per user in &lt;code&gt;timespents&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;We can visualize the outcome of this experiment by drawing both the parameters and the actual timespents of each user. Since there are $N=10,000$ users, we can do so on a 100x100 image. The left side shows the parameters ($\mu=1$ or $\mu=2$), the right side shows the actual, sampled timespents.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa1.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let’s change our visualization a little bit: let’s make it so we draw the A bucket users on top, and the B bucket users on the bottom:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa2.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This nicely shows us what’s going on. On the left side, we can see that the random variables for A and B have different values. On the right side, we can see that after sampling, the difference is still discernible with the naked eye. Note that the two left and two right sides are showing the same values, only arranged differently.&lt;/p&gt;
&lt;h2&gt;Two A/B tests in parallel&lt;/h2&gt;
&lt;p&gt;Now let’s run 2 A/B tests in parallel. In both cases, A leaves the RV’s parameter alone. But for $E_1$, we lift it to $\mu=2$, for $E_2$ we lift it to $\mu=3$. Users are in both A/B tests, and they are assigned into A and B buckets randomly, independently in the two experiments:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;base_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;timespent_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;timespents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="n"&gt;funnel_assignments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                  &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;base_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;funnel_assignment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timespent_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;funnel_assignments&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnel_assignment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;timespent_params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;timespents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that we call &lt;code&gt;simulate_abtest()&lt;/code&gt; in a loop, for each experiment. Let’s visualize the outcome here: we expect that the parameter image will have 4 colors, corresponding to whether a user ended up in AA ($\mu=1$), AB ($\mu=2$) BA ($\mu=3$) or BB ($\mu=4$):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa3.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The left side image in fact has 4 colors. After random sampling, the right side looks just as random as in the single A/B test case. Now let’s do the same trick as before, and draw the image so that As are on top, and Bs are on the bottom. We can pick whether we do this for $E_1$ or $E_2$, we will see the same thing, here I'm doing it for $E_2$'s A and B:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa4.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note again that these are the same exact values as above, only reordered. We can see that on the left side, there are 2 possible parameters on each side (the A and B variations from the &lt;em&gt;other&lt;/em&gt; experiment). And on the right side we can see that even though the other experiment is also running, we can clearly tell apart the average value between top (A) and bottom (B).&lt;/p&gt;
&lt;p&gt;After visualization, we can also numerically see this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnel_assignment&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnel_assignments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[],&lt;/span&gt; &lt;span class="p"&gt;[]],&lt;/span&gt; &lt;span class="p"&gt;[[],&lt;/span&gt; &lt;span class="p"&gt;[]]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnel_assignment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timespents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timespent_params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;ps_means&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_means&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CompareMeans&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ztest_ind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;larger&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;usevar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;unequal&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;test &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="mo"&gt;02&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;, experiment lift=&lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;, blended parameter lift=&lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;, measured lift=&lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;, p-value=&lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;
          &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;base_lift&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps_means&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ps_means&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;ts_means&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ts_means&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;test 00, experiment lift=1.0000, blended parameter lift=0.9700, measured lift=0.9634, p-value=0.0000
test 01, experiment lift=2.0000, blended parameter lift=1.9850, measured lift=1.9924, p-value=0.0000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the first test, we lifted by 1. Because of the other test also running and changing the random variable parameters, on average our random variable parameters were shifted by 0.97 (instead of 1). After sampling, the actual lift was 0.96. And at this sample size, this lift had a very low p-value (since it’s a doubling, it’s easy to measure). And in the next row, we can see the second A/B test, which is also easily measureable.&lt;/p&gt;
&lt;h2&gt;Multiple A/B tests in parallel&lt;/h2&gt;
&lt;p&gt;Maybe this only worked because there were only 2 experiments, and we lifted the RV’s parameter so aggressively (doubling, tripling). Let’s see what happens if we run 11 in parallel, with a $\mu$ lift of 0, 0.1, 0.2 ... 1.0 (so the first one doesn’t work, the last one doubles). The numeric outcomes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;test 00, experiment lift=0.0000, blended parameter lift=0.0605, measured lift=0.0656, p-value=0.0101
test 01, experiment lift=0.1000, blended parameter lift=0.1168, measured lift=0.1027, p-value=0.0001
test 02, experiment lift=0.2000, blended parameter lift=0.2122, measured lift=0.1929, p-value=0.0000
test 03, experiment lift=0.3000, blended parameter lift=0.3125, measured lift=0.3113, p-value=0.0000
test 04, experiment lift=0.4000, blended parameter lift=0.4011, measured lift=0.4153, p-value=0.0000
test 05, experiment lift=0.5000, blended parameter lift=0.5037, measured lift=0.5526, p-value=0.0000
test 06, experiment lift=0.6000, blended parameter lift=0.6013, measured lift=0.5849, p-value=0.0000
test 07, experiment lift=0.7000, blended parameter lift=0.6867, measured lift=0.6740, p-value=0.0000
test 08, experiment lift=0.8000, blended parameter lift=0.8124, measured lift=0.8005, p-value=0.0000
test 09, experiment lift=0.9000, blended parameter lift=0.8973, measured lift=0.8763, p-value=0.0000
test 10, experiment lift=1.0000, blended parameter lift=1.0157, measured lift=1.0209, p-value=0.0000
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It’s still pretty good. Let’s visualize the middle one, where $\mu$ is lifted by 0.5:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa6.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can see that on the left, because there are so many A/B tests in play, the RV parameters also look random, but we can still see the difference, also in the sampled timespents on the right side.&lt;/p&gt;
&lt;h2&gt;Monte Carlo simulations to estimate variance of parallel A/B tests&lt;/h2&gt;
&lt;p&gt;In the above case, for the 7th test, the true experiment lift was 0.7000, but due to the presence of other A/B tests, the blended parameter lift between the two buckets (left side on the images) was 0.6867. Let’s use &lt;a href="https://en.wikipedia.org/wiki/Monte_Carlo_method"&gt;Monte Carlo simulations&lt;/a&gt; to quantify how much of a variance we can expect, as a function on $N$. Let’s run a scenario where we’re running 7 A/B tests at the same time, with lifts of &lt;code&gt;[0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.10]&lt;/code&gt; on different $N$ population sizes &lt;code&gt;[10*1000, 20*1000, 30*1000, 40*1000, 50*1000, 100*1000, 200*1000, 1000*1000]&lt;/code&gt;. Let's run each scenario 100 times, and compute means and variances for average parameter lift and actual measured lift.&lt;/p&gt;
&lt;p&gt;First, the average parameter lifts, with errors to show the variance:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa7.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can see that even at low N, the average parameter lifts are pretty close to the intended experimental lift. The variance is so small, we can barely see it. Let’s zoom in on the orange line:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa8.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The variance goes down with $N$ as expected, but even at $N=10,000$ it’s very low (notice the y-axis). Now the same for actual measured lifts (after the random variables are sampled):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa9.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here we have bigger error bars, it’s hard to see what’s going on. Let’s look at the orange bar again:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa10.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The magnitude of the standard deviation (the error bar), plotted by itself:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa11.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Finally, we can plot the average measured p value (with a &lt;a href="https://en.wikipedia.org/wiki/Z-test"&gt;Z-test&lt;/a&gt;), for each parallel A/B test, for each $N$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/abpa13.png" alt="." style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is essentially a clue to the sample size $N$ we need to be able to detect the signal at a given $\alpha$ critical value of p.&lt;/p&gt;
&lt;h2&gt;Per experiment random user id&lt;/h2&gt;
&lt;p&gt;As these Monte Carlo simulations show, it is possible to run multiple A/B tests at once, across the whole population, on the same outcome metric, and still measure the experimental lift accurately, assuming the experiments are independent.&lt;/p&gt;
&lt;p&gt;It is only necessary to randomize the users between the funnels A and B (and C...) for each experiment independently of the other experiments. A simple solution for this is to have a once randomly generated &lt;code&gt;test_seed&lt;/code&gt; for each experiment that is stored and constant throughout the experiment (like &lt;code&gt;90bb5357&lt;/code&gt; for experiment $E_1$, &lt;code&gt;a5f50c2b&lt;/code&gt; for experiment $E_2$, and so on), combine these with a per user id (like &lt;code&gt;user_id&lt;/code&gt;, or &lt;code&gt;cookie_id&lt;/code&gt;) to get a per experiment random number, that is fixed for the user (so when the user comes back, we compute the same random number), and then modulo that to the number of funnels we have to decide whether to put the user into A or B in each experiment (so if the user returns, she goes back to the same bucket):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_seed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;test_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_seed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ascii&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ascii&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;bits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;bin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;))[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bit&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bits&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;By randomizing user assignments into A/B test experiments, it is possible to run multiple A/B tests at once and measure accurate lifts on the same metric, assuming the experiments are independent.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Bayesian A/B conversion tests</title><link href="https://bytepawn.com/bayesian-ab-conversion-tests.html" rel="alternate"></link><published>2020-03-31T00:00:00+02:00</published><updated>2020-03-31T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-31:/bayesian-ab-conversion-tests.html</id><summary type="html">&lt;p&gt;I compare probabilities from Bayesian A/B testing with Beta distributions to frequentist A/B tests using Monte Carlo simulations. Under a lot of circumstances, the bayesian probability of the action hypothesis being true and the frequentist p value are complementary.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/bayes4.png" alt="Bayes vs z-test" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The A/B tests I talked about before, such as the &lt;a href="http://bytepawn.com/ab-testing-and-the-ztest.html#ab-testing-and-the-ztest"&gt;Z-test&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-the-ttest.html#ab-testing-and-the-ttest"&gt;t-test&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html#ab-testing-and-the-chi-squared-test"&gt;$\chi^2$ test&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-the-gtest.html#ab-testing-and-the-gtest"&gt;G-test&lt;/a&gt; and &lt;a href="http://bytepawn.com/ab-testing-and-fishers-exact-test.html#ab-testing-and-fishers-exact-test"&gt;Fisher’s exact test&lt;/a&gt; are so-called &lt;strong&gt;frequentist&lt;/strong&gt; hypothesis testing methodologies. In &lt;a href="https://en.wikipedia.org/wiki/Frequentist_inference"&gt;frequentist inference&lt;/a&gt;, we formulate a $H_0$ null hypothesis and an $H_1$ action hypothesis, run the experiment, and then calculate the $p_f$ value ($f$ for frequentist), which is the probability of the outcome of the experiment being at least as extreme as the actual outcome, assuming the null hypothesis $H_0$ is true. For one-tailed conversion tests, $H_0$ is &lt;em&gt;“B is converting worse or the same as A”&lt;/em&gt; and $H_1$ is &lt;em&gt;“B is converting better than A”&lt;/em&gt;. In the frequentist setting, if the $p_f$ value is lower than some threshold $\alpha$ (usually $\alpha=0.01$ or $\alpha=0.05$), then we reject the null hypothesis, and accept the action hypothesis.&lt;/p&gt;
&lt;p&gt;At a high level, &lt;a href="https://en.wikipedia.org/wiki/Bayesian_inference"&gt;&lt;strong&gt;bayesian&lt;/strong&gt; inference&lt;/a&gt; turns this on its head and computes the probability $p_b$ ($b$ for bayesian) that $H_1$ is true (and $H_0$ is false) given the outcome of the experiment. If this probability is high, we accept the action hypothesis.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We expect that the two are related: if there is a big difference in conversion in favor of B versus A, relative to the sample size $N$, we expect to get a low frequentist $p_f$ value and a high $p_b$ bayesian probability. However, in terms of the math, the two are not complementary probabilities: $p_f + p_b \neq 1$. However, as I will show here, this relationship approximately holds when doing conversion 2x2 A/B testing with Beta distributions and flat priors: $p_f + p_b \simeq 1 $.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Bayesian%20AB%20testing.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Conversion parameter estimation with the Beta distribution&lt;/h2&gt;
&lt;p&gt;In Bayesian modeling, we pick a distribution to model the $\mu$ conversion probability for each funnel A and B. In other words, we say &lt;em&gt;“I don’t know what the conversion of A is, but based on the experimental outcome, $P_A(\mu)$ is the probability that it is $\mu$.”&lt;/em&gt; For conversion modeling, we usually pick the &lt;a href="https://en.wikipedia.org/wiki/Beta_distribution"&gt;Beta distribution&lt;/a&gt; to model the $\mu$ conversion parameter. The Beta distribution has two parameters, $\alpha$ and $\beta$, its peak is at $\frac { \alpha }{ \alpha + \beta }$, its domain is the range $[0, 1]$. Given an experimental outcome where we observed $C$ conversions out of $N$ cases, we set $\alpha=C$ and $\beta=N-C$, so $\alpha$ is the conversion count, $\beta$ is the non-conversion count. The Beta distribution is available in &lt;a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html"&gt;scipy&lt;/a&gt;, it’s easy to visualize:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;convs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;legends&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;conv&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;convs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conv&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;conv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;legends&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Beta(&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;legends&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes1.png" alt="Beta distributions" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Computing the bayesian probability&lt;/h2&gt;
&lt;p&gt;Given the parameter distributions for A and B, we can compute the probability that B’s conversion is greater than A’s: $P(\mu_A \leq \mu_B) = \int_{\mu_A \leq \mu_B} P_A(\mu_A) P_B(\mu_B) = \int_{\mu_A \leq \mu_B} Beta_{\alpha=C_A, \beta=N_A-C_A} (\mu_A) Beta_{\alpha=C_B, \beta=N_B-C_B} (\mu_B) $. To evaluate the integral, we can either use a &lt;a href="https://towardsdatascience.com/bayesian-a-b-testing-with-python-the-easy-guide-d638f89e0b8a"&gt;closed form solution from this post&lt;/a&gt;, or use Monte Carlo integration (sampling) to estimate. Implementing the MC integration is good practice, and we can use it to make sure the copy/pasted closed form is correct:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bayesian_prob_mc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;beta_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;beta_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;samples_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_A&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;samples_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta_B&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_B&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s see it in action, and let’s also show what we would get with a one-tailed z-test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;z_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;z-test p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;pf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bayesian p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob_mc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bayesian MC p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Observations:
 [[273 245]
 [242 240]]
z-test p      = 0.215
Bayesian p    = 0.785
Bayesian MC p = 0.785
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The results look good. Note that the z-test p-value $p_f$ and the Bayesian probability $p_b$ add to 1. What’s going on?&lt;/p&gt;
&lt;h2&gt;Bayesian Beta modeling vs the frequentist z-test&lt;/h2&gt;
&lt;p&gt;Let’s evaluate this by running 100 A/B tests, and plotting both $p_f$ and $p_b$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_simulations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;z_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bayesian_prob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;conv_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;conv_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conversion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;conv_B&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;conv_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;green&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;conv_A&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;conv_B&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;red&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_simulations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;z-test p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bayesian p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes2.png" alt="p_f vs p_b" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;It seems that the two add up to 1. But they don’t add up to 1 exactly, it’s just an approximation (using the exact closed-form Bayesian evaluation, not the MC):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_simulations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes15.png" alt="p_f vs p_b" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;What’s going on here? One of the things every data scientist knows is that, if given $p_f$, it doesn’t mean that the probability that the action hypothesis is true is $1-p_f$.&lt;/p&gt;
&lt;p&gt;Let’s repeat the above experiment, but with different $N$ sample sizes, different conversions, and also look at cases when the action hypothesis is actually true (funnel B is in fact better). Green dots are cases when the outcome of the experiment was such that B’s conversion was better than A (irrespective of the true conversions), red the opposite, for easier readability. In all cases, 100 A/B tests are performed, and the results are plotted.&lt;/p&gt;
&lt;p&gt;$N=1000$, A and B are both 50%:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes4.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;$N=1000$, B’s conversion is better than A’s (53% vs 50%):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.47&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.53&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes5.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this case, we’re sampling the same curves, but in the region where B’s conversion is better than A’s (more green dots in the green section of the curve).&lt;/p&gt;
&lt;p&gt;$N=10,000$, A and B are both 1%:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes6.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The equation $p_f + p_b \simeq 1 $ seems to hold in all these cases!&lt;/p&gt;
&lt;h2&gt;Explanation&lt;/h2&gt;
&lt;p&gt;With the Z-test, we assume that the Central Limit Theorem (CLT) holds, and model the standard error of the mean (the mean is the conversion) of each funnel with a normal variable $N$, centered on the measured conversion $\mu$, and variance $\sigma^2 = \mu * (1 - \mu) / N$. The difference in conversion is also a normal variable, $N = N_B - N_A$, this has mean $\mu = \mu_B - \mu_A$ and variance $\sigma^2 = \sigma_A^2 + \sigma_B^2$. Then we assume the null hypothesis is true, and calculate the probability of getting at least as extreme results as observed (wrt conversion difference), so we take the integral of the normal from 0 to $\infty$.&lt;/p&gt;
&lt;p&gt;With the Bayesian model, we model the actual conversion parameter with a Beta distribution. As long as the $Beta(\alpha, \beta)$ distribution and the $N(\mu, \sigma^2)$ are close enough (with $\mu = \frac{ \alpha }{ \alpha + \beta } $), the two probabilities will be complementary, since in the Bayesian framework we’re computing the complimentary probability. Let’s compare some Beta and normal distributions under with different sample sizes and conversions to check how close these distributions are:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bayes7.png" alt="Beta vs normal distributions" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Thanks to the CLT, the only time when the Beta pdf and the normal pdf are noticably different is when we’re close to 0 or 1 in conversion probability, and we’re at low sampe sizes $N$ (top right case, above). We can visualize this directly:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes8.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is now a noticable spread in the curves, and the “error” in the $p_f + p_b \simeq 1$ line goes as high as 0.04:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bayes16.png" alt="p_f vs p_b" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In earlier posts I showed that at moderate $N$ the &lt;a href="http://bytepawn.com/ab-testing-and-the-ttest.html"&gt;t-test and the z-test quickly become the same thing&lt;/a&gt;, so exchanging the z-test for the t-test doesn’t make a difference.&lt;/p&gt;
&lt;h2&gt;Bayesian Beta modeling vs the frequentist Fisher’s exact test&lt;/h2&gt;
&lt;p&gt;Let's do the same, but instead of using the Z-test, let's use Fisher's exact test (which doesn't have a normal distribution assumption) to get the frequentist $p_f$.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes10.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;... and the same at $N=1000$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bayes11.png" alt="p_f vs p_b" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;At high $N$, $p_f + p_b \simeq 1$ also approximately holds, since here all these frequentist tests become the Z-test.&lt;/p&gt;
&lt;p&gt;Question: why the concave relationship at low $N$?&lt;/p&gt;
&lt;h2&gt;Bayesian modeling with normals, priors&lt;/h2&gt;
&lt;p&gt;There is no set rules for how to perform Bayesian modeling, it is the modeler's choice. It is up to us what kind of distributions we use to model the conversion parameter for our funnels. For example, another popular choice (other than the Beta) is the normal distribution. It goes without saying that if we did that, with the parameters chosen as mentioned above for the z-test, we could get exactly complementary probabilities.&lt;/p&gt;
&lt;p&gt;Another choice we have in Bayesian modeling is the prior. The prior is some up-front belief we have about the distribution of the conversions, which we then update based on the outcome of the experiment to get the posterior distribution. Two popular choices are $Beta(1, 1)$, which happens to be the uniform distribution, and $Beta(0.5, 0.5)$, called the Jeffreys prior:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# common priors&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Beta(1, 1)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Beta(0.5, 0.5)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bayes12.png" alt="p_f vs p_b" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;When we have a prior belief expressed as a Beta distribution $Beta(\alpha_0, \beta_0)$, after we run the A/B test which yields $\alpha$ conversions and $\beta$ non-conversion events, the posterior will be the Beta distribution $Beta(\alpha_0+\alpha, \beta_0+\beta)$. As you can imagine, at reasonable samples sizes such as $N&amp;gt;100$, Beta priors with relatively low parameters don’t matter much; this is called “washing out the prior with observations”.&lt;/p&gt;
&lt;p&gt;However, note that any sort of prior can be chosen, including a very strong one that doesn’t wash out with $N=1000$ samples, like $Beta(1M, 1M)$; this is saying the modeler has a very strong prior belief that the conversion is 50%, and she needs to see millions of observations to convince her otherwise; getting 10 out of 100 will not convince her, since $Beta(1M+10, 1M+100) \simeq Beta(1M, 1M)$ is still a peak around 0.5 (notice the x-axis):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bayes14.png" alt="Strong prior" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is a modeling decision. Obviously, with strong priors like this $p_f + p_b \simeq 1$ will not hold (since $p_b$ will be frozen until a lot of observations are collected). Also, any distribution can be chosen by the modeler for the posterior.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So in general, the $p_f + p_b \simeq 1$ approximation is not true, it only happens to roughly hold when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;using Z-tests on the frequentist side (or other tests whose p value becomes the Z-test's at large $N$), and&lt;/li&gt;
&lt;li&gt;using Beta distributions (or other distributions that become roughly normal at large $N$) for Bayesian modeling, and&lt;/li&gt;
&lt;li&gt;using a weak prior that is washed out &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Turning it around: Bayesian inference, when applied to A/B testing using Beta distributions (or other distributions that become normal at large $N$) and weak priors, at reasonable sample sizes, yields roughly complementary probabilities to frequentist tests such as the Z-test (or other tests whose p value becomes the Z-test's at large $N$): $p_f + p_b \simeq 1$. At the end of the day, in conversion A/B testing, in the absence of strong prior beliefs, at reasonable sample sizes we end up putting roughly gaussian shaped functions around measured averages, so different statistical procedures yield roughly the same (complementary) probabilities and decisions.&lt;/p&gt;</content><category term="bayesian"></category><category term="ab-testing"></category></entry><entry><title>A/B testing and the G-test</title><link href="https://bytepawn.com/ab-testing-and-the-gtest.html" rel="alternate"></link><published>2020-03-23T00:00:00+01:00</published><updated>2020-03-23T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-23:/ab-testing-and-the-gtest.html</id><summary type="html">&lt;p&gt;The G-test for conversion A/B tests is similar to the Chi-squared test. Monte-Carlo simulations show that the two are indistinguishable in practice.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/gtest11.png" alt="G-test vs Chi-squared p differences" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In previous posts I discussed the &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_test"&gt;$\chi^2$ test&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Fisher%27s_exact_test"&gt;Fisher's exact test&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html"&gt;A/B testing and the Chi-squared test&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://bytepawn.com/ab-testing-and-fishers-exact-test.html"&gt;A/B testing and Fisher's exact test&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both are tests for conversion A/B testing, both can be used to test multiple funnels (A/B/C/..) with multiple outcomes (No conversion, Monthly, Annual). At low $N$, Fisher’s exact test gives accurate results, while at high $N$, the difference in $p$ values goes to zero.&lt;/p&gt;
&lt;p&gt;The G-test is a close relative to the $\chi^2$ test, in fact the $\chi^2$ test is an approximation of the G-test. The &lt;a href="https://en.wikipedia.org/wiki/G-test"&gt;Wikipedia page for G-test&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The commonly used $\chi^2$ tests for goodness of fit to a distribution and for independence in contingency tables are in fact approximations of the log-likelihood ratio on which the G-tests are based. The general formula for Pearson's $\chi^2$ test statistic is $ \chi^2 = \sum_i { \frac{ (O_i - E_{i} )^2 }{ E_i } } $. The approximation of G by $\chi^2$ is obtained by a second order Taylor expansion of the natural logarithm around 1. The general formula for G is $ G = 2 \sum_i { O_i \cdot \ln \frac{O_i}{E_i} } $, where $O_i$ is the observed count in a cell, $E_i$ is the expected count under the null hypothesis, $\ln$ denotes the natural logarithm, and the sum is taken over all non-empty cells.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20the%20G%20test.ipynb"&gt;The code for this post is on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Let's reuse the contingency table example from the &lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html"&gt;$\chi^2$ post&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/contingency_table3.PNG" alt="Contingency table" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For each obsevation cell, we calculate the expected value. Expected here means according to the null hypothesis, which is that all funnels are the same. Our best guess for the null hypothesis are the blended bottom numbers: $7922/10000$ for &lt;strong&gt;No Conversion&lt;/strong&gt;, $1085/10000$ for &lt;strong&gt;Monthly&lt;/strong&gt;, etc. So for Funnel A, which has 5916 samples, our expected &lt;strong&gt;No Conversion&lt;/strong&gt; number is $5916*7922/10000=4686.6$. We do this for each cell. Then we subtract the actual observation from the expected, square it, and divide by the expected, like $(4748-4686.6)^2/4686.6=0.8$. We do this for each cell, and sum up the numbers to we get the $\chi^2$ test statistic. We then look this up in a &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_distribution"&gt;$\chi^2$ distribution&lt;/a&gt; table to get a p value. We have to use a degree of freedom of $k=(F-1)(C-1)$, where $F$ is the number of funnels, $C$ is the number of conversion events, $F=4, C=3$ above.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For the G-test, we just have to change the inner formula. Eg. for Funnel A's No &lt;strong&gt;Conversion&lt;/strong&gt; case, instead of $(4748-4686.6)^2/4686.6$, we calculate $4748 \cdot \ln \frac{4748}{4686.6} $. Other than that, it's the same, add up for all cells to get the G test statistic, and look up in a $\chi^2$ distribution table to the the p value.&lt;/p&gt;
&lt;p&gt;Because the two tests are so similar, we can write a generalized test function with a pluggable cell formula:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generalized_contingency_independence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cell_fn&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;row_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chisq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
            &lt;span class="n"&gt;chisq&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;cell_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dof&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;chi2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chisq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chisq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generalized_chi_squared&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;generalized_contingency_independence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generalized_G&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;generalized_contingency_independence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obs&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's simulate an A/B test where both funnels are the same (null hypothesis is true) and see the difference between the two tests:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;funnels = [
    [[0.50, 0.50], 0.5], # the first vector is the actual outcomes,
    [[0.50, 0.50], 0.5], # the second is the traffic split
]
N = 1000

observations = simulate_abtest(funnels, N)
print(observations)
c_our = generalized_chi_squared(observations)
g_our = generalized_G(observations)
print(&amp;#39;Chi-squared test statistic = %.3f&amp;#39; % c_our[0])
print(&amp;#39;G test statistic = %.3f&amp;#39; % g_our[0])
print(&amp;#39;Chi-squared p = %.6f&amp;#39; % c_our[1])
print(&amp;#39;G p = %.6f&amp;#39;   % g_our[1])
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;[[247. 278.]&lt;/span&gt;
 &lt;span class="k"&gt;[234. 241.]]&lt;/span&gt;
&lt;span class="na"&gt;Chi-squared test statistic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.490&lt;/span&gt;
&lt;span class="na"&gt;G test statistic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.490&lt;/span&gt;
&lt;span class="na"&gt;Chi-squared p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.483775&lt;/span&gt;
&lt;span class="na"&gt;G p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.483769&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can see that the results are very close. Let's see the p value difference as a function of the sample size $N$, for different lifts for a 2x2 contingency table:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;base_conversion = 0.5
traffic_split = 0.5

results = {}
for actual_lift in range(0, 10, 1):
    actual_lift /= 100.0
    results[actual_lift] = []
    for N in range(100, 10*1000, 50):
        observations = [
            [int(base_conversion * traffic_split * N), int((1-base_conversion) * traffic_split * N)],
            [int((base_conversion+actual_lift) * (1-traffic_split) * N), int((1-(base_conversion+actual_lift)) * (1-traffic_split) * N)],
        ]
        p_chi2 = generalized_chi_squared(observations)[1]
        p_g = generalized_G(observations)[1]
        p_diff = abs(p_chi2 - p_g)
        results[actual_lift].append((N, p_diff))

plt.figure(figsize=(10,5))
plt.xlabel(&amp;#39;sample size&amp;#39;)
plt.ylabel(&amp;quot;&amp;quot;&amp;quot;chi^2 vs G-test p difference&amp;quot;&amp;quot;&amp;quot;)
for actual_lift in results.keys():
    plt.plot([x[0] for x in results[actual_lift]], savgol_filter([x[1] for x in results[actual_lift]], 67, 3))
plt.legend([&amp;#39;actual_lift = %d%%&amp;#39; % (100*actual_lift) for actual_lift in results.keys()], loc=&amp;#39;upper right&amp;#39;)
plt.show()
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gtest11.png" alt="G-test vs Chi-squared p differences" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The differences in $p$ values are tiny, in the fourth decimal. In practice, this means the tests are interchangeable, as they numerically yield the same results (similar to how the &lt;a href="http://bytepawn.com/ab-testing-and-the-ttest.html"&gt;t-test and z-test yield the same value numerically&lt;/a&gt;). This is in-line with the Wikipedia page:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For samples of a reasonable size, the G-test and the chi-squared test will lead to the same conclusions. However, the approximation to the theoretical chi-squared distribution for the G-test is better than for the Pearson's chi-squared test.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let's see how significant the last sentence is. What this is saying is that (i) assuming the null hypothesis is true (both funnels are the same), and we (ii) perform the tests multiple times and calculate the $\chi^2$ and G test statistics, and (iii) compute the histogram of tests statistics, then (iv) the histogram of G test statistics should be a better fit to the theoretical $\chi^2$ distribution than the histogram for $\chi^2$ test statistics. This goodness of fit difference is something we can evaluate with a Monte Carlo (MC) simulation.&lt;/p&gt;
&lt;p&gt;The simplest way computationally is to compute the 10th, 20th ... 90th percentiles of the MC test statistics, and compare that to the inverse &lt;a href="https://en.wikipedia.org/wiki/Cumulative_distribution_function"&gt;cumulative distribution function&lt;/a&gt; (cdf) of the theoretical $\chi^2$ distribution taken at those percentiles. Let's run an $N=30$ A/B test 1000 times and compare the distributions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;cs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;gs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generalized_chi_squared&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generalized_G&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;arr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;expected_percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;chi2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ppf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;c_percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g_percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;percentile&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;inverse cdf&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chi2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ppf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;linestyle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c_percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g_percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;theoretical chi^2&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;chi^2 test&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;G-test&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper left&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gtest2.png" alt="Inverse cdf" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The differences are small, but we can still see them. However, such small sample sizes are unrealistic for A/B testing on the one hand, and at low sample sizes we should be using Fisher's exact test anyway. Repeating the same at $N=1,000$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gtest3.png" alt="Inverse cdf" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Already at $N=1,000$ we cannot see a difference between the $\chi^2$ and the G-test in terms of their test statistic distribution.&lt;/p&gt;
&lt;p&gt;Let's do this at scale: at different $N$s, let's run &lt;code&gt;num_simulations&lt;/code&gt; A/B tests, compute the histogram for both tests, compare to theoretical $\chi^2$ distribution, and count whether the G or the $\chi^2$ test statustic (distribution) is the better fit:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;percentile_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_percentiles&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;arr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;a_percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;a_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a_percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_percentiles&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expected_percentiles&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a_diff&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mp_fit_chi2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_fits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;
    &lt;span class="n"&gt;percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;expected_percentiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;chi2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ppf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;c_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;g_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_fits&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;cs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;gs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generalized_chi_squared&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generalized_G&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;c_diffs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;percentile_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_percentiles&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;g_diffs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;percentile_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percentiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_percentiles&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;c_diffs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g_diffs&lt;/span&gt;

&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;num_fits_per_thread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="n"&gt;num_threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt;  &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running simulations for with (N, num_simulations, num_fits_per_thread) = (&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;) on &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; threads&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_fits_per_thread&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mp_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;gtest_worker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mp_fit_chi2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;num_fits_per_thread&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done! Elapsed &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%M:%S&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gmtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed_time&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;c_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;flatten&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;g_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;flatten&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;g_better_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;g_diff&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;c_diff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c_diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g_diff&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c_diffs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g_diffs&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c_diffs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The G test statistic better approximates the theoretical Chi^2 distribution &lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s1"&gt; fraction of times out of &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; fits&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g_better_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c_diffs&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Running simulations for with (N, num_simulations, num_fits_per_thread) = (50, 1000, 500) on 24 threads
Done! Elapsed 29:45
The G test statistic better approximates the theoretical Chi^2 distribution 0.14 fraction of times out of 12000 fits
-
Running simulations for with (N, num_simulations, num_fits_per_thread) = (100, 1000, 500) on 24 threads
Done! Elapsed 46:45
The G test statistic better approximates the theoretical Chi^2 distribution 0.56 fraction of times out of 12000 fits
-
Running simulations for with (N, num_simulations, num_fits_per_thread) = (1000, 1000, 500) on 24 threads
Done! Elapsed 52:06
The G test statistic better approximates the theoretical Chi^2 distribution 0.48 fraction of times out of 12000 fits
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The results are interesting. At $N=50$, the $\chi^2$ is actually a better fit (wins 86% of the time). At higher $N$s, it roughly the same between the two.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The G-test numerically yields the same results as the $\chi^2$ test, in practice it doesn't matter which one we pick for A/B tests.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and networks effects</title><link href="https://bytepawn.com/ab-testing-and-network-effects.html" rel="alternate"></link><published>2020-03-21T00:00:00+01:00</published><updated>2020-03-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-21:/ab-testing-and-network-effects.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations to explore how A/B testing on Watts–Strogatz random graphs depends on the degree distribution of the social network.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/network_scaling1.png" alt="Watts-Strogatz degree distribution" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="http://bytepawn.com/ab-testing-on-social-networks.html#ab-testing-on-social-networks"&gt;previous post&lt;/a&gt; I calculated the expected average posts on a social network. The model had 2 components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;intrinsic:&lt;/strong&gt; users intrinsically create posts&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;network effects:&lt;/strong&gt; users create more posts if they see their friends’ posts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But, there is actually a third, even more nuanced effect: &lt;strong&gt;the strength of the network effect depends on the degree distribution in the network&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Network%20scaling.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Degree dependence of the network effect&lt;/h2&gt;
&lt;p&gt;Simulations showed that even a weak network effect boosts overall post production significantly. If $c_{int}$ is intrinsic post production, and for each friend's post users create an additional $c_{net}$ post the next day, and on average users have $k$ friends, then in the steady state $c = c_{int}  + c \times c_{net} \times k $. Solving this we get $ c = \frac{ c_{int} }{ 1 - c_{net} \times k  } $.&lt;/p&gt;
&lt;p&gt;In the last post, I used a $U(0, 1)$ random variable to multiply both $c_{int}$ and $c_{net}$. This is not neccessary and just introduces noise, so let’s leave it out. For this post let’s use $c_{int}=1/4$ and $c_{net}=3/200$, with these values $c=1$.&lt;/p&gt;
&lt;p&gt;In the above formula, the structure of the graph only appears in the parameter $k$. This is a good first approximation, but it’s not entirely accurate. A more accurate way to think about it is to write $c = c_{int}  + c \times f(c_{net}, g) $, where $f$ is some function, and $g$ is the graph ($f=c_{net} \times k$ is our initial approximation). Let's explore this dependence further.&lt;/p&gt;
&lt;p&gt;We are using &lt;a href="https://en.wikipedia.org/wiki/Watts%E2%80%93Strogatz_model"&gt;Watts-Strogatz graphs&lt;/a&gt; for these experiment, which have a re-wiring randomization parameter $p$. When the Watts-Strogatz graph is constructed, initially a ring graph is created (large diameter) with $k$ edges from each node, and then each edge is randomly re-wired with probability $p$ to get a &lt;a href="https://en.wikipedia.org/wiki/Small-world_network"&gt;small-world network&lt;/a&gt; (small diameter). If we set $p=0$, the degree distribution of the graph is exactly $k$. As we increase $p$, the degree distribution becomes a gaussian around $k$. Code to visualize this:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: some of these simulations take a while to run, so I now use &lt;a href="https://docs.python.org/2/library/multiprocessing.html"&gt;multiprocessing&lt;/a&gt;. This makes the code harder to read, but the speed-up is significant.&lt;/em&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mp_degree_distribution&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;degree&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;

&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mp_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;network_scaling_worker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mp_degree_distribution&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;110&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;degree&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;node count&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/network_scaling1.png" alt="Watts–Strogatz degree distribution" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Now comes the interesting bit. Let’s calculate the average post production for a Watts-Strogatz graph for different parameters $(n, p)$, with fixed $k=50$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mp_lifts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_directed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;population_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;compute_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# see the previous post&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running simulations for n=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mp_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;network_scaling_worker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mp_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done! Elapsed &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%M:%S&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gmtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed_time&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/network_scaling11.png" alt="Post production depends on Watts-Strogatz p" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This shows the (third) effect:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;At $p=0$, when all nodes have exactly $k$ neighbours, the mean post production is exactly 1.&lt;/li&gt;
&lt;li&gt;At higher $p$, there is an additional boost to post production, due to some nodes having more neighbours than $k$; and this boost is not canceled by some nodes having less neighbours.&lt;/li&gt;
&lt;li&gt;The maximum boost due to uneven degree distribution is about 2.5% at $p=1$ for $k=50$.&lt;/li&gt;
&lt;li&gt;The effect does not seem to depend on the $n$ network size.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This has implications for A/B testing: &lt;strong&gt;when we compute the lift in our simulations, we need to use this adjusted baseline.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Another way to get a sense for this is: assume half the nodes are have degree $k=50-d$, half have $k=50+d$ after randomization, so the mean is still $k=50$. We can take the original formula $ c = \frac{ c_{int} }{ 1 - c_{net} \times k  } $ and make an improved version: $ c(d) = \frac{1}{2} ( \frac{ c_{int} }{ 1 - c_{net} \times (k+d) } + \frac{ c_{int} }{ 1 - c_{net} \times (k-d) } ) $. This is not the exact formula (the experimental result above is concave, and this is convex), but it’s good for intuition:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/network_scaling22.png" alt="Post production as a function of degree distribution" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Initially I suspected the effect is due to the graph diameter decreasing with increasing $p$, so I checked how the mean shortest path depends on $p$ and $n$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mp_shortest_paths&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shortest_path_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mean_shortest_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mean_shortest_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;num_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running simulations for n=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mp_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;network_scaling_worker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mp_shortest_paths&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done! Elapsed &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%M:%S&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gmtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed_time&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Shows:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/network_scaling21.png" alt="Mean shortest path as a function of Watts-Strogatz p" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This does not appear to be the reason for the additional boost. The mean shortest path in the graph strongly depends on $n$, as expected, but the effect does not. Intuitively, since all nodes are equal for now (there is no A population producing more content), path length shouldn’t matter.&lt;/p&gt;
&lt;p&gt;So, to get more accurate lift readings for our A/B tests, we need to first calculate the correct baseline mean posts for a Watts-Strogatz graph with those $(k, p)$ params (no strong $n$ dependence, experimentally, as we just saw):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;

&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_directed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;step_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev_posts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;prev_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;
&lt;span class="n"&gt;baseline_avg_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Correct baseline for a (&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;) Watts-Strogatz graph after &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt; steps = &lt;/span&gt;&lt;span class="si"&gt;%.4f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;baseline_avg_posts&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Correct baseline for a (100000, 50, 0.1) Watts-Strogatz graph after 50 steps = 1.0044
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;A/B testing&lt;/h2&gt;
&lt;p&gt;Let’s re-run the A/B test experiment from the last post and calculate the lifts, but compared to the new, corrected baseline. Let’s have an experimental group of $N$ people, whose intrinsic post production $c_{int}$ is lifted by 5%, using both $N=100$ and $N=1,000$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;
&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;num_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;

&lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;prepend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;prepend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;thread_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;             &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;

&lt;span class="n"&gt;stats_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt;  &lt;span class="s1"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Running simulations for n=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;...&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mp_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;network_scaling_worker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mp_lifts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="n"&gt;thread_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;avg_stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]))]&lt;/span&gt;
        &lt;span class="n"&gt;stats_list&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avg_stats&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Done! Elapsed &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%M:%S&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gmtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed_time&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/network_scaling31.png" alt="Experimental post production lift on a Watts-Strogatz network" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;What we expect to see:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;as $N/n$ goes to 0, the rest and overall lifts tend to 0.&lt;/li&gt;
&lt;li&gt;as $N/n$ goes to 1, we expted to see a bigger lift for population A (if $N=n$, we released to 100% and lifted everybody's post production)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;A simple toy post production model on a Watts-Strogatz graph shows multiple interesting effects (also see &lt;a href="http://bytepawn.com/ab-testing-on-social-networks.html#ab-testing-on-social-networks"&gt;previous post&lt;/a&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;network effect&lt;/strong&gt;: boosts post production by a significant factor&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;degree distribution effect&lt;/strong&gt;: the network effect boost is a function of the graph's degree distribituion, which for a Watts-Strogatz graph is a function of $p$&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;dampening effect&lt;/strong&gt;: we underestimate the true intrinsic lift of the A/B test, because A’s non-A friends don’t get the intrinsic post production boost, so As don’t get the boost “back” through these edges&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;spillover effect&lt;/strong&gt;: we measure a lift due to the network effect for friends of A, and further down the network, depending on the distance from As&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;clustering effect&lt;/strong&gt;: if the A group is more tightly clustered, we measured a higher lift&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;experiment size effect&lt;/strong&gt;: as $N/n$ goes to 1, the effect size approaches the true effect size&lt;/li&gt;
&lt;/ul&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing on social networks</title><link href="https://bytepawn.com/ab-testing-on-social-networks.html" rel="alternate"></link><published>2020-03-09T00:00:00+01:00</published><updated>2020-03-09T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-09:/ab-testing-on-social-networks.html</id><summary type="html">&lt;p&gt;I use Monte Carlo simulations to show that experimentation on social networks is a beatiful statistical problem with unexpected nuances due to network effects.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/strogatz2.png" alt="Watts-Strogatz" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href="http://bytepawn.com/tag/ab-testing.html"&gt;previous posts on A/B testing&lt;/a&gt; we have implicitly assumed independece:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if $A_1$ and $A_2$ are two units in the A bucket, the choices of $A_1$ and $A_2$ are independent of each other&lt;/li&gt;
&lt;li&gt;the same across A and B&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This even went into the math, because the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt; assumes that the random variables added are independent. But the point this post drives home is not going to be about the CLT.&lt;/p&gt;
&lt;p&gt;Let’s take the case of post production. An experiment could test whether people are more likely to create a post if the UI element for posting is bigger and more prominent. If this product does not have a sharing/network component, it’s reasonable to make the above 2 independence assumptions. But on a social network the above assumptions do not hold. If the experiment boosts post production, this could lead to their friends seeing more posts in their feed, which in turn could lead to them posting more, which in turn... and so on.&lt;/p&gt;
&lt;p&gt;Sticking to the post production example, we can model the effect if we split posting propensity into two parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;intrinsic&lt;/strong&gt;: a random variable which describes how many posts daily a user on the network is likely to create&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;network effect&lt;/strong&gt;: users are more likely to create posts if they see their friends' posts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s assume that group A gets the UI element and it actually boosts their instrinsic post production. Because of the network effect, we expect to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;measure an increased boost for A (vs just the intrinsic effect), because of A-A “self” interaction (network effect)&lt;/li&gt;
&lt;li&gt;measure an increased boost for B (vs no effect), because of A-B interaction (spillover effect)&lt;/li&gt;
&lt;li&gt;since B is also boosted, A-B interaction also boosts A; everything is boosted, to a different degree&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Additionally:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the effect we measure in A (intrinsic effect plus network effect) will be less than what we get if we release A to 100%, since then the whole network will reinforce&lt;/li&gt;
&lt;li&gt;the network effect depends on the social network: more connections means more reinforcement&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20on%20social%20networks.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Watts–Strogatz random graphs&lt;/h2&gt;
&lt;p&gt;Let’s run some Monte Carlo simulations to see this in action. We will use a random &lt;a href="https://en.wikipedia.org/wiki/Watts%E2%80%93Strogatz_model"&gt;Watts–Strogatz model&lt;/a&gt; for the social network, and use the &lt;a href="https://networkx.github.io/"&gt;networkx&lt;/a&gt; library to generate it for us. The Watts-Strogatz model creates a graph with $n$ nodes, arranged in a ring, with each node connected to the next $k$ nodes in the ring; this initial setup is clustered, and has a high diameter. Then, with probability $p$, each edge is re-connected to a random node on the ring, this causes the diameter of the graph to drop and produces a “small-world graph”, where every node is reachable from every other node in a low number of hops.&lt;/p&gt;
&lt;p&gt;Some examples of Watts–Strogatz graphs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/strogatz1.png" alt="Watts–Strogatz random graph" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/strogatz2.png" alt="Watts–Strogatz random graph" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/strogatz3.png" alt="Watts–Strogatz random graph" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;For initial exploration, I will use a small graph:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_directed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Post production model&lt;/h2&gt;
&lt;p&gt;For post production, let’s follow the simple model given above, with two parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;intrinsic post production&lt;/li&gt;
&lt;li&gt;network effect: seeing their friends posts causes users to post more, proportionally&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In code, we will run the simulation day-to-day, ie. posts from day T will trigger people to post more or day T+1. In this toy model, we will allow non-numeric post production, so people can write eg. 0.1134 posts a day:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;step_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;intrinsic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;network_effect&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;today_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# baseline&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;today_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intrinsic&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# network effect&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;today_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;network_effect&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;today_posts&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can drive it like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;posts_series&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;step_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts_series&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;posts_series&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;posts_series&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It will take a few days for the network to reach equilibrium:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;avg_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;posts_series&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;t&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;avg posts&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/sn1.png" alt="Post production" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;We see that with the parameters used, it converges to 0.5 posts / day on average across the network after about $T_c=20$ steps:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints somethings like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="mf"&gt;0.5040494951777046&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It’s easy to see why. On the first day, each person produces &lt;code&gt;intrinsic * random()&lt;/code&gt; posts, where &lt;code&gt;intrinsic = 0.25&lt;/code&gt; and &lt;code&gt;random()&lt;/code&gt; is a $U(0, 1)$ uniform random variable, so on average it’s 0.5. So this part is on average $c=0.125$. Then, starting the second day, each person produces $c$ on average, plus for each friend, &lt;code&gt;yesterday_posts[v1] * network_effect * random()&lt;/code&gt; additional posts, where &lt;code&gt;network_effect = 0.03&lt;/code&gt;, and from the graph each person has 50 friends. So overall this is on average $c * k$, with $k = 50 * 0.03 * 0.5 = 0.75$. Once equilibrium is reached, the following holds: $c_{next} = c + c_{next} * k$. Solving this, $c_{next} = 0.5$.&lt;/p&gt;
&lt;p&gt;Note that the intrinsic part averages 0.125, and the network effect adds on another 0.375. &lt;strong&gt;In this toy model, 3 out of 4 posts is the result of network effects!&lt;/strong&gt; This is a good qualitative indication why network effects are so important for engagement.&lt;/p&gt;
&lt;p&gt;We can also see that by making the network effect too strong, either by having too many friends or setting &lt;code&gt;network_effect&lt;/code&gt; too high, we get exponential growth (in this case, the $c_{next}$ equation yields a nonsensical negative solution). For example, if we double the friend count to 100 (but keep everything else the same):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/sn2.png" alt="Post production" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;For the purposes of this discussion, exponential growth is unrealistic. We are assuming there is a base steady-state, and we run an experiment which lifts the steady state by a few percentage points.&lt;/p&gt;
&lt;h2&gt;Experiments&lt;/h2&gt;
&lt;p&gt;Let’s do an experiment and see what happens. For this, let's:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;use a bigger graph, with $n=100,000$ nodes, but keep $k=50$&lt;/li&gt;
&lt;li&gt;pick out $N=1,000$ people randomly ("population A"), and boost their intrinsic post production by 5%&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connected_watts_strogatz_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_directed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;population_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;effect_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;step_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;intrinsic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;network_effect&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;today_posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# baseline&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# experiment&lt;/span&gt;
            &lt;span class="n"&gt;today_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intrinsic&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;effect_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;today_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intrinsic&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# network effect&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;today_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;yesterday_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;network_effect&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;today_posts&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Looking at the converged part of the timeline, this is what we get for (i) overall post production (ii) just A (iii) friends of A and (iv) rest:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/sn3.png" alt="Post production" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Combining all the days, we can get better statistics:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;A lift: &lt;/span&gt;&lt;span class="si"&gt;%.3f%%&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts_A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Friends of A lift: &lt;/span&gt;&lt;span class="si"&gt;%.3f%%&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts_A_friends&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Rest lift: &lt;/span&gt;&lt;span class="si"&gt;%.3f%%&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts_rest&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Overall lift: &lt;/span&gt;&lt;span class="si"&gt;%.3f%%&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;avg_posts_all&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So compared to the base of 0.5 (no experiment), we measure:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="mf"&gt;1.702&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Friends&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.605&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Rest&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;         &lt;span class="mf"&gt;0.411&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Overall&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="mf"&gt;0.501&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;These results are very interesting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;intrinsic production dampened by the network effect:&lt;/strong&gt; we underestimate the true intrinsic effect (1.7% vs 5%), because A’s non-A friends don’t have the feature, so As don’t get the boost “back” through these edges&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;spillover effect:&lt;/strong&gt; we measure a lift due to the network effect for friends of A, and further down the network, depending on the distance from As&lt;/li&gt;
&lt;li&gt;if we release this feature to the entire network, average post production would be $ (1 + 0.05) \times 0.25 \times 0.5 / (1 - 50 \times 0.5 \times 0.03) = 0.525$, or a 5% lift compared to the base of 0.5, as expected&lt;/li&gt;
&lt;li&gt;the overall lift is higher than the “rest” because A is pulling it up&lt;/li&gt;
&lt;li&gt;the last 2 lifts (rest and overall) can be made arbitrarily small by increasing the overall size $n$ of the network while keeping the experimental group size $N$ fixed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Intrinsic production dampened by the network effect is a function of the relative strength of the network effect. In this simulation, we set the parameters so that the network effect is very strong, and boosts average post production from 0.125 to 0.5, by 4x! If the network effect were weaker, the experimental dampening would also be weaker, and the same for the spillover effect.&lt;/p&gt;
&lt;p&gt;We can see this in action by repeating the experiment with &lt;code&gt;network_effect = 0.01&lt;/code&gt;, so a 3x weaker network effect. In this case, the base value works out to 0.1666 (no experiment), so the network effect only boosts post production by 1.666/1.25=1.3x. In the experiment, compared to the base, we measure (with +5% post production for the $N=1000$ population A):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="mf"&gt;3.777&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Friends&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.070&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Rest&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;         &lt;span class="mf"&gt;0.032&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Overall&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="mf"&gt;0.085&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This confirms the above: if the network effect is weaker, the measured lift in the experimental group is closer to the effect size because network effect dampening is lower (3.77% vs 1.70%), while the spillover effect is lower (0.07% vs 0.60%). We can achieve the same effect of making the network effect smaller by decreasing the edge count of the graph, ie. we would get the same result by using a $k=50/3$ Watts–Strogatz graph instead of a $k=50$ one.&lt;/p&gt;
&lt;p&gt;Another interesting experiment is if we pick a highly clustered population for the experiment group A. We can achieve this by:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;   &lt;span class="c1"&gt;# set(sample(g.nodes, N)) &amp;lt;- original sampling&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;First, let’s make sure that this way of picking out $N=1,000$ is in fact more highly clustered than properly sampling. In the original setup, we expect each A to have on average N/n = 1% of neighbours that are also in A, whereas by picking out N subsequent nodes, since only $p=0.1$ portion of edges were re-arranged in the Watts-Strogatz process, we expect this ratio to be significantly higher:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ratio_AA_friendship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;num_AA_edges&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;v2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;num_A_edges&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num_AA_edges&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_A_edges&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ratio_AA_friendship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ratio_AA_friendship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="mf"&gt;0.89&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With proper random sampling, the ratio is indeed 1%, whereas in the highly clustered case 89% of A’s friends are also As. So in this setup, we expect the measured A lift to be much closer to the true lift of 5% (using the original &lt;code&gt;network_effect = 0.03&lt;/code&gt;). Running the simulation with this clustered A population, we get:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="mf"&gt;4.307&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Friends&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.579&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Rest&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;         &lt;span class="mf"&gt;0.460&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;span class="n"&gt;Overall&lt;/span&gt; &lt;span class="n"&gt;lift&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="mf"&gt;0.508&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The result is as expected: the measured lift is much closer to the true lift than with a true random sampled A population (4.3% is much closer to 5% than 1.7% is). It’s interesting that the friends of A lift is not much different (0.58% vs 0.60%). If A is more clustered, the set of non-A friends will be smaller (because there’s less edges going to non-As), but each of them on average (at least in the high $n$ limit) still has the same number of A friends, so the A boost they get will be similar.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;When there are no network effects, or they are weak, a regular A/B test with one of the tests discussed in &lt;a href="http://bytepawn.com/tag/ab-testing.html"&gt;earlier posts&lt;/a&gt; works fine. But if there are strong network effects, these have to be taken into account when estimating lift and p-values. In real life there are a lot more nuances to take into account, both related to the network effects and otherwise (eg. cannibalizing photo posts when testing video post lift).&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>Early stopping in A/B testing</title><link href="https://bytepawn.com/early-stopping-in-ab-testing.html" rel="alternate"></link><published>2020-03-05T00:00:00+01:00</published><updated>2020-03-05T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-05:/early-stopping-in-ab-testing.html</id><summary type="html">&lt;p&gt;Increased false positive rate due to early stopping is beautiful nuance of statistical testing. It is equivalent to running at an overall higher alpha. Data scientists need to be aware of this phenomenon so they can control it and keep their organizations honest about their experimental results.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/early_stopping.png" alt="Early stopping" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the past posts we’ve been computing p values for various &lt;a href="https://en.wikipedia.org/wiki/Frequentist_inference"&gt;frequentist&lt;/a&gt; statistical tests that are useful for A/B testing (&lt;a href="http://bytepawn.com/ab-testing-and-the-ztest.html#ab-testing-and-the-ztest"&gt;Z-test&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-the-ttest.html#ab-testing-and-the-ttest"&gt;t-test&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html#ab-testing-and-the-chi-squared-test"&gt;Chi-squared&lt;/a&gt;, &lt;a href="http://bytepawn.com/ab-testing-and-fishers-exact-test.html#ab-testing-and-fishers-exact-test"&gt;Fisher's exact&lt;/a&gt;). When we modeled the A/B test, we assumed the protocol is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;decide what metric we will use to evaluate the test (eg. conversion, timespent, DAU)&lt;/li&gt;
&lt;li&gt;dedice how many $N$ samples we will collect&lt;/li&gt;
&lt;li&gt;decide what type of test (eg. t-test or $\chi^2$) we will use&lt;/li&gt;
&lt;li&gt;decide $\alpha$ acceptable false positive rate (FPR)&lt;/li&gt;
&lt;li&gt;collect $N$ samples&lt;/li&gt;
&lt;li&gt;compute $p$ value, if $p &amp;lt; \alpha$ reject the null hypothesis, else accept it&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20the%20pitfall%20of%20early%20stopping.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Early stopping&lt;/h2&gt;
&lt;p&gt;What happens if the tester is curious or impatient and follows a different protocol and peeks at the data repeatedly to see if it’s “already significant”. So instead of steps 5-6 above, they:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;collect $N_1$ samples, run hypothesis test on $N’ := N_1$ samples, compute $p$ value, if $p &amp;lt; \alpha$ reject the null hypothesis and stop, else go on&lt;/li&gt;
&lt;li&gt;collect $N_2$ more samples, run hypothesis test on $N’ := N + N_2$ samples, compute $p$ value, if $p &amp;lt; \alpha$ reject the null hypothesis and stop, else go on&lt;/li&gt;
&lt;li&gt;collect $N_3$ more samples...&lt;/li&gt;
&lt;li&gt;stop if $N’ &amp;gt;= N$&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can simulate this &lt;strong&gt;early stopping protocol&lt;/strong&gt; with Monte Carlo code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;abtest_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prior_observations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prior_observations&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;prior_observations&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;early_stopping_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;abtest_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s assume our A/B test is actually not working (no lift), so both A and B are the same:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;First, let’s check that we get what we expect in the simple case, without early stopping:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;alphas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;early_stopping_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive ratio: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt; &lt;span class="n"&gt;positive&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.057&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is what we expect. If the null hypothesis is true (A and B are the same), we expect to get $\alpha$ false positives, that’s exactly what $\alpha$ controls. Let’s see what happens if we collect the same amount of total samples, but follow the early stopping protocol with 2 extra peeks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;alphas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;early_stopping_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive ratio: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt; &lt;span class="n"&gt;positive&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.105&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is the problem with early stopping! &lt;strong&gt;If we repeatedly perform the significance test at the same $\alpha$ level, the overall $\alpha$ level will be higher.&lt;/strong&gt; If we do this, we will on average have a higher false positive rate than we think. In the above simulation, with 2 extra peeks, at equal $N$ intervals, the FPR roughly doubles!&lt;/p&gt;
&lt;h2&gt;Intuition&lt;/h2&gt;
&lt;p&gt;Why does the FPR go up in the case of early stopping? The best way to see this is to evaluate the $p$ value a lot of times, and plot the results. The simulation code is straightforward:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;repeated_significances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;abtest_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s evaluate at every 100 samples, 100 times (total $N=10,000$), and run it 3 times:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;results1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repeated_significances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repeated_significances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repeated_significances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sample size&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Test 1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Test 2&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Test 3&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p = 0.05&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper right&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The result is something like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/early_stopping.png" alt="Early stopping" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is no “correct” $p$ value to compute at any point, since we are collecting samples from random process. The guarantee of frequentist hypothesis testing (as discussed in the past posts), is that, if we evaluate the data at the end (at $N=10,000$ on the chart, at the end), if the null hypothesis is true, then on average  in $1-\alpha$ fraction of cases the p value will be bigger than $\alpha$, and we will make the correct decision to accept the null hypothesis (the correct decision). But there is no guarantee about the trajectory of the p value in between. The trajectory is by definition random, so if we repeatedly test against the $p=0.05$ line with an early stopping protocol, then we will reject the null hypothesis (the incorrect decision) more often. In the case above, for the green line, we would have done so at the beinning, and for the orange line, we could have done so several times; even though at the end, as it happens, these tests would all (correctly) accept the null hypothesis.&lt;/p&gt;
&lt;p&gt;A mathy way of saying this is to realize that $P(p_{N} &amp;lt; \alpha | H_0)$ &amp;lt; $P(p_{N1} &amp;lt; \alpha | H_0) + P(p_{N1+N2} &amp;lt; \alpha | H_0 \wedge p_{N1} &amp;gt; \alpha)) + ...$&lt;/p&gt;
&lt;h2&gt;Alpha spending in sequential trials&lt;/h2&gt;
&lt;p&gt;In itself, an early stopping protocol is not a problem. In the above example, we saw that taking 2 extra peeks at equal $N$ intervals with early stopping at $\alpha=0.05$ each yields an overall $\alpha$ of ~0.10. As long as we know that the overall $\alpha$ of our protocol is what it is, we’re fine. The problem is if we’re not aware of this, and we believe we’re actually operating at a lower $\alpha$, and potentially report a lower $\alpha$ along with the results.&lt;/p&gt;
&lt;p&gt;What if we are mindful of the increase in $\alpha$ that early stopping induces, but we want to keep the overall (=real) $\alpha$ at a certain level, let’s say $\alpha=0.05$. Based on the previous simulation, intuitively, this is possible, we just have to test at lower $\alpha$ at each early stopping opportunity. This is called &lt;a href="https://en.wikipedia.org/wiki/Sequential_analysis"&gt;alpha spending&lt;/a&gt;, because it’s like we have an overall budget of $\alpha$, and we’re spending it in steps. &lt;strong&gt;Note that alpha spending is not additive!&lt;/strong&gt; Let’s look at two protocols to achieve overall $\alpha=0.05$.&lt;/p&gt;
&lt;p&gt;First, the &lt;a href="https://en.wikipedia.org/wiki/Pocock_boundary"&gt;Pocock boundary, from Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Pocock boundary gives a p-value threshold for each interim analysis which guides the data monitoring committee on whether to stop the trial. The boundary used depends on the number of interim analyses.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="/images/pocock.png" alt="Pocock table" style="width: 450px;"/&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Pocock boundary is simple to use in that the p-value threshold is the same at each interim analysis. The disadvantages are that the number of interim analyses must be fixed at the start and it is not possible under this scheme to add analyses after the trial has started. Another disadvantage is that investigators and readers frequently do not understand how the p-values are reported: for example, if there are five interim analyses planned, but the trial is stopped after the third interim analysis because the p-value was 0.01, then the overall p-value for the trial is still reported as &amp;lt;0.05 and not as 0.01.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So, in the simulated case earlier, if we use $\alpha=0.0221$ at each step, we will achieve an overall $\alpha=0.05$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;alphas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.0221&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;early_stopping_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive ratio: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt; &lt;span class="n"&gt;positive&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.052&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It works!&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Haybittle%E2%80%93Peto_boundary"&gt;Haybittle–Peto boundary&lt;/a&gt; is much simpler, but it’s not an exact rule. It essentially says, perform the in-between tests at a very low $\alpha=0.001$, and the final test at the desired $\alpha=0.05$. Because the the early steps were performed at such low $\alpha$, it doesn’t change the overall $\alpha$ by much.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/haybittle-peto.png" alt="Haybittle-Peto table" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;What this is essentially saying is: if you peek early and the null hypothesis is without a reasonable doubt wrong, ie. the treatment is without a reasonable doubt better than the control group already at lower $N$s, then stop, else keep going. More mathy, peak at some low $\alpha=\epsilon$, and your additional overall $\alpha$ will be some $\delta_\epsilon$ that is also low.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;alphas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;early_stopping_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alphas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False positive ratio: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt; &lt;span class="n"&gt;positive&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.049&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Obviously, there are countless protocols like this we can construct.&lt;/p&gt;
&lt;p&gt;A final option is to not do this, never to stop the experiment until the original sample size of $N$ is reached. This is what I recommend to do. Note that peeking itself is not a problem, as long as we don’t stop the experiment. Some A/B testing tools go the extra mile and don’t show (blur out) the result before the agreed-upon sample size is reached.&lt;/p&gt;
&lt;p&gt;What is the actual formula for the final $\alpha$, if we follow a protocol of testing at $\alpha_i$ at $N_i$ with early stopping? I’m not going to derive it here, we can just run the simulation above and see what it is. In the real-world, we don’t follow exact protocols like this anyway (peek at exact $N$ steps). In my opinion, it’s enough to be aware of this issue, and control our behaviour. I recommend to not early stop, or follow the conservative Haybittle–Peto boundary.&lt;/p&gt;
&lt;h2&gt;What’s the big deal?&lt;/h2&gt;
&lt;p&gt;Why do we even use A/B testing? Why don't we just eyeball the difference? After all, in a typical SaaS A/B testing setting, lives are not at stake, unlike in &lt;a href="https://en.wikipedia.org/wiki/Clinical_trial"&gt;clinical trials&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We use formal A/B testing so we don’t fool ourselves. Usually the person or team performing the test has some attachment to the test, because it’s their idea, their code, they invested time into running the experiment, their bonus depends on their impact. We’re biased. By agreeing with ourselves or the team that we will follow a certain protocol to evaluate the experiment (sample size $N$, metric, $\alpha$), we’re eliminating or at least controlling our own bias.&lt;/p&gt;
&lt;p&gt;But, we have to be mindful that A/B testing shouldn’t hold us back. In the post &lt;a href="http://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html#ab-tests-moving-fast-vs-being-sure"&gt;A/B tests: Moving Fast vs Being Sure&lt;/a&gt; I’ve argued that in startup settings it may make sense to run at higher $\alpha$ (and collect less samples per experiment), accept more false positives, because it will allow us to perform more experiments per year, which ultimately may be worth it.&lt;/p&gt;
&lt;p&gt;Another thing to keep in mind that early stopping affects $\alpha$, the false positive rate. But false positives only matter for experiments that are not working, ie. when there is no lift from A to B. In the extreme case of all our experiments working, early stopping is actually good, because we will spend less time deciding, and we're "erring" in the right direction. But this is a extreme case, if all our experiments always work out, there’s no need to A/B test anyway.&lt;/p&gt;
&lt;p&gt;We can do a back-of-the-envelope calculation to see what happens if we follow an early stopping protocol and (accidentally, or knowingly) run at $\alpha=0.10$ insteaf of $\alpha=0.05$. Let’s suppose that 1 in 4 ($\gamma=0.25$) experiment actually works, and we’re running at $1-\beta = 0.8$ power, ie. when an experiment works, we catch it 80% of the time. If we run 100 experiments a year:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;25 will work, and we will catch 20 of these, 5 good ones will be misclassified as not working&lt;/li&gt;
&lt;li&gt;75 will not work, if we run at:&lt;ul&gt;
&lt;li&gt;$\alpha=0.05$, then 3.75 bad ones will be misclassified as working&lt;/li&gt;
&lt;li&gt;$\alpha=0.10$, then 7.50 bad ones will be misclassified as working&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The misclassification rate due to a $\Delta \alpha$ is $(1 - \gamma) \Delta \alpha$.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Increased false positive rate due to early stopping is beautiful nuance of statistical testing. It is equivalent to running at an overall higher $\alpha$. Data scientists need to be aware of this phenomenon so they can control it and keep their organizations honest about their experimental results.&lt;/p&gt;
&lt;h2&gt;Links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.evanmiller.org/how-not-to-run-an-ab-test.html"&gt;How Not To Run an A/B Test&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.evanmiller.org/sequential-ab-testing.html"&gt;Simple Sequential A/B Testing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://codeascraft.com/2018/10/03/how-etsy-handles-peeking-in-a-b-testing/"&gt;How Etsy Handles Peeking in A/B Testing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and Fisher's exact test</title><link href="https://bytepawn.com/ab-testing-and-fishers-exact-test.html" rel="alternate"></link><published>2020-03-03T00:00:00+01:00</published><updated>2020-03-03T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-03-03:/ab-testing-and-fishers-exact-test.html</id><summary type="html">&lt;p&gt;Fisher’s exact test directly computes the same p value as the Chi-squared test, so it does not rely on the Central Limit Theorem to hold.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/fisher2.png" alt="Fisher's test, Fisher Monte Carlo and Chi-squared test p values" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Fisher%27s_exact_test"&gt;Fisher’s exact test&lt;/a&gt; directly computes the same $p$ value as the &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_test"&gt;$\chi^2$ test&lt;/a&gt;, without relying on the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt; (CLT) to hold, so it is accurate at low $N$. See the previous post on &lt;a href="http://bytepawn.com/ab-testing-and-the-chi-squared-test.html"&gt;A/B testing and the Chi-squared test&lt;/a&gt; for an introduction to the $\chi^2$ test. The trade-off is that Fisher’s exact test is more computationally intensive, so even at moderate $N$ the direct computation is not feasible. However, &lt;a href="https://en.wikipedia.org/wiki/Monte_Carlo_method"&gt;Monte Carlo sampling&lt;/a&gt; can be used to get estimated results quickly.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20Fisher's%20exact%20test.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Diversion: the Binomial test&lt;/h2&gt;
&lt;p&gt;The best way to understand Fisher’s test is by a simpler analogue, coin flips. Suppose somebody gives you a coin, and you’re trying to decide whether the coin is fair. If you can flip it a lot of times, you can use a &lt;a href="https://en.wikipedia.org/wiki/Z-test"&gt;Z-test&lt;/a&gt; to decide whether it’s fair or not, because at high $N$, the CLT holds, and the distribution of heads follows a normal distribution.&lt;/p&gt;
&lt;p&gt;But, what if you’re only allowed to flip it $N=24$ times and you get $H=18$ heads? This is a low $N$, so the Z-test will not work. But, we can just directly compute the p value by computing $ P(H &amp;gt;= 18 \vee H &amp;lt;= 6) $ assuming the null hypothesis of a fair toin coss. Note that we’re doing a two-tailed test here. $ P(H &amp;gt;= 18 \vee H &amp;lt;= 6) = P(H = 1) + ... + P(H = 6) + P(H = 18) + ... + P(H = 24)$, where $ P(H = k) = {n \choose k} p^k q^{n-k} $, where $p = 0.5, q = 1 - p = 0.5$ in this case from the null hypothesis ($p$ is the probability of heads, $q$ of tails).&lt;/p&gt;
&lt;p&gt;What we’re doing here is called the &lt;a href="https://en.wikipedia.org/wiki/Binomial_test"&gt;Binomial test&lt;/a&gt;. The &lt;a href="https://docs.scipy.org/doc/scipy/reference/stats.html"&gt;scipy stats&lt;/a&gt; package has a library function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;binom_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="mf"&gt;0.023&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can calculate this ourselves per the above formula, we just have to be careful with the rounding:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;binomial_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;lo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lo&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;binom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is a direct p value calculation, it works at any $N$. Let’s double-check what we know. According to the CLT, at high $N$ the average ratio of heads will follow a normal distribution, so we can use the Z-test, and it should yield the same result as the direct calculation above:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5100&lt;/span&gt; &lt;span class="c1"&gt;# delta of +1% lift&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="c1"&gt;# the null hypothesis&lt;/span&gt;
&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Binom test p: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;binom_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test p:     &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Binom&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.047&lt;/span&gt;
&lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="mf"&gt;0.045&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Not quite the same, but pretty close. We can see how the exact binomial and the normal estimated Z-test p values converge thanks to the CLT:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="c1"&gt;# the null hypothesis&lt;/span&gt;
&lt;span class="n"&gt;actual_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;actual_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_binom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;binom_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_binom&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p_z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_diff&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sample size&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;|p z-test - p exact binomial|&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The difference goes to zero in the $N \rightarrow \infty $ limit:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/fisher1.png" alt="Binomial test and Z-test p value difference" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Fisher’s exact test&lt;/h2&gt;
&lt;p&gt;What the binomial test is to the Z-test, &lt;a href="https://en.wikipedia.org/wiki/Fisher%27s_exact_test"&gt;Fisher’s exact test&lt;/a&gt; is to the $\chi^2$ test. It’s a direct calculation of the p value in case of a $F \times C$ contingency table. Fisher’s exact test is accurate at all $N$s, and the $\chi^2$ test’s p converges to it at high $N$s, similar to the above case.&lt;/p&gt;
&lt;p&gt;The null hypothesis is that all funnels have the same conversion event probabilities. Given the $F \times C$ contingency table outcome of an A/B test ($F$ funnels tested, $C$ mutually exclusive conversion events), the calculation of the p value is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;first, calculate the marginals:&lt;ul&gt;
&lt;li&gt;row marginals: how many users were randomly assigned into each funnel in the A/B test&lt;/li&gt;
&lt;li&gt;column marginals: across the tested funnels, conversion event totals&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;given the marginals, what is the probability of the observed outcomes&lt;/li&gt;
&lt;li&gt;for all the ways we can change numbers in the contingency table while keeping the marginals fixed, take the ones that have equal or lower probability then the actual outcome, and add up those probabilities; this is the p value&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The trick is, how to calculate the quantity “given the marginals, what is the probability of a specific outcome  (numbers in the contingency table that add up to the marginals)”; we need this in both step 2. and 3. For this we have to use the &lt;a href="https://en.wikipedia.org/wiki/Hypergeometric_distribution"&gt;hypergeometric distribution&lt;/a&gt;, the distribution for “urn draws”. Let’s reuse the contingency table from the previous post:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/contingency_table3.PNG" alt="Contingency table" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Imagine this: we have a total of $N=10,000$ marbles. Each marble is one of $C=3$ colors (&lt;strong&gt;No conversion, Monthly, Annual&lt;/strong&gt;). There are a total of 7,922 marbles &lt;strong&gt;No conversion&lt;/strong&gt; marbles, 1,085 &lt;strong&gt;Monthly&lt;/strong&gt; conversion marbles, etc. All these marbles are in one big urn. We start drawing marbles; what’s the probability that the first 5,916 drawn will be colored &lt;strong&gt;(No conversion, Monthly, Annual) = (4748, 595, 573)&lt;/strong&gt;, irrespective or the order they are drawn? We can break this into two probabilities that we multiply: what is the probability that of 5,916 drawn the colors are &lt;strong&gt;(No conversion, Rest) = (4748, 595+573)&lt;/strong&gt; from an urn that contains &lt;strong&gt;(No conversion, Rest) = (7922, 1085+993)&lt;/strong&gt; marbles, multiplied by, what is the probability that of the rest 595+573=1,168 drawn the colors are &lt;strong&gt;(Monthly, Annual) = (595, 573)&lt;/strong&gt; from an urn that contains &lt;strong&gt;(Monthly, Annual) = (1085, 993)&lt;/strong&gt; marbles. These individual probabilities are given by the hypergeometric probability $P(X=k | N, K, n)$, ie. what is the probability of drawing $k$ red marbles from an urn that contains a total of $N$ marbles, $K$ of which are red, of total $n$ drawn ($k \leq n$). It is $P(X=k | N, K, n) = \frac{ { K \choose k} { N-K \choose n-k } }{ {N \choose n} }$. Then, we go on and calculate the same probabilities in the second row, &lt;strong&gt;but keeping in mind that we have already removed (4748, 595, 573) marbles from the urn&lt;/strong&gt;. The ordering of the rows doesn’t matter.&lt;/p&gt;
&lt;p&gt;The formula $P(X=k | N, K, n)$ above is implemented by the &lt;a href="https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.hypergeom.html"&gt;scipy hypergeometric probability function &lt;/a&gt; &lt;code&gt;hypergeom.pmf&lt;/code&gt;, with that we can implement the calculation of the overall probability of a contingency table like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hypergeom_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;row_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;hypergeom&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pmf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
              &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;:]),&lt;/span&gt;
              &lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
              &lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can now run an A/B test and whatever the outcome, we can compute the probability of that specific outcome, which will be a very small number:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;hypergeom_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;But this is not the p value! This is just step 2. To get the p value, per step 3, we have to add up the probabilities for all possible numbers in the contingency table that add up to the marginals, that have lower probability than the actual observations (=are more extreme).&lt;/p&gt;
&lt;p&gt;But, even at moderate $N$, there are too many combinations! Instead, what we’ll do is a Monte Carlo (MC) sum. We will randomly sample combinations that satisfy the marginals, and compute the ratio of cases that have lower probability than the given A/B test outcome (= =more extreme outcomes):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;multi_hypergeom_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cumsum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hypergeometric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sample_once&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;row_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;multi_hypergeom_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;col_marginals&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fisher_monte_carlo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;p_obs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hypergeom_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_once&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;p_sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hypergeom_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p_sample&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;p_obs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Our implementation above works for arbitrary $F \times C$ contingency tables. The scipy stats library has Fisher’s exact test built in, but it only works for 2x2 tables. Also, remember that we also have the $\chi^2$ test. Let’s compare the three for 2x2 cases:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.40&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.40&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;chi-squared p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;fs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;fishers exact p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;fmc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_monte_carlo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;monte carlo p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;fmc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;37.&lt;/span&gt; &lt;span class="mf"&gt;16.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;31.&lt;/span&gt; &lt;span class="mf"&gt;16.&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;squared&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.680&lt;/span&gt;
&lt;span class="n"&gt;fishers&lt;/span&gt; &lt;span class="n"&gt;exact&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.830&lt;/span&gt;
&lt;span class="n"&gt;monte&lt;/span&gt; &lt;span class="n"&gt;carlo&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.831&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The p value coming from &lt;code&gt;fisher_exact()&lt;/code&gt; is exact, the one coming from &lt;code&gt;chi2_contingency()&lt;/code&gt; assumes the Central Limit Theorem, while the one coming from our own &lt;code&gt;fisher_monte_carlo()&lt;/code&gt; involves a Monte Carlo approximation. Similar to the binomial toin coss example above, let’s see what happens to the differences in $p$ values with increasing $N$, given two funnels and an actual conversion lift of 1%:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;actual_lift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;actual_lift&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;actual_lift&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_chi2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_fish&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p_fmc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_monte_carlo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_diff_chi2_fish&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_chi2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p_fish&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_diff_chi2_fmc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_chi2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p_fmc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_diff_fish_fmc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_fish&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;p_fmc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_diff_chi2_fish&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_diff_chi2_fmc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_diff_fish_fmc&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sample size&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;p diff&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
  &lt;span class="s1"&gt;&amp;#39;| chi^2 - fisher exact |&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s1"&gt;&amp;#39;| chi^2 - fisher mc |&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s1"&gt;&amp;#39;| fisher exact - fisher mc |&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;upper right&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The result is:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/fisher2.png" alt="Fisher's test, Fisher Monte Carlo and Chi-squared test p values" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is a very interesting plot:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The difference between Fisher’s exact and the MC is wiggling around 0 (green line), not exactly 0 because of the random nature of MC.&lt;/li&gt;
&lt;li&gt;The difference between Fisher’s exact and the $\chi^2$ (blue line) converges to 0 smoothly, like the difference between the z-test and the binomial test in the introduction.&lt;/li&gt;
&lt;li&gt;The difference between the $\chi^2$ and the MC (orange line) follows the blue line, but is a bit more random, again because of the random nature of MC.&lt;/li&gt;
&lt;li&gt;We could get the MC to be smoother by letting it run longer, here it sampled $m=100,000$, we could let it run 10x longer for more smoothness.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This gives us confidence that our MC implementation is correct. However, unlike Fisher’s exact test, and like the $\chi^2$ test, our Monte Carlo version also works on bigger contingency tables:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.70&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;chi-squared p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fisher_monte_carlo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;monte carlo p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;[[368. 113. 112.]&lt;/span&gt;
 &lt;span class="k"&gt;[126.  45.  55.]&lt;/span&gt;
 &lt;span class="k"&gt;[125.  26.  30.]]&lt;/span&gt;
&lt;span class="na"&gt;chi-squared p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.076&lt;/span&gt;
&lt;span class="na"&gt;monte carlo p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.078&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;At this point we have 3 tests we can use for conversion tests: Fisher’s exact, Fisher MC and $\chi^2$. At low $N$, the Fisher ones are more accurate, the exact one (the scipy stats library implementation) only works on 2x2 contingency tables, while our MC one works for any $F \times C$ case. If we let the MC collect enough samples, the two yield the same results numerically. At high $N$, all 3 yield the same results numerically, the simplest thing to do is use the $\chi^2$ test.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and the Chi-squared test</title><link href="https://bytepawn.com/ab-testing-and-the-chi-squared-test.html" rel="alternate"></link><published>2020-02-28T00:00:00+01:00</published><updated>2020-02-28T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-28:/ab-testing-and-the-chi-squared-test.html</id><summary type="html">&lt;p&gt;In an ealier post, I wrote about A/B testing conversion data with the Z-test. The Chi-squared test is a more general test for conversion data, because it can work with multiple conversion events and multiple funnels being tested (A/B/C/D/..).&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/chi2.png" alt="Chi-squared distribution" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In an ealier post, I wrote about &lt;a href="http://bytepawn.com/ab-testing-and-the-ztest.html"&gt;A/B testing conversion data with the Z-test&lt;/a&gt;. The $\chi^2$ test is a more general test for conversion data, because it can work with multiple conversion events and multiple funnels being tested (A/B/C/D/..).&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20the%20Chi-squared%20test.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Before we go on, let’s use a $\chi^2$ test for a simple A/B conversion use-case and compare the results with the Z-test and the &lt;a href="http://bytepawn.com/ab-testing-and-the-ttest.html"&gt;t-test&lt;/a&gt; (both two-taileds). First, a Monte Carlo algorithm to simulate A/B tests:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;traffic_split&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;funnel_outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;funnel_outcome&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, let’s pretend we’re running a conversion A/B test that’s not working (A and B conversions the same) on $N=10,000$, and use the &lt;a href="https://www.statsmodels.org/"&gt;statsmodel&lt;/a&gt; and &lt;a href="https://docs.scipy.org/doc/scipy/reference/stats.html"&gt;scipy stats&lt;/a&gt; libraries to run all three tests on the results:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector element is the actual outcomes, &lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Chi-sq p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;zt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;tt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ttest_ind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;t-test p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;All three yield the same p value:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Observations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;4825.&lt;/span&gt; &lt;span class="mf"&gt;1183.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;3211.&lt;/span&gt;  &lt;span class="mf"&gt;781.&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;Chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.876&lt;/span&gt;
&lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.876&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.876&lt;/span&gt; &lt;span class="c1"&gt;# all three are the same&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We’re not surprised that the Z-test and the t-test yield identical results. We saw in the previous post that above $N=100$ the t-distribution is a normal distribution, and the two tests yield the same p value. For this simple case (two outcomes: conversion or no conversion, and two funnels: A and B), the $\chi^2$ test is also identical to the Z-test, with the same limitation (assumes the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt;, so not reliable below $N=100$ ).&lt;/p&gt;
&lt;h2&gt;The $\chi^2$ test&lt;/h2&gt;
&lt;p&gt;For A/B testing, we can think of the $\chi^2$ test as a generalized Z-test. Generalized in the following sense:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;each of the funnels can have multiple outcomes, not just &lt;strong&gt;Conversion&lt;/strong&gt; and &lt;strong&gt;No Conversion&lt;/strong&gt;. Eg. imagine a funnel with multiple drop-off events and multiple conversions such as buying a &lt;strong&gt;Monthly&lt;/strong&gt; or an &lt;strong&gt;Annual&lt;/strong&gt; license (all of them mutually exclusive).&lt;/li&gt;
&lt;li&gt;we can test more than 2 funnel versions at once, so we can run an A/B/C/D.. test.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s see this in action, eg. we have 3 outcomes and 4 funnels:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.79&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.70&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Chi-sq p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Observations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;4748.&lt;/span&gt;  &lt;span class="mf"&gt;595.&lt;/span&gt;  &lt;span class="mf"&gt;573.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;1657.&lt;/span&gt;  &lt;span class="mf"&gt;197.&lt;/span&gt;  &lt;span class="mf"&gt;231.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="mf"&gt;807.&lt;/span&gt;   &lt;span class="mf"&gt;98.&lt;/span&gt;  &lt;span class="mf"&gt;103.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="mf"&gt;710.&lt;/span&gt;  &lt;span class="mf"&gt;195.&lt;/span&gt;   &lt;span class="mf"&gt;86.&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;Chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;What’s happening under the hood? Using the above 4x3 outcome table, first we construct the contingency table. We simply add the numbers row-wise and column-wise and write them at the right and bottom. These are called the marginals:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/contingency_table3.PNG" alt="Contingency table" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Then, for each obsevation cell, we calculate the expected value. Expected here means according to the null hypothesis, which is that all funnels are the same. Our best guess for the null hypothesis are the blended bottom numbers: $7922/10000$ for &lt;strong&gt;No Conversion&lt;/strong&gt;, $1085/10000$ for &lt;strong&gt;Monthly&lt;/strong&gt;, etc. So for Funnel A, which has 5916 samples, our expected &lt;strong&gt;No Conversion&lt;/strong&gt; number is $5916*7922/10000=4686.6$. We do this for each cell. Then we subtract the actual observation from the expected, square it, and divide by the expected, like $(4748-4686.6)^2/4686.6=0.8$. We do this for each cell, and sum up the numbers to we get the $\chi^2$ test statistic. We then look this up in a &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_distribution"&gt;$\chi^2$ distribution&lt;/a&gt; table to get a p value. We have to use a degree of freedom of $k=(F-1)(C-1)$, where $F$ is the number of funnels, $C$ is the number of conversion events, $F=4, C=3$ above.&lt;/p&gt;
&lt;h2&gt;Implementation&lt;/h2&gt;
&lt;p&gt;This is so simple, we can implement it ourselves:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chi_squared&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;row_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;col_marginals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chisq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
            &lt;span class="n"&gt;chisq&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;
    &lt;span class="n"&gt;dof&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col_marginals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;chi2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dof&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chisq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chisq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can verify we calculate the same test statistic and p value as the library function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch_scipy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch_our&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi_squared&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Statsmodel chi-sq test statistic = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch_scipy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Our chi-sq test statistic        = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch_our&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Statsmodel chi-sq p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch_scipy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Our chi-sq p        = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch_our&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Observations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;4846.&lt;/span&gt;  &lt;span class="mf"&gt;594.&lt;/span&gt;  &lt;span class="mf"&gt;591.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;1628.&lt;/span&gt;  &lt;span class="mf"&gt;188.&lt;/span&gt;  &lt;span class="mf"&gt;171.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="mf"&gt;767.&lt;/span&gt;  &lt;span class="mf"&gt;100.&lt;/span&gt;   &lt;span class="mf"&gt;98.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="mf"&gt;824.&lt;/span&gt;   &lt;span class="mf"&gt;84.&lt;/span&gt;  &lt;span class="mf"&gt;109.&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;Statsmodel&lt;/span&gt; &lt;span class="n"&gt;chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;statistic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;7.324&lt;/span&gt;
&lt;span class="n"&gt;Our&lt;/span&gt; &lt;span class="n"&gt;chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;statistic&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;7.324&lt;/span&gt;
&lt;span class="n"&gt;Statsmodel&lt;/span&gt; &lt;span class="n"&gt;chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.292&lt;/span&gt;
&lt;span class="n"&gt;Our&lt;/span&gt; &lt;span class="n"&gt;chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.292&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Intuition&lt;/h2&gt;
&lt;p&gt;The intuition behind the $\chi^2$ is this: if the null hypothesis is true, then all rows should follow the same conversion ratios, which is also the marginal conversion ratio vector. When we subtract the expected number from the actual number (and normalize), similar to the Z-test, we get a standard normal variable. Since we have multiple cells, we need to add these variables to get an overall statistic, but we don’t want positive and negative fluctuations to cancel out. Hence we first square, and then add. So the $\chi^2$ is a sum of squares of standard normals. This is exactly what the $\chi^2$ distribution is: a $\chi^2$ distribution with degree of freedom $k$ is the result of adding up $k$ independent standard normal variables squared. In the subsequent discussion we will get more intuition why the degree of freedom is $k=(F-1)(C-1)$. Note that the standard normal goes from $-\infty$ to $\infty$, but the $\chi^2$, being its square, goes from $0$ to $\infty$. This has implications for one-tailed vs two-tailed testing.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/chi2.png" alt="Chi-squared distribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In the 2x2 case, why is this exactly the same as the z-test? The answer is simple: in the 2x2 case, the degree of freedom is 1, the $\chi^2$ test is doing exactly the same thing as a 2-sided Z-test, and in fact the $\chi^2$ test statistic in this case is $z^2$. We can see this numerically:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Chi-sq test statistic = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Chi-sq p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;zt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test z = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test z^2 = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Observations:
 [[4836. 1193.]
 [3147.  824.]]
Chi-sq test statistic = 1.378
Chi-sq p = 0.240
Z-test z = 1.174
Z-test z^2 = 1.378 # z^2 is the same as the Chi-sq test statistic
Z-test p = 0.240
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If you compare the $\chi^2$ formulas with the Z-test formulas from the previous post, it works out that $z^2 = \chi^2$.&lt;/p&gt;
&lt;h2&gt;One-tailed vs two-tailed&lt;/h2&gt;
&lt;p&gt;In the case of the Z-test (and t-test), we have a choice between a one-tailed and a two-tailed test, depending on if we want the test to go off for deviations in just one or both directions. In the case of the $\chi^2$ test, we do not have a choice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the $\chi^2$ distribution is asymmetric (from $0$ to $\infty$), so technically the $\chi^2$ test is always one-tailed&lt;/li&gt;
&lt;li&gt;however, since it’s the square of normals, both tails of the normal are folded together, so it corresponds to a two-tailed Z-test [in the 2x2 case]&lt;/li&gt;
&lt;li&gt;this is not just a mathematical artefact; when dealing with multiple conversion events, there is no such thing as “positive” and “negative” directions; for example, in a 2x3 conversion example, if the baseline is $80-10-10$ for &lt;strong&gt;No Conversion - Monthly - Annual&lt;/strong&gt;, and our test comes out at $79-11-10$ or $79-10-11$, which is “positive” and “negative”? (If both are “positive”, then merge the conversions, and do a 2x2 one-tailed Z-test (or t-test)).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can check this simply:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the first vector is the actual outcomes,&lt;/span&gt;
    &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;0.80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# the second is the traffic split&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;observations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_abtest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;funnels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Observations:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chi2_contingency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Chi-sq p = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;zt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;two-sided&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test p (Two-tailed) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;tt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ttest_ind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;two-sided&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;t-test p (Two-tailed) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;zt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;larger&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Z-test p (One-tailed) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;tt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ttest_ind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;larger&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;t-test p (One-tailed) = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;zt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Prints something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Observations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mf"&gt;4780.&lt;/span&gt; &lt;span class="mf"&gt;1181.&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;3243.&lt;/span&gt;  &lt;span class="mf"&gt;796.&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;Chi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;              &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.898&lt;/span&gt; &lt;span class="c1"&gt;# the first three are the same&lt;/span&gt;
&lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Two&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;tailed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.898&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Two&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;tailed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.898&lt;/span&gt;
&lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;One&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;tailed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.551&lt;/span&gt; &lt;span class="c1"&gt;# these are different&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;One&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;tailed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.551&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Degrees of freedom&lt;/h2&gt;
&lt;p&gt;When we're doing hypothesis testing, we're computing a p value. The p value is the probability that we'd get the measured outcome, or more extreme outcomes, assuming the null hypothesis is true. There is one caveat here, hidden in the "or more extreme": the statistically correct way to evaluate this "more extreme" part is by keeping both row and column marginals fixed. Ie. what are all the ways (their probabilities) that we can put different numbers in the contingency table, while keeping the marginals fixed. Although the $\chi^2$ is not calculating this probability directly, thanks to the CLT, this is in fact what it's approximating in the $N \rightarrow \infty$ limit. And given a $F \times C$ table with the marginals fixed, you can only change $(F-1)(C-1)$ numbers freely ("degrees of freedom"), the rest are fixed by the constraint that the rows and columns have to add up to the marginals.&lt;/p&gt;
&lt;p&gt;In the next post, I will talk about &lt;a href="https://en.wikipedia.org/wiki/Fisher%27s_exact_test"&gt;Fisher's exact test&lt;/a&gt;, which will give more intuition about this, because that test explicitly calculates this probability.&lt;/p&gt;
&lt;h2&gt;Conclusion: usage and limitations&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Z-test.&lt;/strong&gt; In the 2x2 case, the $\chi^2$ test yields exactly the same results as a two-tailed Z-test (or t-test).
&lt;br/&gt;
&lt;strong&gt;Central Limit Theorem.&lt;/strong&gt; Like the Z-test, we need enough sample size for the normal approximation to be correct. I would not be comfortable unless each cell in the contingency table is at least $X&amp;gt;100$. See earlier post &lt;a href="http://bytepawn.com/ab-testing-and-the-central-limit-theorem.html"&gt;A/B Testing and the Central Limit Theorem.&lt;/a&gt;
&lt;br/&gt;
&lt;strong&gt;Multiple funnels, multiple outcomes.&lt;/strong&gt; Unlike the Z-test, the $\chi^2$ test can test multiple funnels and multiple outcomes at the same time.
&lt;br/&gt;
&lt;strong&gt;One-tailed distribution.&lt;/strong&gt; Unlike the Z-test, the $\chi^2$ test is directionless (technically one-tailed, but corresponds to the two-tailed Z-test in the 2x2 case).
&lt;br/&gt;
&lt;strong&gt;Degrees of freedom.&lt;/strong&gt; For a test with $F$ funnels and $C$ outcomes you have to use the $k=(F-1)(C-1)$ degree of freedom $\chi^2$ distribution to look up the p value.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and the t-test</title><link href="https://bytepawn.com/ab-testing-and-the-ttest.html" rel="alternate"></link><published>2020-02-23T00:00:00+01:00</published><updated>2020-02-23T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-23:/ab-testing-and-the-ttest.html</id><summary type="html">&lt;p&gt;The t-test is better than the z-test for timespent A/B tests, because it explicitly models the uncertainty of the variance due to sampling. Using Monte-Carlo simulations I show that around N=100, the t-test becomes the z-test.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/t-test-5-10.png" alt="Normal distribution vs t-distribution" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the last post, I showed how to do &lt;a href="http://bytepawn.com/ab-testing-and-the-ztest.html"&gt;A/B testing with the z-test&lt;/a&gt;. I used two examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;conversions, ie. proportions ($X_A$ out of $N_A$ converted)&lt;/li&gt;
&lt;li&gt;timespents (timespents for A were $x_i, x_2 ... x_N$)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this post, let’s concentrate on timespent data. The t-test is a better version of z-tests for timespent data, because it explicitly models the uncertainty of the variance due to sampling. The &lt;a href="https://en.wikipedia.org/wiki/Student%27s_t-test"&gt;Wikipedia page for Student’s t-test&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The t-test is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. A t-test is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistics (under certain conditions) follow a Student's t distribution. The t-test can be used, for example, to determine if the means of two sets of data are significantly different from each other.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20the%20ttest.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;The t-test vs the z-test&lt;/h2&gt;
&lt;p&gt;What does this mean? Before I talked about the z-test, &lt;a href="http://bytepawn.com/ab-testing-and-the-central-limit-theorem.html"&gt;I wrote about the Central Limit Theorem (CLT)&lt;/a&gt;. The CLT says that as we collect more independent samples from a population, we can estimate the true mean of the population by averaging our samples. The distribution of our estimate will be a normal distribution around the true mean, with variance $ \sigma_2 = \sigma_p^2 / N $, where $\sigma_p$ is the true standard deviation of the population. The population mean and standard deviation should exist, but the population doesn’t have to be normally distributed, eg. it can be exponential.&lt;/p&gt;
&lt;p&gt;When we use the z-test for timespent A/B testing, we model the distribution as a normal variable, with mean $ \mu = \frac{1}{N} \sum{ x_i } $ and variance $ \sigma^2 = s^2/N $, where $ s^2 = \frac{1}{N} \sum{(\mu - x_i)^2} $. The problem is, we cheated a little: we used $s^2$ and not $\sigma_p^2$! We do this because we don’t know $\sigma_p^2$, all we have is the estimate $s^2$.&lt;/p&gt;
&lt;p&gt;The t-test models this uncertainty in the estimation of $ \sigma^2 $. When we perform a t-test, it feels very similar to the z-test, except in some places we write $N-1$ instead of $N$. And in the end, we don’t look up a $z$ value on a normal distribution, instead we look up a $t$ value on a &lt;a href="https://en.wikipedia.org/wiki/Student%27s_t-distribution"&gt;t-distribution&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In probability and statistics, Student's t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown. If we take a sample of n observations from a normal distribution, then the t-distribution with $ \nu =n-1 $ degrees of freedom can be defined as the distribution of the location of the sample mean relative to the true mean, divided by the sample standard deviation, after multiplying by the standardizing term $ \sqrt {n} $. In this way, the t-distribution can be used to construct a confidence interval for the true mean.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;The normal distribution vs the t-distribution&lt;/h2&gt;
&lt;p&gt;As in the previous posts, we use the &lt;a href="https://docs.scipy.org/doc/scipy/reference/stats.html"&gt;scipy.stats&lt;/a&gt; module, which has pdfs for both normal and t-distributions. Compared to a standard normal distribution, the t-distribution has an additional parameter called $\nu$ or degrees of freedom (dof). When using the t-distribution on sample size $N$, $ \nu = N-1 $. Let’s plot a standard normal (blue) and t’s with $\nu=5$ (green) and $\nu=10$ (orange):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/t-test-5-10.png" alt="Normal distribution vs t-distribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note how the t-distributions have a bell shape like the normal, but have lower maximum and fatter tails.&lt;/p&gt;
&lt;p&gt;Next, let’s plot a standard normal (blue) and a t with $\nu=100$ (green):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/t-test-100.png" alt="Normal distribution vs t-distribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;At a moderate sample size of $N=100$ there is effectively no difference between the distributions.&lt;/p&gt;
&lt;h2&gt;The t-test becomes the z-test at $ N = 100 $&lt;/h2&gt;
&lt;p&gt;As $ N \rightarrow \infty $:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the t-distribution becomes a normal distribution&lt;/li&gt;
&lt;li&gt;the final outcome of hypothesis testing, the p-value becomes identical for a t-test and a z-test.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The difference effectively disappears at around $N=100$ sample size. So if you’re performing a timespent A/B test, and you have 100s or more samples in each bucket, the t-test and the z-test will yield numerically identical results. This is becauce at such sample sizes, the estimate of $s^2$ for $\sigma_p^2$ becomes really good for estimating the mean, and it’s divided by $N$ anyway, so the importance of the estimate goes down with increasing $N$.&lt;/p&gt;
&lt;p&gt;When googling for “z test vs t test”, a lot of advice goes like “use the t-test if you don’t know the variance” and “use the t-test for $N&amp;lt;100$”. This is not incorrect, but it’s a bit confusing. For A/B testing, a clear and concise statement is: &lt;strong&gt;in A/B testing you never know the population mean, you’re estimating it, so always use the t-test. For $N&amp;gt;100$, the t-test numerically yields the same results as the z-test.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Simulating p-values&lt;/h2&gt;
&lt;p&gt;Let’s perform a Monte-Carlo simulation to see how the t-test becomes the z-test. The &lt;a href="https://www.statsmodels.org/stable/stats.html"&gt;statsmodel&lt;/a&gt; package has both t and z-tests (1 sided and 2 sided). Let’s assume we have true populations for A and B, we take some samples from both to estimate the mean, and we perform both a t-test and a z-test to get 1-sided p-values. We then compute the average and maximum absolute p-value difference:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_p_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_size_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_size_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;p_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_simulations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;sample_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sample_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;population_B&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;t_stat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ttest_ind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;larger&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;z_stat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ztest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;larger&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;p_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z_stat&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;  &lt;span class="n"&gt;t_stat&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;p_diffs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_diff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_diffs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_diffs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s see what happens if we assume that both A and B are identical exponentials (so the null hypothesis is true), and we very sample size from 10 to 500:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;p_diffs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sample_size&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;p_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulate_p_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;population_B&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sample_size_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sample_size_B&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p_diffs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_diff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sample size&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;|p z-test - p t-test|&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_diffs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The output is (blue is mean, orange is maximum absolute p-value difference across 100 A/B tests performed at each sample size):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pz-same.png" alt="p-values for t and z tests" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the difference tends to 0 with increasing sample size&lt;/li&gt;
&lt;li&gt;the p-value differences shown are on the order of 0.001, in real life we usually work with p-values between 0.01 and 0.05&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s see what happens when the A/B test is actually working, ie. B has better timespent on average (so the null hypothesis is false). To make the effect more visible, let’s pretend that timespent doubled. For this, we just have to change the lines:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    &lt;span class="n"&gt;population_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;population_B&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;in the above code. This yields:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pz-different.png" alt="p-values for t and z tests" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Comparing with the above (null hypothesis is true), in such a case the p-value difference drops even quicker. This makes sense: if there is an effect (null hypothesis is false), the tests return a lower p-value, so the difference will also be lower.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In a timespent A/B test scenario, we should always use the t-test. Both the t and z-tests are a library call, so there’s no difference in effort. For real-life high sample size use-cases, numerically there’s no difference in the p-values computed. However, the z-test is a much simpler mental model, as it models the test statistic as the intuitive normal distribution and there’s no degrees of freedom involved like in the t-distribution. So, my rule of thumb: &lt;strong&gt;use the t-test, pretend it’s a z-test&lt;/strong&gt;.&lt;/p&gt;</content><category term="ab-testing"></category></entry><entry><title>A/B testing and the Z-test</title><link href="https://bytepawn.com/ab-testing-and-the-ztest.html" rel="alternate"></link><published>2020-02-15T00:00:00+01:00</published><updated>2020-02-15T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-15:/ab-testing-and-the-ztest.html</id><summary type="html">&lt;p&gt;I discuss the Z-test for A/B testing and show how to compute parameters such as sample size from first principles. I use Monte Carlo simulations to validate significance level and statistical power, and visualize parameter scaling behaviour.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/conversion_diff.png" alt="Conversion difference vs N" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous two posts, we talked about the &lt;a href="http://bytepawn.com/ab-testing-and-the-central-limit-theorem.html"&gt;A/B testing and the Central Limit Theorem&lt;/a&gt; and discussed when the CLT doesn’t hold in &lt;a href="http://bytepawn.com/beyond-the-central-limit-theorem.html"&gt;Beyond the Central Limit Theorem&lt;/a&gt; (CLT). The next step in exploring A/B testing is to look at the Z-test, which is the most common and straightforward staistical test.&lt;/p&gt;
&lt;p&gt;With our understanding of the CLT, the Z-test is simple to explain. We’re running a Z-test if:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;our null hypothesis is the relationship between population means or other test statistics, and&lt;/li&gt;
&lt;li&gt;we can assume that the CLT holds and the test statistics follow a normal distribution&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The same from &lt;a href="https://en.wikipedia.org/wiki/Z-test"&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Conversion%20AB%20test%20with%20the%20Z-test.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Statistical hypothesis testing&lt;/h2&gt;
&lt;p&gt;In a conversion A/B test setting, &lt;a href="https://en.wikipedia.org/wiki/Statistical_hypothesis_testing"&gt;statistical hypothesis testing&lt;/a&gt; is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;we have a base version A and contender version B, and we’re trying to decide whether B is better than A&lt;/li&gt;
&lt;li&gt;if B is converting worse than A, then we’re done&lt;/li&gt;
&lt;li&gt;if B is converting better than A, we’d like to know how &lt;strong&gt;significant&lt;/strong&gt; our results are; in hypothesis testing, we compute the probability that we’d get this result if B is actually not better than A; ie. we compute the probability of getting the result that B is better than A due to random chance, even if B is not better than A; this probability is called the &lt;strong&gt;p-value&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To get a better feeling for the point above, imagine if somebody gives you a coin. They claim it’s a fair coin, meaning you get Heads and Tails half the time. You want to test this claim, ie. the null hypothesis of a fair coin. If you flip it 10 times, and you get 6 Hs and 4 Ts, how confident are you that it’s not a fair coin? You can’t be too sure, because you haven’t collected enough samples, because the 6:4 result is a very likely outcome even if the coin is fair. The 6:4 result is not significant enough to disprove the null hypothesis of a fair coin. But if you flip it 1000 times, and you get 599 Hs and 401 Ts, that’s quite suspicious. Getting 599:401 from a fair coin is unlikely (it can be calculated explicitly, see below).&lt;/p&gt;
&lt;h2&gt;Types of Z-tests&lt;/h2&gt;
&lt;p&gt;Some points to make our thinking about the Z-test clear.&lt;/p&gt;
&lt;p&gt;It’s called Z-test because when running the numbers, it’s common to transform the data to a standard normal distribution $N(0, 1)$. In the old days, before computers, the p-value, ie. the portion of the normal distribution outside the normalized test statistic (eg. the difference of the means) would be read off a printout table (eg. the back of statistics textbooks), and this statistic is conventionally denoted with the letter z. A more verbose, but descriptive name would be &lt;strong&gt;test-for-normally-distributed-test-statistics&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The Z-test is not one specific test, it’s a kind of test. Any time we work with an approximately normally distributed test statistic, we’re performing a Z-test. The practical bible of statistical testing, &lt;a href="https://www.amazon.com/Statistical-Tests-Third-Gopal-Kanji/dp/141292376X"&gt;100 Statistical Tests&lt;/a&gt; by Gopal Kanji, lists the following types of Z-tests:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Test 1: Z-test for a population mean (variance known)&lt;/li&gt;
&lt;li&gt;Test 2: Z-test for two population means (variances known and equal)&lt;/li&gt;
&lt;li&gt;Test 3: Z-test for two population means (variances known and unequal)&lt;/li&gt;
&lt;li&gt;Test 4: Z-test for a proportion (binomial distribution)&lt;/li&gt;
&lt;li&gt;Test 5: Z-test for the equality of two proportions (binomial distribution)&lt;/li&gt;
&lt;li&gt;Test 6: Z-test for comparing two counts (Poisson distribution)&lt;/li&gt;
&lt;li&gt;Test 13: Z-test of a correlation coefficient&lt;/li&gt;
&lt;li&gt;Test 14: Z-test for two correlation coefficients&lt;/li&gt;
&lt;li&gt;Test 23: Z-test for correlated proportions&lt;/li&gt;
&lt;li&gt;Test 83: Z-test for the uncertainty of events&lt;/li&gt;
&lt;li&gt;Test 84: Z-test for comparing sequential contingencies across two groups using the ‘log odds ratio’&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/100_statistical_tests.jpg" alt="100 statistical tests" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;I listed out these to further the point that the Z-test is not just one test, it’s a type of test that makes sense in a variety of scenarios.&lt;/p&gt;
&lt;h2&gt;Formulas&lt;/h2&gt;
&lt;p&gt;The math is similar to the discussion in the CLT post. We’re sampling a distribution and computing a test statistic, and assuming that it follows a normal distribution $ N(\mu, \sigma^2) $. In an A/B test, we have two normal distributions, $ N(\mu_A, \sigma_A^2) $ and $ N(\mu_B, \sigma_B^2) $ with samples sizes $N_A$ and $N_B$, and the test statistic is $ N(\mu_A, \sigma_A^2) - N(\mu_B, \sigma_B^2) = N(\mu = \mu_B - \mu_A, \sigma^2 = \sigma_A^2 + \sigma_B^2) $ by the &lt;a href="https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables"&gt;addition rule for normal distributions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The test statistic then is $ Z = \frac{ \mu }{ \sigma } $, we use this to get the p-value for the experiment. This is simply the normalized distance from the mean of the distribution. With this normalized distance, we can use a table for a standard normal distribution table and read off the p-value. In the age of computers, we actually don’t have to do this final normalization step to get Z, we can just get the p-value from the original $ N(\mu, \sigma^2) $ distribution.&lt;/p&gt;
&lt;p&gt;In an A/B test setting, $ \mu_A $ and $ \mu_B $ are known. The trick is, what are the standard deviations $ \sigma_A^2 $ and $ \sigma_B^2 $? We compute it from the sample standard devation $ s^2 $, like $ \sigma^2 = s^2/N $. The sample standard deviation is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for conversion, the population distribution is &lt;a href="https://en.wikipedia.org/wiki/Bernoulli_distribution"&gt;Bernoulli&lt;/a&gt;, so $ s^2 = \mu(1-\mu) $&lt;/li&gt;
&lt;li&gt;for timespent, you can compute the standard error from the data directly $ s^2 = \frac{1}{N} \sum{(\mu - x_i)^2} $, where $x_i$ are the invididual timespents per user, and $ \mu = \frac{1}{N} \sum{ x_i } $.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Sample size&lt;/h2&gt;
&lt;p&gt;Before running an A/B test, we have to decide 2 things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$ \alpha $, the False Positive Rate (FPR): if B is actually not better than A, by chance the measurement can still show B to be better. We can reduce this by collecting more samples. But we need to set an FPR that we are okay with with. Usually people set this to 0.05, but as I discuss this in the previous post &lt;a href="http://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html"&gt;A/B tests: Moving Fast vs Being Sure&lt;/a&gt;, startups should favor velocity over certainty, using 0.1 or 0.2 is fine.&lt;/li&gt;
&lt;li&gt;$ 1 - \beta $, the True Positive Rate (TPR) or power: if B is actually better than A, how likely are we to actually measure B to be better at the above $ \alpha $? If we don't account for this, by default the math will work out set $ \beta = 0.5 $, which means we will only find half of the good Bs. In real life we usually set power to 0.8. For more on power, see &lt;a href="https://influentialpoints.com/Training/statistical_power_and_sample_size-principles-properties-assumptions.htm"&gt;this article&lt;/a&gt;. We usually set power to 0.8.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Simulating an A/B test&lt;/h2&gt;
&lt;p&gt;Let’s pretend we’re running an A/B test on funnel conversion. A is the current, B is the new version of the funnel. We want to know whether B is better. By looking at our funnel dashboard, we know that A is historically converting around 9-11%.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1.&lt;/strong&gt; Formulate the action hypothesis: B has higher conversion than A, meaning we're doing a one-sided test.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2.&lt;/strong&gt; We set $ \alpha = 0.10 $ and $ 1 - \beta = 0.80 $. This means we're okay with 10% false positives and we will capture 80% of improvements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 3.&lt;/strong&gt; Decide traffic split. Let’s say we will keep 80% in A, 20% in B. This is how much of a chance we take, B could be worse, buggy, etc.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 4.&lt;/strong&gt; Figure out how many samples we need to collect, given the historic conversion, traffic split, alpha and the kind of lift we’re looking. The code below computes sample size based on the math above:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;scipy.stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;alpha_to_z&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ppf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;power_to_z&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;power&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ppf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;z_alpha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alpha_to_z&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;z_power&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;power_to_z&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mu_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;mu_delta&lt;/span&gt;
    &lt;span class="n"&gt;traffic_ratio_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;traffic_ratio_B&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;z_alpha&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;z_power&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu_A&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that in real-life, there are other considerations. For example, if possible we should run tests for complete days and/or weeks, to capture users who are active at different times. So when we calculate the sample size, in real life we compare that to the number of users going through the funnel per day/week, and "round up". To take this into account in the simulation below, I will multiply by 2.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 5.&lt;/strong&gt; Create a random seed for the A/B test and save it server-side. We generate a new seed for each A/B test. Let’s say we generate the string for this one: &lt;code&gt;OkMdZa18pfr8m5sy2IL52pW9ol2EpLekgakJAIZFBbgZ&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 6.&lt;/strong&gt; Perform test by splitting users randomly between A and B according to the above proportions. Users coming, identified by a &lt;code&gt;user_id&lt;/code&gt; (or &lt;code&gt;cookie_id&lt;/code&gt;), should be put in the same funnel. We can accomplish this by hashing the &lt;code&gt;test_id&lt;/code&gt;, where &lt;code&gt;test_id = seed + user_id&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;hashlib&lt;/span&gt;

&lt;span class="n"&gt;test_seed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;OkMdZa18pfr8m5sy2IL52pW9ol2EpLekgakJAIZFBbgZ&amp;#39;&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_seed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;test_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_seed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ascii&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ascii&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;bits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;bin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;))[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bit&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bits&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Step 7.&lt;/strong&gt; Run the test. We're simulating the real-world here, so we will have to pick the actual conversions for A and B. This is not known to the test, this is what it's trying to estimate, so we call this a hidden variable:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.105&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.115&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnel_user_func&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;test_outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which_funnel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;funnel_user_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# returns &amp;#39;A&amp;#39; or &amp;#39;B&amp;#39;&lt;/span&gt;
        &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which_funnel&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Step 8.&lt;/strong&gt; Compute the p-value and compare it with the $ \alpha $ we set to decide whether to accept or reject B:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sigma_A_squared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N_A&lt;/span&gt;
    &lt;span class="n"&gt;sigma_B_squared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N_B&lt;/span&gt;
    &lt;span class="n"&gt;sigma_squared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sigma_A_squared&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sigma_B_squared&lt;/span&gt;
    &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu_B&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sigma_squared&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;z_to_p&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one_sided&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;

&lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;power&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.80&lt;/span&gt;
&lt;span class="n"&gt;base_conversion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;valuable_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="n"&gt;base_traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;

&lt;span class="n"&gt;N_required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;valuable_diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;N_actual&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;N_required&lt;/span&gt; &lt;span class="c1"&gt;# because eg. we run it for a whole week&lt;/span&gt;

&lt;span class="c1"&gt;# hidden_conversion_params is how our funnels actually perform:&lt;/span&gt;
&lt;span class="c1"&gt;# the difference between the two is what we&amp;#39;re trying to establish&lt;/span&gt;
&lt;span class="c1"&gt;# with statistical confidence, using an A/B test&lt;/span&gt;
&lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.105&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.115&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;test_seed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;OkMdZa18pfr8m5sy2IL52pW9ol2EpLekgakJAIZFBbgZ&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;test_outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;N_actual&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_seed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mu_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;mu_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Measured conversion for A: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Measured conversion for B: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p-value: &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;B is better, deploy&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;We&amp;#39;re not sure if B is better than A&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The outcome depends on chance, here's one run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;{&amp;#39;A&amp;#39;: {&amp;#39;N&amp;#39;: 43464, &amp;#39;conversions&amp;#39;: 4620}, &amp;#39;B&amp;#39;: {&amp;#39;N&amp;#39;: 10812, &amp;#39;conversions&amp;#39;: 1222}}
Measured conversion for A: 0.106
Measured conversion for B: 0.113
p-value: 0.023
Action: B is better, deploy
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If you run this repeatedly, sometimes it will indicate B is better, sometimes it will not. It will find B better more often:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;at higher $ N $, if in fact B is better&lt;/li&gt;
&lt;li&gt;if the conversion advantage of B is greater in the hidden conversion parameters&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Consistency test for FPR ($\alpha$)&lt;/h2&gt;
&lt;p&gt;We can set the A and B equal in the hidden conversion parameters, repeatedly perform the A/B test, and count the ratio of times it finds B to be better than A at the $ \alpha $ level, ie. the ratio of false positives. It should be equal to the $ \alpha $ we set to compute the sample size!&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;base_conversion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;valuable_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="n"&gt;base_traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
&lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;valuable_diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# get rid of this of the hashing, it&amp;#39;s slow, we don&amp;#39;t need it for a simulation&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funnel_user_func&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;num_successes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;test_outcomes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;funnel_user_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mu_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;mu_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversions&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_outcomes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mu_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;num_successes&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num_successes&lt;/span&gt;

&lt;span class="n"&gt;num_successes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fpr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_successes&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;False Positive Rate = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s2"&gt; (expected = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;False Positive Rate = 0.10 (expected = 0.10)
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Consistency test for TPR ($1 - \beta$)&lt;/h2&gt;
&lt;p&gt;We can set the A and B hidden conversion parameters exactly like the assumption we used to compute the sample size (ie. 10% and 11%), repeatedly perform the A/B test, and count the ratio of times it finds B to be better than A at the $ \alpha $ level. It should be equal to the $ 1 - \beta $ power we set to compute the sample size!&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;power&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.80&lt;/span&gt;
&lt;span class="n"&gt;base_conversion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
&lt;span class="n"&gt;valuable_diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
&lt;span class="n"&gt;base_traffic_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
&lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;A&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_conversion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mu_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;valuable_diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;traffic_ratio_A&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;num_successes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;num_tests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hidden_conversion_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;funnel_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_traffic_split&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;tpr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_successes&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num_tests&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;True Positive Rate: &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s2"&gt; (expected = &lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;power&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;It prints:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;True Positive Rate: 0.80 (expected = 0.80)
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Scaling behaviour&lt;/h2&gt;
&lt;p&gt;Let's fix all parameters but one, and see how many samples we need.&lt;/p&gt;
&lt;p&gt;Vary base conversion $\mu_A$, with fixed $ \mu_B - \mu_A, \alpha, 1 - \beta$, traffic split. Because the formula for $z$ includes a term like $\mu_A(1-\mu_A)$, this should be highest at $\mu_A=0.5$.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/base_conversion_N.png" alt="Base conversion vs N" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Vary conversion difference $\mu_B - \mu_A$, with fixed $ \mu_A, \alpha, 1 - \beta$, traffic split. A higher conversion difference requires less samples to detect.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/conversion_diff.png" alt="Conversion difference vs N" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Vary $\alpha$, with fixed $ \mu_A, \mu_B - \mu_A, 1 - \beta$, traffic split. Lower $\alpha$ means we want less false positives, which requires more samples.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/alpha_N.png" alt="Alpha vs N" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Vary the power $1 - \beta$, with fixed $ \mu_A, \mu_B - \mu_A, \alpha$, traffic split. Higher $1 - \beta$ translates to higher probability of detecting positive outcomes, which requires more samples.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/power_N.png" alt="Power vs N" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Vary the traffic split to A, with fixed $ \mu_A, \mu_B - \mu_A, \alpha, 1 - \beta$. The sample size is constrained by the smaller sample size of the two, so an equal split requires the least amount of samples.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/traffic_split_N.png" alt="Traffic split vs N" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Z-tests are simple, if you remember the CLT and are careful about controlling false positive rate and true negatives rates. If in doubt, write simulation code like above and make sure the way you set your parameters gets you the results you want. Also remember that there are other types of tests, such as the &lt;a href="https://en.wikipedia.org/wiki/Chi-squared_test"&gt;$\chi^2$-test&lt;/a&gt; and the &lt;a href="https://en.wikipedia.org/wiki/Student%27s_t-test"&gt;t-test&lt;/a&gt;, to be discussed in the next posts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Related links:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/the-art-of-a-b-testing-5a10c9bb70a4"&gt;The Art of A/B Testing&lt;/a&gt; - good post on the same topic&lt;/li&gt;
&lt;/ul&gt;</content><category term="ab-testing"></category></entry><entry><title>Beyond the Central Limit Theorem</title><link href="https://bytepawn.com/beyond-the-central-limit-theorem.html" rel="alternate"></link><published>2020-02-06T00:00:00+01:00</published><updated>2020-02-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-06:/beyond-the-central-limit-theorem.html</id><summary type="html">&lt;p&gt;In the previous post, I talked about the importance of the Central Limit Theorem (CLT) to A/B testing. Here we will explore cases when we cannot rely on the CLT to hold.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/running_mean_cauchy.PNG" alt="Running mean for Cauchy distribution" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous post, &lt;a href="http://bytepawn.com/ab-testing-and-the-central-limit-theorem.html"&gt;A/B testing and the Central Limit Theorem&lt;/a&gt;, I talked about the importance of the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt; (CLT) to A/B testing. Here we will explore cases when we cannot rely on the CLT to hold. Exploring when a theorem doesn’t hold is a good way to deepen our understanding why the theorem works when it works. It’s a bit like writing tests for software and trying to break it.&lt;/p&gt;
&lt;p&gt;I will show 3 cases when we cannot rely on the CLT to hold:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the distribution does not have a mean, eg. the Cauchy distribution&lt;/li&gt;
&lt;li&gt;violating the independence assumption of the CLT, eg. with a random walk&lt;/li&gt;
&lt;li&gt;small sample size, eg. when events such as fraudulent transactions have a very low probability&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/Beyond%20the%20Central%20Limit%20Theorem.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;1. The distribution does not have a mean&lt;/h2&gt;
&lt;p&gt;The CLT says that when we are approximating the mean of a distribution by sampling, the sample means follow a normal distribution. So the CLT is about approximating the mean of a distribution. What if the distribution does not have a mean? In cases like this, we can still sample it, and compute a mean, but:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;since the original distribution doesn't have a mean, we're not approximating it&lt;/li&gt;
&lt;li&gt;the sampled means will not converge to any value, they will keep jumping around&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;How is it even possible for a distribution not to have a mean? The mean, or &lt;a href="https://en.wikipedia.org/wiki/Expected_value"&gt;expected value&lt;/a&gt; $ E[X] $ of a continuous random variable $ X $ with &lt;a href="https://en.wikipedia.org/wiki/Probability_density_function"&gt;probability density function&lt;/a&gt; (pdf) $ f $ is given by:&lt;/p&gt;
&lt;p&gt;$ E[X] = \int x f(x) dx $, where $ \int f(x) dx = 1 $&lt;/p&gt;
&lt;p&gt;For a discrete random variable:&lt;/p&gt;
&lt;p&gt;$ E[X] = \sum i p_i $, where $ \sum p_i = 1 $&lt;/p&gt;
&lt;p&gt;The mean of a distribution does not exist if the integral or sum does not exist, ie. for a "pathological" $ f $ or $ p_i $.&lt;/p&gt;
&lt;p&gt;One example is the &lt;a href="https://en.wikipedia.org/wiki/Cauchy_distribution"&gt;Cauchy-distribution&lt;/a&gt;, defined by $ f(x) = \frac{ 1 }{ \pi ( 1 + x^2 )} $. If you plug this into the above integral, it does not exist.&lt;/p&gt;
&lt;p&gt;The Cauchy distribution looks like this (shown with the normal in blue, cauchy in orange; notice how cauchy is more narrow in the center and fatter in the tails than the normal):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/norm_cauchy.PNG" alt="Cauchy and normal" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can see this for ourselves with a Monte Carlo simulation:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Draw samples from a distribution&lt;/li&gt;
&lt;li&gt;Every 100 sample, compute the running mean (from the very beginning)&lt;/li&gt;
&lt;li&gt;Plot the running mean, together with the &lt;a href="https://en.wikipedia.org/wiki/Standard_error"&gt;standard error&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Like in the previous post, we'll use &lt;a href="https://www.scipy.org/"&gt;scipy&lt;/a&gt;. First let's do this for distributions that have a mean: uniform ($ \mu = 0.5 $), exponential ($ \mu = 1 $) and a standard normal ($ \mu = 0 $):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;population_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;step_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;running_stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;step_size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;running_sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;running_sample&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;running_sample&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;envelope_min&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;envelope_max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fill_between&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;envelope_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;envelope_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;population_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's see how the running mean converges to the true sample mean after $N=10000$ samples:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/running_mean_convergence.PNG" alt="Running mean converges for the uniform, exponential and normal distribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let's do the same for the Cauchy distribution, but let's let it run for $N=10000000$ samples:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/running_mean_cauchy.PNG" alt="Running mean for Cauchy distribution" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;For the Cauchy even after millions of samples, sometimes there is a big jump in the running mean (unlike the previous distributions). It does not converge: if we keep running it, it will keep jumping around, and it will jump to arbitrarily large values.&lt;/p&gt;
&lt;p&gt;Why is this? The Cauchy distribution can be visualized like this: imagine drawing the bottom part of a circle (a half-circle) around the center (x=0, y=1). Each time we want to sample a number from the Cauchy distribution, first we pick an angle $ \theta \in ( -\pi/2, \pi/2) $ in a uniform way on the half circle, and then shoot a ray from the center at that $\theta$ angle to the half-circle, and then on to the x-axis. The coordinate of the x-axis intercept is the returned value for the Cauchy sampling. Although the angle $\theta$ is uniform, the ray can shoot arbitrarily far on the x-axis to generate extremely large or small values, and this happens quite often (the fat tail). Note that the normal distribution can also generate arbitrarily small or large values, but at a lower rate, so the mean still exists there.&lt;/p&gt;
&lt;p&gt;Another way to see this is to look at a histogram of $N=100000$ draws and compare it with a normal distribution (top is normal, bottom is Cauchy):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/scatter_norm_cauchy.PNG" alt="Scatter plot for normal and Cauchy distributions" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The normal doesn't produce extreme values, whereas the Cauchy does.&lt;/p&gt;
&lt;h2&gt;2. Violating the independence assumption of the CLT&lt;/h2&gt;
&lt;p&gt;The wikipedia quote for the CLT starts like this: &lt;em&gt;“...when &lt;strong&gt;independent&lt;/strong&gt; random variables are added...”&lt;/em&gt;. In other words, the samples we draw should be independent of each other. What does this mean for an A/B test? For example, if user X and user Y are both using the product, they should not talk to each other, they should not influence each other when making the conversion “decision”, or when "deciding" how much time to spend with the product.&lt;/p&gt;
&lt;p&gt;A simple thought experiment that shows how dependence breaks the CLT is the following: suppose a sociology PhD student is taking a salary survey, she sits in a room, test subjects go in and tell her their salary figure (like $50,000), she records it, they leave the room, and the next person goes in. Now suppose that the person who is about to go in asks the person who just left what salary number they said. Then, because they want to look good, they decide to inflate their number and say a bigger number than the previous person, just to impress the sociology PhD student. Putting aside the fact that the survey is flawed, the poor student, if she keeps a running mean of salaries, will see that it keeps going up, and it doesn’t converge. The problem is that she gave her subjects a chance to communicate, and the individual measurements are no longer independent. She needs to have 2 doors, one for incoming and one for outgoing subjects, and make sure people don’t talk to each other.&lt;/p&gt;
&lt;p&gt;In statistics, we can construct a similar case by using a &lt;a href="https://en.wikipedia.org/wiki/Random_walk"&gt;random walk&lt;/a&gt;: a frog starts at 0, and either goes to +1 or -1 with even probabilities, and so on. We can imagine the frog’s position to be the draws of a distribution, and clearly the subsequent draws are not independent: if the $t=42$ draw was 13 (the frog was at position 13 after 42 time steps), the $t=43$ draw is going to be either 12 or 14, it’s conditioned on the previous draw(s). Similarly to the PhD student’s case, if we keep sampling these numbers and compute the mean, it will not converge. Note that this is not the same as the Cauchy case: here, at each step $t$ (so each random variable in the series) the mean exists and is finite; the mean position at $t=42$ or $t=43$ can be computed and it’s a finite number (it must be 0, see below). But averaging these dependent random variables yields a random variable that breaks the CLT, because the sum does not converge.&lt;/p&gt;
&lt;p&gt;Let's see this in action:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_walk_draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cumsum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bernoulli&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;#return np.cumsum(norm().rvs(size=num_steps)) # we can also generate the steps with a std. normal&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_walk_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;martingale_normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random_walk_draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;step_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;running_stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;step_size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;running_sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;martingale_normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;running_sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;running_sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;running_sample&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;envelope_min&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;envelope_max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;running_stats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fill_between&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;envelope_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;envelope_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;random_walk_running_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Even after $t=10000000$ steps, the running mean is still moving around (orange is the random walk itself, blue is the running mean):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/random_walk.PNG" alt="Random walk" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Two interesting notes here:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A.&lt;/strong&gt; As I mentioned, the mean at any $t$-th timestep in the random walk can be sampled, it exists, and the CLT works. This is because up until the $t$-th timestep, the frog can only get $t$ steps away from the origin ($-t$ or $t$), so at any $t$, the probability distribution is bounded, and the mean exists, and it will be 0 (the true population mean). We can see this for ourselves:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_walk_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sample_means&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_walk_draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mn&lt;/span&gt;
    &lt;span class="n"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;4.0&lt;/span&gt;
    &lt;span class="n"&gt;resolution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;
    &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resolution&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;random_walk_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;At $t=10000$ steps:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/random_walk_after_10000.PNG" alt="Random walk sample means after 10,000 steps" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;If you're a physicist, you'll know that the standard error is $s=100$, because $s=\sqrt(t)$ for such a random walk, see below.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;B.&lt;/strong&gt; Cases like the random walk can be handled by an extension to the CLT called the &lt;a href="https://en.wikipedia.org/wiki/Martingale_central_limit_theorem"&gt;Martingale Central Limit Theorem&lt;/a&gt;. It essentially says that if you normalize the mean by a suitable function of the steps, then the mean will converge and exists:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In probability theory, the central limit theorem says that, under certain conditions, the sum of many independent identically-distributed random variables, when scaled appropriately, converges in distribution to a standard normal distribution. The martingale central limit theorem generalizes this result for random variables to martingales, which are stochastic processes where the change in the value of the process from time t to time t + 1 has expectation zero, even conditioned on previous outcomes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Because simple a random walk, after $ t $ timesteps will on average be $ \sqrt{t} $ steps away from the origin, the normalizing factor is $ \sqrt{t} $. With this, the normalized mean converges to 0. It must be 0 because the whole setup is symmetric around the origin (orange is the random walk itself, blue is the running mean)::&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;random_walk_running_mean_plot(sample_size=10*1000*1000, martingale_normalize=True)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/martingale_mean.PNG" alt="Martingale mean" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;But note that this is no longer the original CLT. Random walks like this are closely related to &lt;a href="https://en.wikipedia.org/wiki/Brownian_motion"&gt;Brownian motion&lt;/a&gt;, which is what earned Einstein the Nobel price.&lt;/p&gt;
&lt;h2&gt;3. Small sample size&lt;/h2&gt;
&lt;p&gt;We also cannot rely on the CLT when the sample size is small. The CLT says that as we increase the sample size $N$, we get arbitrarily close to a normal distribution (even though we never reach it, only in the $ N \rightarrow \infty $ infitinity limit). In real life, we don’t have infinite time, so we always stop at some fixed $N$. It's the Data Scientist's job to make sure we have a big enough $N$ that our approximation is good enough.&lt;/p&gt;
&lt;p&gt;Let's use conversions, ie. the &lt;a href="https://en.wikipedia.org/wiki/Bernoulli_distribution"&gt;Bernoulli&lt;/a&gt; distribution as an example, because it illustrates this point in a counter-intuitive, but instructive way. In the previous post we saw that even at $N=1000$ the sample mean nicely approximate a normal... that is, when the $p$ parameter of the Bernoulli is $p=0.5$:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bernoulli_mean_sample.PNG" alt="Bernoulli sample means" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Let's run the same simulation, but at $p=0.001$, with $N=1000$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bernoulli&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We get this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/bernoulli_p_small.PNG" alt="Bernoulli at p=0.001, N=1,000" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Clearly, this does not (yet) look like a bell curve, and it's strongly asymmetric. What happens is that $Np=1$, so on average we just get 1 conversion. A lot of the time we will get just 0, most often 1, sometimes 2, 3, and so on.. At such low $Np$, we need more samples, so we can more accurately explore the region around the true mean. If we run at say $N=10000$, then $Np=10$, so sometimes we get $0, 1, 2, ... 10 ... $ We get a clearer picture around the average 10.&lt;/p&gt;
&lt;p&gt;To be clear, the CLT still holds for a low $p$ Bernoulli distribution, we just need to collect more samples, so we get enough conversions per sample to actually see the bell curve. Let's repeat the previous simulation, but at $N=100000$. It shows a beautiful bell curve centered around $p=0.001$:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bernoulli&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bernoulli_p_small_high_n.PNG" alt="Bernoulli at p=0.001, N=100,000" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;So, the CLT still holds, but sometimes we just need to collect more data for it to “start working”. Note that these considerations are baked into sample size calculators such as &lt;a href="https://www.evanmiller.org/ab-testing/sample-size.html"&gt;this&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A good real-life example of working with very low probability events is fraud in credit card transactions. In real life only about 1 in 1,000 transactions are fraudulent. Suppose we're A/B testing an ML model to catch fraudulent transactions: A is the new model, B is the old model. We need to collect a lot of samples to catch enough fraud cases to get a good estimate of the ratios for A and B ($C_A$ and $C_B$ if we think of this as conversions), and the lift between A and B. What the above simulation shows is that if fraud cases happen at $p=0.001$ we will need to collect a lot more samples than if they were to happen at $p=0.5$ frequency.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;There are a number of caveats when running A/B tests in real life. Making sure the CLT holds when we use a test statistic that assumes a normal distribution is one of them. From the list of considerations above, the top thing to keep in mind is the concern of sample sizes (#3). In SaaS-like environments, the population distribution will always exist (#1), especially if we &lt;a href="https://en.wikipedia.org/wiki/Winsorizing"&gt;winsorize&lt;/a&gt; our results (winsorization just means we replace the most extreme values at the two ends of our sample with less extreme values). Independence (#2) usually holds in SaaS environments, but may not in social networks.&lt;/p&gt;
&lt;p&gt;It's worth remembering that there are also non-statistical A/B testing errors, eg. if we run variant A in Germany and variant B in France. We may get enough samples for both A and B, both are normals, we get a nice reading on the difference, but we're comparing apples to oranges. We're not measuring (just) the difference between A and B, but the difference between german and french users.&lt;/p&gt;</content><category term="data"></category><category term="ab testing"></category><category term="statistics"></category></entry><entry><title>A/B testing and the Central Limit Theorem</title><link href="https://bytepawn.com/ab-testing-and-the-central-limit-theorem.html" rel="alternate"></link><published>2020-02-05T00:00:00+01:00</published><updated>2020-02-05T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-05:/ab-testing-and-the-central-limit-theorem.html</id><summary type="html">&lt;p&gt;When working with hypothesis testing, the desciptions of the statistical method often has normality assumptions. For example, the Wikipedia page for the z-test starts like this: &lt;em&gt;"A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution"&lt;/em&gt;. What does this mean? How do I know it’s a valid assumption for my data?&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/normal_from_uniform.PNG" alt="Normal distribution from uniform" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Data Scientists run lots of A/B tests, whether they’re working on SaaS products, social networking, logistics or self-driving cars. A/B testing is a form of hypothesis testing, a decision-making method powered by statistics. By following the rules of hypothesis testing we make sure we have gathered enough and strong enough evidence to support our decisions.&lt;/p&gt;
&lt;p&gt;When working with hypothesis testing, the desciptions of the statistical methods often have normality assumptions. For example, &lt;a href="https://en.wikipedia.org/wiki/Z-test"&gt;the Wikipedia page for the z-test starts like this&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A Z-test is any statistical test for which the distribution of the test statistic under &lt;strong&gt;the null hypothesis can be approximated by a normal distribution&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This a common source of confusion. What does it mean that &lt;em&gt;“the test statistic under the null hypothesis can be approximated by a normal distribution”&lt;/em&gt;? And whatever that is, how do I know it’s  a valid assumption for my data?&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/playground/blob/master/AB%20testing%20and%20the%20Central%20Limit%20Theorem.ipynb"&gt;The code shown below is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;A/B testing&lt;/h2&gt;
&lt;p&gt;Let’s take a concrete example. Suppose we’re doing an A/B test, and we’re measuring two metrics for variants A and B: a $C$ conversion rate (at what rate do people convert) and a $T$ timespent (how many minutes do they spend in the product). In both case, our null hypothesis would be &lt;em&gt;“A and B are the same”&lt;/em&gt;, which is translated into the language of mathematics like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$ H_0 $ null hypothesis for conversion rate: $ C_A = C_B $&lt;/li&gt;
&lt;li&gt;$ H_0 $ null hypothesis for timespent: $ T_A  = T_B $&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here comes the important part: in the above expression, $ C_A $ and $ C_B $ are the conversion rate, and $ T_A $ and $ T_B $ are the average timespent minutes. Both of these quantities are averages: for the conversion rate, we can imagine a conversion counting as a 1 and a non-conversion as a 0, and the conversion rate is the average of this random variable (like coinflips). The timespent is also computed by adding up the individual timespent minutes and dividing by the number of samples.&lt;/p&gt;
&lt;p&gt;And this is the key: the z-test works only if these averages can be approximated by a normal distribution. &lt;strong&gt;So it’s not the distribution of conversions or the distributions of timespents which must be normal.&lt;/strong&gt; In fact, these do not follow a normal distribution at all! The conversions are 0s and 1s and follow a &lt;a href="https://en.wikipedia.org/wiki/Bernoulli_distribution"&gt;Bernoulli distribution&lt;/a&gt;, like coin tosses. Timespents usually follow an exponential-looking drop-off in SaaS products. &lt;strong&gt;But, given a big enough sample size, the distribution of averages computed from samples can in fact be approximated by a normal.&lt;/strong&gt; This is the guarantee of the Central Limit Theorem (CLT).&lt;/p&gt;
&lt;p&gt;The test statistic is actually the difference of the means: for example, $ T_A = T_B $ can be reformulated as $ T_A - T_B = 0 $, and it is this difference (normalized) that is the test statistic. Fortunately, &lt;a href="https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables"&gt;independent normal distributions have a very nice additive property&lt;/a&gt;: if $ X $ and $ Y $ are independent normal random variables, then $ Z_+ = X + Y $ and $ Z_- = X - Y $ are also normals. So if $ T_A $ and  $ T_B $ are normal, so is the test statistic $ T_A - T_B $.&lt;/p&gt;
&lt;h2&gt;The Central Limit Theorem&lt;/h2&gt;
&lt;p&gt;Without further ado, the &lt;a href="https://en.wikipedia.org/wiki/Central_limit_theorem"&gt;Central Limit Theorem&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a "bell curve") even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.&lt;/p&gt;
&lt;p&gt;For example, suppose that a sample is obtained containing many observations, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the arithmetic mean of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the distribution of the average will be closely approximated by a normal distribution. A simple example of this is that if one flips a coin many times the probability of getting a given number of heads in a series of flips will approach a normal curve, with mean equal to half the total number of flips in each series; in the limit of an infinite number of flips, it will equal a normal curve.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In layman’s terms, the CLT says that: given a population P, with some metric M whose true average is $ \mu_M $, and you take a random sample of independent measurements from P and take the average $ a_M $, then $ a_M $ follows a normal distribution. Note that the error of the measurement, $ e_M = \mu_M - a_M $ also follows a normal distribution since $ \mu_M $ is a constant. $ e_M $ is the quantity most closely related to the $ H_0 $ null hypothesis' test statistic.&lt;/p&gt;
&lt;h2&gt;Monte Carlo simulation&lt;/h2&gt;
&lt;p&gt;We can "prove" this to ourselves by running Monte Carlo simulations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;First, we will define a population with some distribution. It doesn’t have to be normal, it can be uniform, exponential, whatever.&lt;/li&gt;
&lt;li&gt;Then we take &lt;code&gt;sample_size&lt;/code&gt; samples and compute the mean.&lt;/li&gt;
&lt;li&gt;We do the above step &lt;code&gt;num_sample&lt;/code&gt; times, so we have &lt;code&gt;num_sample&lt;/code&gt; means.&lt;/li&gt;
&lt;li&gt;Then we plot these means and according to the CLT, we should see a nice bell curve centered on the true mean of the original population distribution (ie. the mean of the uniform, the mean of the exponential, etc).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let's use &lt;a href="https://www.scipy.org/"&gt;scipy&lt;/a&gt;, it gives us convenient ways to sample standard distributions. First, a function which samples a given distribution and shows a histogram of the samples against the &lt;a href="https://en.wikipedia.org/wiki/Probability_density_function"&gt;probability density function&lt;/a&gt; of the distribution. We can use this to check that we're doing the right thing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;population_sample_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;4.0&lt;/span&gt;
    &lt;span class="n"&gt;resolution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;
    &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resolution&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's visualize a &lt;a href="https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)"&gt;uniform&lt;/a&gt;, an &lt;a href="https://en.wikipedia.org/wiki/Exponential_distribution"&gt;exponential&lt;/a&gt; and a &lt;a href="https://en.wikipedia.org/wiki/Normal_distribution"&gt;normal distribution&lt;/a&gt; using the above function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_sample_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_sample_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_sample_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/uniform_exp_normal_histo.PNG" alt="Uniform, exponential and normal distributions" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;So these are the original population distributions, and we're trying to estimate the mean by drawing samples. Let's write code to do this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sample_means&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;population&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample_size&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_samples&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;mn&lt;/span&gt;
    &lt;span class="n"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;4.0&lt;/span&gt;
    &lt;span class="n"&gt;resolution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;
    &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mn&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resolution&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_means&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's see the distribution of the means for the same three distributions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/uniform_exp_normal_mean_sample.PNG" alt="Uniform, exponential and normal distribution sample means" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The Central Limit Theorem works!The distribution of the means, for uniform, exponential and normal distributions is a normal distribution, about the true population mean.&lt;/p&gt;
&lt;p&gt;Note that &lt;strong&gt;the CLT also works for discrete distributions such as the &lt;a href="https://en.wikipedia.org/wiki/Bernoulli_distribution"&gt;Bernoulli distribution&lt;/a&gt;&lt;/strong&gt;, the underlying distribution for conversions.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;population_sample_mean_plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bernoulli&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/bernoulli_mean_sample.PNG" alt="Bernoulli sample means" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Standard error&lt;/h2&gt;
&lt;p&gt;The CLT says that the means follow a normal distribution centered around the true mean of the population. What about the width of the bell curve? The technical term for the width of a distribution is the &lt;a href="https://en.wikipedia.org/wiki/Standard_deviation"&gt;standard deviation&lt;/a&gt;. Furthermore, there is a dedicated term for the standard deviation of a sample drawn to estimate a population parameter such as the mean, we call this the &lt;a href="https://en.wikipedia.org/wiki/Standard_error"&gt;standard error&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution[1] or an estimate of that standard deviation. If the parameter or the statistic is the mean, it is called the standard error of the mean (SEM). The sampling distribution of a population mean is generated by repeated sampling and recording of the means obtained. This forms a distribution of different means, and this distribution has its own mean and variance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Without going into details, the standard error $ s $ is $ s = \sigma / \sqrt{N} $, where $ \sigma $ is the standard deviation of the original population, and $ N $ is the sample size (&lt;code&gt;sample_size&lt;/code&gt; in the code above). Aas we draw more and more samples (more $ N $), the standard error $ s $ decreases, so we get a "needle" bell curve around the true population mean. We can get an arbitrarily accurate estimate of the mean by drawing a lot of samples.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The Central Limit Theorem is the reason why, when you're doing A/B testing on averages (such as conversion or average timespents), the normality assumption for hypothesis testing is usually justified.&lt;/strong&gt; &lt;a href="http://bytepawn.com/beyond-the-central-limit-theorem.html"&gt;In the next post I will show cases when the CLT does not apply&lt;/a&gt;.&lt;/p&gt;</content><category term="data"></category><category term="ab testing"></category><category term="statistics"></category></entry><entry><title>Optimizing waits in Airflow</title><link href="https://bytepawn.com/optimizing-waits-in-airflow.html" rel="alternate"></link><published>2020-02-01T00:00:00+01:00</published><updated>2020-02-01T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-02-01:/optimizing-waits-in-airflow.html</id><summary type="html">&lt;p&gt;Sometimes I get to put on my Data Engineering hat for a few days. I enjoy this because I like to move up and down the Data Science stack and I try to keep myself sharp technically. Recently I was able to spend a few days optimizing our Airflow ETL for speed. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/airflow-dag.png" alt="Airflow DAG" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Sometimes I get to put on my Data Engineering hat for a few days. I enjoy this because I like to move up and down the Data Science stack and I try to keep myself sharp technically. Recently I was able to spend a few days optimizing our &lt;a href="https://airflow.apache.org/"&gt;Airflow&lt;/a&gt; ETL for speed. We noticed that DWH jobs with a lots of waits are taking a lot of time to complete the waits (not counting the actual waiting time). Below is a list of changes I made to improve our waiting time.&lt;/p&gt;
&lt;h2&gt;Our history of waiting on tables&lt;/h2&gt;
&lt;p&gt;The basic premise is this. Suppose you have a DWH job that creates the latest &lt;code&gt;ds&lt;/code&gt; partition for &lt;code&gt;result_table&lt;/code&gt;, and the &lt;code&gt;INSERT&lt;/code&gt; is a result of a &lt;code&gt;SELECT&lt;/code&gt; like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;result_table&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;table3&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In our ETL, we would write this like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;result_table&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;curent_ds_wait&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;curent_ds_wait&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;curent_ds_wait&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;table3&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Our framework parses the SQL snippets and extracts tables names after &lt;code&gt;current_ds_wait::&lt;/code&gt;. These are the list of tables where we need to wait for today's &lt;code&gt;ds&lt;/code&gt; partition to land before we can run the &lt;code&gt;SELECT&lt;/code&gt; (otherwise the result would be incomplete).&lt;/p&gt;
&lt;p&gt;I described &lt;code&gt;ds&lt;/code&gt; partitions in an &lt;a href="http://bytepawn.com/fetchr-airflow.html#fetchr-airflow"&gt;earlier post&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The other major design pattern from Facebook is the idea of daily partitioned tables. This is a feature available on Hive, and not really practical on eg. Redshift. Essentially we store (complete) daily, write-once slices of each table, which are generated by daily jobs. The partitions are called &lt;code&gt;ds&lt;/code&gt; at Facebook and logically show up as a column of the table, and you’ll find plenty of references to it if you read the Hive docs (because Hive was written at Facebook). Physically, these are essentially directories, each one holding the data files for that day’s data. We use S3, so in our case it looks something like &lt;code&gt;s3://dwh-bucket/&amp;lt;table&amp;gt;/&amp;lt;ds&amp;gt;/&amp;lt;data_files&amp;gt;&lt;/code&gt;. For example, &lt;code&gt;s3://dwh-bucket/company_metrics/2018-03-01/datafile&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So when our framework generates the DAG for this DWH job, it generates an &lt;code&gt;insert&lt;/code&gt; task (&lt;code&gt;PrestoOperator&lt;/code&gt; operator), which depends on 3 &lt;code&gt;wait&lt;/code&gt; tasks (&lt;code&gt;DsPartitionSensor&lt;/code&gt; operators), one for each table. There a bunch of other tasks that we generate (such as tasks for running &lt;code&gt;CREATE TABLE IF NOT EXISTS&lt;/code&gt;), but let’s ignore that.&lt;/p&gt;
&lt;p&gt;So this part of the DAG looks like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    insert
    |
    +-- wait_table1
    |
    +-- wait_table2
    |
    +-- wait_table3
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Chaining waits&lt;/h2&gt;
&lt;p&gt;Initially, the &lt;code&gt;wait&lt;/code&gt; jobs issued a Presto SQL statement like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;PARTITIONS&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="err"&gt;{&lt;/span&gt;&lt;span class="k"&gt;table&lt;/span&gt;&lt;span class="err"&gt;}&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;{ds}&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The first thing we noticed is that this overloaded our Presto cluster. We have ~100 jobs, and each has a couple of &lt;code&gt;wait&lt;/code&gt;s, so this results in hundreds of &lt;code&gt;wait&lt;/code&gt;s trying to run at the same time. Also, since we only have a limited number of worker slots on our Airflow worker, sometimes the &lt;code&gt;wait&lt;/code&gt;s would use up all the slots, and the actual &lt;code&gt;insert&lt;/code&gt;s never ran, or spent a long time in the queue, waiting to be executed.&lt;/p&gt;
&lt;p&gt;So one of the initial optimizations was to chain the waits on the DAG, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    insert
    |
    +-- wait_table1
          |
          +-- wait_table2
               |
               +-- wait_table3
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This way each DAG only ever has one &lt;code&gt;wait&lt;/code&gt; job running. The &lt;code&gt;wait&lt;/code&gt; jobs per DAG run sequentally. This change was easy to make, because we don't construct our DAGs "by hand" for each table, we have a helper function which does this (which also does the &lt;code&gt;current_ds_wait::&lt;/code&gt; stuff), so we just needed to make this change in one place.&lt;/p&gt;
&lt;h2&gt;Task pools&lt;/h2&gt;
&lt;p&gt;The second thing we tried was to use Airflow’s pool feature. With this, tasks can be assigned to pools, and per pool limits can be set on execution. So if we have 32 worker slots, we can set up a &lt;code&gt;wait&lt;/code&gt; pool with 24 slots, so no more than 24 &lt;code&gt;wait&lt;/code&gt;s can be running.&lt;/p&gt;
&lt;p&gt;Unfortunately, this feature in Airflow is buggy/broken. In our setup, where we’re running a separate master and worker, and using &lt;a href="https://airflow.apache.org/docs/1.10.6/howto/executor/use-celery.html"&gt;Celery&lt;/a&gt; for running worker tasks, the Airflow scheduler doesn’t respect the limits, &lt;a href="https://issues.apache.org/jira/browse/AIRFLOW-584"&gt;similar to this bug report&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Hive instead of Presto&lt;/h2&gt;
&lt;p&gt;Since all our DWH jobs run on Presto, our Hive execution engine is just sitting around idle handling metadata queries such as &lt;code&gt;CREATE TABLE&lt;/code&gt; (&lt;code&gt;create&lt;/code&gt; tasks in the DAG) . So by running the &lt;code&gt;SHOW PARTITION&lt;/code&gt; (the syntax starts the same, but it’s a bit different on Hive) on Hive, we can get rid of 95% of the jobs on the Presto cluster, which were taking a long time to run, even though they’re just checking for the presence of a partition. The Hive engine can handle these metadata queries easily, returning in less than a second.&lt;/p&gt;
&lt;h2&gt;Multiwaits&lt;/h2&gt;
&lt;p&gt;In the example above, we’re waiting on 3 tables, and we generate 3 &lt;code&gt;wait&lt;/code&gt; jobs. We realized this is inefficient, and we can just have one &lt;code&gt;multi_wait&lt;/code&gt; task which checks all 3 partitions at once. We just generate a HQL with several &lt;code&gt;SHOW PARTITION&lt;/code&gt; statements separated by &lt;code&gt;;&lt;/code&gt; and parse the resulting string to see what’s there and what’s missing. So the final DAG looks very simple:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    insert
    |
    +-- multi_wait
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is such an obvious idea..&lt;/p&gt;
&lt;h2&gt;Reducing time between jobs&lt;/h2&gt;
&lt;p&gt;Looking at the scheduler logs, we still have a problem: the &lt;code&gt;multi_wait&lt;/code&gt; finishes at time X, but the insert is only launched at time X+5 minutes. This is a generic issue with Airflow, not specific to &lt;code&gt;wait&lt;/code&gt;s. Why does Airflow need 5 minutes to figure out that a task’s dependencies are all finished?&lt;/p&gt;
&lt;p&gt;To understand this I looked through the logs, and found that indeed, the first time in the logs that Airflow notices that the insert can run is several minutes after the &lt;code&gt;multi_wait&lt;/code&gt; finishes. To understand this I took the log line and looked it up in the Airflow source code. What happens is this:&lt;/p&gt;
&lt;p&gt;Every 30 seconds the Airflow scheduler (as configured) lists out all .py files in the &lt;code&gt;dags&lt;/code&gt; folder. It saves these known &lt;code&gt;.py&lt;/code&gt; files, and then in a “round-robin” manner, executes them: it runs &lt;code&gt;one.py&lt;/code&gt;, &lt;code&gt;two.py&lt;/code&gt;, &lt;code&gt;three.py&lt;/code&gt;, and so on, where each of the &lt;code&gt;.py&lt;/code&gt; files is a DAG definition in our case. Each time it executes the &lt;code&gt;.py&lt;/code&gt; file, it looks at instances of the &lt;code&gt;DAG&lt;/code&gt; class in the global namespace, and those are the &lt;code&gt;DAG&lt;/code&gt;s it executes. &lt;strong&gt;The problem is, the Airflow scheduler only checks for new runnable tasks (ie. all dependencies are finished) when it’s running the appropriate &lt;code&gt;.py&lt;/code&gt; file!&lt;/strong&gt; This is a very unfortunate architectural choice. And this explains why it takes ~5 minutes between task executions: we have about ~100 ETL jobs in ~100 &lt;code&gt;.py&lt;/code&gt; files, and running a &lt;code&gt;.py&lt;/code&gt; file takes 3-5 seconds. The reason it takes 3-5 seconds to execute a 100 line Python program is because they have to &lt;code&gt;import&lt;/code&gt; Airflow libraries (to get &lt;code&gt;DAG&lt;/code&gt; class, etc), and those Airflow &lt;code&gt;import&lt;/code&gt;s take time:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;airflow.operators&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PrestoOperator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DropDsPartitionOperator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DsPartitionSensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PostgresOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;airflow.operators.hive_operator&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HiveOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;airflow.hooks.presto_hook&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PrestoHook&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;airflow.hooks.base_hook&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseHook&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;airflow&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I asked the following question in the &lt;a href="https://apache-airflow-slack.herokuapp.com/"&gt;Airflow slack #support channel&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I'm trying to debug my Airflow (1.8.2), we've been using it in prod for ~2 yrs. My issue is that it takes a long time between task runs. Ie. task X is waiting on task Y to finish in a DAG. Y finishes, and then it takes ~5 minutes for X to get queued and executed. Overall these 5 mins add up and add hours to the ETL running time.&lt;/p&gt;
&lt;p&gt;I've been doing some debugging, and looking at the Airflow source code; what I found so far:
- for a task to be run, all upstream tasks have to finished, and the task has to be in &lt;code&gt;SCHEDULED&lt;/code&gt; state: &lt;code&gt;jobs.py::_execute_task_instances()&lt;/code&gt; called like &lt;code&gt;_execute_task_instances(simple_dag_bag, (State.SCHEDULED,))&lt;/code&gt;
- a task goes from &lt;code&gt;None&lt;/code&gt; state to &lt;code&gt;SCHEDULED&lt;/code&gt; state in &lt;code&gt;jobs.py::process_file()&lt;/code&gt;, which corresponds to lines like &lt;code&gt;Started a process (PID: 28897) to generate tasks for ...&lt;/code&gt; lines in my syslog
- by default my tasks are in &lt;code&gt;None&lt;/code&gt; state (I see this in Task Instance Details view on web UI).
- I have ~100 DAG python files, each takes ~3 seconds to execute to collect the DAG, so a "roundtrip" takes ~300secs = 5mins
- so I'm guessing this is what's causing the ~5 minute delay, that each DAG python file is re-read every 5 mins, and that's when Airflow realizes that the deps are good and makes it &lt;code&gt;SCHEDULED&lt;/code&gt;. Correct me if I'm wrong.
What's confusing to me is, why does Airflow need to re-read the file to notice that all upstream tasks are good to go?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I received no clear answer, other than a link about &lt;a href="https://www.astronomer.io/blog/profiling-the-airflow-scheduler/"&gt;profiling the Airflow scheduler&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Given this limitation, the obvious workaround is to put several DAGs into one &lt;code&gt;.py&lt;/code&gt; file, thus saving time on &lt;code&gt;import&lt;/code&gt;s. For example, right now we have one &lt;code&gt;.py&lt;/code&gt; file per table &lt;code&gt;import&lt;/code&gt; from the production database, which is very nice in terms of code layout in the IDE, and in terms of following changes on &lt;code&gt;git&lt;/code&gt;. But we could put all these into one big &lt;code&gt;.py&lt;/code&gt; file, and have one big &lt;code&gt;.py&lt;/code&gt; file per “type” of DAG (eg. one file for imports, one for exports, etc).&lt;/p&gt;
&lt;p&gt;I haven’t yet made up my mind whether we should do this: it feels wrong to sacrifice everyday engineering UX for an accidental architectural flaw in the ETL system.&lt;/p&gt;
&lt;h2&gt;Further optimizations&lt;/h2&gt;
&lt;p&gt;Other ideas:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;talk straight to the Hive metastore&lt;/li&gt;
&lt;li&gt;cache existing partitions once we know they're there&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are good idea, but at this point (repeatedly) querying Hive with &lt;code&gt;SHOW PARTITION&lt;/code&gt;s is not a bottleneck, so it wouldn't help us.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Airflow is the 5th ETL tool I use: we wrote 3 hand-rolled ETL system at Prezi (one in bash, one in Haskell, one in Go), at Facebook we used &lt;a href="https://asiliconvalleyinsider.com/2016/05/01/data-engineering-facebook/"&gt;Dataswarm&lt;/a&gt;, and at Fetchr we use Airflow (which is based on Dataswarm). I think it’s great that we have Airflow, because it’s miles better than a hand-rolled ETL system. Also, as it matures, it will get better!&lt;/p&gt;
&lt;p&gt;Having said that, I hope the open source Data Engineering community will improve Airflow in the future to address these issues. My problems with Airflow are three-fold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bugs:&lt;ul&gt;
&lt;li&gt;the unreliable pools feature&lt;/li&gt;
&lt;li&gt;Airflow can get stuck in various ways:&lt;ul&gt;
&lt;li&gt;all worker slots are used up by tasks which block; in this case, we have to &lt;code&gt;ps ax | grep airflow | grep wait | awk ' { print  kill -9 $1 } '&lt;/code&gt; on the worker&lt;/li&gt;
&lt;li&gt;sometimes tasks get stuck in &lt;code&gt;null&lt;/code&gt; or &lt;code&gt;queued&lt;/code&gt; status; in this case, we have to manually re-kick them on the UI&lt;/li&gt;
&lt;li&gt;sometimes the scheduler itself runs into a bug and gets stuck; in this case we have to restart the scheduler itself on the master with &lt;code&gt;systemctl restart airflow-scheduler.service&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;architectural shortcomings&lt;ul&gt;
&lt;li&gt;only making progress on the DAG when re-running the &lt;code&gt;.py&lt;/code&gt; file containing the &lt;code&gt;DAG&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;right now, for every task instance Airflow launches a new, expensive Python process on the worker node, which takes hundreds of MBs of memory and the turn-around time is quite slow; it'd be nice to come up with an "in-process" way to launch small inexpensive checks quickly (like &lt;code&gt;wait&lt;/code&gt;s)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;resource hungry: we run two beefy EC2 instances, one airflow-master and one airflow-worker, but all these really do is manage a relatively small DAG (~100 DAGs each with ~10 tasks); the actual work is performed on a third node (actually, a cluster), the EC2 nodes that are running Presto and the various ML jobs; still, both nodes show 2-3 load with &lt;code&gt;top&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are real issues that affect our production every day in terms of landing time, dollars and engineering time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Airflow is slow to make progress&lt;/li&gt;
&lt;li&gt;we often have to manually kill / clear / re-kick jobs&lt;/li&gt;
&lt;li&gt;we run two EC2 nodes just for Airflow (master and worker)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I definitely don’t regret using Airflow, but it would be nice if the core engine itself would be more efficient, performant, and less wasteful. 10 years ago I was a proud C++ programmer, &lt;a href="https://github.com/scalien/scaliendb"&gt;building database kernels and storage engines&lt;/a&gt;, optimizing away bits and bytes. Today, because I try to &lt;em&gt;move fast and focus on impact&lt;/em&gt;—which is the right thing to do, despite these issues—throwing hardware and money at a simple problem of managing a small DAG is the best option. Feels weird.&lt;/p&gt;</content><category term="data"></category><category term="airflow"></category><category term="python"></category></entry><entry><title>SQL best practices for Data Scientists and Analysts</title><link href="https://bytepawn.com/sql-best-practices-for-data-scientists-and-analysts.html" rel="alternate"></link><published>2020-01-26T00:00:00+01:00</published><updated>2020-01-26T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-01-26:/sql-best-practices-for-data-scientists-and-analysts.html</id><summary type="html">&lt;p&gt;My list of SQL best practices for Data Scientists and Analysts, or, how I personally write SQL code. I picked this up at Facebook, and later improved it at Fetchr. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/sql-constr.jpg" alt="SQL code" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The &lt;a href="http://bytepawn.com/how-i-write-sql-code.html"&gt;previous post was about SQL code style&lt;/a&gt;, like uppercasing, indentation and overall structure like &lt;code&gt;WITH&lt;/code&gt;. This is about how to write readable, concise and efficient &lt;code&gt;SELECT&lt;/code&gt; statements that minimize bugs.&lt;/p&gt;
&lt;p&gt;For completeness, I will quickly list out the coding style suggestions from the &lt;a href="http://bytepawn.com/how-i-write-sql-code.html"&gt;last post&lt;/a&gt;, without explanations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Use all upper-case for keywords like &lt;code&gt;SELECT&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Keep &lt;code&gt;SELECT .. FROM .. WHERE .. GROUP BY&lt;/code&gt; unindented&lt;/li&gt;
&lt;li&gt;Line up &lt;code&gt;WHERE&lt;/code&gt; conditions&lt;/li&gt;
&lt;li&gt;Write &lt;code&gt;GROUP BY 1, 2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Indent &lt;code&gt;WHEN&lt;/code&gt; relative to &lt;code&gt;CASE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;WITH&lt;/code&gt; instead of nested sub-&lt;code&gt;SELECT&lt;/code&gt;s&lt;/li&gt;
&lt;li&gt;Long lines are okay&lt;/li&gt;
&lt;li&gt;Break the rules for readability and flow of code&lt;/li&gt;
&lt;li&gt;Follow the same rules when writing interactive SQL code&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Without further ado, my SQL best practices for Data Scientists and Analysts follow below.&lt;/p&gt;
&lt;h2&gt;Use &lt;code&gt;COUNT(DISTINCT id)&lt;/code&gt; instead of &lt;code&gt;COUNT(*)&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;COUNT(*)&lt;/code&gt; is dangerous, because you could be counting things do don't want to. For example, a table called &lt;code&gt;dispatches&lt;/code&gt; might include multiple dispatch events for a delivery order. If you're counting dispatches for a week, do you want to double-count dispatches for the same order, if the order was dispatched but didn't get delivered the first time (customer was unreachable), and was re-dispatched again? Also, tables can sometimes get polluted with bad data, eg. if there was a software issue issue in production and the warehouse staff triggered multiple dispatch events. Also, although we expect data engineers to pre-clean data, sometimes new modes of dirtyness appear. And sometimes it happens that a bug is introduced in the ETL pipeline, and eg. all rows are duplicated in a table. Although it's not the analyst's fault, it's better to be defensive. Example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;                    &lt;span class="c1"&gt;-- BAD:  what are we counting here?&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- GOOD: much clearer, we&amp;#39;re counting unique orders&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;dispatches&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Use &lt;code&gt;ROW()&lt;/code&gt; when computing metrics over multiple days&lt;/h2&gt;
&lt;p&gt;In the above example, suppose we want to count dispatches for 7 days, and we want to count re-dispatches of the same package. In this case &lt;code&gt;COUNT(DISTINCT tracking_id)&lt;/code&gt; won't work, because it won't double count double dispatches. And we said &lt;code&gt;COUNT(*)&lt;/code&gt; is evil. The way out is to &lt;code&gt;DISTINCT&lt;/code&gt; on both days and orders, and the way to do that is with &lt;code&gt;ROW()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;                              &lt;span class="c1"&gt;-- BAD:  what are we counting here?&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;            &lt;span class="c1"&gt;-- GOOD: unique orders dispatched&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;-- GOOD: dispatches, but we only count an order once a day&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;dispatches&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-06&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-12&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Always name columns with &lt;code&gt;AS&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Continuing with the previous example, we should always name our columns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;clown_town&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_unique_orders_dispatched&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_dispatches&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;dispatches&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-06&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-12&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Use &lt;code&gt;CASE WHEN&lt;/code&gt; instead of writing multiple queries&lt;/h2&gt;
&lt;p&gt;Suppose we want to count both dispatches and deliveries, and there's a table &lt;code&gt;core_events&lt;/code&gt; which has both. We can accomplish this with one &lt;code&gt;SELECT&lt;/code&gt;, we don't have to write two:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;dispatched&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_dispatches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delivered&amp;#39;&lt;/span&gt;  &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;dispatched&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;           &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_unique_orders_dispatched&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delivered&amp;#39;&lt;/span&gt;  &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;           &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_unique_orders_delivered&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;core_events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-06&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-12&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Remember, &lt;code&gt;COUNT()&lt;/code&gt; doesn't count &lt;code&gt;NULL&lt;/code&gt;s.&lt;/p&gt;
&lt;p&gt;In the above example, we expect:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;num_delivered = num_unique_orders_delivered&lt;/code&gt; because an order can only be delivered once&lt;/li&gt;
&lt;li&gt;&lt;code&gt;num_unique_orders_dispatched &amp;lt;= num_dispatches&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;num_delivered &amp;lt;= num_dispatches&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Use &lt;code&gt;IN (...)&lt;/code&gt; instead or &lt;code&gt;OR&lt;/code&gt;s&lt;/h2&gt;
&lt;p&gt;In the above example, for clarity and efficiency, we should only include &lt;code&gt;dispatched&lt;/code&gt; and &lt;code&gt;delivered&lt;/code&gt; rows. Instead of writing &lt;code&gt;event = 'dispatched' OR event = 'delivered'&lt;/code&gt;, use &lt;code&gt;IN()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;core_events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;dispatched&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delivered&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Use &lt;code&gt;BETWEEN&lt;/code&gt; for dates instead of &lt;code&gt;&amp;gt;=&lt;/code&gt; and &lt;code&gt;=&amp;lt;&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Don't write &lt;code&gt;day &amp;gt;= DATE('2020-01-06') AND day &amp;lt;= DATE('2020-01-12')&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Write &lt;code&gt;day BETWEEN DATE('2020-01-06') AND DATE('2020-01-12')&lt;/code&gt;, it's much more readable.&lt;/p&gt;
&lt;p&gt;Remember that &lt;code&gt;BETWEEN&lt;/code&gt; is inclusive, so &lt;code&gt;x BETWEEN 1 AND 3&lt;/code&gt; is the same as &lt;code&gt;x IN (1, 2, 3)&lt;/code&gt; for an &lt;code&gt;INT&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Use “advanced” aggregation functions such as &lt;code&gt;MAX_BY()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;If you're doing analytics work, use a database meant for it, like &lt;a href="https://prestodb.io/"&gt;Presto&lt;/a&gt;. A good database meant for analytics work will have lots of useful aggregation functions like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;MAX_BY()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;MIN_BY()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ARRAY_AGG()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ARBITRARY()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;COUNT_IF()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;APPROX_PERCENTILE()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;KURTOSIS()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SKEWNESS()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To understand what these are, check the &lt;a href="https://prestodb.io/docs/current/functions/aggregate.html"&gt;Presto docs&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;When taking &lt;code&gt;AVG()&lt;/code&gt;, also check min/max/median/p10/p90 values&lt;/h2&gt;
&lt;p&gt;We often write aggregations and compute the mean summary statistic with &lt;code&gt;AVG()&lt;/code&gt;. In the logistics/delivery world, the most common example is to compute the metric &lt;strong&gt;average Deliveries per Driver&lt;/strong&gt; for a fleet. So we write the query, the fleet average comes out to &lt;code&gt;DPD=30.4&lt;/code&gt;, which means on average a driver makes 30.4 deliveries per day. Sounds reasonable. But there could be a lot of junk in there:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;drivers who are just hired and in training, and making 2-3 deliveries/day; probably shouldn't be counted&lt;/li&gt;
&lt;li&gt;internal drivers making internal deliveries; probably shouldn't be counted&lt;/li&gt;
&lt;li&gt;suppose the company just introduced self-pickups, but technically in the production system these show up as a &lt;code&gt;SELF_PICKUP&lt;/code&gt; driver, ie. all self-pickups are under one virtual driver's accounts, who has 1000+ deliveries; probably shouldn't be counted&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;See my earlier post &lt;a href="http://bytepawn.com/beat-the-averages.html#beat-the-averages"&gt;Beat the averages&lt;/a&gt; for more on this.&lt;/p&gt;
&lt;p&gt;There's lots of ways to catch problems like this, but one cheap way is to check the edges of the distribution:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;daily_dpd&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_delivered&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;deliveries&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dpd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;-- let&amp;#39;s get some additional statistics to make sure we&amp;#39;re&lt;/span&gt;
    &lt;span class="c1"&gt;-- not fooling ourselves by reporting the average&lt;/span&gt;
    &lt;span class="k"&gt;MIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;min_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MIN_BY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;min_delivered_driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;max_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MAX_BY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;max_delivered_driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;APPROX_PERCENTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;dpd_p10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;APPROX_PERCENTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;dpd_p50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- aka median&lt;/span&gt;
    &lt;span class="n"&gt;APPROX_PERCENTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;dpd_p90&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;daily_dpd&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Don’t forget to &lt;code&gt;COALESCE()&lt;/code&gt; when doing &lt;code&gt;LEFT/RIGHT/OUTER JOIN&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Similar to the above example, suppose you want to list out driver's names and daily DPDs for 2020-01-06:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;daily_dpd&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;driver_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;tracking_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_delivered&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;deliveries&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-06&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="c1"&gt;-- COALESCE() returns the first non-NULL argument passed to it&lt;/span&gt;
        &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;daily_dpd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;driver_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;daily_dpd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_delivered&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;daily_dpd&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="n"&gt;drivers&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="n"&gt;daily_dpd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Also, always write out the table aliases (&lt;code&gt;daily_dpd&lt;/code&gt; and &lt;code&gt;drivers&lt;/code&gt; in this example) for clarity. It may be clear to you now where each column is coming from, but will you know in 3 months? Will the next guy know? &lt;/p&gt;
&lt;h2&gt;Double-check your parentheses in &lt;code&gt;WHERE&lt;/code&gt; when using &lt;code&gt;OR&lt;/code&gt;s&lt;/h2&gt;
&lt;p&gt;A nasty source of bugs in SQL code is a list of &lt;code&gt;AND&lt;/code&gt;, with an &lt;code&gt;OR&lt;/code&gt; hiding in there, with no parentheses, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;UAE&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-16&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B2C&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Acme Bank&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is almost certainly not what the writer intended. Notice I didn't follow my indentation rules here, which would help catch this. What we really want here is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;UAE&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2020-01-16&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fleet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;B2C&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Acme Bank&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Write &lt;code&gt;1000*1000&lt;/code&gt; instead of &lt;code&gt;100000&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Notice how I left out a 0 above?&lt;/p&gt;
&lt;p&gt;Unfortunately, we cannot put commas into the numbers to improve readability in SQL (or other programming languages), so we can't write &lt;code&gt;1,000,000&lt;/code&gt;. The problem with &lt;code&gt;1000000&lt;/code&gt; is that it's hard to see whether we got the number of zeros right. This can lead to nasty bugs. It's better to pretend commas by multiplying like &lt;code&gt;1000*1000&lt;/code&gt; or &lt;code&gt;500*1000&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Use &lt;code&gt;DECIMAL&lt;/code&gt; not &lt;code&gt;DOUBLE&lt;/code&gt; when dealing with money&lt;/h2&gt;
&lt;p&gt;When dealing with money, never use &lt;code&gt;DOUBLE&lt;/code&gt;. There is a type for it, called &lt;code&gt;DECIMAL&lt;/code&gt;. In SQL, like in many programming languages, doubles are &lt;a href="https://en.wikipedia.org/wiki/IEEE_754"&gt;IEEE 754 floating points&lt;/a&gt;, and there's weird precision behaviour that may introduce nasty bugs. There's &lt;a href="https://0.30000000000000004.com/"&gt;entire sites&lt;/a&gt; dedicated to explaining this. Tldr = &lt;code&gt;DOUBLE&lt;/code&gt; is meant to be used for math like sine and cosine, &lt;code&gt;DECIMAL&lt;/code&gt; for money. Note that by default, if you write a literal like &lt;code&gt;0.2&lt;/code&gt;, it will be &lt;code&gt;DOUBLE&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here's a simple example that may surprise you:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;does_math_work&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;result_double_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;does_this_other_math_work&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;result_decimal_type&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This returns on my Presto DB:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;-----------------+---------------------+---------------------------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;does_math_work&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;result_double_type&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;does_this_other_math_work&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;result_decimal_type&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;-----------------+---------------------+---------------------------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;30000000000000004&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;                      &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;                 &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;-----------------+---------------------+---------------------------+---------------------+&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;</content><category term="data"></category><category term="programming"></category><category term="sql"></category></entry><entry><title>How I write SQL code</title><link href="https://bytepawn.com/how-i-write-sql-code.html" rel="alternate"></link><published>2020-01-24T00:00:00+01:00</published><updated>2020-01-24T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-01-24:/how-i-write-sql-code.html</id><summary type="html">&lt;p&gt;This is a simple post about SQL code formatting. Most of this comes from my time as a Data Engineer at Facebook. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/sql.png" alt="SQL code" style="width: 600px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is a simple post about SQL code formatting. Most of this comes from my time as a Data Engineer at Facebook.&lt;/p&gt;
&lt;p&gt;I’ve always cared a lot about writing readable code. Readability has several components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;directory layout&lt;/li&gt;
&lt;li&gt;breaking code into files (eg. what to put in headers, how many)&lt;/li&gt;
&lt;li&gt;code layed out in files (eg. ordering of includes, templates, classes, functions)&lt;/li&gt;
&lt;li&gt;naming of files, classes, functions and variables&lt;/li&gt;
&lt;li&gt;indentation, line width&lt;/li&gt;
&lt;li&gt;comments in code files&lt;/li&gt;
&lt;li&gt;the modular structure of the code itself (eg. design patterns)&lt;/li&gt;
&lt;li&gt;high-level documentation that explains design choices&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Readable code should be relatively easy to explore and read. I say “relatively”, because most programs have a lot of innate complexity that you have to wrap your head around, even if it’s well-written.&lt;/p&gt;
&lt;p&gt;A famous example of high quality and readable (C++) code is the &lt;a href="https://github.com/mtrencseni/quake3"&gt;Quake3&lt;/a&gt; and &lt;a href="https://github.com/mtrencseni/doom3"&gt;Doom3&lt;/a&gt; code by &lt;a href="https://en.wikipedia.org/wiki/John_Carmack"&gt;John Carmack&lt;/a&gt;. When I was working on &lt;a href="https://github.com/scalien/scaliendb"&gt;ScalienDB&lt;/a&gt; many years ago, I tried to write C++ code in a similar style. A good C++ book on the above points is &lt;a href="https://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620"&gt;Large Scale Software Design&lt;/a&gt; by John Lakos.&lt;/p&gt;
&lt;p&gt;Writing good SQL code is much simpler than writing good C++ code. SQL code has no templates, classes, and in my experience analytical “data mining” SQL code also has no functions (=stored procedures). It’s really just a lot of standalone &lt;code&gt;SELECT&lt;/code&gt;s, sometimes chained with &lt;code&gt;WITH&lt;/code&gt;, or by dropping results into tables and &lt;code&gt;SELECT&lt;/code&gt;ing out of those tables in the next step. Having said that, I still see a lot of data people writing hard to read SQL code.&lt;/p&gt;
&lt;p&gt;Without further ado, my rules for writing SQL code follow below.&lt;/p&gt;
&lt;h2&gt;Use all upper-case for keywords like &lt;code&gt;SELECT&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Also use upper-case for built-in types and functions like &lt;code&gt;BIGINT&lt;/code&gt; and &lt;code&gt;COUNT()&lt;/code&gt;.
Use lower-case for column names and table names.&lt;/p&gt;
&lt;h2&gt;Keep &lt;code&gt;SELECT .. FROM .. WHERE .. GROUP BY&lt;/code&gt; unindented&lt;/h2&gt;
&lt;p&gt;But indent the rest. An example so far:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;customer_street&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;company_metrics&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;customer_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Marton Trencseni&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Line up &lt;code&gt;WHERE&lt;/code&gt; conditions&lt;/h2&gt;
&lt;p&gt;God is merciful because &lt;code&gt;AND⎵&lt;/code&gt; is 4 characters, a good tab width, so &lt;code&gt;WHERE&lt;/code&gt; conditions are to be lined up like (same for &lt;code&gt;JOIN&lt;/code&gt; conditions):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;deliveries&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;UAE&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2019-07-01&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;DAY_OF_WEEK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;scheduled_accuracy_meters&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Write &lt;code&gt;GROUP BY 1, 2&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Always put the grouped columns first in the column list, and write &lt;code&gt;GROUP BY 1, 2 .. N&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;region_fleet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Delivered&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Delivered&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Not Delivered&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;week&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;so_number&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;scheduled_accuracy_meters&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;so_number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num_accurate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_accuracy_meters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;scheduled_accuracy_meters&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;deliveries&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Indent &lt;code&gt;WHEN&lt;/code&gt; relative to &lt;code&gt;CASE&lt;/code&gt;&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Ticker&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Ticker&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CallCenter&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;CSA&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Callcenter&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CallCenterBlind&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;CSA-BD&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Blind&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;AutoReschedule&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;AI-AutoReschedul&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Auto Schedule&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;DriverReschedule&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Rest&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;BulkSchedule&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Bulk&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CallCenterSelfPickupPoint&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;CallCenterSelfPickupWarehouse&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Self-pickup&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;AI-AutoSchedulin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;AI_Rango&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;AI-Rango&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;AI&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MWeb&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;MWEB&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;mobile-ios&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;mobile-android&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Mweb/apps (various)&amp;#39;&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Whatsapp&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Callcenter&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Blind&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Driver (various)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Bulk&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;AI&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Rest&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;schedule_channel&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;deliveries&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Use &lt;code&gt;WITH&lt;/code&gt; instead of nested sub-&lt;code&gt;SELECT&lt;/code&gt;s&lt;/h2&gt;
&lt;p&gt;Sub-&lt;code&gt;SELECT&lt;/code&gt;s with indenting are hard to read. Instead, create aliases with &lt;code&gt;WITH&lt;/code&gt;, and chain them. Put the &lt;code&gt;WITH&lt;/code&gt; on a separate line, and then write the aliases. If I have no better idea, I call the aliases &lt;code&gt;step1, step2 ...&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;For example, suppose table &lt;code&gt;deliveries&lt;/code&gt; has columns &lt;code&gt;scheduled_coordinates&lt;/code&gt; and &lt;code&gt;actual_coordinates&lt;/code&gt; as &lt;code&gt;lat, lon&lt;/code&gt; string, and you want to compute the meter distance with the &lt;a href="https://en.wikipedia.org/wiki/Haversine_formula"&gt;Haversine formula&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;WITH&lt;/span&gt;
&lt;span class="n"&gt;step1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SPLIT_PART&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual_coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;act_lat_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SPLIT_PART&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual_coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;act_lon_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SPLIT_PART&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sch_lat_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SPLIT_PART&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sch_lon_str&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;deliveries&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;step2&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CASE&lt;/span&gt;
            &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;REGEXP_LIKE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;act_lat_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;^[0-9]+\.[0-9]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;act_lat_str&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;DOUBLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
        &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;actual_lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CASE&lt;/span&gt;
            &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;REGEXP_LIKE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;act_lon_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;^[0-9]+\.[0-9]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;act_lon_str&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;DOUBLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
        &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;actual_lon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CASE&lt;/span&gt;
            &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;REGEXP_LIKE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sch_lat_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;^[0-9]+\.[0-9]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sch_lat_str&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;DOUBLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
        &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;scheduled_lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;CASE&lt;/span&gt;
            &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;REGEXP_LIKE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sch_lon_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;^[0-9]+\.[0-9]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sch_lon_str&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;DOUBLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
        &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;scheduled_lon&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;step1&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;step3&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="mi"&gt;6371&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;ASIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SQRT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;POW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;SIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;   &lt;span class="n"&gt;RADIANS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_lat&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;actual_lat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;   &lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mi"&gt;2&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt;
              &lt;span class="n"&gt;COS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RADIANS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual_lat&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;COS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RADIANS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_lat&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;POW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;SIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;   &lt;span class="n"&gt;RADIANS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scheduled_lon&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;actual_lon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;   &lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mi"&gt;2&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;scheduled_accuracy_meters&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;step2&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Long lines are okay&lt;/h2&gt;
&lt;p&gt;As you can see, long lines are okay in my opinion. We all have widescreen aspect ratio screens (Macbooks), let's use that space. Shorter lines &lt;em&gt;are&lt;/em&gt; more readable, but breaking everything into shorter lines, re-breaking when making changes, it's too much work in my opinion, especially when dealing with lenghty "business logic" in SQL. &lt;/p&gt;
&lt;h2&gt;Break the rules for readability and flow of code&lt;/h2&gt;
&lt;p&gt;Rules are made to be broken. If doing it in another way leads to better readability, break the rules.&lt;/p&gt;
&lt;p&gt;For example, sometimes we &lt;code&gt;SELECT&lt;/code&gt; out a horizontal/vertical part of a table in a quick sub-&lt;code&gt;SELECT&lt;/code&gt; to help the query optimizer. In cases like this I don't use &lt;code&gt;WITH&lt;/code&gt; and keep it in one line, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;deliveries&lt;/span&gt; &lt;span class="n"&gt;dls&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;UAE&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;drv&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt;
    &lt;span class="n"&gt;dls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;drv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Follow the same rules when writing interactive SQL code&lt;/h2&gt;
&lt;p&gt;I follow these same rules when writing one-off queries in an interactive console. That way it’s easier to stick to them when writing long-lived code that goes into ETL and the repo. Also, one-off code oftens ends up being ETL’d.&lt;/p&gt;
&lt;h2&gt;Alternative ways to write SQL&lt;/h2&gt;
&lt;p&gt;Finally, some other, more comprehensive guides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.sqlstyle.guide/"&gt;Simon Holywell's SQL style guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://about.gitlab.com/handbook/business-ops/data-team/sql-style-guide/"&gt;Gitlab's SQL style guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mattm/sql-style-guide"&gt;Matt Mazur's SQL style guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="data"></category><category term="programming"></category><category term="sql"></category></entry><entry><title>Small team planning</title><link href="https://bytepawn.com/small-team-planning.html" rel="alternate"></link><published>2020-01-10T00:00:00+01:00</published><updated>2020-01-10T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2020-01-10:/small-team-planning.html</id><summary type="html">&lt;p&gt;I’ve worked at 5-10 different organizations, most of them were startups or startuppy companies. I’ve done a lot of planning in small teams, and also taken part in company-wide leadership planning. Here I will describe what has worked well for me in small team settings, focusing on time estimation. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/peak.png" alt="Reaching the peak" style="width: 200px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I’ve worked at 5-10 different organizations, most of them were startups or startuppy companies. I’ve done a lot of planning in small teams, and also taken part in company-wide leadership planning. Here I will describe what has worked well for me in &lt;strong&gt;small team settings&lt;/strong&gt;, focusing on time estimation.&lt;/p&gt;
&lt;h2&gt;Team activity&lt;/h2&gt;
&lt;p&gt;Team planning typically occurs at the beginning of each quarter or half. It is an activity for whole team; everybody needs to own the plan. The techlead or engineering manager cannot do it alone, because there are multiple steps where everybody’s input is needed to generate project ideas and estimates.&lt;/p&gt;
&lt;h2&gt;Planning vs goaling&lt;/h2&gt;
&lt;p&gt;The focus here is on planning. Planning is not the same as goaling.&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &lt;strong&gt;Goal&lt;/strong&gt; = what we want to do&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &lt;strong&gt;Plan&lt;/strong&gt; = what we can do&lt;/p&gt;
&lt;p&gt;A plan incorporates resource and dependency contraints: people, time, money, team structure, software architecture, etc. Another way of saying this:&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &lt;strong&gt;Goal&lt;/strong&gt; = where we want to get&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &lt;strong&gt;Plan&lt;/strong&gt; = how we get there&lt;/p&gt;
&lt;p&gt;When goaling, if you don’t take into account resource and dependency contraints (=what makes a good plan), the goals will be unrealistic. When planning, if you don’t have a goaling direction, you won’t work toward the right direction. Planning and goaling go hand in hand, it’s a back-and-forth.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/peak.png" alt="Planning peak" style="width: 200px;"/&gt;&lt;/p&gt;
&lt;p&gt;In my experience, &lt;strong&gt;a lot of people are bad at planning, and tend to overpromise and underdeliver&lt;/strong&gt;. I prefer to underpromise and overdeliver.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When I was a young Techlead , the first time I did small team planning, we spent hours writing lots of sticky notes, estimating mandays, we channeled &lt;a href="https://en.wikipedia.org/wiki/Fred_Brooks"&gt;Fred Brooks&lt;/a&gt;. At the end, when we were finished, we leaned back and looked at our plan, and I laughed out. I said: “Somehow we managed to fit all projects we can think of into the quarter! We’ll be laid off because there will be nothing left to do.” It was a joke. We overpromised and underdelivered, because we didn’t know how to make a good plan.*&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Counting days&lt;/h2&gt;
&lt;p&gt;The basic process is straightforward:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Draw a calendar on the whiteboard. Columns are people, rows are weeks in the quarter / half that you’re planning for. If you want, make cells for days, but I usually don’t. The point of the exercise is to count how many weeks we will have to work on “new stuff”.&lt;/li&gt;
&lt;li&gt;Count how many weeks are in the quarter, and make a row for each week. Number the week as the week-of-year, eg. W44.&lt;/li&gt;
&lt;li&gt;Cross out any weeks that are in the past.&lt;/li&gt;
&lt;li&gt;If you’re doing this exercise in the middle of the week, cross out the current week.&lt;/li&gt;
&lt;li&gt;Cross out bank holidays.&lt;/li&gt;
&lt;li&gt;Cross out days for team offsites, hackathons, etc.&lt;/li&gt;
&lt;li&gt;All team members cross out weeks in their column when they will be on vacation. If you’re unsure which week it is, take a guess.&lt;/li&gt;
&lt;li&gt;Mark 1-2 day per week for interrupts. In my line of work, for a generalist data team, there are lots of interrupts (same goes for many software engineering teams). In my experience, 20-30% of time should be assumed to be interrupts (infrastructure breaks, logs change, new data breaks a pipe, dashboards break, ML model regresses, questions from PMs, SWEs, designers, AMs, sales, marketing, finance...)&lt;/li&gt;
&lt;li&gt;Estimate meeting load. Everybody open their calendar, and look at the last 2 weeks to estimate meeting load. Imagine compressing your meeting load into full days, and cross out that many days. Note: meetings are not interrupts here.&lt;/li&gt;
&lt;li&gt;The Techlead for a 4+ person team will spend 50%+ of their time on management duties. Take that time off from that person’s column.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At this point, there is a per-person budget left. This is how much time is available to work on new stuff, to have direct impact.&lt;/p&gt;
&lt;p&gt;This is what an in progress planning table might look like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/planning-table.png" alt="Planning table" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Direct vs indirect impact&lt;/h2&gt;
&lt;p&gt;Sometimes people get upset because it seems too little time is left for new stuff. You could argue that putting aside days for interrupts and meetings is "planning to underperform". I would not agree with that, it's just a realistic plan.&lt;/p&gt;
&lt;p&gt;On the one hand, yes, we all are spending a lot of time in meetings, and some of the meetings are useless; but that has nothing to do with planning. Planning may draw attention to it, but it’s not a planning issue. To decrease interrupts, have your team set up the interrupt magnet system (=an assigned team member who handles interrupts that day/week, same idea for infra magnets). If you manage to push down interrupts and meeting time, great, you'll have more time and will end up underpromising and underdelivering, and on the next cycle you can adjust down your interrupt/meeting estimate.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/indirect-impact.png" alt="Indirect impact" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;On the other hand, and this is the point: a lot of impact is delivered indirectly. It may be realized by another team, or later. Spending time talking to people or fixing stuff is not underperforming. A great example is a hackathon: at Facebook, a lot of important features were first built at hackathon, even though, in terms of planning, a hackathon would be “crossed-off time”. Hackathon also have a lot of other positive effects.&lt;/p&gt;
&lt;h2&gt;Project list&lt;/h2&gt;
&lt;p&gt;As a team, write a list of projects the team would like to work on, taking into account existing goals. Have everybody write their ideas on a sticky note (before the meeting), and then collate the ideas together on the whiteboard. The goal is to make sure everybody can give their ideas without being over-talked in a meeting setting.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/stickynotes-whiteboard.png" alt="Sticky notes whiteboard" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The next step is to decide how much work to allocate to each project, who works on each. I will skip the “who” and figuring out dependencies here.&lt;/p&gt;
&lt;h2&gt;Planning poker&lt;/h2&gt;
&lt;p&gt;The point of &lt;a href="https://en.wikipedia.org/wiki/Planning_poker"&gt;planning poker&lt;/a&gt; is to make sure everybody who is involved in a project can give their manday estimate, without being influenced without others who speak before. The easiest way to do it is to have people write their estimate on a sticky note, and then share it. Super-high and super-low estimates should be explained. The goal is to arrive at a realistic estimate. “Realistic” is not the same as “consensus”, eg. you should not take an average or median of estimates. People who are better at planning (=usually more senior people) need to have more weight, while people who are gaining experience should use this experience to learn how to do a better job estimasting.&lt;/p&gt;
&lt;h2&gt;Mandays&lt;/h2&gt;
&lt;p&gt;Be careful about mandays versus calendar days when estimating projects: &lt;strong&gt;Pure mandays&lt;/strong&gt; is how much time you’d spend on something if you were to work uninterrupted. &lt;strong&gt;Real mandays&lt;/strong&gt; is how much time you will spend on it in real life, since you’re less efficient because of interrupts (but not counting time spent in interrupts and meetings). &lt;strong&gt;Calendar days&lt;/strong&gt; (calendar weeks) is how much time will elapse between start and end, also counting interrupt time and meetings, weekends. For example if your compressed meeting load is 1 day per week, 1 day per week is interrupts, and a task is estimated to take 4-5 pure mandays, then probably it will take 6 mandays, and overall it will consume 2 calendar weeks (1+1+3+weekend+1+1+3+weekend).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Different people think in different quantities when planning (pure mandays, real mandays, calendar days). Make sure you’re talking about the same thing.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Coarse-graining&lt;/h2&gt;
&lt;p&gt;I prefer to keep the plan coarse-grained, and think of bigger projects/tasks, that are at least 1-2 weeks in size. I think trying to think ahead in smaller granularities is unrealistic. Good and motivated people will be able to manage the sub-tasks within coarse-grained projects during the quarter on their own (juniors with regular help).&lt;/p&gt;
&lt;h2&gt;Timing&lt;/h2&gt;
&lt;p&gt;At this point we know how much time each person has available, projects, and who works on what projects, so we can put it into the table, taking into account dependencies. This gives our estimated completion times. In my experience, for software, at best month-grained can be taken seriously (“We will start working on this in Nov, we will ship sometime in Dec”), but often only quarter-granularity is dependable ("We will ship in Q4").&lt;/p&gt;
&lt;h2&gt;Projects and goals&lt;/h2&gt;
&lt;p&gt;Planning and goaling is a back-and-forth. If the plans and goals are too far off, one or both has to be adjusted; maybe the goal was too aggressive and needs to come down, or the team needs to find additional time this quarter and aggressively say no to interrupts and cancel meetings, or both.&lt;/p&gt;
&lt;h2&gt;Jiggle&lt;/h2&gt;
&lt;p&gt;I like to give space/time to reflect and adjust. After the initial meeting, we take a few days to think about it, and schedule a second, finalizing meeting, where we have an opportunity to make adjustments, add in stuff we missed.&lt;/p&gt;
&lt;h2&gt;Meeting minutes&lt;/h2&gt;
&lt;p&gt;The techlead or manager should take a picture of the table, enter it into a spreadsheet, and share it with the team. Then, if the team also uses software such as &lt;a href="https://trello.com"&gt;Trello&lt;/a&gt; or &lt;a href="https://jira.atlassian.com"&gt;Jira&lt;/a&gt;, enter it there.&lt;/p&gt;</content><category term="planning"></category><category term="teams"></category><category term="goaling"></category></entry><entry><title>Personal goaling</title><link href="https://bytepawn.com/personal-goaling.html" rel="alternate"></link><published>2019-12-22T00:00:00+01:00</published><updated>2019-12-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-12-22:/personal-goaling.html</id><summary type="html">&lt;p&gt;The meta-goal of goaling is to stretch yourself to achieve more, and to feel good about what you’ve achieved. Whatever happened this year, it’s always possible to achieve a lot more and feel better about yourself next year. To hijack a Feynman quote, &lt;em&gt;there is plenty of room at the top&lt;/em&gt;.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/run2019_1.PNG" alt="2019 running" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;I’ve been using a system of setting and tracking personal goals for the last 3 years. For me, this way has worked out well, so I’m sharing it here. This is not meant to be prescriptive (“do this”), it’s descriptive (“this worked for me”). Here is my &lt;a href="https://docs.google.com/document/d/1J_eay8axCVOTZ9gDk8wLiI3cdI5Qx4r_u318YTOsFg8/edit?usp=sharing"&gt;2019 master goaling doc&lt;/a&gt;, shown inline at the bottom of the page.&lt;/p&gt;
&lt;h2&gt;The meta-goal of goaling&lt;/h2&gt;
&lt;p&gt;The meta-goal of goaling is to stretch yourself to achieve more, and to feel good about what you’ve achieved. Whatever happened this year, it’s always possible to achieve a lot more and feel better about yourself next year. To hijack a Feynman quote, &lt;em&gt;there is plenty of room at the top&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;More is better&lt;/h2&gt;
&lt;p&gt;I find that having lots of goals is fun. Life has multiple arenas: family, fitness, savings, projects, work, traveling, reading, writing, etc. I try to write down multiple goals for each arena, usually about 5 each. In the end, my complete list of goals is usually 2 pages of bullets and sub-bullets in a Google Doc.&lt;/p&gt;
&lt;h2&gt;50/50 goaling&lt;/h2&gt;
&lt;p&gt;This is something I learned while working at Facebook. There the philosophy was that goaling should be aggressive; on average you should hit 50% of your goals (or, 50% of teams should hit their goals). When I set my goals at the beginning of the year, I usually think “I’d be happy to hit 50% of these”, but then I usually end up hitting a bit more, like 60-70%. This always reminds me that there is so much more room to stretch and do more. Very few people are near their capacity, I am definitely not.&lt;/p&gt;
&lt;h2&gt;Use colors&lt;/h2&gt;
&lt;p&gt;I use a Google Doc to write down the goals as bullet points (like “run 1000km”), and separate Google Sheets to track my daily progress towards the goal. I use red, green and orange colors: green when I accomplished what I needed to that day to hit the goal, or when I hit the goal, orange for a missed day, and red if the verdict is out and I missed the goal. These colors carry strong psychological connotation and put pressure---the good kind of pressure---on me to hit my goals. Looking at green is pleasant, green means good.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/track2019.png" alt="2019 goals" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Visualize numeric goals on charts&lt;/h2&gt;
&lt;p&gt;For numeric goals (for me, this is fitness related goals) I use a Google Spreadsheet. The master tab has a daily log of activities. One row is one calendar day; if I didn’t do anything that day, it’s orange. If I did, it’s green. Separate tabs show various line charts, showing my progress as I track toward my yearly goal. Charts are another type of visualization that have a strong psychological effect on me in terms of motivating me to get the line to move up and to the right.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/run2019_1.PNG" alt="2019 goals" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Show the goal line&lt;/h2&gt;
&lt;p&gt;For goals such as "run 1000km", where I continuously make progress, I break it into a daily goal (1000/365 km per day), and show that goal line on the chart in green. This is another strong visual cue that motivates me, I have to follow the goal line.&lt;/p&gt;
&lt;h2&gt;Look at the goals daily&lt;/h2&gt;
&lt;p&gt;Goaling only works if you care about the goals. The best way to do that is to look at the goals all the time. I have 3 tabs (one master Google Doc, one Google Spreadsheet for fitness and one Google Spreadsheet for financials) pinned open in Chrome, and look at them daily. I always know how I’m tracking towards my goals.&lt;/p&gt;
&lt;h2&gt;Set cumulative goals&lt;/h2&gt;
&lt;p&gt;I find setting cumulative goals like “run 1000km in 2019” works much better for me compared to weekly goals like “run 20km per week in 2019”. The problem with the weekly version is, what happens if you miss your goal on one week? Have you already missed the yearly goal? How many weeks are you allowed to misss? Setting a cumulative goal like “run 1000km in 2019” is great, because it becomes a game: if I don’t go running on a week, no problem, but I’ll have to make it up the following weeks. And knowing that making up is tough, I don’t miss my weekly targets unless there’s a good reason (like illness).&lt;/p&gt;
&lt;h2&gt;Set S and L goals&lt;/h2&gt;
&lt;p&gt;Sometimes it’s not clear what the right goal is. Should the target be 500km or 1,000km of running? When I don’t know what the realistic goal is, I set a S and an L goal. Some time down the line I will see if the S goal is too easy of if the L goal is too hard, and focus on hitting the right one. If I decide to hit the S goal, I keep the L goal on the list (marked in red); it’ll be a good stretch goal for next year.&lt;/p&gt;
&lt;h2&gt;Set realistic goals&lt;/h2&gt;
&lt;p&gt;If you set unrealistic goals, you will know deep down that you will never hit your goals, or it is very unlikely. If you’re unsure what’s realistic, set S and L goals.&lt;/p&gt;
&lt;h2&gt;Break your goals into H1 and H2 goals&lt;/h2&gt;
&lt;p&gt;If your goal is to run 1,000km in a year, break it down into a 500km goal for H1 (Half 1), and an overall goal of 1,000km for H2. This way, when you hit the 500 km sub-goal in June, you can green it out and feel good.&lt;/p&gt;
&lt;h2&gt;Renegotiate&lt;/h2&gt;
&lt;p&gt;These are personal goals. The meta-goal of goaling is to stretch yourself to do more, and feel good about it on the way. But there's no point to feeling bad about not hitting an unrealistic goal. If you’re some months into the year, and you realize that a goal was too ambitious, it’s okay to re-label it as an L goal and introduce an S goal. Also, sometimes circumstances change (new job, new city, family), and goals become unrealistic; in that case, it’s okay to renegotiate the goal with yourself, and introduce a new, different, better fitting goal.&lt;/p&gt;
&lt;h2&gt;Use leading metrics&lt;/h2&gt;
&lt;p&gt;There are leading metrics and lagging metrics. A leading metric is about something actionable today, like going running or cutting calorie intake. A lagging metric is the outcome of the leading metric, like “80kg body mass”. I set my goals as leading metrics, because they can be translated to something I will do daily, and I can keep track of my gradual progress. Also, I find goals like “lose 10kg” poor, because what happens once I lose 10kg? Suppose I lose it by September? Am I done with my yearly goal? If I just gain it back, that’s a fleeting accomplishment. I find it better to set leading metrics goals, which can be translated to daily actions and weekly cadence, ie. routine. It’s easier to maintain your weight if you’re used to running and working out a lot and controlling your calorie intake, and if you feel bad if you miss a day.&lt;/p&gt;
&lt;h2&gt;Should be up to you&lt;/h2&gt;
&lt;p&gt;The goals should be up to you, so "get promoted" is not a good personal goal, because it's not up to you: the company could have a bad year, you could get a new manager, etc. Instead, figure out what are the best things you can do to get promoted ("land 4 big features", "mentor 2 juniors", "present at 2 conferences", "organize an offsite") and write those down. A good goal is when you feel bad if you didn't hit it and can blame it on yourself for not trying hard enough.&lt;/p&gt;
&lt;h2&gt;Track secondary metrics&lt;/h2&gt;
&lt;p&gt;Secondary metrics are, for example, "% active days" for your activities, or the location split of your running kilometers. Since you're tracking your metrics anyway, these are easy to split and visualize and can give you interesting insights ("wow, I do most of my running indoors").&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/runloc2019.png" alt="2019 running" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;External commitments&lt;/h2&gt;
&lt;p&gt;External commitments, created on purpose by you, can be a great tool for hitting your goals. Simply go around and tell everybody what your goal is. For example, in 2008 I decided to do an Ironman. I gave myself a year to prepare, and in that year I told everybody I met I'm going to do an Ironman. There was no way I could go back on it, I told everybody!&lt;/p&gt;
&lt;h2&gt;Dopamine hits&lt;/h2&gt;
&lt;p&gt;Using daily tracking, colors and cumulative goals (eg. running kilometers accumulate like gold in an RTS game) already gives you a dopamine hit every time you do something and update your tracking. You can get an additional hit if you post about it on Facebook. I think there is social good in this, because it may motivate others to also go out and do something.&lt;/p&gt;
&lt;h2&gt;Sample goals&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://docs.google.com/document/d/1J_eay8axCVOTZ9gDk8wLiI3cdI5Qx4r_u318YTOsFg8/edit?usp=sharing"&gt;My 2019 master goaling doc&lt;/a&gt;, after redactions:&lt;/p&gt;
&lt;iframe src="https://docs.google.com/document/d/e/2PACX-1vQn49UZbLT-v88d6qu1ojz1dsz2bfhAW0C0DdZDIKBmYHMl1ApycI9a4_3McXCr4MfnBsQ4ZyzYArmu/pub?embedded=true" style="width: 1000px; height: 800px;"&gt;&lt;/iframe&gt;</content><category term="self help"></category><category term="goaling"></category></entry><entry><title>Pytorch in 2019</title><link href="https://bytepawn.com/pytorch-in-2019.html" rel="alternate"></link><published>2019-12-12T00:00:00+01:00</published><updated>2019-12-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-12-12:/pytorch-in-2019.html</id><summary type="html">&lt;p&gt;2019 was another big year for Pytorch, one of the most popular Deep Learning libraries out there. Pytorch has become the de facto deep learning library used for research thanks to it’s dynamic graph model which allows fast model experimentation. It’s also become production ready, with support for mobile and infrastructure tooling such as Tensorboard.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/pytorch-gt.png" alt="Pytorch Google Trends 2019" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;2019 was another big year for Pytorch, one of the most popular Deep Learning libraries out there. Pytorch has become the de facto deep learning library used for research thanks to it’s dynamic graph model which allows fast model experimentation. It’s also become production ready, with support for mobile and infrastructure tooling such as Tensorboard. Google Trends shows a slight increasing trend as measured by searches for Pytorch in 2019, although Pytorch is still behind Tensorflow and Keras in overall popularity:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pytorch-gt.png" alt="Pytorch Google Trends 2019" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Pytorch version 1.0 was released in late December, 2018. Since then, we’ve seen numerous minor releases, and it’s already up to version 1.3.1. Significant new features include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Tensorboard support:&lt;/strong&gt; visualization and tooling needed for ML experimentation, for example tracking metrics such as loss and accuracy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mobile support:&lt;/strong&gt; run any TorchScript model directly on Android/iOS, without conversion.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extended ONNX support:&lt;/strong&gt; ONNX is an open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pytorch/pytorch/releases"&gt;and 100s of other improvements...&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A number of Pytorch books came out in 2019 to help programmers and data scientists get started. There is &lt;a href="https://www.amazon.com/Programming-PyTorch-Deep-Learning-Applications/dp/1492045357"&gt;Programming PyTorch for Deep Learning&lt;/a&gt; by Ian Pointer, and &lt;a href="https://www.amazon.com/Natural-Language-Processing-PyTorch-Applications/dp/1491978236"&gt;Natural Language Processing with PyTorch&lt;/a&gt; by Delip Rao, both published by O’Reilly. Coming out soon is &lt;a href="https://www.amazon.com/Deep-Learning-PyTorch-Eli-Stevens/dp/1617295264"&gt;Deep Learning with PyTorch&lt;/a&gt; by Eli Stevens (Manning).&lt;/p&gt;
&lt;p&gt;Thanks to the community, milestone Deep Learning results have implementations in Pytorch:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/suragnair/alpha-zero-general"&gt;AlphaGo Zero (Reinforcement Learning) in Pytorch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/tingkai-zhang/openai-gpt-pytorch"&gt;GPT natural language model in Pytorch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/eriklindernoren/PyTorch-GAN"&gt;GAN models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/NVIDIA/FastPhotoStyle"&gt;NVIDIA style transfer model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pytorch.org/docs/stable/torchvision/models.html"&gt;Standard vision models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In 2019 there was another Pytorch developer conference in San Francisco, where &lt;strong&gt;&lt;a href="https://www.youtube.com/watch?v=oBklltKXtDE"&gt;Tesla’s Andrej Karpathy shared how the automaker uses Pytorch&lt;/a&gt;&lt;/strong&gt; to power their deep learning based self-driving architecture that relies solely on cameras.&lt;/p&gt;
&lt;p&gt;In Europe, the best place to learn about Pytorch in 2019 was at the &lt;a href="http://reinforceconf.com/"&gt;Reinforce Conference&lt;/a&gt; in Budapest! We had &lt;a href="https://www.youtube.com/watch?v=-pRBzh_hGqA"&gt;Stefan Otte doing a great tutorial style presentation&lt;/a&gt;, voted one of the best talks of the conference. &lt;strong&gt;Reinforce is returning in 2020&lt;/strong&gt;, as is Stefan Otte, speaking about Pytorch once again. &lt;strong&gt;Join us April 6-7 for the conference days, and optionally April 8 for workshops, in Budapest again!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/reinforce.png" alt="Reinforce" style="width: 600px;"/&gt;&lt;/p&gt;</content><category term="pytorch"></category></entry><entry><title>Warren Buffett style fundamental metrics of long-term company performance</title><link href="https://bytepawn.com/warren-buffett-style-fundamental-metrics-of-long-term-company-performance.html" rel="alternate"></link><published>2019-12-02T00:00:00+01:00</published><updated>2019-12-02T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-12-02:/warren-buffett-style-fundamental-metrics-of-long-term-company-performance.html</id><summary type="html">&lt;p&gt;I look at some fundamental charts of Apple, Activision Blizzard and Intel.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/aapl_shareholder_wealth.PNG" alt="AAPL shareholder wealth curve" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Over the years I’ve read a lot of books on investing, especially Warren Buffet-style value investing. I also read Warren Buffet’s yearly investor report. Some relevant links:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Intelligent-Investor-Definitive-Investing-Essentials/dp/0060555661"&gt;Benjamin Graham - The Intelligent Investor&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Berkshire-Hathaway-Letters-Shareholders-Buffett/dp/0615975070"&gt;Berkshire Hathaway Letters to Shareholders&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Security-Analysis-Foreword-Buffett-Editions/dp/0071592539"&gt;Graham, Dodd - Security Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Warren-Buffett-Stock-Portfolio-Investing/dp/1451606486"&gt;Mary Buffett (ex-wife of WB’s son) - The Warren Buffett Stock Portfolio&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com"&gt;Stockpup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://warrenbuffettstockportfolio.com"&gt;Warren Buffett : Latest Portfolio&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Recently I found some great online resources which give this style of analysis (called fundamental analysis) of companies, and I ended up spending hours browsing company metrics. So here I will show (from these sources) fundamental metrics of publicly traded companies. The point of the exercise is to get a feeling for what these metrics look like for companies in different industries, at different stages of their lifecycle. Most of the charts shown here are from &lt;a href="http://www.stockpup.com"&gt;Stockpup&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Apple (AAPL)&lt;/h2&gt;
&lt;p&gt;Let’s start with Apple because Warren Buffett holds AAPL stock himself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cumulative shareholder wealth.&lt;/strong&gt; This is an interesting chart: it shows, per quarter, the book value of the company at the point in time (green), and the cumulative dividends that the company paid out to shareholders up to that point in time (blue). The reason this chart is structured like this is, it says, if you bought 1 piece of AAPL stock in 1994, this is how much “sure money” that 1 stock is worth today. The cumulative dividends are “sure money”, since it’s cash that Apple has given you over the years. And book value is the closest accounting equivalent: it’s total assets minus total liabilities. The total assets can include intangibles like “brand value”, if it was a result of a purchase. One way to think of book value is, this is how much would be left if the company would sold off what it has (including intangibles, with a reasonable assumption of marketability) and paid back liabilities. Note that this chart has nothing to do with the share price of the stock we have to pay on the open market, in 1994 or at a later time. Notice how Apple started paying dividends in 2012.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aapl_shareholder_wealth.PNG" alt="AAPL shareholder wealth curve" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Earnings per share.&lt;/strong&gt; This shows the amount of profit the company made per quarter. Clearly Apple’s earnings have exploded over the last 20 years, and show a nice cyclical pattern (Q4 = christmas). Note that, when a company earns profit, it can re-invest it, keep the money as cash or mostly liquid investment, or pay it out as dividends. In both the latter two cases it would show up as a positive contribution on the cumulative shareholder wealth chart.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aapl_eps.PNG" alt="AAPL EPS" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price to book ratio.&lt;/strong&gt; This is the chart that incorporates the market price of the stock. It’s the ratio of the market price of AAPL (as traded on NASDAQ) and the book value, as shown on the first chart. The P/B ratio picks up in 2005-2006 (iPods, new Macbook Pros), peak around 2008 (after iPhone release).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aapl_pb.PNG" alt="AAPL P/B" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price to earnings ratio.&lt;/strong&gt; It’s fun to see how before 2005 or so, when earnings were low, this ratio is literally off-the-charts.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aapl_pe.PNG" alt="AAPL P/E" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stock performance.&lt;/strong&gt; And finally, the long-term stock performance chart:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/aapl_stock.PNG" alt="AAPL stock performance" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Activision Blizzard (ATVI)&lt;/h2&gt;
&lt;p&gt;What does one of the biggest gaming companies, with franchises such as Starcraft, World of Warcraft, Diablo, Overwatch, Call of Duty and Candy Crush look like? Book value has gone up steadily; the big jump in 2008 is when Activision and Blizzard merged.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/atvi_shareholder_wealth.PNG" alt="ATVI shareholder wealth curve" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Earnings per share in increasing (the big 2017Q4 loss was a one-time tax hit):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/atvi_eps.PNG" alt="AAPL EPS" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;P/B ratio is quite low, P/E ratio is erratic because earnings vary quarter-to-quarter, but generally high:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/atvi_px.PNG" alt="AAPL P/X" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/atvi_stock.PNG" alt="AAPL stock performance" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Intel (INTC)&lt;/h2&gt;
&lt;p&gt;Intel is incredibly consistent, despite competition from AMD and recently Apple:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/intc_shareholder_wealth.PNG" alt="INTC shareholder wealth curve" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The company took the same one-time tax hit in 2017Q4, but otherwise EPS shows a slow but steady increase:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/intc_eps.PNG" alt="INTC EPS" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The P/X ratios seem pretty reasonable for a tech company:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/intc_px.PNG" alt="AAPL P/X" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/intc_stock.PNG" alt="INTC stock performance" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Other interesting companies&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/AMZN/"&gt;Amazon&lt;/a&gt; - very high P/X ratios&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/MSFT/"&gt;Microsoft&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/FB/"&gt;Facebook&lt;/a&gt; - I worked here&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/GOOG/"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/NVDA/"&gt;NVIDIA&lt;/a&gt; - exploded recently thanks to GPU usage in Deep Learning&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/FDX/"&gt;Fedex&lt;/a&gt; - a logistics/delivery company like Fetchr&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/MCO/"&gt;Moody’s&lt;/a&gt; - one of Warren Buffett’s investments, also detailed in the book, they are essentially a services company (ratings), so they barely have any real assets, resulting in “weird” fundamental analysis charts&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/KO/"&gt;The Coca-Cola Company&lt;/a&gt; - one of Warren Buffett’s oldest investments, also detailed in the book&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/WPI/"&gt;Allergan&lt;/a&gt; - my father worked here for 10 years&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.stockpup.com/companies/EK/"&gt;Eastman Kodak&lt;/a&gt; - went bankrupt thanks to mobile cameras, then came back&lt;/li&gt;
&lt;/ul&gt;</content><category term="investing"></category><category term="stocks"></category><category term="warren buffett"></category></entry><entry><title>Calibration curves for delivery prediction with Scikit-Learn</title><link href="https://bytepawn.com/calibration-curves-for-delivery-prediction-with-scikit-learn.html" rel="alternate"></link><published>2019-11-21T00:00:00+01:00</published><updated>2019-11-21T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-11-21:/calibration-curves-for-delivery-prediction-with-scikit-learn.html</id><summary type="html">&lt;p&gt;I show calibration curves for four different binary classification Scikit-Learn models we built for delivery prediction at Fetchr, trained using real-world data: &lt;code&gt;LogisticRegression&lt;/code&gt;, &lt;code&gt;DecisionTree&lt;/code&gt;, &lt;code&gt;RandomForest&lt;/code&gt; and &lt;code&gt;GradientBoosting&lt;/code&gt;. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/logisticregression-decile.png" alt="Logistic regression calibration curve" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In a previous post about &lt;a href="http://bytepawn.com/machine-learning-at-fetchr.html"&gt;Machine Learning at Fetchr&lt;/a&gt;, I mentioned several families of models we have in production. The latest is &lt;strong&gt;Operational Choice&lt;/strong&gt;, which we use for delivery prediction. The idea is simple: we have a large number of features (essentially columns in our data warehouse) available for our historic dispatches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;sender's information&lt;/li&gt;
&lt;li&gt;recipient’s information (address, etc.)&lt;/li&gt;
&lt;li&gt;recipient’s historic information&lt;/li&gt;
&lt;li&gt;geography&lt;/li&gt;
&lt;li&gt;scheduling channel&lt;/li&gt;
&lt;li&gt;timing&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For each dispatch, we know whether it was successfully delivered or not (True/False). Given our historic data, we can build a binary classifier which predicts which orders will be delivered (or not) tomorrow, of all orders scheduled for dispatch. After one-hot encoding, our feature vector length is in the 1000s, and we can achieve 90%+ accuracy with out-of-the-box &lt;a href="https://scikit-learn.org/"&gt;Scikit-Learn&lt;/a&gt; models. In other words, perhaps not too surprisingly, it is possible to predict the chances of delivery success quite well.&lt;/p&gt;
&lt;p&gt;When using this in production, we don’t primarily look at the absolute value of the delivery probability itself. What we care about is the relative ordering: out of 1,000 orders, which are the least likely to be delivered successfully tomorrow? Operational Choice is about treating these orders differently. So while in standard ML classification tasks usually the most important metric is &lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html"&gt;accuracy&lt;/a&gt; (assuming a balanced dataset), ie. the ratio of test data that is predicted correctly by the predictor, here we also care about calibration: that the relationship between predicted and actual delivery probability should be monotonic, and as close to the &lt;code&gt;x=y&lt;/code&gt; line as possible.&lt;/p&gt;
&lt;p&gt;As a reminder, the way the binary (delivered or not) predictor models discussed here work is that given a feature vector, they return a probability of delivery, like 0.67 (SKL’s &lt;code&gt;model.predict_proba()&lt;/code&gt; functions. If we want to get a True/False prediction, we cut the probability at 0.5, so for 0.67 we would predict True (SKK’s &lt;code&gt;model.predict()&lt;/code&gt; function). Accuracy is the ratio of test data (historic dispatches) where the True/False prediction matches the actual True/False historic delivery outcome. To get the calibration curve, we need to convert the True/False historic ground truth to probabilities, so we need to bucket the data and count the ratio of successful deliveries.&lt;/p&gt;
&lt;p&gt;Below I show the calibration results for 4 Scikit-Learn models:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html"&gt;LogisticRegression&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html"&gt;DecisionTree&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html"&gt;RandomForest&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html"&gt;GradientBoosting&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The first left chart show the predicted probability on the x-axis, by deciles, as a bar chart; so the first bar is test data points where the model predicted between 0-10% delivery probability, and so on. The y axis is the ratio of test data in the bucket that was actually delivered (ratio of Trues). The right chart shows the number of data points in each decile; since the deciles are fixed, the counts are unbalanced, which leads to inbalanced statistics, ie. the error varies between bars. The lower, third chart shows the same thing, but with equal bucket sizes (total 10 buckets).&lt;/p&gt;
&lt;p&gt;To get these results, I used 100,000 randomly chosen training points from our real delivery data and 100,00 test points. Both sets were randomly chosen, so the test distribution matches the training distribution. Both are balanced 50-50 between successful and unsuccessful deliveries.&lt;/p&gt;
&lt;h2&gt;LogisticRegression&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;LogisticRegression&lt;/code&gt; is the simplest model, it takes 4 seconds to train. It has an accuracy of 87.9% on the balanced dataset. Both the unbalanced and balanced calibration curves look very good.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/logisticregression-decile.png" alt="Logistic regression deciles" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/logisticregression-balanced.png" alt="Logistic regression balanced" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;DecisionTree(max_depth=10)&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;DecisionTree&lt;/code&gt; model, after 13 seconds of training, has an accuracy of 90.1%. The decile calibration curve is beautiful, although the deciles are very unbalanced, so this could be misleading. The balanced calibration curve has an inversion between the 7th and 8th buckets.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/decisiontree-decile.png" alt="Decision tree deciles" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/decisiontree-balanced.png" alt="Decision tree balanced" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;RandomForest(max_depth=10)&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;RandomForest&lt;/code&gt; model, after 7 seconds of training, has an accuracy of 87.5%. The decile calibration curve is more like a sigmoid, and it’s interesting that the decile counts are skewed torwards the middle. The balanced calibration curve has an inversion between the 7th and 8th buckets, like the &lt;code&gt;DecisionTree&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/randomforest-decile.png" alt="Random forest deciles" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/randomforest-balanced.png" alt="Random forest balanced" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;GradientBoosting(max_depth=10)&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;GradientBoosting&lt;/code&gt; model, after 1,679 seconds of training (!), has an accuracy of 91.1%. Both the balanced and unbalanced calibration curves are very close to the ideal &lt;code&gt;x=y&lt;/code&gt;. The decile counts are heavily skewed towards the two ends.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gradientboosting-decile.png" alt="Gradient boosting deciles" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/gradientboosting-balanced.png" alt="Gradient boosting balanced" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Discussion&lt;/h2&gt;
&lt;p&gt;If training time is not an issue, the &lt;code&gt;GradientBoosting&lt;/code&gt; model is the best choice, both in terms of accuracy and in terms of calibration. Note how the subsequent gradient boosting steps push the predicted probabilities towards 0 and 1, resulting in the highly skewed decile counts. As a reminder, this is an ensemble of trees, trained and applied in sequence, where each subsequent tree is attempting to correct mistakes made so far; it's this construction which results in the skewed distribution.&lt;/p&gt;
&lt;p&gt;It’s also interesting to see how well the &lt;code&gt;LogisticRegression&lt;/code&gt; model performs. It’s (i) only 3% off in terms of accuracy from &lt;code&gt;GradientBoosting&lt;/code&gt; (ii) in terms of calibration it’s very close to &lt;code&gt;GradientBoosting&lt;/code&gt; (iii) it takes only 4 seconds to train, 400x faster than &lt;code&gt;GradientBoosting&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;DecisionTree&lt;/code&gt; and the &lt;code&gt;RandomForest&lt;/code&gt; models are not very appealing for this use-case. Note how the averaging between the trees in the &lt;code&gt;RandomForest&lt;/code&gt; ensemble pulls the predicted probabilities towards 0.5. As a reminder, a &lt;code&gt;DecisionTree&lt;/code&gt; cuts along a feature dimension at each step; when predicting, it travels down to a leaf, and returns the ratio of True training points in the leaf bucket. A &lt;code&gt;RandomForest&lt;/code&gt; is an ensemble of such trees, with the final prediction being the average of the ensemble trees’ predictions.&lt;/p&gt;
&lt;p&gt;Accuracy is usually the primary indicator of how good a classification model is performing. However, in thise case our primary goal is to extract the probabilities, so we can use them for Operational Choice, ie. ranking orders. An imaginary perfect predictor would return &lt;code&gt;p=0&lt;/code&gt; and &lt;code&gt;p=1&lt;/code&gt; at 100% accuracy (so &lt;code&gt;ROC AUC=1&lt;/code&gt;). This would be valuable because then we could not dispatch the &lt;code&gt;p=0&lt;/code&gt; orders, since we could be 100% sure they wouldn't be delivered—but this is unrealistic. We actually prefer a model which nicely "stretches" the orders by the actual probability of delivery with a monotonic calibration curve, following the &lt;code&gt;x=y&lt;/code&gt; diagonal, like the &lt;code&gt;GradientBoosting&lt;/code&gt; model.&lt;/p&gt;</content><category term="machine"></category><category term="learning"></category><category term="fetchr"></category><category term="skl"></category><category term="calibration"></category></entry><entry><title>Using simulated self-play to solve all OpenAI Gym classic control problems with Pytorch</title><link href="https://bytepawn.com/using-simulated-self-play-to-solve-all-openai-gym-classic-control-problems-with-pytorch.html" rel="alternate"></link><published>2019-11-14T00:00:00+01:00</published><updated>2019-11-14T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-11-14:/using-simulated-self-play-to-solve-all-openai-gym-classic-control-problems-with-pytorch.html</id><summary type="html">&lt;p&gt;I use simulated self-play by ranking episodes by summed reward. Game outcomes are divided in two by cutting at the median, winners are assigned +1 rewards, losers are assigned -1 rewards, like in games like Go and Chess. Unlike naive policy gradient descent used in previous posts, this version solves all OpenAI classic control problems, albeit slowly.&lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mountaincar.png" alt="OpenAI mountaincar" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In &lt;a href="http://bytepawn.com/solving-the-cartpole-reinforcement-learning-problem-with-pytorch.html"&gt;a previous blog post&lt;/a&gt;, I applied plain vanilla Reinforcement Learning policy gradient to solve the CartPole OpenAI gym classic control problem. In &lt;a href="http://bytepawn.com/solving-openai-gym-classic-control-problems-with-pytorch.html#solving-openai-gym-classic-control-problems-with-pytorch"&gt;the subsequent blog post&lt;/a&gt;, I generalized that code (in a software engineering sense) and applied it to &lt;a href="https://gym.openai.com/envs/#classic_control"&gt;all classic control problems&lt;/a&gt;; the only "trick" was to quantize the applied action for the continuous problems to convert them to discrete problems. It was able to solve CartPole and Acrobot, but failed on Pendulum and MountainCar (the failure was unrelated to the discretization). I described the problem that I saw examining the numerics at the end of the post: &lt;/p&gt;
&lt;p&gt;&lt;em&gt;.. the loss function is structured like &lt;code&gt;loss := -1 * sum(reward x log(probability of action taken))&lt;/code&gt;, where the &lt;code&gt;log(probability of action taken)&lt;/code&gt; is negative, so the overall expression is positive, assuming the reward is positive. In this case, making the loss 0 would be a global minimum. This can happen if the optimizer sets the probabilities of an arbitrary sub-optimal policy to one, hence making the &lt;code&gt;log(probability)&lt;/code&gt; zero, making the entire loss function go to zero, even though the solution is actually "random".&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Simulated self-play&lt;/h2&gt;
&lt;p&gt;It occured to me that this sort of problem wouldn’t occur when using RL for Chess or Go, since in that case, the agent plays itself, and the winner would get a +1 reward, the loser a -1 reward. This means that gradient descent can’t converge on a bad solution by just adjusting the weights so that the probability of an arbitrary action becomes 1, because this gets supressed by the -1 reward parts in the loss function.&lt;/p&gt;
&lt;p&gt;It then occured to me that it’s easy to simulate this setup with any game: let the agent play games, and once finished, pair the games, and the one with the higher episode reward (as set by the original rules of the game) is the winner, the other the loser. Overwrite the original reward, and simply use +1 for all actions taken by the winner, and -1 for all actions taken by the loser. In actual implementation, there is no need to actually pair: it’s easier to just play N games, sort the games by overall reward, and treat the lower half as losers and give them -1 rewards, and the upper half as winners and give them +1 rewards. N should be chosen to control variance, I used N=100 here.&lt;/p&gt;
&lt;p&gt;The only "hack" necessary to make this work is on the MountainCar-v0 problem: the way the default reward returned by OpenAI Gym is structured, initially it returns a constant -200 (-1 for each timestep the agent doesn't reach the top of the hill, with 200 timesteps per episode). So with this default reward structure, this approach would fail, because the initial totally random agent has no way of knowing which way to proceed, as all runs return a constant -200 &lt;code&gt;(min=median=avg=max=-200)&lt;/code&gt;. To make this approach work, I defined a custom reward function, which is simply the maximum x-coordinate that the agent reaches, and set the solved reward threshold at 0.5, which is the position of the flag at the top of the hill.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mountaincar.png" alt="OpenAI Gym mountaincar" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Coding it up&lt;/h2&gt;
&lt;p&gt;The code is very similar to the previous solutions. &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/11-gym-self-play/OpenAI%20Gym%20classic%20control.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt; The main change is in the main training loop, which collects episode results, splits them into winners and losers, and computes the custom loss function based on that:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_solve_self_play&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_non_greedy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Doing &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gym&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in_out_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;lr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;learning_rate&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;lr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;games&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;play_games&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_non_greedy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# sorts by first key, the reward&lt;/span&gt;
        &lt;span class="n"&gt;losses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;wins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt;
        &lt;span class="n"&gt;sum_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;game&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;game&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;sum_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;compute_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# losers get -1 reward for all actions taken&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;game&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;wins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;game&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;sum_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;compute_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# winners get +1 reward for all actions taken&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;sum_loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;evaluation_rewards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;play_one_game&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps_non_greedy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;policy_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;select_action_from_policy_best&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;evaluation_rewards&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mean_reward&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evaluation_rewards&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;: min=&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; median=&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; max=&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt; eval=&lt;/span&gt;&lt;span class="si"&gt;%.2f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;games&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mean_reward&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mean_reward&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;reward_threshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Solved!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Failed!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results and discussion&lt;/h2&gt;
&lt;p&gt;This approach solves all the classic control problems from OpenAI Gym. I find it intellectually rewarding to adjust a simple, generic solution so it works across problems. Caveats of this approach are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As mentioned above, I had to adjust the reward structure for MountainCar so the training loop is able to tell apart better from worse runs (winners/losers).&lt;/li&gt;
&lt;li&gt;This approach still assumes a discrete action space, which means continuous problems like Pendulum had to be discretized. This would not work or would be very inefficient for higher dimensional action spaces.&lt;/li&gt;
&lt;li&gt;Although this approach works, it's quite slow: in each epoch, 100 games are played; for the simple games like CartPole about 100 epochs are enough, but for MountainCar about 2,000 epochs (so a total of 2,000,000 games played) are needed; other RL methods can find solutions to these simple problems using orders of magnitude less playing time.&lt;/li&gt;
&lt;li&gt;I played around with off-policy learning; this would mean that in epsilon percent of cases, the agent does not pick the action based on the current policy distribution, but totally randomly (ie. uniform); this introduces more random exploration (and noise) into the training process; in the end, this was not needed here.&lt;/li&gt;
&lt;li&gt;As before, I used the &lt;a href="https://pytorch.org/docs/stable/_modules/torch/optim/adam.html"&gt;Adam optimizer&lt;/a&gt; for all problems; I had to adjust learning rate to 0.001 for MountainCar, the default 0.01 worked for the other environments.&lt;/li&gt;
&lt;li&gt;This approach is still naive because it evenly splits each epoch's N games on the median into two equal sets of winners and losers. As the agent approaches a good solution (and the &lt;code&gt;[min, max]&lt;/code&gt; range of rewards in the pool of N games becomes tighter and better), this strategy becomes less effective, because good runs are classified as a loss, and the decisions taken are penalized (-1 reward). Similar to adjusting the learning rate, it would be more sophisticated to adjust the ratio of winners and losers as the solution approaches the reward theshold (but I didn't implement that here).&lt;/li&gt;
&lt;/ul&gt;</content><category term="python"></category><category term="pytorch"></category><category term="reinforcement"></category><category term="learning"></category><category term="openai"></category><category term="gym"></category></entry><entry><title>Applying policy gradient to OpenAI Gym classic control problems with Pytorch</title><link href="https://bytepawn.com/solving-openai-gym-classic-control-problems-with-pytorch.html" rel="alternate"></link><published>2019-11-12T00:00:00+01:00</published><updated>2019-11-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-11-12:/solving-openai-gym-classic-control-problems-with-pytorch.html</id><summary type="html">&lt;p&gt;I try to generalize the policy gradient algorithm as introduced earlier to solve all the OpenAI classic control problems. It works for CartPole and Acrobot, but not for Pendulum and MountainCar environments. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/classic_control.png" alt="OpenAI classic control environments" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In &lt;a href="http://bytepawn.com/solving-the-cartpole-reinforcement-learning-problem-with-pytorch.html"&gt;the previous blog post&lt;/a&gt; we used a simple Reinforcement Learning method called policy gradient to solve the CartPole-v1 environment from OpenAI. This post is about seeing how far I can take this basic approach. Can it solve the other, harder classic control problems in OpenAI? &lt;/p&gt;
&lt;p&gt;The OpenAI classic control problem set consists of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py"&gt;CartPole-v1&lt;/a&gt;: Balance a pole on a cart.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/acrobot.py"&gt;Acrobot-v1&lt;/a&gt;: Swing up and balance a two-link robot.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/pendulum.py"&gt;Pendulum-v0&lt;/a&gt;: Swing up and balance a pendulum.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/mountain_car.py"&gt;MountainCar-v0&lt;/a&gt;: Drive up a big hill.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/continuous_mountain_car.py"&gt;MountainCarContinuous-v0&lt;/a&gt;: Drive up a big hill.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Environments&lt;/h2&gt;
&lt;p&gt;All these problems are similar in that the state space (which is the input space for the policy neural network) is a few real numbers. The action space (which is the output space for the policy) is sometimes discrete (left/right) and sometimes a real (magnitude):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/classic_control.png" alt="OpenAI classic control environments" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;table&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;env&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;b&gt;CartPole-v1&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;b&gt;Acrobot-v1&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;b&gt;Pendulum-v0&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;b&gt;MountainCar-v0&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;b&gt;MountainCarContinuous-v0&lt;/b&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;description&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;balance a pole on a cart&lt;/td&gt;
        &lt;td&gt;swing up a two-link robot&lt;/td&gt;
        &lt;td&gt;swing up a pendulum&lt;/td&gt;
        &lt;td&gt;drive up a big hill&lt;/td&gt;
        &lt;td&gt;drive up a big hill&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;code&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py"&gt;cartpole.py&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a nref="https://github.com/openai/gym/blob/master/gym/envs/classic_control/acrobot.py"&gt;acrobot.py&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/pendulum.py"&gt;pendulum.py&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/mountain_car.py"&gt;mountain_car.py&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/continuous_mountain_car.py"&gt;continuous_mountain_car.py&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;state&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;4 reals: cart position, velocity, pole angle, velocity&lt;/td&gt;
        &lt;td&gt;6 reals: sine and cosine of the two rotational joint angles and the joint angular velocities&lt;/td&gt;
        &lt;td&gt;3 reals: sine and cosine of the angle and angular velocity&lt;/td&gt;
        &lt;td&gt;2 reals: position, velocity&lt;/td&gt;
        &lt;td&gt;2 reals: position, velocity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;action&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;discrete: left/right&lt;/td&gt;
        &lt;td&gt;discrete: left/nothing/right, the torque on the second joint&lt;/td&gt;
        &lt;td&gt;1 real between -2 and 2, the torque&lt;/td&gt;
        &lt;td&gt;discrete: left/nothing/right, the force on the car&lt;/td&gt;
        &lt;td&gt;1 real between -1 and 1, the force on the car&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;episode length&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;500&lt;/td&gt;
        &lt;td&gt;500&lt;/td&gt;
        &lt;td&gt;200&lt;/td&gt;
        &lt;td&gt;200&lt;/td&gt;
        &lt;td&gt;999&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;reward&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;+1 for each timestep the agent stays alive&lt;/td&gt;
        &lt;td&gt;-1 for each timestep the agent takes to swing up&lt;/td&gt;
        &lt;td&gt;negative reward as a function of the angle&lt;/td&gt;
        &lt;td&gt;-1 for each timestep the agent doesn’t reach the top of the hill&lt;/td&gt;
        &lt;td&gt;negative for applied action, +100 once solved&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;b&gt;reward threshold for solved&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;475&lt;/td&gt;
        &lt;td&gt;-100&lt;/td&gt;
        &lt;td&gt;None (I used -150)&lt;/td&gt;
        &lt;td&gt;-110&lt;/td&gt;
        &lt;td&gt;90&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;Coding it up&lt;/h2&gt;
&lt;p&gt;To attack all the problems with one script, I took the script from the previous post and made it more general. First, a function which returns the input and output space dimensions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;in_out_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;observation_space&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action_space&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gym&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spaces&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action_space&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action_space&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gym&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spaces&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;discrete&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Discrete&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action_space&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For example, for CartPole, it returns &lt;code&gt;(4, 2)&lt;/code&gt;. The next question is, how to attack environments such as Pendulum, where the action is real and not discrete? The Policy gradient formalism uses probabilities, so I chose to quantize the action space and pretend it’s discrete. This makes the quantization levels a parameter. I also wanted to play around with different reward functions when constructing the loss function, so I collected these into an &lt;code&gt;EnvDescriptor&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;model_class&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# the rest are optional, in case it&amp;#39;s needed for the env&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;reward_func&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;output_quantization_levels&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;output_range&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;reward_threshold&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;EnvDescription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;namedtuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;EnvDescription&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;defaults&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;    
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The descriptors for the problems are:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;env_descriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;EnvDescription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CartPole-v1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PolicyShallowNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;EnvDescription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Acrobot-v1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PolicyShallowNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;EnvDescription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Pendulum-v0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PolicyDeepNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pendelum_reward_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;output_quantization_levels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;101&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;output_range&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                   &lt;span class="n"&gt;reward_threshold&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;EnvDescription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MountainCar-v0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PolicyDeepNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mountaincar_reward_func&lt;/span&gt;
                  &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;EnvDescription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MountainCarContinuous-v0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PolicyDeepNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mountaincar_reward_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;output_quantization_levels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;101&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;output_range&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                  &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I defined two policy neural networks: one which is a simple affine linear map, and one with one hidden layer network and &lt;code&gt;ReLu&lt;/code&gt; for nonlinearity. Both end with &lt;code&gt;softmax&lt;/code&gt; to get a probability distribution over the discrete (or discretized) action space, as before:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PolicyShallowNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PolicyShallowNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PolicyDeepNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PolicyDeepNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;With these generalizations, we use plain-vannila policy descent:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for each episode&lt;ul&gt;
&lt;li&gt;finish the episode&lt;/li&gt;
&lt;li&gt;if the descriptor contains a custom reward function, use that, otherwise use the env’s default reward function to compute rewards, which are then rolled up with a gamma factor and multiplied by -1 to get the loss function (value)&lt;/li&gt;
&lt;li&gt;using &lt;code&gt;gamma = 0.99&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;every 10 episodes, use the Adam optimizer to improve the weights on the policy, by backprop’ing from the added up losses (of 10 episodes)&lt;/li&gt;
&lt;li&gt;check whether the avg reward of the last 10 episodes if greater than the threshold (always use the env’s rewards here), if yes, we’re done&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Computing the loss function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;states&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gamma&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reward_func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;rewards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;states&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_rewards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="c1"&gt;# decay rewards with gamma&lt;/span&gt;
    &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="c1"&gt;# backward&lt;/span&gt;
        &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;gamma&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt;
        &lt;span class="n"&gt;scaled_rewards&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# insert at beginning&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scaled_rewards&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The main training loop:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_solve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Doing &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gym&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in_out_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;output_length&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;episode_rewards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;states&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_max_episode_steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;select_action_from_policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_range&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_range&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_range&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_quantization_levels&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prob&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reward&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;states&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;episode_reward&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;episode_rewards&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episode_reward&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_solved&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;episode_rewards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reward_threshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Solved in &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; episodes!&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episode&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;compute_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;states&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reward_func&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Failed!&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;This approach is able to solve CartPole-v1, Acrobot-v1, but &lt;strong&gt;fails&lt;/strong&gt; on the Pendulum and MountainCar problems. The optimizer gets stuck in a bad minima and never finds a good policy solution. An interesting bad behavior happens for positive loss functions: the loss function is structured like &lt;code&gt;loss := -1 * sum(reward x log(probability of action taken))&lt;/code&gt;, where the &lt;code&gt;log(probability of action taken)&lt;/code&gt; is negative, so the overall expression is positive, assuming the reward is positive. In this case, making the loss 0 would be a global minimum. This can happen if the optimizer sets the probabilities of an arbitrary sub-optimal policy to one, hence making the &lt;code&gt;log(probability)&lt;/code&gt; zero, making the entire loss function go to zero, even though the solution is actually "random".&lt;/p&gt;
&lt;p&gt;Intuitively, it seems this simple implementation of gradient descent only works for reactive environments like CartPole and Acrobot, where the policy network doesn’t have to find a “plan” (ie. swing left and then right to get up), it just has to react to the current state, irrespective of the history. To solve the harder environments, more advanced training approaches are required.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="reinforcement"></category><category term="learning"></category><category term="openai"></category><category term="gym"></category></entry><entry><title>Machine Learning at Fetchr</title><link href="https://bytepawn.com/machine-learning-at-fetchr.html" rel="alternate"></link><published>2019-10-29T00:00:00+01:00</published><updated>2019-10-29T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-10-29:/machine-learning-at-fetchr.html</id><summary type="html">&lt;p&gt;Opportunities for automating, optimizing and enabling processes with ML at a delivery company such as Fetchr are plentiful. We put three families of ML models into production. These 3 areas are: Scheduling, Notifications and Operational choice.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/opchoice.png" alt="Operational choice" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the first year of Data Science at Fetchr, &lt;a href="http://bytepawn.com/fetchr-data-science-infra.html"&gt;we built out a data science infrastructure&lt;/a&gt; that allows us to understand the data, our business and operations from a quantitative perspective. We built a &lt;a href="http://prestodb.github.io/"&gt;Presto&lt;/a&gt; data warehouse on AWS, built approx. 100-200 &lt;a href="https://airflow.apache.org/"&gt;Airflow&lt;/a&gt; pipelines to feed it from our production systems, and approx. 50-100 &lt;a href="https://superset.apache.org/"&gt;Superset&lt;/a&gt; dashboards to visualize it all. This has been a great success and a big step towards making the company data-driven.&lt;/p&gt;
&lt;p&gt;While doing this work, we identified a number of opportunities to deploy Machine Learning. During our second year at Fetchr, we put three families of ML models into production. These 3 areas are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scheduling&lt;/li&gt;
&lt;li&gt;Notifications&lt;/li&gt;
&lt;li&gt;Operational choice&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In all cases, &lt;a href="http://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html"&gt;we used A/B tests&lt;/a&gt; and % releases to put our models into production, or when putting new versions of models into production and deprecating older ones.&lt;/p&gt;
&lt;h2&gt;Scheduling&lt;/h2&gt;
&lt;p&gt;Scheduling is a critical step in the delivery funnel in the Middle East. It refers to looking at an order’s data (recipient name, phone number, freetext address, etc), and trying to figure out where our courier has to go (latitude, longitude) to deliver the package. This is trivial in Europe or the US, where we have mature addressing systems, and often the address can be resolved to a (latitude, longitude) perfectly with a Google Maps API call. In the Middle East, addressing is a challenge, because there are no zip codes, and street names and numbers are unclear and/or people don’t know. Google Maps has very limited coverage, so querying their APIs doesn’t help (same with OSM based services). Also, because people know this, they often don’t try to put their actual address into the address field, instead they put down a nearby point of interest and/or instructions. For this reason, the &lt;code&gt;(address) -&amp;gt; (latitude, longitude)&lt;/code&gt; mapping was originally performed manually:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;either through self-scheduling (recipient gets an SMS, clicks through, and drops a pin in Google Maps on our scheduling mweb page)&lt;/li&gt;
&lt;li&gt;or by a call center agent, either by reading the address (“Blind”) by calling the recipient and talking with them on the phone, and meanwhile dropping a pin in Google Maps&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We realized this mapping can be automated with Machine Learning for a large majority of orders. &lt;a href="http://bytepawn.com/automating-a-call-center-with-machine-learning.html"&gt;I’ve described the automated scheduling on the blog before&lt;/a&gt;, and described the models we use. &lt;/p&gt;
&lt;p&gt;&lt;img src="/images/jvt.png" alt="JVT and JVC" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;By now we have a lot more models in production. They are (see earlier post for details):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;repeats&lt;/li&gt;
&lt;li&gt;address matching to:&lt;ul&gt;
&lt;li&gt;manually maintained rules&lt;/li&gt;
&lt;li&gt;ML rules (single text fragment)&lt;/li&gt;
&lt;li&gt;multi-level ML rules (multi text fragment)&lt;/li&gt;
&lt;li&gt;ML rules for arabic text&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;provided locations&lt;/li&gt;
&lt;li&gt;zip codes (in KSA)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As described in an earlier post, we have lots of knobs to tune to move the models in the (conversion, Delivery Performance) space, ie. schedule more or less orders at lower or higher Delivery %. Scheduling more orders at overall lower delivery % makes sense if alternative scheduling channels (such as the call center) are experiencing technical difficulties (eg. lines are down), because in this case there is no next-best alternative for scheduling.&lt;/p&gt;
&lt;p&gt;In 2019 September, ML scheduling was Fetchr’s biggest scheduling channel globally, handling approx. 37% of all orders dispatched globally. In terms of Delivery Performance, it outperforms call center scheduling, and is second only by approx. 5% to our best channel, recipient self-scheduling.&lt;/p&gt;
&lt;h2&gt;Notifications&lt;/h2&gt;
&lt;p&gt;As mentioned above, our best scheduling channel in terms of Delivery Performance is self-scheduling. This makes sense: self-scheduling means the recipient visits our website and explicitly tells us the (day, time, location) they want us to deliver the order. These are recipients who really want their orders, and are willing to invest time to give us high-quality scheduling coordinates. So this is a biased, but highly valuable group. &lt;/p&gt;
&lt;p&gt;Clearly, more self-scheduling is better for any delivery company. How can we get more self-scheduling? The basic scheduling flow is for us to send out notifications to the recipients that their orders are ready to go in our last mile warehouse in their city. There are various notification channels, the biggest one is SMS. The message contains a link to our mweb scheduling page, where the recipients can self-schedule. If we can get more people to click and convert, we get more self-scheduling, which means we will have higher overall Delivery Performance (since this is the best channel wrt Delivery Performance).&lt;/p&gt;
&lt;p&gt;&lt;a href="http://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html"&gt;We ran many A/B tests on notifications&lt;/a&gt;, and found that getting the language right matters a lot. In our markets, the biggest split is between English and Arabic (the third would be Hindi). So the challenge is, given a name like “Marton Trencseni” (Hungary), “Mohit Ahuja” (India), or “Tariq Sanad” (Bahrain), all expats living in the UAE, what is the right language? In the first 2 cases, it should be english, in the last it should be arabic.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/languages.jpg" alt="Languages" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://bytepawn.com/arabic-name-classification-with-scikit-learn-and-pytorch.html"&gt;We experimented with numerous Scikit Learn models&lt;/a&gt;, but in the end we went with a hand-rolled one. We were not able to use public datasets for this classification task, because these datasets are highly polluted:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;many common arabic names are also common in non-arabic countries (eg. India)&lt;/li&gt;
&lt;li&gt;popular arabic names also show up in english speaking country’s name databases (eg. Ali, Ahmed are very common in US/UK)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So in the end we used names from our own delivery dataset (10M+ deliveries) to bootstrap a classification dataset, where we used frequency in mostly homogeneous countries as an initial signal and went from there. This worked, but then we realized that the algorithms (n-gram frequency and co-occurrence counting) we used to do the bootstrapping can also be re-used for the classification task, there’s no need for Scikit Learn. The resulting hand-tuned model is 99% accurate.&lt;/p&gt;
&lt;h2&gt;Operational choice&lt;/h2&gt;
&lt;p&gt;This is the latest family of models we introduced at Fetchr. From a Machine Learning perspective this is the most straightforward: we have a large number of features (essentially columns in our data warehouse) available for our historic dispatches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;sender’s information&lt;/li&gt;
&lt;li&gt;recipient’s information (address, etc.)&lt;/li&gt;
&lt;li&gt;recipient’s historic information&lt;/li&gt;
&lt;li&gt;geography&lt;/li&gt;
&lt;li&gt;scheduling channel&lt;/li&gt;
&lt;li&gt;timing&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For each dispatch, we know whether it was successfully delivered or not. Given our historic data, we can build a classifier which predicts which orders will be delivered (or not) tomorrow (or a later date), of all orders scheduled for dispatch. After one-hot encoding, our feature vector length is in the 1000s, and we can achieve 90%+ accuracy with out-of-the-box Scikit Learn models. In other words, perhaps not too surprisingly, it is possible to predict the chances of delivery success quite well.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/opchoice.png" alt="Operational choice" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;What are the use-cases for using delivery prediction? Not dispatching orders, even if they have low predicted probability of delivery success is not an option; it's our job to attempt the delivery! But we can use the relative probabilities to prioritize orders to increase the chances of success and improve efficiency. Another potential use-case, currently not planned at Fetchr, is differential pricing.&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;I worked at Facebook in 2016-17 and experienced a very effective Data Science culture as part of a product team. We’ve been building Data Science at Fetchr based on this template and it has worked out well so far. We had significant impact in the past 2 years, both with our Analytics and our Machine Learning projects. Opportunities for automating, optimizing and enabling processes with ML are plentiful.&lt;/p&gt;</content><category term="machine"></category><category term="learning"></category><category term="fetchr"></category><category term="skl"></category></entry><entry><title>Solving the CartPole Reinforcement Learning problem with Pytorch</title><link href="https://bytepawn.com/solving-the-cartpole-reinforcement-learning-problem-with-pytorch.html" rel="alternate"></link><published>2019-10-22T00:00:00+02:00</published><updated>2019-10-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-10-22:/solving-the-cartpole-reinforcement-learning-problem-with-pytorch.html</id><summary type="html">&lt;p&gt;The CartPole problem is the Hello World of Reinforcement Learning, originally described in 1985 by Sutton et al. The environment is a pole balanced on a cart. CartPole is one of the environments in OpenAI Gym, so we don't have to code up the physics. Here I walk through a simple solution using Pytorch. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cartpole.gif" alt="Cartpole animation" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The CartPole problem is the &lt;a href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program"&gt;Hello World&lt;/a&gt; of Reinforcement Learning, originally described in &lt;a href="http://www.incompleteideas.net/papers/OSB-tracking-85.pdf"&gt;1985 by Sutton et al&lt;/a&gt;. The environment is a pole balanced on a cart. Here I walk through a simple solution using Pytorch. 
The &lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/09-cartpole"&gt;ipython notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The cartpole environment’s state is described by a 4-tuple:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;cart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;velocity&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;cart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;angular&lt;/span&gt; &lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;pole&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;angular&lt;/span&gt; &lt;span class="n"&gt;velocity&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;pole&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;At every timestep, the physics simulation is updated. The input is 0 or 1, depending on whether we want to move the cart to the left or to the right.&lt;/p&gt;
&lt;p&gt;With the OpenAI Gym environment, we don’t have to code up the physics simulation, &lt;a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py"&gt;it comes out of the box&lt;/a&gt;. We just have to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;gym&lt;/span&gt;
&lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gym&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;CartPole-v1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/cartpole.gif" alt="Cartpole animation" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;Note: the &lt;code&gt;-v1&lt;/code&gt; in the environment spec makes each episode run for 500 steps. &lt;code&gt;CartPole-v0&lt;/code&gt; only runs for 200 steps.&lt;/p&gt;
&lt;p&gt;Before we get into neural networks and Reinforcement Learning (RL), let’s play around with the environment to get some intuition. The basic simulation loop is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;select_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note: &lt;code&gt;env.render()&lt;/code&gt; will open a GUI window and show you the cartpole.&lt;/p&gt;
&lt;p&gt;The environment will return &lt;code&gt;done=True&lt;/code&gt; if either 500 timesteps have elapsed (episode success) or if the pole has fallen over (angular position of the pole has reached +- 12 degrees) or the cart has left the simulation space (cart position has reached +- 2.4), in which case the episode failed.&lt;/p&gt;
&lt;p&gt;To make the above snippet work, we just have to supply a &lt;code&gt;select_action()&lt;/code&gt; function, which given the state, returns what to do: move the cart left or right. Let’s see what happens if we supply a random agent:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_action_random&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Obviously this will not perform very well, but it’s a start. How about something smarter: if the pole is falling to the right, let’s move the cart to the right to compensate, and vice versa:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_action_simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This should perform better than our random agent, but how much better? Let’s write a simple function which counts, on average, how far our agent survives (out of 500, as a ratio), and returns the average, a number between 0 and 1, 1 being a perfect score:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;goodness_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;select_action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;select_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If we run this for the above 2 test functions, like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;goodness_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;select_action_simple&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;goodness_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;select_action_random&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We see that the random scores 0.04, and simple 0.08. So the simple is better than random, but still very far from 1. Let’s try something better: how about if we also add the angular velocity:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_action_good&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;goodness_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;select_action_good&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Turns out this simple test function solves the problem pretty much perfectly, it gets a score of almost 1.0 (meaning the simulation survives all the way to 500 steps). We got lucky with this ansatz, but there’s no lucky shortcut to solving Starcraft, so let’s move on to Reinforcement Learning.&lt;/p&gt;
&lt;p&gt;Essentially we will build a neural network which will try to guess/learn the &lt;code&gt;select_state()&lt;/code&gt; function above. The input state is 4 doubles, and the output is 2 doubles (the probability of going left and right, sum to 1). Since we got lucky above with &lt;code&gt;select_action_good()&lt;/code&gt;, we know a small neural network will do, it just has to learn to add the right components:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PolicyNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PolicyNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;select_action()&lt;/code&gt; that goes along with the network is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_action_from_policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_numpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsqueeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Categorical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_prob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_action_from_policy_best&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_numpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsqueeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;code&gt;select_action_from_policy()&lt;/code&gt; runs the network on the state, and returns the left/right output according to the probabilities (and also a probability, see later). &lt;code&gt;select_action_from_policy_best()&lt;/code&gt; can be used after training, it always returns the action with the higher probability.&lt;/p&gt;
&lt;p&gt;The next question is, how do we train our network? In &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html"&gt;supervised learning like MNIST&lt;/a&gt;, we train our network on independent samples, and for each sample, we know what the desired response is, and construct a loss function (from some sort of distance-like metric) from that. Here, there is no explicit training data to tell us whether in a given state a right/left prediction was good or bad. But it's actually not that hard to constuct a loss function.&lt;/p&gt;
&lt;p&gt;Taking a step back, what is our goal in the CartPole environment? The goal is that the agent should survive as long as possible, 500 steps in this simulation. So as a first idea, we could simply try to use the &lt;code&gt;(1 - t/500)&lt;/code&gt; normalized length of the simulation (how far the agent got) as a loss function. Then we can use gradient descent to search for a minimum of the loss function (maximum length of simulation). Assuming we do a gradient descent after each episode (simulation run), the code would be:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_wont_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;select_action_from_policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt;
        &lt;span class="c1"&gt;# this doesn&amp;#39;t actually work, because&lt;/span&gt;
        &lt;span class="c1"&gt;# the loss function is not an explicit&lt;/span&gt;
        &lt;span class="c1"&gt;# function of the model&amp;#39;s output; it&amp;#39;s&lt;/span&gt;
        &lt;span class="c1"&gt;# a function of book keeping variables&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# AttributeError: &amp;#39;float&amp;#39; object has no attribute &amp;#39;backward&amp;#39;&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;However, this won't work. The problem is, &lt;code&gt;t&lt;/code&gt; is a book keeping &lt;code&gt;int&lt;/code&gt; variable, it's not a Pytorch variable. It's not derived from (not a function of) the neural network's output, so we can't take the derivative of it (with respect t o the network's weights)AZ.&lt;/p&gt;
&lt;p&gt;But from the above idea, we can get something that works. At each timestep, let's say that the reward is how long the simulation survived after the timestep, multiplied by the network's output that we took (left/right probability). The loss is the negative sum of all rewards. This sounds good, because: (i) the global minimum of the this loss will be when the agent survives all the way, and takes the moves that lead there with 1 probability (ii) it's a function of the network's outputs (the probabilities). In code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_episodes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;select_action_from_policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prob&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;episode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# check stopping condition:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;num_steps&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stopping training, looks good...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;

&lt;span class="n"&gt;train_simple&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This works! In a couple hundred episodes, the simulation stops, because the network successfully balances the pole:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;goodness_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;select_action_from_policy_best&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; 
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This "derivation" should give an intuitive understanding of RL. There is one bit that we jumped over: we use the log probabilities in the loss function, not the regular probabilities (see last line in &lt;code&gt;select_action_from_policy()&lt;/code&gt;). The training loop doesn't work if we change it to use the regular probabilities. For an explanation, see this &lt;a href="https://medium.com/@thechrisyoon/deriving-policy-gradients-and-implementing-reinforce-f887949bd63"&gt;derivation of the policy gradient method&lt;/a&gt;, which is what we actually implemented here.&lt;/p&gt;
&lt;p&gt;Another important note is that here we didn't use a discount factor (usually called &lt;code&gt;gamma&lt;/code&gt;; what we did here is the same as setting &lt;code&gt;gamma=1&lt;/code&gt;, ie. no decay of rewards), and the training loop still converges. Most RL problems use a discount factor, because there is an assumption that whatever action we took at time t influenced what happens after, but as time goes on it becomes less important (the importance decays). Check the &lt;a href="https://github.com/pytorch/examples/blob/master/reinforcement_learning/reinforce.py"&gt;official Pytorch CartPole example&lt;/a&gt; for an implementation with a discount factor; interestingly, it doesn't seem to have better convergence properties than this naive implementation.&lt;/p&gt;
&lt;p&gt;More links:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://medium.com/@ts1829/policy-gradient-reinforcement-learning-in-pytorch-df1383ea0baf"&gt;Policy Gradient Reinforcement Learning in PyTorchs&lt;/a&gt; (similar intro)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gym.openai.com/envs/"&gt;OpenAI environments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/playlist?list=PLzuuYNsE1EZAXYR4FJ75jcJseBmo4KQ9-"&gt;10 part RL course by Google DeepMind's David Silver&lt;/a&gt; (highly recommended)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Deep-Learning-Game-Max-Pumperla/dp/1617295329"&gt;Deep Learning and the Game of Go book&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Reinforcement-Learning-Introduction-Adaptive-Computation/dp/0262193981"&gt;Sutton, Barto: &lt;em&gt;Reinforcement Learning&lt;/em&gt; book&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="python"></category><category term="pytorch"></category><category term="reinforcement"></category><category term="learning"></category><category term="openai"></category><category term="gym"></category><category term="cartpole"></category></entry><entry><title>Metrics Atlas</title><link href="https://bytepawn.com/metrics-atlas.html" rel="alternate"></link><published>2019-08-29T00:00:00+02:00</published><updated>2019-08-29T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-08-29:/metrics-atlas.html</id><summary type="html">&lt;p&gt;The idea is simple: write a document which helps new and existing people—both managers and individual contributors—get an objective, metrics-based picture of the business. This is helpful when new people join, when people start working in new segments of the business, and to understand other parts of the company.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/metrics_atlas.png" alt="Metrics atlas" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;The idea is simple: &lt;strong&gt;write a document which helps new and existing people—both managers and individual contributors—get an objective, metrics-based picture of the business.&lt;/strong&gt; This is helpful when new people join, when people start working in new segments of the business, and to understand other parts of the company. &lt;/p&gt;
&lt;p&gt;Companies with a data/analytics team end up with lots of dashboards and reports, plus hundreds of ad-hoc, non-dashboarded analysis that data scientists produce over the years. The dashboards are usually discoverable because they're in one system, but there will be a lot of them. But it's hard mental work to look through 50-100 dashboards and make sense of it all. Many people won't do that. And the ad-hoc reports and presentations are locked up in documents, notebooks, email threads, not discoverable by others, even though many of them contain valuable insights.&lt;/p&gt;
&lt;p&gt;The idea is similar to when an engineering team concludes their system design work by writing &lt;a href="https://arxiv.org/abs/1302.3860"&gt;a document explaining the architecture they chose, why they chose it, trade-offs, lessons learned&lt;/a&gt;. A written document, with lots of charts and links, that somebody can read and get a good picture of the product/business. &lt;strong&gt;This is one, long, stand-alone document.&lt;/strong&gt; Wikis are good, but wikis will have a lot of sub-pages and are better for search-and-find consumption use-cases. With the atlas, the user experience is that the reader will go through the whole thing and will have a good feeling that they now understand the business better. And because it has links, they now know where to look for metrics, dashboards, wikis, or who to ask. It doesn't have to be to-the-day up to date, it's good enough to update it every 3-6 months.&lt;/p&gt;
&lt;p&gt;The first time I had this idea was at Prezi, but we never got around to it. At Facebook I was working on &lt;a href="https://www.facebook.com/workplace"&gt;Workplace&lt;/a&gt;, which at that time was so new and changing so fast that this didn't make sense. At Fetchr, we were able to allocate a few days to this recently, and it turned out very useful!&lt;/p&gt;
&lt;p&gt;For us it's a Google doc, we co-edit it, and share it within the company. &lt;strong&gt;Right now it's about 40 pages, it will probably come in at 50 pages in the end.&lt;/strong&gt; Since we've been working with this data and metrics every day for the last 2 years, writing this was not a big effort.&lt;/p&gt;
&lt;p&gt;This is what the first page looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/metrics_atlas.png" alt="Metrics atlas" style="width: 800px;"/&gt;&lt;/p&gt;</content><category term="data"></category><category term="fetchr"></category></entry><entry><title>Playing Go with supervised learning in Pytorch</title><link href="https://bytepawn.com/playing-go-with-supervised-learning-in-pytorch.html" rel="alternate"></link><published>2019-08-25T00:00:00+02:00</published><updated>2019-08-25T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-08-25:/playing-go-with-supervised-learning-in-pytorch.html</id><summary type="html">&lt;p&gt;Using historic gameplay between strong Go players as training data, a CNN model is built to predict good Go moves on a standard 19x19 Go board. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/go-supervised-prediction-sample-1.jpg" alt="Go prediction sample" style="width: 600px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/AlphaGo"&gt;AlphaGo&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/AlphaGo_Zero"&gt;AlphaGo Zero&lt;/a&gt; are tour-de-force demonstrations of the power of Deep Learning combined with Reinforcement Learning.&lt;/p&gt;
&lt;p&gt;The new book &lt;a href="https://www.amazon.com/Deep-Learning-Game-Max-Pumperla/dp/1617295329"&gt;Deep Learning and the Game of Go&lt;/a&gt; walks the reader through the steps of building an AlphaGo-like Go playing agent, starting from a completely randomized bot (but one which knows the rules of the game), to eventually a full-blown agent that learns with Reinforcement Learning. I’ve been reading it, it’s great, highly recommended.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/dl-go-book.jpg" alt="CIFAR examples" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;One of the &lt;em&gt;intermediate chapters&lt;/em&gt; in the book (Chapter 7) shows how to build a Deep Learning agent using supervised learning. In other words, given historic games between strong players (humans or bots) let a deep neural network learn to play good moves. In the context of building Go bots, the idea is not to use the output of the neural network directly. Instead, this output of the neural network can be used to build a bot that uses &lt;a href="https://en.wikipedia.org/wiki/Monte_Carlo_tree_search"&gt;Monte Carlo Tree Search&lt;/a&gt;, as does AlphaGo.&lt;/p&gt;
&lt;p&gt;The idea behind Monte Carlo tree search in games like Go is simple. When the bot has to make a move, there are a number of possibilites to place the next stone on the board. In MCTS, before making the actual move, the bot tries a number of moves in simulation, then switches sides, makes the next move, and so on; based on the outcome of these simulations (or, a score assignment is the game is not over), it weighs its options, and selects the next move accordingly. What makes Go hard, even harder than Chess, is that there are a lot of moves to try! Trying all valid moves is computationally infeasible, so the bot needs some help to select and prune interesting parts of the tree.&lt;/p&gt;
&lt;p&gt;This is where a neural network using supervised learning is useful. If we train it on historic games, we get an agent which can return the top-N moves, which we can use in a MCTS setting. In order to not limit the bot too much by historics, it can optimistically also use an &lt;a href="https://en.wikipedia.org/wiki/Multi-armed_bandit"&gt;epsilon-greedy explore/exploit&lt;/a&gt; mechanism to find other interesting moves.&lt;/p&gt;
&lt;p&gt;The book uses Keras in its examples. Inspired by the book, I wanted to see what it’s like to do this with Pytorch. &lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/08-go-supervised"&gt;The notebooks are up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Getting historic gameplay data&lt;/h2&gt;
&lt;p&gt;The site &lt;a href="https://u-go.net/gamerecords/"&gt;u-go.net lists many years worth of games&lt;/a&gt; between strong players (6d or better). The files are zipped, and inside we find a sgf file for each game. Fortunately, there is a Python library called &lt;a href="https://github.com/mattheww/sgfmill"&gt;sgfmill&lt;/a&gt; which makes opening and processing sgf game files simple.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;## download:&lt;/span&gt;
&lt;span class="c1"&gt;# curl https://u-go.net/gamerecords/ | grep &amp;#39;https://dl.u-go.net/gamerecords/&amp;#39; | grep zip | cut -d&amp;#39;&amp;quot;&amp;#39; -f 2 | awk &amp;#39;{ print &amp;quot;wget &amp;quot; $1 }&amp;#39; | sh&lt;/span&gt;
&lt;span class="c1"&gt;## extract:&lt;/span&gt;
&lt;span class="c1"&gt;# ls -l | awk &amp;#39;{ print &amp;quot;unzip &amp;quot; $9 }&amp;#39; | sh&lt;/span&gt;

&lt;span class="n"&gt;GAMEFILES_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;lt;something&amp;gt;&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;game_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;walk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GAMEFILES_PATH&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;.sgf&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Total games: &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;  &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We use these games to create the training (and test) set. We loop though each move in each game, and create the map &lt;code&gt;board -&amp;gt; next move&lt;/code&gt;. This is the mapping we want our neural network to learn.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sgfmill&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sgf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sgf_moves&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_data_points&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;data_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;game_file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Processing &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;: &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;game_file&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;contents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ascii&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;game&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sgf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sgf_game&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plays&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sgf_moves&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_setup_and_moves&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;game&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;plays&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
                &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;
                &lt;span class="n"&gt;tp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;data_points&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;play&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data_points&lt;/span&gt;

&lt;span class="n"&gt;NUM_TRAINING_GAMES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;NUM_TEST_GAMES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;training_game_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;NUM_TRAINING_GAMES&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;test_game_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;game_files&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;NUM_TEST_GAMES&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;
&lt;span class="n"&gt;training_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_data_points&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_game_files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_data_points&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_game_files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The code above calls &lt;code&gt;data_point()&lt;/code&gt;, which actually constructs the desired representation, shown below.&lt;/p&gt;
&lt;h2&gt;Encoding data&lt;/h2&gt;
&lt;p&gt;Encoding the board positions is simple. Go is played 19x19 grid, and each grid point is either empty, occupied by white or occupied by black. We create 19x19 float arrays, where empty is &lt;code&gt;0.0&lt;/code&gt;, the player who goes next is encoded &lt;code&gt;1.0&lt;/code&gt;, the opponent is encoded &lt;code&gt;-1.0&lt;/code&gt;. This way both white’s and black’s moves can be used for training. The move is the grid location of the next move, encoded in a one-hot vector of 19x19=361 length.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;

&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;19&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;data_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;board_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;list_occupied_points&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;board_array&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;board_array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GoDataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_points&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_points&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__getitem__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_points&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_points&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__len__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;training_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GoDataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GoDataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;train_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_dataset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_dataset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Neural network model&lt;/h2&gt;
&lt;p&gt;The neural network is a relatively simple &lt;a href="https://en.wikipedia.org/wiki/Convolutional_neural_network"&gt;Convolutional Neural Network&lt;/a&gt;, since we want it to learn 2D spatial patterns on the board. As usual, we use ReLu for nonlinearities. Interestingly, unlike with image recognition CNNs like the one I built for &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;MNIST&lt;/a&gt; and &lt;a href="http://bytepawn.com/solving-cifar-10-with-pytorch-and-skl.html#solving-cifar-10-with-pytorch-and-skl"&gt;CIFAR-10&lt;/a&gt;, on this one I had to use &lt;a href="https://pytorch.org/docs/stable/_modules/torch/nn/modules/dropout.html"&gt;dropouts&lt;/a&gt;, ie. randomly turning off a fraction p of weights during each training run. Without dropouts, the model overfit the training data, but was performing barely beyond random guessing on test data.&lt;/p&gt;
&lt;p&gt;The structure is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BOARD_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;      &lt;span class="c1"&gt;# notice the dropout&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;              &lt;span class="c1"&gt;# notice the dropout&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that this is a pretty big model. It has 1M+ parameters, almost all of them in the final two fully connected (FC) layers; this is why it is able to overfit low cardinality training data.&lt;/p&gt;
&lt;h2&gt;Training and test accuracy&lt;/h2&gt;
&lt;p&gt;This is a relatively big model, so it needs to be trained on a GPU. On my GTX 970, training on 3000 games (500,000 moves) takes about 6 hours. The model converges to about 75% accuracy on train data and 25% accuracy on test data. Note that random guessing would have 1/19^2 = 0.27% accuracy, so this is 100x better. Also, there is not neccesary one right play, so being different than the test data is okay.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/go-supervised-loss.png" alt="Loss" style="width: 800px;"/&gt;&lt;br/&gt;
&lt;img src="/images/go-supervised-accuracy.png" alt="Accuracy" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Examining the predictions was a bit surprising. The model has a softmax at the end, it’s predicting probabilities for each “class” (each of the possible moves). I thought that multiple good moves are identified by the model, which would make it useful in a MCTS algorithm. But that’s not what happens most of the time: the probabilities are usually “focused” on one move. Below are samples from the test data (left: board setup, center: actual move per test data, right: predicted move probabilities by model):&lt;/p&gt;
&lt;p&gt;The test dataset's move is predicted:&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/go-supervised-prediction-sample-1.jpg" alt="Go prediction sample" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Another move has a higher probability, but the test dataset's move also has a high probability (two maximums):&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/go-supervised-prediction-sample-2.jpg" alt="Go prediction sample" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;The test dataset's move is missed completely:&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/go-supervised-prediction-sample-3.jpg" alt="Go prediction sample" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/08-go-supervised/Go-GPU-run.ipynb"&gt;See the output here for more examples&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I didn’t try this, but this is probably not a problem: we can train multiple models on different training sets, and use the outputs of several models to guide the MCTS walk during simulation.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Neural networks are a great tool for well-defined problems with rich training data, and Go is like this. I didn’t pursue this specific supervised learning approach further, because this is not the most interesting way to build a Go bot. Here we are using other player’s moves to teach the network, which would potentially limit the agent’s strength and creativity. With Reinforcement Learning and self-play, as demonstrated by AlphaGo, it’s possible to build an agent that doesn’t use past gameplay data at all.&lt;/p&gt;
&lt;p&gt;I don't actually play Go, so I plan to switch to Chess for further toy model building. Stay tuned!&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="go"></category></entry><entry><title>Arabic name classification with Scikit-Learn and Pytorch</title><link href="https://bytepawn.com/arabic-name-classification-with-scikit-learn-and-pytorch.html" rel="alternate"></link><published>2019-08-02T00:00:00+02:00</published><updated>2019-08-02T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-08-02:/arabic-name-classification-with-scikit-learn-and-pytorch.html</id><summary type="html">&lt;p&gt;While working on arabic-vs-rest classification, I was curious how good out-of-the-box models perform with publicly available data, and then compare that with what we can achieve with internal data / features derived from millions of deliveries. We train Scikit-learn and Pytorch models for this classification task and achieve 90% prediction accuracy on publicly available data and out-of-the-box models, while internally 99% is achievable. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/arabic-1.png" alt="ROC curve" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Many times, a delivery company doing parcel deliveries on behalf of its clients to the clients’ customers (the recipients) doesn’t have a direct relationship with the recipient herself. All it has is the parcel, the attached name and address, and maybe a product description on the airway bill. Nevertheless, there are numerous steps in the logistics funnel when the delivery company would like to communicate to the recipient; the simplest example is sending an “I’m coming” notification on the day of (attempted) delivery. These notifications themselves present rich opportunities for data scientists for experimentation. One simple thing is to get the language right. In the Middle East, in countries like the UAE, more than half the population is expats, so we can probably do better than a country default.&lt;/p&gt;
&lt;p&gt;In this region, the simplest base case is to tell an arabic name apart from a non-arabic name, and assume that arabic/english notifications work for those two cases. While working on this arabic-vs-rest classification problem, I was curious how good out-of-the-box models perform with publicly available data, and then compare that with what we can achieve with internal data / features derived from millions of deliveries.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/07-arabic"&gt;The code and training data is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Building a training set from publicly available data&lt;/h2&gt;
&lt;p&gt;Fortunately there are some publicly available datasets that we can merge to get training data. Specifically, I used these sources (first 4 for arabic, last for english):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/List_of_Arabic_given_names"&gt;https://en.wikipedia.org/wiki/List_of_Arabic_given_names&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/zakahmad/ArabicNameGenderFinder"&gt;https://github.com/zakahmad/ArabicNameGenderFinder&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.20000-names.com/female_arabian_names.htm"&gt;http://www.20000-names.com/female_arabian_names.htm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.20000-names.com/male_arabian_names.htm"&gt;http://www.20000-names.com/male_arabian_names.htm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ligi/HiLoCo/blob/master/app/src/main/res/raw/names.csv"&gt;https://github.com/ligi/HiLoCo/blob/master/app/src/main/res/raw/names.csv&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Using these sources I created a training set of 10,000 names:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;5,000 arabic, 5,000 english&lt;/li&gt;
&lt;li&gt;8,000 for training (balanced)&lt;/li&gt;
&lt;li&gt;2,000 for testing  (balanced)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Some arabic examples: &lt;em&gt;ahmed, ghalib, hasna, salar, afruz.&lt;/em&gt;
Some english examples: &lt;em&gt;john, westwood, eldon, corina, margareta.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The names are cleaned: ASCII a-z letters only; all lower-case; at least 3 characters long; some arabic names are also common in english speaking countries (like ahmed or ali), these were removed from english; and so on.&lt;/p&gt;
&lt;h2&gt;Model training&lt;/h2&gt;
&lt;p&gt;I specifically wanted to see how bag-of-chars feature vectors perform against one-hot encoded ones; other than that I just wanted to try a bunch of models I had anyway from previous SKL and Pytorch projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html"&gt;logistic regression&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html"&gt;simple decision trees&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html"&gt;Random Forests&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html"&gt;Gradient Boosted Trees&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html"&gt;Neural nets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;fully connected deep NNs in Pytorch,&lt;/li&gt;
&lt;li&gt;and finally CNNs in Pytorch.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I won’t go into the details of these models, I will just show results.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/arabic-2.png" alt="Arabic-vs-rest model train and test results" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;My observations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the tree based and NN models have enough parameters to learn the training set (train accuracy approaches 100%)&lt;/li&gt;
&lt;li&gt;the neural nets don't outperfom the trees&lt;/li&gt;
&lt;li&gt;neither model is able to go significantly above 90% test accuracy, so generalization is limited&lt;/li&gt;
&lt;li&gt;90% is not good enough for production on this problem (although that wasn’t the goal here)&lt;/li&gt;
&lt;li&gt;I was expecting one-hot encoded models to pick up on useful trigram features, but they don’t significantly outperform the simple bag-of-chars models; I was expecting the CNN specifically to outperform the others here; the reason is that if characters in a trigram are far apart in terms of the one-hot coding, the CNN is not able to pick it up directly, only in downstream "averaged" layers&lt;/li&gt;
&lt;li&gt;I was training the Pytorch models on my laptop’s CPU, that’s why the Pytorch training times are so high; it’s interesting how much slower the Pytorch network training is than SKL’s MLPClassifier (which is also a deep neural network); possible reasons: the MLP has a lot less parameters and/or the Pytorch models have ReLu() and other non-linearities&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Doing much better with a lot more data&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Note: these datasets derived from internal data, and the models tied to it, are not shared.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Since we do millions of deliveries, specifically in the countries of interest to us, we have the potential to have much more training data than the datasets above. The challenge for us was ground truth: our historic deliveries are not labeled for arabic/rest. Fortunately there are tricks we could use (other available fields, frequency, etc), to create a high-quality arabic names dataset for each of our target countries. As is usual in these projects, 80% of the work went into creating a high-quality dataset; once this was done, builiding a predictor was simple. In the end, the best-performing model is a hand-tuned tri-gram based model that also uses frequencies of names (not part of the public datasets), and achieves 99% accuracy. It only gets confused on names that genuinely sound like they could be arabic (but are actually not, or are shared arabic/indian names). I leave it to the reader to change the models/feature vectors above to be bag-of-trigrams, it should only be a few lines of code changes.&lt;/p&gt;
&lt;h2&gt;Estimating impact&lt;/h2&gt;
&lt;p&gt;A model like this (once we’re reasonably sure it’s accuracy is good enough that we can trust it for making estimates) is not just useful in production, but also for estimating its own impact. By running it on (unseen) past delivery data in our target countries, we can see how much of an impact the model will have in production: there are various “naive” ways to predict language (always arabic, always english, use information from other fields), and we can compare the accuracy of these to what we can achieve with our model, to get the additional % of deliveries where we will get the notification language right. If we also have an estimate for how much getting the language right lifts the probability of delivery success, we can estimate the overall lift in delivery success, which is then easy to translate to dollars (or dirhams). If any of these multpliers is not available in this &lt;a href="https://en.wikipedia.org/wiki/Fermi_problem"&gt;Fermi-decomposition&lt;/a&gt;, then we can always perform an A/B test to see the impact. For us, running this estimate shows that the model will be most useful in the UAE; this makes sense, there are lots of expats in Dubai; and, interestingly, another GCC country, where most of our deliveries go to arabic names, but this would be hard to tell without the predictor since the recipients give their names with english letters (ie. not with arabic Unicode characters).&lt;/p&gt;
&lt;p&gt;The arabic expression for &lt;em&gt;“done”&lt;/em&gt; is &lt;em&gt;“khallas”&lt;/em&gt;, which is used often enough in the region that it also becomes slang for non-arabic speakers like myself. This project however is not done, there are always many ways to improve such models. Instead, &lt;em&gt;“yalla”&lt;/em&gt; is more appropriate here, which roughly means &lt;em&gt;“let’s go”&lt;/em&gt;, and improve the models further!&lt;/p&gt;
&lt;p&gt;Yalla let's go!&lt;/p&gt;</content><category term="pytorch"></category><category term="skl"></category><category term="arabic"></category><category term="fetchr"></category></entry><entry><title>Exploring prior beliefs with MCMC</title><link href="https://bytepawn.com/exploring-prior-beliefs-with-mcmc.html" rel="alternate"></link><published>2019-07-06T00:00:00+02:00</published><updated>2019-07-06T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-07-06:/exploring-prior-beliefs-with-mcmc.html</id><summary type="html">&lt;p&gt;I use PyMC3 to solve the food delivery toy problem and explore some alternative priors. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/pymc-2.png" alt="PyMC3 traceplot()" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous post &lt;a href="http://bytepawn.com/food-deliveries-bayes-and-computational-statistics.html"&gt;Food deliveries, Bayes and Computational Statistics&lt;/a&gt; a first principles simulation was used to calculate the Bayes posterior probability of UberEats being the most popular food delivery service. After I finished writing the post my I was left feeling unsatisfied: I wrote too much simulation code. I realized I could've used a &lt;a href="https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo"&gt;Markov Chain Monte Carlo (MCMC)&lt;/a&gt; framework to get the same result with less code. &lt;a href="https://github.com/mtrencseni/playground/blob/master/PyMC%20UberEats.ipynb"&gt;The notebook is up on Github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So let's do that. Let's use &lt;a href="https://docs.pymc.io/"&gt;PyMC3&lt;/a&gt;, the standard MCMC library for Python. This code is so simple, without further ado, here it is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pymc3&lt;/span&gt; &lt;span class="kn"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pm&lt;/span&gt;
&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;matplotlib&lt;/span&gt; &lt;span class="n"&gt;inline&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Carriage&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Talabat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;UberEats&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Deliveroo&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;uniform_prior&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;raw&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deterministic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prior&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;vals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prior&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;multi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Multinomial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;multi&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;observed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;draws&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;progressbar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;

&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uniform_prior&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Okay, what's happening here? PyMC3 is an MCMC library, and computes representative samples of random variables based on some data. We specify a &lt;code&gt;model&lt;/code&gt;, which is essentially a way to compute a desired random variable (RV) (essentially a probability distribution function, pdf) from other random variables. In this case, we say that &lt;code&gt;multi&lt;/code&gt; is a &lt;a href="https://en.wikipedia.org/wiki/Multinomial_distribution"&gt;multinomial RV&lt;/a&gt;, with the parameters of &lt;code&gt;multi&lt;/code&gt; itself being RVs, these are &lt;code&gt;ps&lt;/code&gt; in the code. When we specify &lt;code&gt;multi&lt;/code&gt;, we pass in our observed values, so we "fix" this RV. The &lt;code&gt;ps&lt;/code&gt; RVs are constructed in the &lt;code&gt;prior&lt;/code&gt; helper function, named prior because here is where we encode our prior belief about the food delivery business. To match the previous post, we don't assume any courier is more popular than the other, constructed so that the probabilities sum to 1.&lt;/p&gt;
&lt;p&gt;When we call &lt;code&gt;sample()&lt;/code&gt; on the model, Markov Chain Monte Carlo simulation is performed; the details of this are beyond the scope of this article. &lt;strong&gt;The important thing is that after the simulation is complete, &lt;code&gt;trace&lt;/code&gt; will contain a sample of all free (non-fixed) RVs, updated according to the observations passed ot the fixed RVs&lt;/strong&gt;. Ie. when we specified these RVs, we gave our prior belief pdfs, $ p(H) $, whereas &lt;code&gt;trace&lt;/code&gt; now contains samples from the posterior pdf, ie. $ p(H|D) $. That's exactly what we worked for in the previous post!&lt;/p&gt;
&lt;p&gt;Given this sample that PyMC3 gives us, it's easy to calculate the hypothesis. We simply count the frequency of cases when the probability parameter of UberEats is the maximum in the &lt;code&gt;trace&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p2&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]))]&lt;/span&gt;
&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hypothesis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hypothesis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The program outputs the probability at 72%, like in the original post.&lt;/p&gt;
&lt;p&gt;PyMC3 has a few useful helper functions. One is &lt;code&gt;pm.summary(trace)&lt;/code&gt;, this outputs some statistics about the RVs:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pymc-1.png" alt="PyMC3 summary()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The other is &lt;code&gt;pm.traceplot(trace)&lt;/code&gt;, this outputs the pdf and the random walk for each RV:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pymc-2.png" alt="PyMC3 traceplot()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;There is something weird here. Looking at the original observation data, half of the deliveries are from UberEats (16/32). The corresponding RV in the &lt;code&gt;summary()&lt;/code&gt; output is &lt;code&gt;p2&lt;/code&gt;. So why is the posterior mean for &lt;code&gt;p2&lt;/code&gt; 0.44 and not closer to 0.50? Intuitively, after looking at the data, shouldn't that be the maximum likelyhood (ML) prediction? Let's think this through:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the mean is not the ML, but looking at the posterior curve for &lt;code&gt;p2&lt;/code&gt;, the maximum of the curve (=ML) is also not at 0.5&lt;/li&gt;
&lt;li&gt;it could be that the simulation has not reached a stationary state (in Markov chain language); but looking at the random walk output, it has (and to double-check I ran it longer, this is not the problem)&lt;/li&gt;
&lt;li&gt;numerical error: this is not the reason&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The reason is simple: it's our prior belief. Our prior belief was a uniform distribution. We observed some data and showed it to the model, and based on this our posterior belief was updated, &lt;strong&gt;but the prior belief is still baked in&lt;/strong&gt;. The way to think about this is, the prior pdf slowly morphs into the posterior pdf as data is fed to the model.&lt;/p&gt;
&lt;p&gt;One good way to get a feel for this is to imagine feeding a low observation count to the model. Suppose we feed it observations like &lt;code&gt;(1, 0, 2, 1)&lt;/code&gt;. Should we now believe that Talabat never gets orders? Clearly not.&lt;/p&gt;
&lt;p&gt;Another good way to get a feel for this is to feed the model more data, but keeping the same ratios, ie. multiplying each number by 10, like &lt;code&gt;(20, 20, 160, 120)&lt;/code&gt;. At such high observation counts, the posteriors will be very narrow, and close to the actual ratios. This is called flooding the priors: as the model observes a lot of data, the prior belief is less and less important. &lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pymc-3.png" alt="PyMC3 summary()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Also the hypothesis probability of UberEats being the most popular will be 98.7%, or almost certain.&lt;/p&gt;
&lt;p&gt;PyMC3 also allows us to easily explore other prior beliefs. For example, we could specify a prior centered around the observed ratios. &lt;strong&gt;BAKING OBSERVATIONS INTO THE PRIOR IS A MODELLING MISTAKE, DON'T DO THIS.&lt;/strong&gt; I just used it as a debugging exercise to make sure that in such a case the posteriors are what I expect them to be, they remain centered on the observed ratios (yes). The evilness also comes out in the function signature, passing the observed values to the prior shouldn't happen. But in any case, it shows how easy it is to change the priors with PyMC3:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;beta_prior&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Beta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;raw&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deterministic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;beta_prior&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Going back to the original problem, we can make our code shorter by using PyMC3's built-in &lt;a href="https://en.wikipedia.org/wiki/Dirichlet_distribution"&gt;Dirichlet distribution&lt;/a&gt;. The Dirichlet distribution can be parameterized so that it's the uniform distribution in n-dimenions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dirichlet_uniform_prior&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dirichlet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ps&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dirichlet_uniform_prior&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ps&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hypothesis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hypothesis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Running this reveals something interesting. The probability of the hypothesis that UberEats is the most popular comes out to ~77%, which is about 5% more than before. What's going on here? It turns out that prior we baked in before, although it didn't favor any courier over the others, did have some bias built-in. We sampled each individual &lt;code&gt;raw&lt;/code&gt; RV uniformly between 0 and 1, and then divided by the &lt;code&gt;sum&lt;/code&gt;. Each &lt;code&gt;raw&lt;/code&gt;'s mean is 0.5, &lt;code&gt;sum&lt;/code&gt;'s mean is 2, so the &lt;code&gt;ps&lt;/code&gt; component's mean is 0.25. So by constructing the prior like this, we biased it towards &lt;code&gt;(0.25, 0.25, 0.25, 0.25)&lt;/code&gt;, ie. we said this is a more likely prior than eg. &lt;code&gt;(0.7, 0.1, 0.1, 0.1)&lt;/code&gt;. This can be seen by using MCMC to just sample:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;raw&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deterministic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sum&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deterministic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;p&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;draws&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;progressbar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;traceplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/pymc-5.png" alt="PyMC3 traceplot()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Looking at the &lt;code&gt;sum&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pymc-6.png" alt="PyMC3 traceplot()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Versus the Dirichlet distribution (&lt;a href="http://jessicastringham.net/2018/05/25/quick-example-of-dirichlet-distribution.html"&gt;see this blog post for more&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dirichlet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ps&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;trace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;draws&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;progressbar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;traceplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/pymc-4.png" alt="PyMC3 traceplot()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;I would say the original post is buggy, because I intended to make no prior assumption (=use the "uninformative prior"), so the Dirichlet distribution (at 𝛼=1) is the right prior.&lt;/p&gt;</content><category term="python"></category><category term="math"></category><category term="pymc3"></category></entry><entry><title>A/B tests: Moving Fast vs Being Sure</title><link href="https://bytepawn.com/ab-tests-moving-fast-vs-being-sure.html" rel="alternate"></link><published>2019-07-01T00:00:00+02:00</published><updated>2019-07-01T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-07-01:/ab-tests-moving-fast-vs-being-sure.html</id><summary type="html">&lt;p&gt;Most A/B testing tools default to α=0.05, meaning the expected false positive rate is 5%. In this post I explore the trade-offs between moving fast, ie. using higher α, versus being sure, ie. using lower α.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/ab-testing-base.png" alt="14. slide" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;The basic flow of an A/B test (in a frequentist setting) is roughly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;formulate a hypothesis ("sending additional notifications will cause people to be available for deliveries")&lt;/li&gt;
&lt;li&gt;select a target metric ("Delivery Performance = Deliveries/Dispatches") and specify the base value ("75%")&lt;/li&gt;
&lt;li&gt;estimate the lift on the target metric ("1%")&lt;/li&gt;
&lt;li&gt;use an &lt;a href="https://www.evanmiller.org/ab-testing/sample-size.html"&gt;off-the-shelf A/B testing tool to figure out how many N samples you will need&lt;/a&gt;, given the base metric value and expected lift&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ab-testing-evanmiller-sample-size.png" alt="Sample size" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;At this point, in the A/B testing tool, there are 2 magic parameters that the Data Scientist has to set for the Chi-squared test:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;α:&lt;/strong&gt;   also called &lt;strong&gt;sensitivity&lt;/strong&gt;, the probability that, if the experiment goes off, the effect is actually not there (&lt;strong&gt;False Positive Rate, FPR&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1-β:&lt;/strong&gt; also called &lt;strong&gt;power&lt;/strong&gt;, the probability that, if the effect is there, the experiment will go off (&lt;strong&gt;True Positive Rate, TPR&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most tools will default α=0.05 and 1-β=0.8, which is a sensible default. With these, the A/B testing tool will tell the Data Scientist how many N samples she needs to collect per variant (control and test) to be able to detect the given lift (1%) from the base value (75%), with the specified statistics (FPR, TPR).&lt;/p&gt;
&lt;p&gt;By pushing down the FPR, we can increase our confidence in our results. More power will enable us to catch more working experiments. But there is no free lunch, we will need to collect more N samples for this. So what is a good trade-off here? Almost all tools default to the values above, but there is nothing special about them.&lt;/p&gt;
&lt;p&gt;The reason I started thinking about this is that in a startup setting, when there are low volumes [for the subset we're testing, eg. a specific city for a delivery company], it takes a lot of days to collect the N samples, which slows down the product development velocity. This is essentially a management concern: if A/B testing is seen by management to be a burden that slows things down, then it won't happen, which is a net loss. Often I feel that it's probably fine to run at α=0.1 or even α=0.2, or, in other words, to accept higher p-values as statistically significant; being wrong 20% of the time is not too terrible in a startup setting when no lives are at stake (ie. we're not experimenting with new drugs).&lt;/p&gt;
&lt;p&gt;So let's try to quantify this with a &lt;strong&gt;toy model&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We continue with the assumptions above: base metric = 75%, expected lift = +1%.&lt;/li&gt;
&lt;li&gt;Let's look at three different scenarios, α=0.05, α=0.10 and α=0.20.&lt;/li&gt;
&lt;li&gt;We use &lt;a href="https://www.evanmiller.org/ab-testing/sample-size.html"&gt;Evan Miller's A/B testing sample size tool&lt;/a&gt; to read off the N samples required.&lt;/li&gt;
&lt;li&gt;At higher α, we need less N, so we can run more experiments, that's the point (but we will get more false positives, FPs).&lt;/li&gt;
&lt;li&gt;We continuosly run experiments for 365 days.&lt;/li&gt;
&lt;li&gt;Let's assume that 1 in 4 experiments actually yield a hit, which results in the desired +1% lift.&lt;/li&gt;
&lt;li&gt;Let's estimate the $ value of a hit at $100,000 (we realize this on true positives, TPs),&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Let's estimate the cost of a rollout at $25,000 (we incur this cost on both TP and FP hits).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This is what the three scenarios look like, annualized for easy readability, &lt;a href="https://docs.google.com/spreadsheets/d/1thsMPiUAd4WYbxZI4cVB1zyZqcYW9lCZHqPrR9X3gFQ"&gt;the spreadsheet is here&lt;/a&gt;:&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ab-testing-base.png" alt="A/B testing base case" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this base scenario, with these parameters, running at high α is better. It yields more experiments, because we need less sample size, so we will find more hits, and even though we incur more false positives, it's still worth it. As we increase the cost of the rollout, this benefit goes away, as the cost of rolling out the FPs eats up the value generated by the TPs. &lt;strong&gt;Having said that, I think this scenario applies in many startups: moving fast over being sure is worth it&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Let's add one parameter to the toy model. Suppose there is a cost to running an experiment, $10,000/experiment. This essentially means that there is a cost to "moving fast" (new parameters highlighted in blue):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ab-testing-with-cost.png" alt="A/B testing with cost" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this scenario, α=0.05 is the better choice. When we move fast at higher αs, we lose more on running and rolling out our FPs than we gain by finding more TPs.&lt;/p&gt;
&lt;p&gt;Let's try another one, where we add a loss, ie. 50% of our false positives actually make things worse, and in this case we incur a loss of $50,000 (new parameters highlighted in blue):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ab-testing-with-loss.png" alt="A/B testing with loss" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this case,  α=0.10 is the better choice, but it's pretty close.&lt;/p&gt;
&lt;p&gt;Let's combine the two, and have both cost and loss:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ab-testing-with-both.png" alt="A/B testing with both" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;In this scenario, we're better off being sure and running at α=0.05, running at α=0.20 would actually cause the startup to lose money (not to mention the salary of the Data Scientist!).&lt;/p&gt;
&lt;p&gt;As these simple toy models show, it does make sense to increase α, the accepted false positive rate, in some situations. For example, at my current company the base case is the closest to reality, except the value of a hit is even higher and the rollout cost is even lower (and the likelihood of a loss on a FP is low, sending more notifications is unlikely to yield less likely availability for delivery when our driver shows up).&lt;/p&gt;
&lt;p&gt;Having said that, there are valid counter-arguments to running at high α (accepting higher p-values):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;less certainty (higher FPR) could diminish the organizational belief in experimental results and A/B testing&lt;/li&gt;
&lt;li&gt;can a startup come up with enough experiments to justify a higher α?&lt;/li&gt;
&lt;li&gt;can a startup roll out all those (TP and FP) hits, is the velocity of the Product and Engineering team high enough?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Fortunately, building a &lt;a href="https://docs.google.com/spreadsheets/d/1thsMPiUAd4WYbxZI4cVB1zyZqcYW9lCZHqPrR9X3gFQ"&gt;spreadsheet like this&lt;/a&gt; takes less than an hour, so everybody can make up their own mind, weigh the pros and cons of statistical and management aspects, and find a trade-off in (α, β) parameter space which works for them!&lt;/p&gt;</content><category term="ab-testing"></category><category term="fetchr"></category></entry><entry><title>Food deliveries, Bayes and Computational Statistics</title><link href="https://bytepawn.com/food-deliveries-bayes-and-computational-statistics.html" rel="alternate"></link><published>2019-06-22T00:00:00+02:00</published><updated>2019-06-22T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-06-22:/food-deliveries-bayes-and-computational-statistics.html</id><summary type="html">&lt;p&gt;I was grabbing a burger at &lt;a href="https://www.shakeshack.com/location/dubai-moe/"&gt;Shake Shack&lt;/a&gt;, &lt;a href="http://www.malloftheemirates.com/"&gt;Mall of the Emirates&lt;/a&gt; in Dubai, when I noticed this notebook on the counter. The staff is using it to track food deliveries and each service (Carriage, Talabat, UberEats, Deliveroo) has its own column with the order numbers. Let's assume this is the only page for the day, and ask ourselves: &lt;em&gt;given this data, what is the probability that UberEats is the most popular food delivery service?&lt;/em&gt;. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/shake-shack.jpg" alt="Shake shack food deliveries" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I was grabbing a burger at &lt;a href="https://www.shakeshack.com/location/dubai-moe/"&gt;Shake Shack&lt;/a&gt;, &lt;a href="http://www.malloftheemirates.com/"&gt;Mall of the Emirates&lt;/a&gt; in Dubai, when I noticed this notebook on the counter. The staff is using it to track food deliveries, and each service &lt;strong&gt;(Carriage, Talabat, UberEats, Deliveroo)&lt;/strong&gt; has its own column with the order numbers. Let's assume this is the only page for the day, and ask ourselves: &lt;em&gt;given this data, what is the probability that UberEats is the most popular food delivery service?&lt;/em&gt;. &lt;a href="https://github.com/mtrencseni/playground/blob/master/UberEats.ipynb"&gt;The ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/shake-shack.jpg" alt="Shake shack food deliveries" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is good challenge to exercise our Bayesian reasoning. &lt;a href="https://en.wikipedia.org/wiki/Bayes%27_theorem"&gt;Bayes theorem&lt;/a&gt; states that $ P(A|B)P(B) = P(A,B) = P(B|A)P(A) $, where $ P(A|B ) $ is the conditional probability of A given B, and $ P(A, B) $ is the joint probability of A and B co-occuring. All we have to do is replace A with &lt;strong&gt;H for Hypothesis&lt;/strong&gt; and B with &lt;strong&gt;D for Data&lt;/strong&gt;, and we get our formula for attacking the challenge: $ P(H|D) = \frac { P(D|H)P(H) }{ P(D) } $. Here &lt;strong&gt;Hypothesis&lt;/strong&gt; is that UberEats is the most popular service, and &lt;strong&gt;Data&lt;/strong&gt; is the observed counts (2, 2, 16, 14) in the notebook.&lt;/p&gt;
&lt;h2&gt;Simulation&lt;/h2&gt;
&lt;p&gt;First, let's do a brute-force simulation. Our model is that there are 4 services, each service has an associated probability, the probabilities sum to 1. When a new orders arrives at Shake Shack, it is according to this model that the service is drawn. A single simulation run---given a model (so, 4 probabilities for each service, summing to 1)---consists of 2 + 2 + 16 + 14 = 34 draws, and the outcome is the distribution between the services, eg. (10, 10, 10, 4). In mathematics terms, this drawing is called a multinomial distribution, &lt;code&gt;numpy&lt;/code&gt; has a &lt;code&gt;multinomial&lt;/code&gt; function, so we will use that. So our outer loops looks something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;numpy.random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multinomial&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Carriage&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Talabat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;UberEats&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Deliveroo&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;n_models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;multinomial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_models&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="c1"&gt;# call random_simulation(model, sum(data.values())) and count things&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Since this is a simulation, we had to introduce &lt;code&gt;n_model&lt;/code&gt;, the number of random models we sample (the ensemble size). Now we just have to implement the $ P(H|D) $ formula. This is simple. $ P(D|H) $ is the fraction of &lt;code&gt;random_simulation&lt;/code&gt; cases that match the data, when the model matches the hypotehesis (see next sentence). $ P(H) $ is the fraction of models (!) that match the hypothesis of UberEats being the most popular, ie. that probability being the highest. $ P(D) $ is the total number of cases when &lt;code&gt;random_simulation&lt;/code&gt; matches the data. First, some helper functions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;same_as_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;model_satisfies_hypothesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# UberEats&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We have to introduce another parameter, the number of trial runs per model, which we'll call &lt;code&gt;n_simulations&lt;/code&gt;. The simulation is now (note that this is actually two loops, the list expression for &lt;code&gt;hits&lt;/code&gt; is a loop):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;n_models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="n"&gt;n_simulations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="n"&gt;hits_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;hits_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_models&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;same_as_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_simulation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_simulations&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hits_total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_satisfies_hypothesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;hits_hypothesis&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;
        &lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In terms of these variables:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;p_data_given_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hits_hypothesis&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;n_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n_models&lt;/span&gt;
&lt;span class="n"&gt;p_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hits_total&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_models&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;n_simulations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# p_hypothesis_given_data = p_data_given_hypothesis * p_hypothesis / p_data&lt;/span&gt;
&lt;span class="c1"&gt;# but, for better numerics, the above simplifies to:&lt;/span&gt;
&lt;span class="n"&gt;p_hypothesis_given_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hits_hypothesis&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;hits_total&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;That's it! We just have to run this simulation, and it will tell us our desired probability. Unfortunately, this is actually a lot of &lt;code&gt;random()&lt;/code&gt;, calls. On my Macbook, this takes a long time to run. So I modified it a bit, updated and displayed the estimate &lt;code&gt;hits_hypothesis/hits_total&lt;/code&gt; as the simulation was running. This is what it looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/food-bayes-simu.png" alt="Food delivereis simulation" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Obviously it takes a long time for the simulation to converge. In fact, the &lt;code&gt;p_est&lt;/code&gt; shown in the screenshot above is not correct, the simulation has not yet converged. The reason is that the simulation is sparse. In most cases, there are no hits, so we waste a lot of cycles not updating our estimate &lt;code&gt;hits_hypothesis/hits_total&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Monte Carlo integration&lt;/h2&gt;
&lt;p&gt;The above code is a good start, but it's terribly inefficient. Actually, there's no reason to brute-force the simulation itself, since there is an &lt;a href="https://en.wikipedia.org/wiki/Multinomial_distribution#Probability_mass_function"&gt;explicit formula for the multinomial probability&lt;/a&gt; of drawing $ k_i $ given $ p_i $ probabilities. The package &lt;code&gt;scipy&lt;/code&gt; implements it, it's also called &lt;code&gt;multinomial&lt;/code&gt; (like the &lt;code&gt;numpy&lt;/code&gt; one), but this is the explicit probability mass function, not a random draw. Actually all we have to do is replace one line, the &lt;code&gt;hits&lt;/code&gt; calculation, to compute the fractional probability of observing the data. It changes the meaning of the variables (the &lt;code&gt;hits&lt;/code&gt; variables are no longer integers counts, instead they're summed fractions), but the ratios are the same. So the simulation becomes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;scipy.stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multinomial&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Carriage&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Talabat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;UberEats&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;Deliveroo&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;n_models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;random_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;same_as_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;model_satisfies_hypothesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;ps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# UberEats&lt;/span&gt;

&lt;span class="n"&gt;hits_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;hits_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_models&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;multinomial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pmf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;-- this changed&lt;/span&gt;
    &lt;span class="n"&gt;hits_total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_satisfies_hypothesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;hits_hypothesis&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;
        &lt;span class="n"&gt;num_model_satisfies_hypothesis&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;p_hypothesis_given_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hits_hypothesis&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;hits_total&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_hypothesis_given_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that we now no longer need the variable &lt;code&gt;n_simulations&lt;/code&gt;. This is now lighting fast, converges in a few seconds, and quickly tells us that the probability we're looking for is 72%. &lt;strong&gt;Given what I saw on the notebook, the probability that UberEats is their most popular delivery service is ~72%.&lt;/strong&gt; Given this code, it's easy to check another one. What's the probability that Deliveroo is the most popular? Just change &lt;code&gt;model_satisfies_hypothesis&lt;/code&gt;, run it again, and it's ~28%. The other two have almost zero chances of actually being the most popular. In this whole discussion I'm assuming a uniform prior, ie. I have no prior bias towards believing one service being more popular than the rest.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note: this post has a subtle bug in the prior assumption. For more, see this &lt;a href="/exploring-prior-beliefs-with-mcmc.html"&gt;subsequent post&lt;/a&gt;. The prior, as implemented above, does not favor one courier over another, but it is not in fact uniform in p-space.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Before moving on, here's a screenshot of this version, but before it converges. It shows the initial oscillations, which is what we were observing with the brute-force simulations (note that it eventually converges to 0.72):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/food-bayes-simu2.png" alt="Food delivereis simulation" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Integrals&lt;/h2&gt;
&lt;p&gt;In case you remember college calculus, what we're doing above is a &lt;a href="https://en.wikipedia.org/wiki/Monte_Carlo_method"&gt;Monte Carlo (MC) integration&lt;/a&gt; of:&lt;/p&gt;
&lt;p&gt;$$ \frac { \int_{\sum p_i = 1 \wedge p_3 = max(p_i)} P(x_i=k_i) dp_i }{ \int_{\sum p_i = 1} P(x_i=k_i) dp_i } $$&lt;/p&gt;
&lt;p&gt;where:&lt;/p&gt;
&lt;p&gt;$$ P(x_i = k_i) = \frac {N!}{x_1! ... x_s!} p_1^{x_1} ... p_s^{x_s} $$&lt;/p&gt;
&lt;p&gt;is the multinomial probability mass function evaluated at&lt;/p&gt;
&lt;p&gt;$$ k_i = (2, 2, 16, 14), s = 4, i = 1.. 4, N = \sum k_i = 34.$$&lt;/p&gt;
&lt;p&gt;It's called Monte Carlo because instead of computing the integral explicitly, we're taking random samples in p-space and using that to estimate. The $ P() $ function above can be integrated, it's just a polynomial, and the whole integral can be explicitly calculated (I think) with pen and paper (or Mathematica); both numerator and denominator constraints are a hyperplane in 4D p-space, with the numerator being more tricky because of the $ max() $ constraint. I will let the reader do this as homework :)&lt;/p&gt;
&lt;p&gt;If you're having trouble understanding this post, check out the free book &lt;a href="https://greenteapress.com/wp/think-bayes/"&gt;Think Bayes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/thinkbayes.png" alt="Think Bayes book" style="width: 300px;"/&gt;&lt;/p&gt;</content><category term="python"></category><category term="math"></category><category term="fetchr"></category></entry><entry><title>The Collatz conjecture</title><link href="https://bytepawn.com/the-collatz-conjecture.html" rel="alternate"></link><published>2019-06-02T00:00:00+02:00</published><updated>2019-06-02T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-06-02:/the-collatz-conjecture.html</id><summary type="html">&lt;p&gt;The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: start with any positive integer n. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of n, the sequence will always reach 1. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/collatz.png" alt="Collatz" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I came across the Collatz conjecture reading the book &lt;a href="https://www.amazon.com/Model-Thinker-What-Need-Know/dp/0465094627"&gt;The Model Thinker&lt;/a&gt;. The book I don’t recommend particulary, 80% of it is topics science/engineering students learn at college (eg. entropy, normal distribution). But 20% is interesting tidbits and references I haven’t heard of. One was the the &lt;a href="https://en.wikipedia.org/wiki/Collatz_conjecture"&gt;Collatz conjecture. From wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: start with any positive integer n. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of n, the sequence will always reach 1.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The kicker is, nobody can prove the Collatz conjecture. &lt;a href="https://terrytao.wordpress.com/2011/08/25/the-collatz-conjecture-littlewood-offord-theory-and-powers-of-2-and-3/"&gt;Not even Terry Tao!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here it is as code, so we can play around with it. &lt;a href="https://github.com/mtrencseni/playground/blob/master/Collatz.ipynb"&gt;The code is up on Github&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collatz&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collatz&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sequence&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;52&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;26&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;How about a bigger one:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2498904803&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2498904803&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;7496714410&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;3748357205&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;11245071616&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;5622535808&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;2811267904&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;1405633952&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;702816976&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="o"&gt;...&lt;/span&gt;
 &lt;span class="mi"&gt;53&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;160&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The one above is 278 long! This is what it looks like on a log plot:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;log2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2498904803&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-log2-2498904803.png" alt="Collatz log2 2498904803" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Sequence properties&lt;/h2&gt;
&lt;p&gt;The sequences have a lot of random properties. For example, plotting the length of Collatz sequences:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-length.png" alt="Collatz length" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;What is the biggest value the sequence gets to, on a log plot:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;log2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-max.png" alt="Collatz max" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.ericr.nl/wondrous/pathrecs.html"&gt;Some people are brute-forcing it&lt;/a&gt;, and have checked that all numbers satisfy the conjecture up to about 2^64. They're also recording path records, ie. the biggest number reached on the path down to 1. The current path record is (this is the &lt;code&gt;max()&lt;/code&gt; I plotted above), reading off from the linked page (without running):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;71&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;149323&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;674102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;624415&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;9055&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;383924&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;226744&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;340579&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;466230&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;337749&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;396932&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Plotting the path records on a log-log plot looks very linear:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/collatz-path-records.png" alt="Collatz max" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The only way the series can get down to 1 is if it decreases. It only decreases when dividing by 2. So another way to state the conjecture is that eventually all series reach a number that is 2^k for some integer k. So let's check what is the biggest k, as a function of n:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;log2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_power2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-max2power.png" alt="Collatz max" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;It would be nice if there would be some regularity here, like numbers smaller than 2^k end up in 2^k, but that's not true. However, something similar was found by the &lt;a href="(http://www.ericr.nl/wondrous/pathrecs.html)"&gt;brute-force check project&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The table therefore establishes the practical fact that for all numbers in the interval researched so far the path of every number taking k bytes (assuming a byte consists of 8 bits) can be completely determined using a storage of just 2k bytes for intermediate results.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Another idea I had was around &lt;a href="https://en.wikipedia.org/wiki/Lyapunov_function"&gt;Lyapunov functions&lt;/a&gt;. If one can come up with a function L(), so that L() is decreasing for a Collatz sequence (and L() has some good properties), that could help. One thing I tried is counting how the sequence produces 1s in the binary representation of the number:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;&amp;#39;{0:b}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;{0:b}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-binary1s.png" alt="Collatz max" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The 1s are not conserved. It is possible to upper bound it, for example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{0:b}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;{0:b}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;&amp;#39;{0:b}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;collatz_sequence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/collatz-bound.png" alt="Collatz max" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;But this isn't actually useful (not a good Lyapunov function), since the binary 1s can keep traveling up in the binary sequence, so the numbers can keep getting bigger.&lt;/p&gt;
&lt;h2&gt;Primes and cycles&lt;/h2&gt;
&lt;p&gt;Another way to think about it like this: imagine the number broken out as prime factors. When there are prime factors like 2^k, those are chopped off by the divisions. So any time the sequence hits a number where the prime factors have a 2^k, those are chopped off. If there is something left (not 1), it will be odd, let's call it n=m+1, where m is even. We multiply it by 3 and add 1 (3m+4), that will be even, so we divide by 2. So what do we know about (3m+4)/2 = m + m/2 + 2, if m is even? &lt;a href="https://www.quora.com/How-are-the-prime-factors-of-a-number-affected-when-multiplying-by-3-and-adding-1"&gt;I don't know, but other people are also thinking about this.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Taking one step back, if n is odd, it does not have 2 as a factor. We multiply it by 3, so we add 3 to the prime factors. Then we add 1, what happens to the prime factors? &lt;a href="https://math.stackexchange.com/questions/173113/relationship-between-prime-factorizations-of-n-and-n1"&gt;I don't know, nobody knows!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Another thing to keep in mind is &lt;a href="https://en.wikipedia.org/wiki/Collatz_conjecture#Cycles"&gt;cycles&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Any counterexample to the Collatz conjecture would have to consist either of an infinite divergent trajectory or a cycle different from the trivial (4; 2; 1) cycle. Thus, if one could prove that neither of these types of counterexample could exist, then all positive integers would have a trajectory that reaches the trivial cycle. Such a strong result is not known, but certain types of cycles have been ruled out.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To prove that it comes down to 1, we have to prove there are no cycles, if there's no number k that is reached (by 3n+1'ing a smaller number), which eventually will be reached again (by dividing 2k by 2), and then the sequence keeps looping. The only known cycle is the trivial: [1 → 4 → 2 → 1]. There's also no proof about no cycles.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Happy Collatz'ing! ☺&lt;/em&gt;&lt;/p&gt;</content><category term="python"></category><category term="math"></category></entry><entry><title>MNIST pixel attacks with Pytorch</title><link href="https://bytepawn.com/mnist-pixel-attacks-with-pytorch.html" rel="alternate"></link><published>2019-06-01T00:00:00+02:00</published><updated>2019-06-01T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-06-01:/mnist-pixel-attacks-with-pytorch.html</id><summary type="html">&lt;p&gt;It’s easy to build a CNN that does well on MNIST digit classification. How easy is it to break it, to distort the images and cause the model to misclassify? &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mnist-attack-accuracy.png" alt="MNIST attack accuracy" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;It’s easy to build a CNN that does well on MNIST digit classification. How easy is it to break it, to distort the images and cause the model to misclassify? The post was inspired by 2 papers, both are worth reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1710.08864"&gt;One Pixel Attack for Fooling Deep Neural Networks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://keenlab.tencent.com/en/2019/03/29/Tencent-Keen-Security-Lab-Experimental-Security-Research-of-Tesla-Autopilot/"&gt;Tencent’s Keen Labs get a Tesla to leave the lane by placing a few white dots on the road&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For the experiments, I’m using the ~99% accurate CNN that I’ve trained in the &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;previous MNIST post&lt;/a&gt;. 
The &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/06-pixel-attacks/MNIST-pixel-attacks.ipynb"&gt;ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Random noise&lt;/h2&gt;
&lt;p&gt;The first and simplest thing I tried is adding random noise. I wanted to see how random noise leads to classification errors, ie. reduced accuracy. Pytorch’s &lt;a href="https://pytorch.org/docs/stable/torchvision/transforms.html"&gt;transformer framework&lt;/a&gt; made these experiments easy: I just had to add one &lt;code&gt;Lambda()&lt;/code&gt; to the transformers chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;distort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_pixels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_pixels&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;)][&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot_accuracies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distorted_pixels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracies&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;clear_output&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;distorted pixels&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;accuracy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distorted_pixels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;distorted_pixels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;accuracies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;my_test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s1"&gt;&amp;#39;../data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
               &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
               &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lambda&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;distort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_pixels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
               &lt;span class="c1"&gt;# the above line was added compared to the original model&lt;/span&gt;
               &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_test_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;distorted_pixels&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;accuracies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accuracy_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;plot_accuracies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distorted_pixels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note that the &lt;code&gt;transforms.Lambda()&lt;/code&gt; was added after the &lt;code&gt;transforms.ToTensor()&lt;/code&gt;, so at this point the image is a 28x28 tensor of floats between 0 and 1, and before the &lt;code&gt;transforms.Normalize()&lt;/code&gt;. For reference, this is what a distorted image looks like (fifth test image in MNIST, a digit 4, original and with 100 pixels distorted):&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-four.png" alt="MNIST 4 distorted" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The result shows the decline in accuracy, as a function of how many pixels are randomly distorted on the test image:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-attack-accuracy.png" alt="MNIST attack accuracy" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The accuracy degradation is pretty linear. I was surprised by this, I was expecting the model to perform well up to a certain distortion, and then break down (more inclined inverse S-curve). Note that I set the pixels randomly, so at high counts it’s possible that a pixel is “set twice”, so the actual number of distorted pixels is lower.&lt;/p&gt;
&lt;h2&gt;One-pixel attacks&lt;/h2&gt;
&lt;p&gt;Based on the above, breaking the model by setting one-pixel shouldn’t be easy. But in the random noise example, we were setting pixels randomly; is it possible to find a special vulnerable pixel and break the classification? When looking for a one-pixel vulnerability, there are 3 parameters, x and y, and the pixel value. Here I will only play with x and y, I will set the pixel value to 1.0.&lt;/p&gt;
&lt;p&gt;Unlike in the article referenced in the intro, here I’m doing a brute-force attack. Two outer loops go through the pixel location, and each time it runs the test images through the model, and counts how many images are misclassified. Note that without any distortion the model achieves 98.9% accuracy, so out of 10,000 test images there are 110 that are misclassified even without distortion. These 110 are always subtracted, they are not counted.&lt;/p&gt;
&lt;p&gt;The code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_pixel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;misclassified_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;vuls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;vuls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vuls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;my_test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;../data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_test_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;base_misclassifieds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;misclassified_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vulnerable_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;([])&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;my_test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s1"&gt;&amp;#39;../data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                   &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                   &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lambda&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;set_pixel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
                   &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,)),&lt;/span&gt;
                &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_test_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;misc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;misclassified_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;actual_vulnerables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;misc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;difference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_misclassifieds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;) vulnerables: &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;actual_vulnerables&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actual_vulnerables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;gt; &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt; is &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;, classified as &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="n"&gt;vulnerable_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual_vulnerables&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_misclassifieds&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vulnerable_images&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In the end, it finds only 69 (out of 10,000) that are vulnerable. At most locations, it finds vulnerabilities, but it’s always the same images. For example, image #3906 is very vulnerable, changing the (0,8) pixel gets this 1 classified as a 3:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-one.png" alt="MNIST 1 attacked" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;This seems to go against the first cited article’s finding, where ~75% of CIFAR-10 images are found to have a one-pixel vulnerability. Possible explanations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;classifying digits is simpler than image categories, so MNIST is less vulnerable&lt;/li&gt;
&lt;li&gt;playing around with the pixel values would yield more vulnerabilities&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Out-of-bounds one-pixel attacks&lt;/h2&gt;
&lt;p&gt;Finally, I played around with setting out-of-bounds values, ie. what if I numerically set the pixel value to be greather than 1.0. Let’s repeat the distortion experiment, but set values to 5.0 instead of 1.0. At 20 distorted pixels the accuracy already drops to ~50%:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-oob-attack-accuracy.png" alt="MNIST attack accuracy" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;I attibute this to the &lt;code&gt;relu()&lt;/code&gt; activation, which cuts off at the low end, but not on the high-end, so these out-of-bounds values are able to travel through the deep layers and throw off the model:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/relu.png" alt="relu()" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Intuitively, neural nets find blobs in high-dimensional space and assign it to a label, and by doing this we move the image away from the blob along one dimension.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="torchvision"></category><category term="mnist"></category><category term="skl"></category></entry><entry><title>Solving CIFAR-10 with Pytorch and SKL</title><link href="https://bytepawn.com/solving-cifar-10-with-pytorch-and-skl.html" rel="alternate"></link><published>2019-05-14T00:00:00+02:00</published><updated>2019-05-14T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-05-14:/solving-cifar-10-with-pytorch-and-skl.html</id><summary type="html">&lt;p&gt;CIFAR-10 is a classic image recognition problem, consisting of 60,000 32x32 pixel RGB images (50,000 for training and 10,000 for testing) in 10 categories: plane, car,  bird,  cat, deer, dog, frog, horse, ship, truck. Convolutional Neural Networks (CNN) do really well on CIFAR-10, achieving 99%+ accuracy. The Pytorch distribution includes an example CNN for solving CIFAR-10, at 45% accuracy. I will use that and merge it with a Tensorflow example implementation to achieve 75%. We use torchvision to avoid downloading and data wrangling the datasets. Like in the MNIST example, I use Scikit-Learn to calculate goodness metrics and plots. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/cifar-10-example.png" alt="CIFAR examples" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.cs.toronto.edu/~kriz/cifar.html"&gt;CIFAR-10&lt;/a&gt;  is a classic image recognition problem, consisting of 60,000 32x32 pixel RGB images (50,000 for training and 10,000 for testing) in 10 categories: plane, car,  bird,  cat, deer, dog, frog, horse, ship, truck. Convolutional Neural Networks (CNN) do really well on CIFAR-10, achieving 99%+ accuracy. The Pytorch distribution includes an &lt;a href="https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py"&gt;example CNN for solving CIFAR-10&lt;/a&gt;, at 45% accuracy. I will use that and merge it with &lt;a href="https://towardsdatascience.com/cifar-10-image-classification-in-tensorflow-5b501f7dc77c"&gt;a Tensorflow example implementation&lt;/a&gt; to achieve 75%. We use &lt;a href="https://pytorch.org/docs/stable/torchvision/index.html"&gt;torchvision&lt;/a&gt; to avoid downloading and data wrangling the datasets. Like in the previous &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;MNIST post&lt;/a&gt;, I use &lt;a href="https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics"&gt;SciKit-Learn&lt;/a&gt; to calculate goodness metrics and plots. You can run this on your laptop in a couple of hours without a GPU, the &lt;strong&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/05-cifar-10/CIFAR-10.ipynb"&gt;ipython notebook is up on Github.&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cifar-10-example.png" alt="CIFAR examples" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The neural network&lt;/h2&gt;
&lt;p&gt;The CNN architecture is from &lt;a href="https://towardsdatascience.com/cifar-10-image-classification-in-tensorflow-5b501f7dc77c"&gt;this example implementation&lt;/a&gt;, ported to Pytorch, with &lt;code&gt;log_softmax()&lt;/code&gt; at the final layer like in the &lt;a href="http://bytepawn.com/solving-mnist-with-pytorch-and-skl.html#solving-mnist-with-pytorch-and-skl"&gt;MNIST CNN&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxPool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This time, let's automate printing out how many parameters the net has:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Trainable parameters:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numel&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Total&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output (formatted):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;Trainable&lt;/span&gt; &lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;      &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;728&lt;/span&gt;
&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;           &lt;span class="mi"&gt;64&lt;/span&gt;
&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;     &lt;span class="mi"&gt;73&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;728&lt;/span&gt;
&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;          &lt;span class="mi"&gt;128&lt;/span&gt;
&lt;span class="n"&gt;conv3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;    &lt;span class="mi"&gt;294&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;912&lt;/span&gt;
&lt;span class="n"&gt;conv3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;          &lt;span class="mi"&gt;256&lt;/span&gt;
&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;      &lt;span class="mi"&gt;131&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mo"&gt;072&lt;/span&gt;
&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;            &lt;span class="mi"&gt;128&lt;/span&gt;
&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;       &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;768&lt;/span&gt;
&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;            &lt;span class="mi"&gt;256&lt;/span&gt;
&lt;span class="n"&gt;fc3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;        &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;560&lt;/span&gt;
&lt;span class="n"&gt;fc3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bias&lt;/span&gt;             &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="o"&gt;-----------------------&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt;           &lt;span class="mi"&gt;537&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;610&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Getting data&lt;/h2&gt;
&lt;p&gt;As mentioned, we use torchvision here:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;train_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;torchvision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CIFAR10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;./data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;torchvision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CIFAR10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;./data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only trick here is the normalization. The mean and standard deviation passed in is the actual value computed for the dataset, after normalization (subtract and divide) the dataset will be a standard normal N(0,1) distribution. This just helps the training.&lt;/p&gt;
&lt;h2&gt;Training the model&lt;/h2&gt;
&lt;p&gt;Training is straightforward. I've modified the code so it returns the losses, so we can plot it later:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;losses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nll_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Train Epoch: {} [{}/{}&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;({:.0f}%)]&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;Loss: {:.6f}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mf"&gt;100.&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;losses&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is a plain vanilla training loop, with &lt;code&gt;nll_loss()&lt;/code&gt;. This stands for &lt;a href="https://ljvmiranda921.github.io/notebook/2017/08/13/softmax-and-the-negative-log-likelihood/"&gt;negative log likelihood (NLL)&lt;/a&gt;.
NLL essentially transforms the class probability (0 to 1) to run from ∞ to 0, good for a loss function. The combination of outputing &lt;code&gt;log_softmax()&lt;/code&gt; and minimizing &lt;code&gt;nll_loss()&lt;/code&gt; is mathematically the same as outputing the probabilities and minimizing &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;cross-entropy&lt;/a&gt; (how different are two probability distributions, in bits), but with better numerical stability.&lt;/p&gt;
&lt;p&gt;Using matplotlib we can see how the model converges:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;training batch&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;loss&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="p"&gt;))])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/cifar-10-training-loss.png" alt="CIFAR-10 training loss" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;By computing the accuracy on the training set at the end of each epoch, we can see how our model improves:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cifar-10-training-accuracy.png" alt="CIFAR-10 training accuracy" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Evaluating the model&lt;/h2&gt;
&lt;p&gt;Now we use the test data portion of CIFAR-10, and run the model on it.
Most SKL metrics expect 2 of 3 inputs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the ground truth labels&lt;/li&gt;
&lt;li&gt;the predicted labels&lt;/li&gt;
&lt;li&gt;the prediction probabilities (for eg. ROC curve)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now we can use SKL to get various metrics:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Confusion matrix:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;confusion_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;F1 score: &lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;f1_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;micro&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Accuracy score: &lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;accuracy_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Outputs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Confusion matrix:
[[778  20  37  12  18   6   2  25  65  37]
 [  6 854   7   5   2   4   1   1  44  76]
 [ 61   5 525  97  89  92  55  36  24  16]
 [ 20  10  32 587  65 162  45  35  19  25]
 [  9   5  34  63 715  51  41  65  13   4]
 [ 13   4  28 155  41 683  12  44   9  11]
 [  7   2  24  62  25  37 814   5  13  11]
 [ 13   3  17  38  54  61  10 785   5  14]
 [ 39   9  12  14   4   1   5   6 874  36]
 [ 23  56   6  14   4   4   5  14  35 839]]
F1 score: 0.745400
Accuracy score: 0.745400
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let's see the ROC curve for the cat(3) class:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_class_probabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;which_class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_class_probabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;roc_curve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_probabilities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;roc_auc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;lw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;darkorange&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ROC curve (area = &lt;/span&gt;&lt;span class="si"&gt;%0.2f&lt;/span&gt;&lt;span class="s1"&gt;)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;roc_auc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;navy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;linestyle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlim&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.05&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False Positive Rate&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;True Positive Rate&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ROC for label=cat(&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;) class&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;lower right&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/cifar-10-roc.png" alt="CIFAR-10 ROC curve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;While this result is not state-of-the-art, it's still much better than random, which would be 10% accuracy. I was able to run this with minimal changes from the MNIST code, since the model and the train/test framework are cleanly separated.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="torchvision"></category><category term="cifar"></category><category term="skl"></category></entry><entry><title>Solving MNIST with Pytorch and SKL</title><link href="https://bytepawn.com/solving-mnist-with-pytorch-and-skl.html" rel="alternate"></link><published>2019-05-02T00:00:00+02:00</published><updated>2019-05-02T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-05-02:/solving-mnist-with-pytorch-and-skl.html</id><summary type="html">&lt;p&gt;MNIST is a classic image recognition problem, specifically digit recognition. It contains 70,000 28x28 pixel grayscale images of hand-written, labeled images, 60,000 for training and 10,000 for testing. Convolutional Neural Networks (CNN) do really well on MNIST, achieving 99%+ accuracy. The Pytorch distribution includes a 4-layer CNN for solving MNIST. Here I will unpack and go through this example. We use torchvision to avoid downloading and data wrangling the datasets. Finally, instead of calculating performance metrics of the model by hand, I will extract results in a format so we can use SciKit-Learn's rich library of metrics. &lt;br/&gt;&lt;br/&gt; &lt;img src="/images/mnist-example.png" alt="MNIST example digits" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="http://yann.lecun.com/exdb/mnist/"&gt;MNIST&lt;/a&gt; is a classic image recognition problem, specifically digit recognition. It contains 70,000 28x28 pixel grayscale images of hand-written, labeled images, 60,000 for training and 10,000 for testing. Convolutional Neural Networks (CNN) do really well on MNIST, achieving 99%+ accuracy. The Pytorch distribution includes a &lt;a href="https://github.com/pytorch/examples/blob/master/mnist/main.py"&gt;4-layer CNN for solving MNIST&lt;/a&gt;. Here I will unpack and go through this example. We use &lt;a href="https://pytorch.org/docs/stable/torchvision/index.html"&gt;torchvision&lt;/a&gt; to avoid downloading and data wrangling the datasets. Finally, instead of calculating performance metrics of the model by hand, I will extract results in a format so we can use &lt;a href="https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics"&gt;SciKit-Learn's rich library of metrics&lt;/a&gt;. You can run this on your laptop in a couple of minutes without a GPU, the &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/04-mnist/MNIST.ipynb"&gt;ipython notebook is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-example.png" alt="MNIST example digits" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;The neural network&lt;/h2&gt;
&lt;p&gt;The definition for the CNN is just a couple of lines, taken from &lt;a href="https://github.com/pytorch/examples/blob/master/mnist/main.py"&gt;https://github.com/pytorch/examples/blob/master/mnist/main.py&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conv2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_pool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_pool2d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;As an exercise, let's make sure we understand what's going on here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;conv1&lt;/code&gt; is the first convolutional layer:&lt;ul&gt;
&lt;li&gt;the MNIST images are grayscale, so there is just 1 input channel&lt;/li&gt;
&lt;li&gt;this layer computes 20 convolutions, so the output is 20 channels&lt;/li&gt;
&lt;li&gt;each kernel is 5x5&lt;/li&gt;
&lt;li&gt;with a stride of 1&lt;/li&gt;
&lt;li&gt;by default, each kernel has a bias&lt;/li&gt;
&lt;li&gt;20 x (5 x 5 + 1) = 520 parameters to train&lt;/li&gt;
&lt;li&gt;the input is 28x28 pixels, the ouput is 28 - (5-1) = 24x24 pixels, on 20 channels each&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;in &lt;code&gt;forward()&lt;/code&gt;, &lt;code&gt;conv1&lt;/code&gt; is applied to the input image, than a &lt;code&gt;max_pool2d()&lt;/code&gt;&lt;ul&gt;
&lt;li&gt;2x2 patches are maxpool'd, with a stride of 2, this halves the image&lt;/li&gt;
&lt;li&gt;the input is 24x24 pixels, the output is 12x12 pixels&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;conv2&lt;/code&gt; is the second convolutional layer:&lt;ul&gt;
&lt;li&gt;input = 12x12 pixels, 20 channels&lt;/li&gt;
&lt;li&gt;this layer computes 50 convolutions, so the output is 50 channels&lt;/li&gt;
&lt;li&gt;same kernel and stride as the previous layer&lt;/li&gt;
&lt;li&gt;50 x (20 x 5 x 5 + 1) = 25,050 parameters to train&lt;/li&gt;
&lt;li&gt;the input is 12x12 pixels, the output is 12 - (5-1) = 8x8 pixels, on 50 channels each&lt;/li&gt;
&lt;li&gt;having &amp;gt;1 input and output channels: there is a separate 5x5 kernel for each combination (50x20 kernels), then to get each of the 50 output pixels the 20 convolved are averaged, and the bias is added&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;we apply &lt;code&gt;relu()&lt;/code&gt;, this doesn't change dimensionality&lt;/li&gt;
&lt;li&gt;another &lt;code&gt;max_pool2d()&lt;/code&gt; follows, cutting image size from 8x8 to 4x4 pixels&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fc1&lt;/code&gt; (for fully connected):&lt;ul&gt;
&lt;li&gt;takes the 4 x 4 x 50 = 800 input values and treats is as a big vector&lt;/li&gt;
&lt;li&gt;projects it to a 500 dimensional vector with an Ax+b matrix multiplication&lt;/li&gt;
&lt;li&gt;this is 4 x 4 x 50 x 500 + 500 = 400,500 parameters to train&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;then another &lt;code&gt;relu()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fc2&lt;/code&gt;&lt;ul&gt;
&lt;li&gt;projects down to a 10 dimensional vector&lt;/li&gt;
&lt;li&gt;this is 500 x 10 + 50 = 5,050 parameters to train&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;then &lt;code&gt;log_softmax()&lt;/code&gt; to get log-probabilities for each class; regular &lt;code&gt;softmax()&lt;/code&gt; would output probabilities, but here they are &lt;code&gt;log()&lt;/code&gt;'d, so later we have to &lt;code&gt;exp()&lt;/code&gt; to get back the probabilities&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Total parameters = 520 + 25,050 + 400,500 + 5,050 = 431,120 floats&lt;/p&gt;
&lt;h2&gt;Getting data&lt;/h2&gt;
&lt;p&gt;As mentioned, we use torchvision here:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;../data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
       &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MNIST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;../data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
           &lt;span class="n"&gt;transforms&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;0.1307&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3081&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The only trick here is the normalization. The mean and standard deviation passed in is the actual value computed for the dataset, after normalization (subtract and divide) the dataset will be a standard normal N(0,1) distribution. This just helps the training.&lt;/p&gt;
&lt;h2&gt;Training the model&lt;/h2&gt;
&lt;p&gt;Training is straightforward. I've modified the code so it returns the losses, so we can plot it later:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;losses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nll_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Train Epoch: {} [{}/{}&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;({:.0f}%)]&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s1"&gt;Loss: {:.6f}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mf"&gt;100.&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;batch_idx&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_loader&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;losses&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This is a plain vanilla training loop, with &lt;code&gt;nll_loss()&lt;/code&gt;. This stands for &lt;a href="https://ljvmiranda921.github.io/notebook/2017/08/13/softmax-and-the-negative-log-likelihood/"&gt;negative log likelihood (NLL)&lt;/a&gt;.
NLL essentially transforms the class probability (0 to 1) to run from ∞ to 0, good for a loss function. The combination of outputing &lt;code&gt;log_softmax()&lt;/code&gt; and minimizing &lt;code&gt;nll_loss()&lt;/code&gt; is mathematically the same as outputing the probabilities and minimizing &lt;a href="https://en.wikipedia.org/wiki/Cross_entropy"&gt;cross-entropy&lt;/a&gt; (how different are two probability distributions, in bits), but with better numerical stability.&lt;/p&gt;
&lt;p&gt;Using matplotlib we can see how the model converges:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;li&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;training batch&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;loss&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;losses&lt;/span&gt;&lt;span class="p"&gt;))])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mnist-training-loss.png" alt="MNIST training loss" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;By computing the accuracy on the training set at the end of each epoch, we can see how our model improves:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/mnist-training-accuracy.png" alt="MNIST training accuracy" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Evaluating the model&lt;/h2&gt;
&lt;p&gt;Now we use the test data portion of MNIST, and run the model on it.
Most SKL metrics expect 2 of 3 inputs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the ground truth labels&lt;/li&gt;
&lt;li&gt;the predicted labels&lt;/li&gt;
&lt;li&gt;the prediction probabilities (for eg. ROC curve)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now we can use SKL to get various metrics. Because the model performs so well, most of these are not very interesting.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_label_predictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Confusion matrix:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;confusion_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;F1 score: &lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;f1_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;micro&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Accuracy score: &lt;/span&gt;&lt;span class="si"&gt;%f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;accuracy_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Outputs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Confusion matrix:
[[ 973    0    0    0    0    0    4    1    2    0]
 [   0 1130    0    2    0    1    1    1    0    0]
 [   1    1 1024    0    2    0    1    2    1    0]
 [   0    0    0 1005    0    2    0    0    3    0]
 [   0    0    1    0  975    0    1    1    1    3]
 [   2    0    0   11    0  874    2    1    2    0]
 [   0    1    0    0    1    1  955    0    0    0]
 [   0    3    3    1    0    0    0 1019    1    1]
 [   0    0    1    1    0    1    0    0  970    1]
 [   0    2    0    6    7    2    0    4    1  987]]
F1 score: 0.991200
Accuracy score: 0.991200
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;ROC curve for one of the digit classes, with AUC; since the classifier is so good, the ROC curve is the ideal top-right curve, and AUC is 1.0 (again, not very interesting):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_class_probabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;actuals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keepdim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;probabilities&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;which_class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;
&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_probabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_class_probabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_loader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;roc_curve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actuals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_probabilities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;roc_auc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;lw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;darkorange&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ROC curve (area = &lt;/span&gt;&lt;span class="si"&gt;%0.2f&lt;/span&gt;&lt;span class="s1"&gt;)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;roc_auc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;navy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;linestyle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlim&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylim&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.05&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;False Positive Rate&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;True Positive Rate&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ROC for digit=&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt; class&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;which_class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;lower right&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;img src="/images/mnist-roc.png" alt="MNIST ROC curve" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The combination of Pytorch, torchvision and SKL makes it really quick to play around with deep neural network architectures and see how they perform, without writing too much code, while using regular Python to do the debugging. A good next step is to play around with &lt;a href="https://www.cs.toronto.edu/~kriz/cifar.html"&gt;CIFAR data&lt;/a&gt;.&lt;/p&gt;</content><category term="python"></category><category term="pytorch"></category><category term="cnn"></category><category term="torchvision"></category><category term="mnist"></category><category term="skl"></category></entry><entry><title>SVM with Pytorch</title><link href="https://bytepawn.com/svm-with-pytorch.html" rel="alternate"></link><published>2019-04-16T00:00:00+02:00</published><updated>2019-04-16T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-04-16:/svm-with-pytorch.html</id><summary type="html">&lt;p&gt;I use the standard Iris dataset for supervised learning with a Support Vector Machine model using Pytorch's autograd.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/svm_1.png" alt="SVM" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Support-vector_machine"&gt;Support Vector Machines&lt;/a&gt; are a standard ML model for supervised classification. The basic idea behind a (linear) SVM is to find a separating hyperplane for two categories of points. Additionally, to make the model as generic as possible, SVM tries to make the margin separating the two sets of points as wide as possible. When a linear separator is not enough, SVM can be made non-linear with the kernel trick, but here I will stick to the linear model. &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/03-svm/SVM%20with%20Pytorch.ipynb"&gt;All the code is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/svm_1.png" alt="SVM" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;Doing SVM in Pytorch is pretty simple, and we will follow the same recipe as in &lt;a href="http://bytepawn.com/pytorch-basics-solving-the-axb-matrix-equation-with-gradient-descent.html#pytorch-basics-solving-the-axb-matrix-equation-with-gradient-descent"&gt;the Ax=b post&lt;/a&gt;. We will use the standard &lt;a href="https://en.wikipedia.org/wiki/Iris_flower_data_set"&gt;Iris dataset&lt;/a&gt; for supervised learning.&lt;/p&gt;
&lt;h2&gt;Setting up the model: differentiable SVM&lt;/h2&gt;
&lt;p&gt;In order for Pytorch and autograd to work, we need to formulate the SVM model in a differentiable way. This is pretty straighforward, and has been done before by &lt;a href="http://deeplearning.net/wp-content/uploads/2013/03/dlsvm.pdf"&gt;Tang in this 2013 paper&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The separating hyperplane is defined by the &lt;strong&gt;wx - b = 0&lt;/strong&gt; equation, where &lt;strong&gt;w&lt;/strong&gt; is the normal vector and &lt;strong&gt;b&lt;/strong&gt; is a scalar offset. &lt;strong&gt;w&lt;/strong&gt;’s dimendionality is however many features we have. Additionally, we will try to place the plane in such a way that it falls halfway between the two classes, so that, if possible, there are no points behind the &lt;strong&gt;wx - b = ±1&lt;/strong&gt; lines (see first image). For each training point &lt;strong&gt;x&lt;/strong&gt;, we want &lt;strong&gt;wx - b &amp;gt; 1&lt;/strong&gt; if &lt;strong&gt;x&lt;/strong&gt; is in the &lt;strong&gt;+1&lt;/strong&gt; class, &lt;strong&gt;wx - b &amp;lt; -1&lt;/strong&gt; if &lt;strong&gt;x&lt;/strong&gt; is in the &lt;strong&gt;-1&lt;/strong&gt; class (we re-label classes to &lt;strong&gt;±1&lt;/strong&gt;). Calling the labels &lt;strong&gt;y&lt;/strong&gt;, we can multiply both equations to get the same thing: &lt;strong&gt;y ( wx - b) &amp;gt; 1&lt;/strong&gt;, or &lt;strong&gt;1 - y ( wx - b ) &amp;lt; 0&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So our constraint is for these expressions to be less than zero for each training point. If it’s positive, that’s “bad”. If it’s negative, we don’t really care how negative it is. This leads to the loss function: &lt;strong&gt;∑ max[0, 1 - y ( wx - b ) ]&lt;/strong&gt;. To make it optimizer friendly, we square it: &lt;strong&gt;∑ max[0, 1 - y ( wx - b ) ]²&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;There is a caveat though. What if the training points overlap? Or, there is just a few points which would cause the separating hyperplane’s margin to be very narrow? As the first picture shows, the width of the margin is &lt;strong&gt;2/|w|&lt;/strong&gt;, we also want to maximize this, or, minimize &lt;strong&gt;|w|/2&lt;/strong&gt;, so the model generalizes better. So the full loss function is: &lt;strong&gt;|w|/2 + C ∑ max[0, 1 - y ( wx - b ) ]²&lt;/strong&gt;. &lt;strong&gt;C&lt;/strong&gt; is an important hyperparameter, it sets the importance of separating all the points and pushing them outside the margin &lt;em&gt;versus&lt;/em&gt; getting a wide margin.&lt;/p&gt;
&lt;h2&gt;Pytorch code&lt;/h2&gt;
&lt;p&gt;First, let’s get the Iris data. The easiest is to get it from SciKit-Learn, which comes with a bunch of standard datasets. We can use &lt;a href="https://matplotlib.org/api/pyplot_api.html"&gt;pyplot&lt;/a&gt; to visualize Iris’s 4 features and the 3 species:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/svm_2.png" alt="Iris dataset" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;The code for this is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;matplotlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_iris&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_index&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FuncFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;x_index&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;y_index&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;colorbar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feature_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x_index&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feature_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;y_index&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;iris&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;load_iris&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;121&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;122&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;For this demonstration, I will just run SVM on the petal length and width (the last two features), and build a &lt;em&gt;setosa vs rest&lt;/em&gt; classifier. Constructing the training and test data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Writing the code is straightforward, it’s the same story as in the Ax=b post. &lt;strong&gt;w&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt; are variables that we want to optimize. For this task we can leave out the first part of the loss function, because an exact solution is possible:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;autograd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;autograd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;   &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;step_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1e-3&lt;/span&gt;
&lt;span class="n"&gt;num_epochs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;
&lt;span class="n"&gt;minibatch_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;inds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;inds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]))&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;L&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# if the loss is zero, Pytorch leaves the variables as a float 0.0, so we can&amp;#39;t call backward() on it&lt;/span&gt;
            &lt;span class="n"&gt;L&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;step_size&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="c1"&gt;# step&lt;/span&gt;
            &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;step_size&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="c1"&gt;# step&lt;/span&gt;
            &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s print out the &lt;strong&gt;w&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt; values, and evaluate the model:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;plane equation:  w=&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b =&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;y_predicted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;y_predicted&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;correct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;train accuracy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;test accuracy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;I get:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;plane&lt;/span&gt; &lt;span class="n"&gt;equation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.8717707&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.4143362&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2047558&lt;/span&gt;
&lt;span class="n"&gt;train&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let’s visualize the solution:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;line_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;   &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;ym&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;yp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;yn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;x_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;y_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FuncFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;x_index&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;y_index&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;colorbar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feature_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x_index&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feature_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;y_index&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ym&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To get:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/svm_3.png" alt="Iris dataset with SVM separator" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;This was just a game. There’s no good reason to run SVM on Pytorch, &lt;a href="https://scikit-learn.org/stable/modules/svm.html"&gt;SciKit-Learn has a built-in SVM model&lt;/a&gt; that is more robust and scalable and can get this done in less lines of code.&lt;/p&gt;</content><category term="pytorch"></category><category term="svm"></category><category term="iris"></category></entry><entry><title>Hacker News Embeddings with PyTorch</title><link href="https://bytepawn.com/hacker-news-embeddings-with-pytorch.html" rel="alternate"></link><published>2019-03-12T00:00:00+01:00</published><updated>2019-03-12T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-03-12:/hacker-news-embeddings-with-pytorch.html</id><summary type="html">&lt;p&gt;A PyTorch model is trained on public Hacker News data, embedding posts and comments into a high-dimensional vector space, using the mean squared error (MSE) of dot products as the loss function. The resulting model is reasonably good at finding similar posts and recommending posts for users.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/vectors.png" alt="Vector space" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;This post is based on &lt;a href="https://douwe.com/about"&gt;Douwe Osinga’s&lt;/a&gt; excellent &lt;a href="https://www.amazon.com/Deep-Learning-Cookbook-Practical-Recipes/dp/149199584X"&gt;Deep Learning Cookbook&lt;/a&gt;, specifically &lt;a href="https://github.com/DOsinga/deep_learning_cookbook/blob/master/04.2%20Build%20a%20recommender%20system%20based%20on%20outgoing%20Wikipedia%20links.ipynb"&gt;Chapter 4&lt;/a&gt;, embeddings. Embedding is a simple thing: given an entity like a Hacker News post or a Hacker News user, we associate an n-dimensional vector with it. We then do a simple thing: if two entities are similar in some way, we assert that the dot product (cosine similarity) should be &lt;code&gt;+1&lt;/code&gt;, ie. the vectors should be “aligned”. If two entities are not similar, we assert that the dot product should be &lt;code&gt;-1&lt;/code&gt;, ie. they should point in different directions. We then feed the data to a model, and in the training process get the optimizer to find assignments of entities to vectors such that those assertions are satisfied as much as possible. The most famous example of embeddings is Google's &lt;a href="https://en.wikipedia.org/wiki/Word2vec"&gt;word2vec&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/dlc.jpg" alt="Deep Learning Cookbook" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;In the book, embedding is performed on movies. For each movie, the wikipedia page is retrieved, and outgoing links to other wiki pages are collected. Two movies are similar if they both link to the same wiki page, else they are not similar. &lt;a href="https://keras.io/"&gt;Keras&lt;/a&gt; is used to train the model and the results are reasonably good.&lt;/p&gt;
&lt;p&gt;I wanted to implement the same thing in &lt;a href="https://pytorch.org"&gt;PyTorch&lt;/a&gt;, but on a different data set, to keep it interesting. As a regular &lt;a href="https://news.ycombinator.com"&gt;Hacker News&lt;/a&gt; reader, I chose Hacker News. Likes of user are not public, but comments are, so I use that for similarity.&lt;/p&gt;
&lt;p&gt;The plan is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Retrieve the top 1,000 HN posts from 2018 by number of comments&lt;/li&gt;
&lt;li&gt;For each post, retrieve the unique set of users who commented&lt;/li&gt;
&lt;li&gt;Use these &lt;code&gt;(post, user)&lt;/code&gt; pairs for similarity embedding&lt;/li&gt;
&lt;li&gt;Train with &lt;a href="https://en.wikipedia.org/wiki/Mean_squared_error"&gt;mean squared error&lt;/a&gt; (MSE)&lt;/li&gt;
&lt;li&gt;Use the resulting model to get:&lt;ul&gt;
&lt;li&gt;post similarity: if I like post P, recommend other posts I might like&lt;/li&gt;
&lt;li&gt;user recommendations: I am user U, recommend posts I might like&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/02-hacker-news-recommendations"&gt;All the code shown here, with the data files, is up on Github.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Getting the top 1,000 HN posts&lt;/h2&gt;
&lt;p&gt;The simplest way to get this is from &lt;a href="https://cloud.google.com/bigquery"&gt;Google BigQuery&lt;/a&gt;, which has a &lt;a href="https://console.cloud.google.com/marketplace/details/y-combinator/hacker-news"&gt;public Hacker News dataset&lt;/a&gt;. We can write a SQL query and download the results as a CSV file from the Google Cloud console:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;descendants&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="n"&gt;bigquery&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hacker_news&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;full&lt;/span&gt;&lt;span class="o"&gt;`&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="k"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;2018-01-01&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;  &lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="mi"&gt;2019&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;story&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt;
    &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt;
    &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The result of this is &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/02-hacker-news-recommendations/top_1000_posts.csv"&gt;top_1000_posts.csv&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Retrieve commenters for top posts&lt;/h2&gt;
&lt;p&gt;Getting the comments is not practical from BigQuery because the table stores the tree hierarchy (&lt;code&gt;parent_id&lt;/code&gt; of the parent comment, but not the &lt;code&gt;post_id&lt;/code&gt;), so we’d have to query repeatedly to get all the comments of the post, which is inconvenient. Fortunately there’s an easier way. &lt;a href="https://algolia.com"&gt;Algolia&lt;/a&gt; has a Hacker News API where we can download one big JSON per post, containing all the comments. The API endpoint for this is:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;https://hn.algolia.com/api/v1/items/&amp;lt;post_id&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;So we just go through all the posts from the previous step and download each one from Algolia.
Getting the set of commenters out of the JSON would be the easiest with &lt;code&gt;json.load()&lt;/code&gt;, but this sometimes fails on bad JSON. Instead we use an &lt;a href="https://github.com/mtrencseni/rxe"&gt;rxe&lt;/a&gt; regexp:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rxe.one('"author":"').one_or_more(rxe.set_except(['"'])).one('"')&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/02-hacker-news-recommendations/download_comments.py"&gt;entire code for this download script is on Github&lt;/a&gt;. The script &lt;a href="https://github.com/mtrencseni/pytorch-playground/tree/master/02-hacker-news-recommendations/cache"&gt;caches&lt;/a&gt; files, so repeatedly running it doesn’t repeatedly re-download data from Algolia.&lt;/p&gt;
&lt;p&gt;The script outputs the &lt;code&gt;(post, user)&lt;/code&gt; pairs into &lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/02-hacker-news-recommendations/post_comments_1000.csv"&gt;post_comments_1000.csv&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Building the model&lt;/h2&gt;
&lt;p&gt;PyTorch has a built-in module for &lt;a href="https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html"&gt;Embeddings&lt;/a&gt;, which makes building the model simple. It’s essentially a big array, which stores for each entity the assigned high-dimensional vector. In our case, both posts and users are embedded so if there are &lt;code&gt;num_posts&lt;/code&gt; posts and &lt;code&gt;num_users&lt;/code&gt; users, then &lt;code&gt;num_vectors = num_posts + num_users&lt;/code&gt;. So the array has &lt;code&gt;num_vectors&lt;/code&gt; row, each row corresponds to that entity’s embedding vector.&lt;/p&gt;
&lt;p&gt;PyTorch will then optimize the entries in this array, so that the dot products of the combinations of the vectors are &lt;code&gt;+1&lt;/code&gt; and &lt;code&gt;-1&lt;/code&gt; as specified during training, or as close as possible.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/vectors.png" alt="Vector space" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;The next step is to create a Model which contains the embedding. We implement the &lt;code&gt;forward()&lt;/code&gt; function, which just returns the dot product for a minibatch of posts and users, as per the current embedding vectors:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding_dim&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_norm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;t1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LongTensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="n"&gt;t2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LongTensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="n"&gt;dot_products&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bmm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;t1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contiguous&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dim&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;t2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contiguous&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dot_products&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contiguous&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Next, we need to write a function to build the minibatches we will use for training. For training, we will pass in existing combinations and “assert” that the dot product should be &lt;code&gt;+1&lt;/code&gt;, and some missing combinations with &lt;code&gt;-1&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_minibatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_positives&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_negatives&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;minibatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_positives&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;which&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx_list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;which&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_negatives&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;min_user_idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;idx_user_posts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;minibatch&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now we can perform the training. We will embed into 50 dimensions, we will use 500 positive and 500 negative combinations per minibatch. We use the &lt;a href="https://pytorch.org/docs/stable/optim.html"&gt;Adam optimizer&lt;/a&gt; and minimize the mean squared error between our asserted dot products and the actual dot products:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;embedding_dim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding_dim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;loss_function&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MSELoss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reduction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;mean&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;num_epochs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;num_positives&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="n"&gt;num_negatives&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="n"&gt;num_steps_per_epoch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_comments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_positives&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_steps_per_epoch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;minibatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_minibatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_positives&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_negatives&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FloatTensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loss_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;r: loss = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retain_graph&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;: loss = &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="c1"&gt;# print out some samples to see how good the fit is&lt;/span&gt;
&lt;span class="n"&gt;minibatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_minibatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FloatTensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;minibatch&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Sample vectors:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt; vs &lt;/span&gt;&lt;span class="si"&gt;%.3f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.016&lt;/span&gt;
&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.009&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="mi"&gt;49&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.633&lt;/span&gt;

&lt;span class="n"&gt;Sample&lt;/span&gt; &lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="mf"&gt;0.319&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.226&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.232&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.179&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.096&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.395&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.537&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.020&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.392&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;span class="mf"&gt;0.141&lt;/span&gt;  &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can see that training is able to reduce the MSE by about 40% from the initial random vectors by finding better alignments. That doesn’t sound too good, but it’s good enough for recommendations to work. Let’s write a function to find the closest vectors to a query vector:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;similar_posts_by_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;post_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;title_to_id&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;pv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_post_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;other_post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;other_post&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="n"&gt;ov&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_post_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other_post&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;dist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ov&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dist&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;https://news.ycombinator.com/item?id=&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;other_post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;id_to_title&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;other_post&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
    &lt;span class="n"&gt;similars&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;
    &lt;span class="n"&gt;similars&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;similars&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;a href="https://github.com/mtrencseni/pytorch-playground/blob/master/02-hacker-news-recommendations/Hacker%20News%20Embeddings%20with%20PyTorch.ipynb"&gt;The entire ipython notebook is on Github&lt;/a&gt;. We can use this to find similar posts, it works reasonably well.&lt;/p&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=16619917"&gt;Self-driving Uber car kills Arizona woman crossing street&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.89, &lt;a href="https://news.ycombinator.com/item?id=16643056"&gt;Tempe Police Release Video of Uber Accident&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.69, &lt;a href="https://news.ycombinator.com/item?id=16629673"&gt;Police Say Video Shows Woman Stepped Suddenly in Front of Self-Driving Uber&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.68, &lt;a href="https://news.ycombinator.com/item?id=16761602"&gt;Tesla crash in September showed similarities to fatal Mountain View accident&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=16967543"&gt;Ask HN: Who is hiring? (May 2018)&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.98, &lt;a href="https://news.ycombinator.com/item?id=16735011"&gt;Ask HN: Who is hiring? (April 2018)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.98, &lt;a href="https://news.ycombinator.com/item?id=17205865"&gt;Ask HN: Who is hiring? (June 2018)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.98, &lt;a href="https://news.ycombinator.com/item?id=18113144"&gt;Ask HN: Who is hiring? (October 2018)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=17015661"&gt;Conversations with a six-year-old on functional programming&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.76, &lt;a href="https://news.ycombinator.com/item?id=17036497"&gt;Common Lisp homepage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.67, &lt;a href="https://news.ycombinator.com/item?id=16884067"&gt;Towards Scala 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.66, &lt;a href="https://news.ycombinator.com/item?id=17079952"&gt;JavaScript is Good, Actually&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=16898827"&gt;You probably don't need AI/ML. You can make do with well written SQL scripts&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.66, &lt;a href="https://news.ycombinator.com/item?id=16847781"&gt;Time to rebuild the web?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.65, &lt;a href="https://news.ycombinator.com/item?id=16688521"&gt;Oracle Wins Revival of Billion-Dollar Case Against Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.62, &lt;a href="https://news.ycombinator.com/item?id=16959188"&gt;IBM is not doing "cognitive computing" with Watson (2016)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=16120235"&gt;Bitcoin has little shot at ever being a major global currency&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.71, &lt;a href="https://news.ycombinator.com/item?id=16267428"&gt;U.S. Regulators to Subpoena Crypto Exchange Bitfinex, Tether&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.71, &lt;a href="https://news.ycombinator.com/item?id=16115240"&gt;Buffett Says Stock Ownership Became More Attractive With Tax Cut&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.70, &lt;a href="https://news.ycombinator.com/item?id=16107597"&gt;Building for the Blockchain&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Query: &lt;a href="https://news.ycombinator.com/item?id=17536352"&gt;2018 MacBook Pro Review&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.75, &lt;a href="https://news.ycombinator.com/item?id=17231120"&gt;Apple introduces macOS Mojave&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.75, &lt;a href="https://news.ycombinator.com/item?id=16766129"&gt;Apple’s 2019 Mac Pro will be shaped by workflows&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.75, &lt;a href="https://news.ycombinator.com/item?id=17554902"&gt;MacBook Pro with i9 chip is throttled due to thermal issues, claims YouTuber&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Posts recommended for: &lt;a href="http://news.ycombinator.com/user?id=Maro"&gt;Maro&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;0.58, &lt;a href="https://news.ycombinator.com/item?id=16411662"&gt;Ask HN: Is it 'normal' to struggle so hard with work?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.49, &lt;a href="https://news.ycombinator.com/item?id=16409768"&gt;Ask HN: What has HN given you?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.47, &lt;a href="https://news.ycombinator.com/item?id=16153840"&gt;Google Memory Loss&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.46, &lt;a href="https://news.ycombinator.com/item?id=16424954"&gt;Why is it hard to make friends over 30? (2012)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.45, &lt;a href="https://news.ycombinator.com/item?id=17039144"&gt;Microsoft Turned Consumers Against the Skype Brand&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.45, &lt;a href="https://news.ycombinator.com/item?id=16465762"&gt;Ask HN: I'm writing a book about white-collar drug use, including tech sector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.44, &lt;a href="https://news.ycombinator.com/item?id=16483241"&gt;Why I Quit Google to Work for Myself&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.41, &lt;a href="https://news.ycombinator.com/item?id=16200007"&gt;The Death of Microservice Madness in 2018&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.40, &lt;a href="https://news.ycombinator.com/item?id=16724962"&gt;Facebook Secretly Saved Videos Users Deleted&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;0.40, &lt;a href="https://news.ycombinator.com/item?id=16146950"&gt;CES Was Full of Useless Robots and Machines That Don’t Work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Discussion&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Clearly we could use the text of the posts/comments to gauge similarity, and would get much better results.&lt;/li&gt;
&lt;li&gt;If the positive/negative ratio of training samples is too different from 1:1, we actually get a significantly lower MSE, but the resulting model is not useful. Why? If we include too many positive pairs where we “assert” &lt;code&gt;+1&lt;/code&gt; for the dot products, the optimizer will just pull all the vectors together to get &lt;code&gt;+1&lt;/code&gt; all the time and reduce MSE. If we include too many negative pairs, it will pull all posts to one vector and all users to the opposing vector, this configuration will mostly satisfy the training criteria and result in a low MSE. (In the book, 1/10 ratio is used, I think it’s accidental that it works in that case.)&lt;/li&gt;
&lt;li&gt;When emitting the &lt;code&gt;(post, user)&lt;/code&gt; pairs, we cut the users, and only keep users who have between 3 and 50 comments. The lower 3 is just to cut out users who don’t connect posts, so won’t be valuable to the embedding similarity training; so this cut makes the training set leaner and meaner. The 50 is to throw out users who comment on a lot posts, and hence pollute the similarity signal during training. Interestingly, without the upper limit of 50, the model doesn’t converge to a useful configuration! This took a lot of playing around to figure out.&lt;/li&gt;
&lt;li&gt;Notice that when we got recommendations for a user (user-post dot product), the dot product is always significantly lower than in the post-post case (user-user dot products are also lower). The users seem to be more scattered in the high-dimensional space, the posts seem to be in a more tightly packed subspace.&lt;/li&gt;
&lt;li&gt;Issues/bugs that slowed me down:&lt;ul&gt;
&lt;li&gt;Both posts and users are embedded, so we must remember at which row in the embedding matrix the user vectors start (&lt;code&gt;min_user_idx&lt;/code&gt; in the code). Initially I forgot to account for this, both started indexing at 0. Everything ran, but the similarity results were garbage. A nicer solution here would be to use 2 &lt;code&gt;Embedding&lt;/code&gt; objects (essentially 2 arrays), so we don’t have to remember the offset.&lt;/li&gt;
&lt;li&gt;I forgot to call &lt;code&gt;optimizer.zero_grad()&lt;/code&gt; in the training loop. Everything ran, but the similarity results were garbage. Without the &lt;code&gt;zero_grad()&lt;/code&gt; call, the gradients are accumulated, and the optimizer jumps around aimlessly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;</content><category term="pytorch"></category><category term="embedding"></category></entry><entry><title>rxe: literate and composable regular expressions</title><link href="https://bytepawn.com/rxe-literate-and-composable-regular-expressions.html" rel="alternate"></link><published>2019-03-02T00:00:00+01:00</published><updated>2019-03-02T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-03-02:/rxe-literate-and-composable-regular-expressions.html</id><summary type="html">&lt;p&gt;&lt;code&gt;rxe&lt;/code&gt; is a thin wrapper around Python's &lt;code&gt;re&lt;/code&gt; module. The various &lt;code&gt;rxe&lt;/code&gt; functions are wrappers around corresponding &lt;code&gt;re&lt;/code&gt; patterns. For example, &lt;code&gt;rxe.digit().one_or_more('a').whitespace()&lt;/code&gt; corresponds to &lt;code&gt;\da+\s&lt;/code&gt;. Because &lt;code&gt;rxe&lt;/code&gt; uses parentheses but wants to avoid unnamed groups, the internal (equivalent) representation is actually &lt;code&gt;\d(?:a)+\s&lt;/code&gt;. This pattern can always be retrieved with &lt;code&gt;get_pattern()&lt;/code&gt;.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/rxe2.png" alt="rxe example code" style="width: 500px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;rxe&lt;/code&gt; is a thin wrapper around Python's &lt;code&gt;re&lt;/code&gt; module (see &lt;a href="https://docs.python.org/2/library/re.html"&gt;official re docs&lt;/a&gt;). The various &lt;code&gt;rxe&lt;/code&gt; functions are wrappers around corresponding &lt;code&gt;re&lt;/code&gt; patterns. For example, &lt;code&gt;rxe.digit().one_or_more('a').whitespace()&lt;/code&gt; corresponds to &lt;code&gt;\da+\s&lt;/code&gt;. Because &lt;code&gt;rxe&lt;/code&gt; uses parentheses but wants to avoid unnamed groups, the internal (equivalent) representation is actually &lt;code&gt;\d(?:a)+\s&lt;/code&gt;. This pattern can always be retrieved with &lt;code&gt;get_pattern()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Github repo: &lt;a href="https://github.com/mtrencseni/rxe"&gt;https://github.com/mtrencseni/rxe&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Motivation&lt;/h2&gt;
&lt;p&gt;Suppose you want to parse geo coordinates from a string, like &lt;code&gt;(&amp;lt;latitude&amp;gt;,&amp;lt;longitude&amp;gt;)&lt;/code&gt;, where each is a decimal. The raw regular expression would look like &lt;code&gt;\(\d+\.\d\+,\d+\.\d\+)&lt;/code&gt;. This is hard to read and maintain for the next guy, and diffs will be hard to understand and verify.&lt;/p&gt;
&lt;p&gt;With rx, you can write:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;decimal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;one_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;digit&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;one_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;digit&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;coord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Note how rxe allows the &lt;code&gt;decimal&lt;/code&gt; regex to be re-used in the &lt;code&gt;coord&lt;/code&gt; pattern! Although it's more code, it's much more readable.&lt;/p&gt;
&lt;p&gt;Suppose you want to support arbitrary number of whitespace. The diff for this change will be:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;coord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line added&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line added&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line added&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line added&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Okay, but we also want to extract the latitude and longitude, not just match on it. Let's extract them, but in a readable way:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;coord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line changed&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lon&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- line changed&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero_or_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;whitespace&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;literal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;coord&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(23.34, 11.0)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;lon&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;One more example, parsing email addresses. The regex is &lt;code&gt;[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}&lt;/code&gt;. The equivalent &lt;code&gt;rxe&lt;/code&gt; code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;username = rxe.one_or_more(rxe.set([rxe.alphanumeric(), &amp;#39;.&amp;#39;, &amp;#39;%&amp;#39;, &amp;#39;+&amp;#39;, &amp;#39;-&amp;#39;]))
domain = rxe.one_or_more(rxe.set([rxe.alphanumeric(), &amp;#39;.&amp;#39;, &amp;#39;-&amp;#39;]))
tld = rxe.at_least_at_most(2, 6, rxe.set([rxe.range(&amp;#39;a&amp;#39;, &amp;#39;z&amp;#39;), rxe.range(&amp;#39;A&amp;#39;, &amp;#39;Z&amp;#39;)]))
email = (rxe
    .exactly(username)
    .literal(&amp;#39;@&amp;#39;)
    .exactly(domain)
    .literal(&amp;#39;.&amp;#39;)
    .exactly(tld)
)
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Install&lt;/h2&gt;
&lt;p&gt;Use &lt;code&gt;pip&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pip install git+git://github.com/mtrencseni/rxe&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;python&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;rxe&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rxe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;digit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at_least&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;at_least&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;q&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;1ppppqqqqq&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/pre&gt;&lt;/div&gt;</content><category term="python"></category></entry><entry><title>PyTorch Basics: Solving the Ax=b matrix equation with gradient descent</title><link href="https://bytepawn.com/pytorch-basics-solving-the-axb-matrix-equation-with-gradient-descent.html" rel="alternate"></link><published>2019-02-08T00:00:00+01:00</published><updated>2019-02-08T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-02-08:/pytorch-basics-solving-the-axb-matrix-equation-with-gradient-descent.html</id><summary type="html">&lt;p&gt;I will show how to solve the standard &lt;strong&gt;A x = b&lt;/strong&gt; matrix equation with PyTorch. This is a good toy problem to show some guts of the framework without involving neural networks.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/computational-graph.PNG" alt="PyTorch computational graph" style="width: 300px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;PyTorch is my favorite deep learning framework, because it's a hacker's deep learning framework. It’s easier to work with than Tensorflow, which was developed for Google’s internal use-cases and ways of working, which just doesn’t apply to use-cases that are several orders of magnitude smaller (less data, less features, less prediction volume, less people working on it). This is not a PyTorch vs Tensorflow comparison post, for that see &lt;a href="https://towardsdatascience.com/pytorch-vs-tensorflow-spotting-the-difference-25c75777377b"&gt;this post&lt;/a&gt;. There are tons of &lt;a href="https://pytorch.org/tutorials/"&gt;great tutorials&lt;/a&gt; and &lt;a href="https://www.youtube.com/watch?v=_H3aw6wkCv0"&gt;talks&lt;/a&gt; that help people quickly get a deep neural network model up and running with PyTorch. &lt;a href="https://pytorch.org/docs/stable/torchvision/datasets.html"&gt;PyTorch comes with standard datasets (like MNIST)&lt;/a&gt; and &lt;a href="https://pytorch.org/docs/stable/torchvision/models.html"&gt;famous models (like Alexnet) out of the box&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Under the hood, PyTorch is computing derivatives of functions, and backpropagating the gradients in a computational graph; this is called autograd. This can also be applied to solve problems that don’t explicitly involve a deep neural network.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/computational-graph.PNG" alt="PyTorch computational graph" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;To illustrate this, we will show how to solve the standard &lt;strong&gt;A x = b&lt;/strong&gt; matrix equation with PyTorch. This is a good toy problem to show some guts of the framework without involving neural networks.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/axb.PNG" alt="Ax=b" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;p&gt;As a reminder, in the &lt;strong&gt;A x = b&lt;/strong&gt; matrix equation, &lt;strong&gt;A&lt;/strong&gt; is a fixed matrix, &lt;strong&gt;b&lt;/strong&gt; is a fixed vector, and we’re looking for vector &lt;strong&gt;x&lt;/strong&gt; such that &lt;strong&gt;A x&lt;/strong&gt; is just the vector &lt;strong&gt;b&lt;/strong&gt;. If &lt;strong&gt;A&lt;/strong&gt; is a 1x1 matrix, then this is just a scalar equation, and &lt;strong&gt;x = b / A&lt;/strong&gt;. Let’s write this as &lt;strong&gt;x = A&lt;sup&gt;-1&lt;/sup&gt; b&lt;/strong&gt;, and then this applies to the n x n matrix case as well: the exact solution is to compute the inverse of &lt;strong&gt;A&lt;/strong&gt;, and multiply it by &lt;strong&gt;b&lt;/strong&gt;. (Note: the  technical conditions for a solution is &lt;strong&gt;det A ≠ 0&lt;/strong&gt;, I'll ignore this since I'll be using random matrices). Let’s say the solution is &lt;strong&gt;x = x&lt;sub&gt;s&lt;/sub&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: using gradient descent to estimate the solution is not necessary, as the solution can be computed quickly with &lt;a href="https://en.wikipedia.org/wiki/Invertible_matrix#Methods_of_matrix_inversion"&gt;matrix inversion&lt;/a&gt;. We're doing this to understand PyTorch on a toy problem.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;How can we use &lt;a href="https://pytorch.org/docs/stable/autograd.html"&gt;PyTorch and autograd&lt;/a&gt; to solve it? We can use it to approximate the solution: start with some random &lt;strong&gt;x&lt;sub&gt;0&lt;/sub&gt;&lt;/strong&gt;, compute the vector &lt;strong&gt;A x&lt;sub&gt;0&lt;/sub&gt; - b&lt;/strong&gt;, take the norm &lt;strong&gt;L =  ‖ A x&lt;sub&gt;0&lt;/sub&gt; - b   ‖&lt;/strong&gt;, and use gradient descent to find a next, better &lt;strong&gt;x&lt;sub&gt;1&lt;/sub&gt;&lt;/strong&gt; vector so that it’s closer to the real solution &lt;strong&gt;x&lt;sub&gt;s&lt;/sub&gt;&lt;/strong&gt;. They key idea is that for &lt;strong&gt;x=x&lt;sub&gt;s&lt;/sub&gt;&lt;/strong&gt;, the norm &lt;strong&gt;L =     ‖ A x&lt;sub&gt;s&lt;/sub&gt; - b   ‖ = 0&lt;/strong&gt;, it vanishes. So we want to minimize &lt;strong&gt;L&lt;/strong&gt;. This &lt;strong&gt;L&lt;/strong&gt; is called the loss function in such optimization problems.&lt;/p&gt;
&lt;p&gt;Let’s start with the standard L2 norm:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/l2-definition.PNG" alt="L2 norm definition" style="width: 150px;"/&gt;&lt;/p&gt;
&lt;p&gt;This will result in a parabolic loss function, where we will converge to the minimum.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/parabola2d.PNG" alt="L2 norm parabola" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;This is what the PyTorch code for setting up &lt;strong&gt;A, x&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt; looks like. We initialize &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt; to random:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/d0f65883f2be329cac1ec390869d02e0.js"&gt;&lt;/script&gt;

&lt;p&gt;We set &lt;code&gt;requires_grad&lt;/code&gt; to &lt;code&gt;False&lt;/code&gt; for &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt;. These are constants in this scenario, their gradient is zero. &lt;strong&gt;x&lt;/strong&gt; is the variable which we will compute gradients for, so we set &lt;code&gt;requires_grad = True&lt;/code&gt;.&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/a232e345680f17f0c00ce816036308f1.js"&gt;&lt;/script&gt;

&lt;p&gt;We then tell PyTorch to do a backward pass and compute the gradients:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/4583fdf66ea02a4d4756103c734074c5.js"&gt;&lt;/script&gt;

&lt;p&gt;At this point, PyTorch will have computed the gradient for &lt;strong&gt;x&lt;/strong&gt;, stored in &lt;code&gt;x.grad.data&lt;/code&gt;. What this means is “adjust &lt;strong&gt;x&lt;/strong&gt; in this direction, by this much, to decrease the loss function, given what &lt;strong&gt;x&lt;/strong&gt; is right now”. Now we just need to introduce a step size to control our speed of descent, and actually adjust &lt;strong&gt;x&lt;/strong&gt;:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/c1929a58684b2962c4008b24564b48a0.js"&gt;&lt;/script&gt;

&lt;p&gt;Almost done. We just need to set &lt;code&gt;step_size&lt;/code&gt;, put this in a for loop, and figure out when to stop it:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/3e1b3d47f55fdb5a54518312d298940a.js"&gt;&lt;/script&gt;

&lt;p&gt;Let’s stop it when the loss is smaller than &lt;code&gt;stop_loss&lt;/code&gt;, and put an upper bound on the number of iterations. Putting it all together:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/d6be2e6b35f100be1a92e6726f5210c2.js"&gt;&lt;/script&gt;

&lt;p&gt;It’s a good exercise to play around with this. Set a specific &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt;, print things out, try other dimensions, use numpy to get the inverse and compare the solutions, etc.&lt;/p&gt;
&lt;h2&gt;Discussion&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;How is this related to neural networks?&lt;/strong&gt;&lt;br/&gt;
In a fully connected (FC) layer, each input is multiplied by a weight to get the next's layer values. Putting the weights together, they form a matrix (tensor), which is multiplied by the input activations, just like &lt;strong&gt;A x&lt;/strong&gt;. In real neural networks, non-linearity is introduced at each node, eg. a &lt;a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)"&gt;ReLu&lt;/a&gt; function; we don't have that, this is a linear problem.  Note that in real machine learning scenarios, the loss function (potential surface) is in a very high dimensional space and has several minima; the goal is not to find the global optimum (which is untractable), just a good enough local one (that gets the job done).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/potential-surface.PNG" alt="Potential surface" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How do we know what &lt;code&gt;step_size&lt;/code&gt; should be? How do we know when to stop?&lt;/strong&gt;&lt;br/&gt;
In this case, because the loss function we’re optimizing is quadratic, we can get away with a fixed step size, as the gradient will get smaller as we approach the optimum (the solution). In more complicated deep neural network scenarios (where the step size is called the learning rate), there are strategies on how to gradually decay the step size. In general, if the step size is too small, we waste a lot of time far away from the solution. If it’s too big, we jump around the optimum and it may never converge. See also &lt;a href="https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1"&gt;Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/learning-rate.PNG" alt="Learning rate" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What should the stopping condition be?&lt;/strong&gt;&lt;br/&gt;
In this toy example, the &lt;code&gt;step_size&lt;/code&gt; is set smaller than &lt;code&gt;stop_loss&lt;/code&gt;, so it can converge on the optimum below the accepted loss. If you would set the &lt;code&gt;stop_loss&lt;/code&gt; to be much smaller than the step size, you will see that it never stops, it will jump around the optimum (left side of above picture).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What exactly is the quantity &lt;code&gt;x.grad.data&lt;/code&gt;?&lt;/strong&gt;&lt;br/&gt;
It's the gradient of the loss function we called backward() on, with respect to the variable, in this case &lt;strong&gt;x&lt;/strong&gt;. So in the dim=1 case it’s just &lt;strong&gt;dL/dx&lt;/strong&gt;. Note that since &lt;strong&gt;x&lt;/strong&gt; is the only independent variable here, the partial derivative is the total derivative. If we had multiple independent variables, we'd have to add the partial derivates to get the &lt;a href="https://en.wikipedia.org/wiki/Total_derivative"&gt;total derivative&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What if we use the L1 norm as the loss function?&lt;/strong&gt;&lt;br/&gt;
To use the L1 norm, set &lt;code&gt;p=1&lt;/code&gt; in the code. The L1 norm in &lt;code&gt;dim=1&lt;/code&gt; is the &lt;strong&gt;abs()&lt;/strong&gt; function, so it’s derivative is piecewise constant. In this case the slope is &lt;strong&gt;+-   ‖A  ‖&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/l1-norm.PNG" alt="L1 norm" style="width: 300px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is “less nice” than the L2 norm for this simple case, because the gradient doesn’t vanish as the solution approaches the optimum. The solution is more likely to overshoot the optimum and oscillate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens if we run this on the GPU?&lt;/strong&gt;&lt;br/&gt;
To turn on the GPU, put this line at the top:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/ae8c2b405025c21a63c6562d8dc7dcff.js"&gt;&lt;/script&gt;

&lt;p&gt;On my computer, this is significantly slower on the GPU than the CPU, because copying such a small problem to the GPU on each iteration creates a time overhead which is not worth it. This toy example is too small to demonstrate how GPUs speed up deep learning.&lt;/p&gt;</content><category term="pytorch"></category></entry><entry><title>Automating a Call Center with Machine Learning</title><link href="https://bytepawn.com/automating-a-call-center-with-machine-learning.html" rel="alternate"></link><published>2019-01-27T00:00:00+01:00</published><updated>2019-01-27T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-01-27:/automating-a-call-center-with-machine-learning.html</id><summary type="html">&lt;p&gt;Over a period of 6 months, we rolled out a Machine Learning model to predict a customer’s delivery (latitude, longitude). During the recent holiday peak, this ML model handled most of Fetchr’s order scheduling.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/ml-share3.png" alt="Share of ML scheduled versus Call center scheduled deliveries" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Impact&lt;/h2&gt;
&lt;p&gt;Over a period of 6 months, we rolled out a Machine Learning model to predict a customer’s delivery (latitude, longitude). During the recent holiday peak, this ML model handled most of Fetchr’s order scheduling.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ml-share3.png" alt="Share of ML scheduled versus Call center scheduled deliveries" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In Europe and the US, addresses is not something we think about a lot. My address in Hungary is, for example:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;em&gt;1114 Budapest, Szabolcska Mihaly u. 7&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Here “u.” stands for “utca”, which means “street”. 1114 is my zip code in Hungary. Sometimes I’m lazy and I shorten it, like:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;em&gt;1114 BP, Szabolcska 7&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the US, it’s customary to write it out in a different order and “street” is dropped:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;em&gt;7 Szabolcska Mihaly, Budapest, 1114&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If I open Google Maps (in an incognito window), I can enter either of the three, and it will point me to the precise (latitude, longitude) of my apartment, which happens to be (47.476117, 19.044950).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/szabolcska7.png" alt="Szabolcska Mihaly u. 7" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;I can give either address string to a delivery company in Hungary, and they will find my apartment. Why does this work? In the US and Europe, the following all hold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;zip codes exist and everybody uses them&lt;/li&gt;
&lt;li&gt;address formats are reasonably standardized&lt;/li&gt;
&lt;li&gt;most people know what their address is (“my zip code is 1114”)&lt;/li&gt;
&lt;li&gt;most people know how to write out their addresses&lt;/li&gt;
&lt;li&gt;companies like Google have a known database of addresses (and maps)&lt;/li&gt;
&lt;li&gt;companies like Google have an incentive to make services like Google Maps work&lt;/li&gt;
&lt;li&gt;web shops can enforce address formats, eg. can force the user to select from known zip codes, street names in those zip codes, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In many countries in the Middle East, some of the above do not hold. For example, in the United Arab Emirates (UAE), there are no zip codes. There are street names, but which street a building falls on is often ambiguous. Also, streets have many names, official and slang, english and arabic. Often, people don’t know their streets: for example, I live in a hotel in Dubai, and I don’t know which street the building is on (more than half of the population in Dubai are expats). Sometimes buildings have a street number, sometimes not; sometimes people know the number, sometimes not. Finally, there are large areas, even in Dubai, where Google Maps doesn't know street names or numbers:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/no-street-names.png" alt="No street names" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Many times people also don't give their street name as an address, instead they give an area name (which is itself ambiguous) and building name (“Princess Tower”) or a nearby point of interest (POI) like “near Burger King in Al Barsha, next to SZR” ("SZR" stands for "Sheikh Zayed Road", it's a 2x8 lane super-highway in Dubai). This is the situation in cities like Dubai or Riyadh; in remote areas, in the desert, resolving locations is even harder.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: Interestingly, a few years ago the UAE government created a system to identify buildings called Makani codes, which is a 10 digit number. Every building in the UAE has a Makani code, and every building must have a plaque showing the Makani code. Unfortunately, very few people know their building's Makani code; it’s not widely used (eg. I don’t know my building's Makani code).&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/makani.jpg" alt="Makani numbers" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here are some UAE and KSA addresses Fetchr delivered to in the past (changed capitalization to improve readability:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Near to Safeer Mall, Khuzam&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Greece Cluster, Building Greece 05 Shop 04, International City, Dubai&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Near by Emirates NBD, Nad Al Sheba, Dubai&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Batha Near Al Rajhi Building near Al Electron Building&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;I work at Royal Green Golf &amp;amp; Country Club&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;My home is near by colors street for car decoration in Jeddah&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Villa, King Khalid street, Down Town&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Google Maps or Open Street Maps doesn’t help here!&lt;/p&gt;
&lt;p&gt;The trick to a successful delivery in this region is the phone number! Unlike in the US or Europe, where an address is enough, here the &lt;strong&gt;phone number is king&lt;/strong&gt;; no package is accepted without the customer’s phone number. For incomplete or ambigious addresses, delivery companies rely on calling the customer to figure out where to go: call the customer, and try to figure out where to send the package based on the conversation with the customer, and then the package is dispatched to that (latitude, longitude). This is called &lt;strong&gt;scheduling&lt;/strong&gt;, the goal here is to figure out the spacetime coordinates of the delivery: (latitude, longitude, day, time), but we’ll ignore the (day, time) here.&lt;/p&gt;
&lt;h2&gt;Modeling&lt;/h2&gt;
&lt;p&gt;The problem we took on: given the freetext (phone, address), can we predict (latitude, longitude), so we can avoid a call to the customer? We set up a dummy service for this and got our software engineers to pass in the (phone, country, city, address); if we can make a good prediction, we return the predicted (latitude, longitude), else we return &lt;code&gt;NO_PREDICTION&lt;/code&gt;, in which case everything happens as before, the customer gets a call. (This is actually an oversimplification, for example the customer can also self-schedule using our app or mweb.)&lt;/p&gt;
&lt;p&gt;The service in production is running a number of models. A model is a way to predict the (latitude, longitude). When the service receives a (phone ... address) request, it goes through the models in a fixed order. If a model returns the (latitude, longitude), the service returns it. If the model returns &lt;code&gt;NO_PREDICTION&lt;/code&gt;, it moves on to the next. The models which returns the best quality coordinates is the first in line, and so on.&lt;/p&gt;
&lt;p&gt;So what models do we actually use? We currently have a total of 5 models running in production. I will describe 2 at a high level below.&lt;/p&gt;
&lt;h2&gt;Repeats&lt;/h2&gt;
&lt;p&gt;When working on building dashboards to understand our delivery operations, we created a metric which shows the % of our customers who are repeat customers. Customers can be identified by their phone numbers, which are also passed in as free text, but normalizing this is easy. It turns out we have a lot of customers that we’re already delivered to! This is an obvious opportunity: if we’ve delivered to a customer before, and recorded the actual (latitude, longitude) of the delivery (the driver app automatically does this when the package is delivered), then we can look this up. This should work most of the time, because people don’t move that often. This is the basic idea of this model (details omitted on purpose).&lt;/p&gt;
&lt;p&gt;The repeat model is simple, but it works amazingly well. The delivery performance (out of 100 dispatches, how many deliveries are successful) of this model outperforms our call center, and is on par with customer self-scheduling (which is the best channel). Part of the reason is that repeat customers are a biased group.&lt;/p&gt;
&lt;h2&gt;Address matching&lt;/h2&gt;
&lt;p&gt;What about non-repeat customers? Can we know where to go just based on the address?&lt;/p&gt;
&lt;p&gt;Initially we tried a lot of things, too many to detail here. Broadly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;instead of predicting the (latitude, longitude), use a more coarse grained geographic division of zones (eg. divide Dubai into a few 100 polygons), and try to predict the correct zone; here we tried various approaches:&lt;ul&gt;
&lt;li&gt;building a separate model for each zone&lt;/li&gt;
&lt;li&gt;building one city-level model with multiple activations, one per zone&lt;/li&gt;
&lt;li&gt;decision tree and other models on feature vectors constructed from bag-of-words models, TF-IDF, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;use raw OpenStreetMaps (OSM) data, extracting “sites”, and matching to that&lt;/li&gt;
&lt;li&gt;mixing-and-matching the above two&lt;/li&gt;
&lt;li&gt;various string tokenization and matching approaches&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After a lot of experimentation, I wasn’t satisfied with the overall performance of the models, and I didn't have enough confidence to put them into production. However, after weeks of working with the data, I realized that I can try something pretty simple “by hand”. I usually look at Dubai data, and I noticed a lot of addresses include the area name, which is pretty unambigious, for example “Jumeirah Village Circle” or “Jumeirah Village Triangle”.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/jvt.png" alt="JVT and JVC" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;I knew that these areas were served by the same driver, because I’ve been out with him several times to understand what happens on the ground. So if the service returns the middle of the area as a (latitude, longitude) for those addresses, it’ll get dispatched by the correct driver, a good enough first step. So I spent a day looking at Google Maps and OSM and simply wrote out a few hundred rules by hand, did some quick sanity checks to make sure past addresses which would match these were in the right location, and then wrote a simple model which essentially does substring checking. I then put it into production for a few orders / day. A few days later I evaluated the delivery performance, and saw that while it’s not excellent, it’s not that bad. (I later removed this manual model from production, the ML version is much better).&lt;/p&gt;
&lt;p&gt;So the question was, how do I make this better, and generalize it? I noticed this pattern while looking at the data, clearly there’s more patterns like this in the data, let’s get the machine to learn it. This is what we did: there’s a backend component, which looks at all our historic deliveries, and finds good rules, the production service then just uses these rules (details omitted on purpose). It's an interesting approach: a rule based engine in production, but the rules are coming out of an ML model; this makes it really easy to tune (see below) and add/remove exceptions.&lt;/p&gt;
&lt;h2&gt;Knobs to turn&lt;/h2&gt;
&lt;p&gt;A really nice property of our models is that they have knobs to turn. On the repeat model, we can accept better or worse address similarity when comparing to past addresses. On the address matching type models, we can accept more or less tightly packed historic coordinates when deciding which rule to run in production. This allows us to turn knobs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;run models in “tight” mode, where we schedule less orders (more prediction queries return &lt;code&gt;NO_PREDICTION&lt;/code&gt; and go to the call center), but the returned coordinates are very accurate and hence we get good delivery performance.&lt;/li&gt;
&lt;li&gt;run models in “wide” mode, where we schedule more orders (less orders return &lt;code&gt;NO_PREDICTION&lt;/code&gt; and go to the call center), but the returned coordinates are on average less accurate and hence we get lower delivery performance---but we pass less orders to the call center.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can use these knobs to make choices. For example, if other scheduling channels are not available, it makes sense to run the model as wide as possible; and there's a break-even point, where the model performs as well on average as the call center.&lt;/p&gt;
&lt;h2&gt;Winner take all&lt;/h2&gt;
&lt;p&gt;The delivery market has a “winner take all” dynamic: more order volume means higher density, means more loaded drivers, means more efficient drivers, means lower cost. This also applies to the ML models. The more deliveries a company has made, the more repeats it will have (eventually, it will cover the entire population of a country/city). The more deliveries a company has made, the better address rules it can extract from its data. More past deliveries lead to higher efficiency today.&lt;/p&gt;
&lt;h2&gt;Statistical improvements&lt;/h2&gt;
&lt;p&gt;There are a lot of ways to improve these models. The simplest one is based on counting. Using the address matching model as a use-case, we can simply count how many dispatches are coming from each rule (like the toy model example &lt;code&gt;“jumeirah village triangle” -&amp;gt; (latitude, longitude)&lt;/code&gt;), compute the delivery performance (=deliveries/dispatches) for each rule, and prune the badly performing ones. There’s an exploration-exploitation trade-off here, so we use an epsilon-greedy strategy. For more on this, see &lt;a href="https://en.wikipedia.org/wiki/Multi-armed_bandit"&gt;multi-armed bandits&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Metrics&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Good Machine Learning goes hand in hand with good Data Engineering and Analytics.&lt;/strong&gt; This project came out of building 100s of charts and metrics to understand and visualize Fetchr’s operations and business. For this project, the most relevant were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Repeat %&lt;/strong&gt;: what % of our daily dispatches are going to customer we’ve delivered to before; the higher, the easier it is to do a good job on predicting customer location and behaviour based on past data. Since Fetchr is very successful and operates at scale, we have a fair share of repeats.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scheduling accuracy&lt;/strong&gt;: scheduling accuracy is the % of deliveries where the scheduled coordinate and the delivery coordinate is within X meters. The challenge is, the delivery coordinate is unreliable: sometimes the drivers update the order status hours after the delivery event (eg. while having coffee), so the delivery coordinate is unreliable. The scheduled coordinate itself could also be incorrect. But when the two are close together, it’s very likely that they point to the correct location. Scheduling accuracy can also be benchmarked when back-testing models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delivery performance&lt;/strong&gt;: Delivery performance is a daily metric, it’s the % of dispatches that are successfully delivered. Delivery performance is not something we can back-test when building models, it has to be measured in production, experimentally, eg. on a small 1% release. (Delivery performance is the One Metric That Matters for delivery companies, we live and die by it.)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scheduling channel splits, model splits&lt;/strong&gt;: also a daily metric, it shows what % of dispatches came from which scheduling channel (call center, ML, self-scheduling, etc.), and specifically for the ML channel, what % came from which model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Conversion:&lt;/strong&gt; of all orders passed to the ML model for coordinate prediction, what % do we return a coordinate (instead of &lt;code&gt;NO_PREDICTION&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ml-conversion3.png" alt="ML Conversion%" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The delivery coordinate prediction service has been a great success at Fetchr. The version currently in production is relatively simple, easy to understand and tunable, and adding exceptions is easy. There are lots of improvement opportunities in the current models themselves, ordering of models based on features, and of course making more complex models. Our goal is to go further up and toward the right in (conversion, accuracy) space!&lt;/p&gt;</content><category term="fetchr"></category><category term="machine-learning"></category><category term="call-center"></category></entry><entry><title>5 things that happened in Data Science in 2018</title><link href="https://bytepawn.com/five-things-2018-data-science.html" rel="alternate"></link><published>2019-01-09T00:00:00+01:00</published><updated>2019-01-09T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2019-01-09:/five-things-2018-data-science.html</id><summary type="html">&lt;p&gt;2018 was a hot year for Data Science and AI. Here we picked out 5 highlights, which in our opinion shaped the field in the past year.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/q3deepmind.png" alt="Deepmind playing CTF" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;2018 was a hot year for Data Science and AI. Here we picked out 5 highlights, which in our opinion shaped the field in the past year.&lt;/p&gt;
&lt;h2&gt;OpenAI Five achieves human-level DOTA team gameplay&lt;/h2&gt;
&lt;p&gt;OpenAI has started to achieve professional-level 1v1 playback in 2017, but 2018 was the year when team gameplay became human level. In DOTA, two teams of 5 players (or bots) battle for domination of resources on a map and try to kill each other off. OpenAI uses Reinforcement Learning with an LSTM network and self-play for learning, playing 180 years against itself every day on 256 GPUs and 128,000 CPU cores.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/dota.png" alt="OpenAI Five" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;At a high level, one interesting aspect of OpenAI’s success is that no deep theoretical breakthrough was necessary (so far) to achieve this; the basic ideas of the AI are similar to Google DeepMind’s AlphaGo, which defeated Lee Sedol in 2016. But Go, although a very deep, strategic game, still seemed very “discrete” compared to a Real Time Strategy game like DOTA. It will be interesting to see how far the OpenAI architecture will scale: can it defeat the top professional human team in 2019? I wouldn’t bet against the AI.&lt;/p&gt;
&lt;p&gt;OpenAI resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blog.openai.com/openai-five/"&gt;OpenAI Five&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gym.openai.com/"&gt;OpenAI Gym&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Human-level performance in Quake 3 Arena&lt;/h2&gt;
&lt;p&gt;Staying with games, 2018 was also the year when DeepMind achieved human-level performance in Quake 3, using only pixels and game points as input. Although this result received less coverage than the earlier AlphaGo and the OpenAI successes, for people who grew up playing Doom and Quake, this is perhaps even cooler.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/q3deepmind.png" alt="Deepmind playing CTF" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Like the other game playing bots, this one is also based on Reinforcement Learning, but here thousands of bots are “alive” at any given time, playing in teams against themselves in randomly generated environments. Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning and selects actions using a hierarchical representation that enables the agent to reason at multiple timescales. The videos show that the bots display human-like behaviors such as navigating, following, and defending:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1807.01281"&gt;DeepMind paper: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=dltN4MxV1RI"&gt;Video supplement to the paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=MvFABFWPBrw&amp;amp;t=96s"&gt;Video explanation of the DeepMind Quake3 architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Self-driving cars did not arrive in 2018&lt;/h2&gt;
&lt;p&gt;Self-driving cars are the “killer-app” of the current AI hype cycle, and many people look at it as an indicator tent pole. Everybody understand this application, most people's life would be affected, and it’s hard to argue with the cool-factor. The stakes are high, many big players are working on this disruptive advancement: who will supply the brain of these self-driving cars? who will sell the most self-driving cars? which self-driving car service will dominate? and equally importantly, what will drivers who no longer have to drive do? whose content will they consume?&lt;/p&gt;
&lt;p&gt;Progress is steady, with Tesla pushing out new versions over the air, and other major car manufacturing also experimenting with self-driving features, but it seems Google’s Waymo is the most far ahead in terms of commercial deployment: in 2018 Waymo launched its first commercial self-driving car service called "Waymo One", where users in the Phoenix metropolitan area can use an app to request the service.&lt;/p&gt;
&lt;p&gt;But in 2018, we still were not able to get in a self-driving car, put our kids in the back, and get from San Francisco to Los Angeles without touching the wheel.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/waymo.jpg" alt="Deepmind playing CTF" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Instead, we had to cope with sad news. There were fatal crashes related to self-driving technology of Uber and Tesla, which triggered widespread debate about how the technology should be marketed, tested and deployed.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.theverge.com/2018/6/22/17492320/safety-driver-self-driving-uber-crash-hulu-police-report"&gt;Uber crash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.wired.com/story/tesla-autopilot-self-driving-crash-california/"&gt;Tesla crash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;PyTorch reached 1.0&lt;/h2&gt;
&lt;p&gt;Data Scientists and ML Researchers have a wide variety of tools and frameworks to chose from, with Tensorflow, Keras and PyTorch being the most widely used Deep Learning ones. PyTorch stands out to us because it is a “hacker’s framework”. Tensorflow came out of Google, where most ML jobs operate on vast amounts of data, which lead to design decisions to trade ease of use and debuggability for raw performance.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pytorch.png" alt="PyTorch logo" style="width: 500px;"/&gt;&lt;/p&gt;
&lt;p&gt;For smaller scale, more lean problems and teams, we believe PyTorch is a better fit. PyTorch is easier to work with than Tensorflow, the execution model is nicely integrated with Python, for example, step-by-step &lt;code&gt;print()&lt;/code&gt;s work---at the expense of some performance. But unless you’re working at Google scale, the speed of iterating on your model and code should come before the speed of training. To get an introduction to PyTorch, check out the excellent tutorial lecture by Stefan Otte:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=_H3aw6wkCv0"&gt;Deep Neural Networks with PyTorch - Stefan Otte&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Reinforce AI Conference announced!&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://reinforceconf.com/"&gt;Reinforce Conf&lt;/a&gt; gives product managers, data scientists and engineers insights into success stories, lessons learned, best practices and new approaches, &lt;strong&gt;in the beautiful European city of Budapest, on March 20-22, 2019&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/reinforce.png" alt="Reinforce" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Reinforce brings together a diverse group of leading experts and practitioners to share their knowledge and experience on the fast-moving field of Machine Learning and Artificial Intelligence. Some of the speakers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Andrea Pasqua, Data Science Manager, Uber&lt;/li&gt;
&lt;li&gt;Christian Szegedy, Staff Research Scientist, Google&lt;/li&gt;
&lt;li&gt;Cibele Montez Halasz, Machine Learning Engineer, Twitter&lt;/li&gt;
&lt;li&gt;Kush R. Varshney, Principal Research Staff Member, and Manager, IBM Watson&lt;/li&gt;
&lt;li&gt;Patrick van der Smagt, Director Artificial Intelligence, Volkswagen&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Use promo code &lt;code&gt;KDNuggets&lt;/code&gt; to get a 20% discount.&lt;/p&gt;</content><category term="data"></category><category term="openai"></category><category term="waymo"></category><category term="deepmind"></category><category term="tesla"></category><category term="reinforce"></category></entry><entry><title>Warehouse locations with k-means</title><link href="https://bytepawn.com/warehouse-locations-with-kmeans.html" rel="alternate"></link><published>2018-09-26T00:00:00+02:00</published><updated>2018-09-26T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-09-26:/warehouse-locations-with-kmeans.html</id><summary type="html">&lt;p&gt;Sometimes, the seven gods of data science, Pascal, Gauss, Bayes, Poisson, Markov, Shannon  and Fisher, all wake up in a good mood, and things just work out. Recently we had such an occurence at Fetchr, when the Operational Excellence team posed the following question: &lt;i&gt;if we could pick our Saudi warehouse locations, where would be put them? What is the ideal number of warehouses, and, what does ideal even mean? Also, what should our “delivery radius” be?&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/pickup_locations_riyadh.png" alt="." style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Sometimes, the seven gods of data science, &lt;a href="https://en.wikipedia.org/wiki/Blaise_Pascal"&gt;Pascal&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss"&gt;Gauss&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Thomas_Bayes"&gt;Bayes&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Sim%C3%A9on_Denis_Poisson"&gt;Poisson&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Andrey_Markov"&gt;Markov&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Claude_Shannon"&gt;Shannon&lt;/a&gt;  and &lt;a href="https://en.wikipedia.org/wiki/Ronald_Fisher"&gt;Fisher&lt;/a&gt;, all wake up in a good mood, and things just work out. Recently we had such an occurence at Fetchr, when the Operational Excellence team posed the following question: &lt;i&gt;if we could pick our KSA (=Kingdom of Saudi Arabia) warehouse locations, where would be put them? What is the ideal number of warehouses, and, what does ideal even mean? Also, what should our “delivery radius” be?&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;For those of us ignorant of Middle East geography, some facts about KSA:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the biggest country in the Middle East&lt;/li&gt;
&lt;li&gt;mostly desert&lt;/li&gt;
&lt;li&gt;not very far from global conflict locations such as Iraq, Syria, Lebanon&lt;/li&gt;
&lt;li&gt;about 6x as big as Germany (a “big” European country), with 0.4x of the population&lt;/li&gt;
&lt;li&gt;about 25x as big as UAE (where Dubai, Fetchr’s HQ is)&lt;/li&gt;
&lt;li&gt;about 24x as big as Hungary (my home)&lt;/li&gt;
&lt;li&gt;responsible for 13% of the world’s oil production&lt;/li&gt;
&lt;li&gt;e-commerce is exploding, lots of people are ordering stuff online&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ksa_map.png" alt="KSA map" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Note: I will describe this fun little project as best as I can without giving away sensitive information. In some parts I will use synthetic data and in cases where the information is public/discoverable ("Fetchr has a warehouse in Riyadh"), I will just show the real thing.&lt;/i&gt;&lt;/p&gt;
&lt;h2&gt;Metrics and k-means&lt;/h2&gt;
&lt;p&gt;The first question is, what is “good” here? What are “good” warehouse locations? We need to find a metric to minimize/maximize. This is pretty straightforward: we need to find warehouse locations, and for each order, we assign it to the nearest warehouse (we assume it would dispatch from there), and we calculate the distance. The goal is then to find warehouse locations which minimizes the average distance across all orders.&lt;/p&gt;
&lt;p&gt;We can write this out as a (naive) algorithm:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/avg_distance.png" alt="Code for avg_distance()" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Given this choice of metric, we can evaluate a set of warehouse locations, and compare it to another.&lt;/p&gt;
&lt;p&gt;The next question is, how do we actually pick the warehouse locations? Putting aside the question of how many warehouse locations we should have, assuming we know we want N locations, there is an algorithm just for this: &lt;a href="https://en.wikipedia.org/wiki/K-means_clustering"&gt;k-means clustering&lt;/a&gt;. You give k-means a set of points and a parameter N, and it returns the best N cluster centroid locations which minimizes the average distance to the nearest centroid. (This problem is actually NP-hard, so k-means returns an approximation of the solution.) At Fetchr we are standardized on Python and &lt;a href="http://scikit-learn.org/stable/"&gt;scikit-learn&lt;/a&gt;, so we used SKL’s excellent k-means implementation; it’s just 2 lines of code:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/skl_kmeans.png" alt="k-means in scikit-learn" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Analysis&lt;/h2&gt;
&lt;p&gt;The next question is, how do we pick N, the number of warehouses?&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Note: There are other clustering algorithms such as &lt;a href="https://en.wikipedia.org/wiki/DBSCAN"&gt;DBSCAN&lt;/a&gt; that do not depend on N as an input, and tell you the best N as an output. But there is no free lunch, so even with DBSCAN you have to specifiy an epsilon parameter as an input, and it uses that epsilon to tell apart “core” and “noise” points, with clusters being dense “core” regions surrounded by sparse “noise” regions. So DBSCAN is also not parameter-free.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;To keep things parameter-free and &lt;a href="https://en.wikipedia.org/wiki/KISS_principle"&gt;simple&lt;/a&gt; to interpret, we  ran k-means from &lt;code&gt;N = 1 .. N_max&lt;/code&gt; for a large &lt;code&gt;N_max&lt;/code&gt;, and plot the average distance achieved for each N, then read off the “best value” in some sense. Later we will see that being able to vary N is actually a good thing, because we can get easy-to-interpret insights from it. So k-means it is. With this, the basic approach of our analysis is, in pseudo-code:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/kmeans_skeleton.png" alt="." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;We can also compute the average distance with out actual, current KSA warehouse locations, and see how good a job the Operations team did picking them out. Running this analysis with different N, also showing actual warehouse locations, this is what the average distance curve might look like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wh_dist_1.png" alt="." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This (the real version of this) shows us:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;How much average distance we could gain by keeping the number of hubs constant, but if we had optimal locations (move down vertically from blue cross to red line).&lt;/li&gt;
&lt;li&gt;How many warehouses we could close and remain at the same average distance (move left from the blue cross to the red line).&lt;/li&gt;
&lt;li&gt;Our Operations team did a good job, our current warehouse locations are pretty good.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Thinking it through, it’s obvious that the average distance will not converge to some non-zero value. It will go to zero as N (the number of warehouse locations) approaches infinity (the number of input points), since in theory we can achieve zero average distance by placing a warehouse next to each delivery location. The next best thing to check is the “derivative” of this line, ie. how much distance we “gain” (well, lose), if we add +1 warehouse. From this we will see at which N we get the biggest gains, and where the gains level off.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wh_dist_2.png" alt="." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Here we can see that initially we gain a lot, then at some N our gains even out (but are non-zero). Then, a higher N the gains drop further. These are interesting points to investigate further.&lt;/p&gt;
&lt;p&gt;With these same tools we can also investigate the question of “delivery radius”. Delivery companies often have no-service zones where they don’t accept packages, because they can’t efficiently (=profitably) service these areas. Or these areas are serviced, but only once or twice a week. To get a feel for this, we took our actual warehouse locations and put a circle of radius R kilometers around them. We investigaged as a function of the R cut-off:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;What percentage of orders lie outside of R?&lt;/li&gt;
&lt;li&gt;If we cut these outliers and re-run k-means, how much average distance do we gain?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For us, it turns out we can make a strong argument: there is a given R for which we only cut off a very low percent of orders, but we gain a lot of average distance!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wh_dist_3.png" alt="." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This also shows that by performing both optimizations (distance cut-off and k-means) we can actually gain a lot of efficiency; moving from the blue cross to the green line is significant. For more on outliers and outlier detection, see my previous post &lt;a href="http://bytepawn.com/beat-the-averages.html#beat-the-averages"&gt;Beat the averages&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Locations&lt;/h2&gt;
&lt;p&gt;So far our analysis has been very quantitative in that we looked at plots and curves. We can also look at the actual recommended locations (latitude, longitude) on a map. As a starting point, we look at the k-means recommended warehouse locations when running at &lt;code&gt;N = N_actual&lt;/code&gt;, here we expect the recommended locations to resemble our actuals; this is a sanity check on k-means. And it actually works out! For example, k-means correctly places warehouses in the center of the biggest KSA cities (eg. Riyadh, Jeddah, etc, where all medium-to-large delivery companies like Fetchr must have warehouses).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wh_locations_1.png" alt="." style="width: 650px;"/&gt;
&lt;center&gt;Red is actual, yellow is k-means at &lt;code&gt;N = N_actual&lt;/code&gt;.&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;The interesting thing happens as we start to decrease N; essentially k-means starts finding more optimal configurations and/or recommends to merge warehouses:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/wh_locations_2.png" alt="." style="width: 650px;"/&gt;
&lt;center&gt;Red is actual, yellow is k-means at &lt;code&gt;N &amp;lt; N_actual&lt;/code&gt;.&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;As we decrease N (the number of warehouses), it essentially makes recommendations, like &lt;i&gt;“if you want to decrease the warehouse locations, join these two, and try to place the new one in the middle”.&lt;/i&gt; In real life this may or may not be feasible, because the middle may be just desert.&lt;/p&gt;
&lt;h2&gt;Conclusion and impact&lt;/h2&gt;
&lt;p&gt;This analysis was not a prescriptive (&lt;i&gt;“rent N warehouses here”&lt;/i&gt;), it’s a discussion starter for our operations and business teams. For this reason, we can get away with using as-the-crow-flies distances instead of proper road routed distances. But still, based on this, we were able to make a good N recommendation for number of warehouses (less than actual) and locations, but we also learned that our actual warehouse locations are not too bad. We also investigated delivery radius and found that if we cut orders at a certain R distance away from a warehouse, we only cut a few % of orders, but average distance drops significantly.&lt;/p&gt;
&lt;p&gt;As a bonus, a few weeks later the seven gods of data science smiled on us again. It turns out there is a very similar logistics problem to picking warehouse locations: pickup locations, where customers can go and self-pickup their packages. This time, the question was: &lt;i&gt;“What would the best pickup locations be in Riyadh? How many should we even have?”.&lt;/i&gt; We were able to re-use the same analytical framework, only this time running it on just Riyadh city data. The analysis says: &lt;i&gt;"put the first 5 pickup locations into central Riyadh, and the next one into Al Muzahimiyah, the next one into... and so on."&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/pickup_locations_riyadh.png" alt="." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;It’s interesting that all examples on the k-means Wikipedia page are cases where the distance metric is synthetic (vector quantization, image recognition, NLP). In this beautiful logistics use-case the distance is physical Euclidian, the real thing. It’s amazing how an old-school algorithm like k-means can deliver so much impact in unexpected places!&lt;/p&gt;</content><category term="data"></category><category term="data-science"></category><category term="metrics"></category><category term="fetchr"></category></entry><entry><title>Growth Accounting and Backtraced Growth Accounting</title><link href="https://bytepawn.com/backtraced-growth-accounting.html" rel="alternate"></link><published>2018-09-16T00:00:00+02:00</published><updated>2018-09-16T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-09-16:/backtraced-growth-accounting.html</id><summary type="html">&lt;p&gt;Previously I wrote two articles about data infra and data engineering at Fetchr. This time I want to move up the stack and talk about a simple piece of metrics engineering that proved to be very impactful: Growth Accounting and Backtraced Growth Accounting.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/ga_read_off.png" alt="Backtraced Growth Accounting" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Previously I wrote two articles about &lt;a href="http://bytepawn.com/fetchr-data-science-infra.html#fetchr-data-science-infra"&gt;data infra and data engineering at Fetchr&lt;/a&gt;. This time I want to move up the stack and talk about a simple piece of metrics engineering that proved to be very impactful: &lt;a href="https://www.facebook.com/FacebookforDevelopers/videos/growth-accounting-triangle-heatmap-explanation/3707283286197/"&gt;Growth Accounting&lt;/a&gt; and Backtraced Growth Accounting.&lt;/p&gt;
&lt;h2&gt;Standard Growth Accounting&lt;/h2&gt;
&lt;p&gt;Let’s start with Growth Accounting, a standard framework for understanding user lifecycles and churn in the SaaS world. I’ve implemented and ran Growth Accounting in previous jobs, but really grasped the importance (and how to do it properly) during my time at Facebook.&lt;/p&gt;
&lt;p&gt;The basic framework of Growth Accounting is to assign a state to every object, every day. The possible states are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;New&lt;/li&gt;
&lt;li&gt;Retained&lt;/li&gt;
&lt;li&gt;Churned&lt;/li&gt;
&lt;li&gt;Stale&lt;/li&gt;
&lt;li&gt;Resurrected&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;First, what is an ‘object’? An object can be anything, like a FB user, a FB group, a Prezi presentation or a Fetchr client company. The second part of Growth Accounting is determining which actions (or lack of them) lead to state transitions. For a user, actions are usually any activity, for a group is any activity in the group, for a Prezi it may be views. The third part of Growth Accounting is fixing the time horizon; this is usually either 1 day (Daily Growth Accounting), 7 days (Weekly) or 28 days (Monthly).&lt;/p&gt;
&lt;p&gt;So, before Growth Accounting, we need to decide the:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;object&lt;/li&gt;
&lt;li&gt;actions&lt;/li&gt;
&lt;li&gt;time horizon&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The most common use-case for Growth Accounting is users, on a monthly basis, so I’ll stick to that example in the first part:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;object = users&lt;/li&gt;
&lt;li&gt;actions = any activity by user&lt;/li&gt;
&lt;li&gt;time horizon = 28 days&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Growth Accounting then says:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a user is a Monthly Active User (MAU) if they had any activity in the last 28 days&lt;/li&gt;
&lt;li&gt;run the below state transitions daily to get the user’s Monthly Growth Accounting state:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ga_state_transitions.png" alt="Growth Accounting State Transitions" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The same state transition table works for Daily Growth Accounting by replacing MAU with DAU (Daily Active User), and Weekly Growth Accounting by replacing MAU with WAU (Weekly Active User).&lt;/p&gt;
&lt;p&gt;Some notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;users are in the New state for only one day (the first), irrespective of whether Growth Accounting is Daily/Weekly/Monthly&lt;/li&gt;
&lt;li&gt;users are in the Churned state for only one day, irrespective of whether Growth Accounting is Daily/Weekly/Monthly&lt;/li&gt;
&lt;li&gt;users are in the Resurrected state for only one day (the first active after a period of inactivity), irrespective of whether Growth Accounting is Daily/Weekly/Monthly&lt;/li&gt;
&lt;li&gt;inactive users end up in the Stale state (when speaking, we tend to say ‘churned users’, but stale is a more accurate term)&lt;/li&gt;
&lt;li&gt;continuosly active users end up in the Retained state&lt;/li&gt;
&lt;li&gt;the two states where users bulk up are Retained and Stale&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The point of doing Growth Accounting is to count every day the number of New, Retained, etc. users (plus Net New, see below) and put the counts on a time series chart (perhaps a stacked one), and look at it every day. What this usually shows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;every product churns out users, so Churned will be non-zero&lt;/li&gt;
&lt;li&gt;almost every product churns out a lot of users, so Stale will monotonically increase&lt;/li&gt;
&lt;li&gt;if Churn increases, that’s potentially bad, or the product is just growing&lt;/li&gt;
&lt;li&gt;if the product is healthy, Retained should increase&lt;/li&gt;
&lt;li&gt;The Growth Accounting equation is:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;Net New = New + Resurrected - Churned&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if Net New &amp;gt; 0, that’s good, the MAU count (or DAU or WAU) is increasing&lt;/li&gt;
&lt;li&gt;if Net New &amp;lt; 0, that’s bad, the MAU count is decreasing, the product is slowly “dying”&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Some comments about the Monthly, Weekly, Daily bit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;When doing Monthly Growth Account, 1 day of activity is enough to make the user MAU for 28 subsequent days (first they will be New or Resurrected, then Retained for 27 days, then Churned for one day and then Stale, assuming they’re not active again):
&lt;code&gt;New/Resurrected (1st day) → Retained (2..28) → Churned (29) → Stale (30th day)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;When doing Daily Growth Accounting, 1 day of activity will make the user DAU for one day, the state transitions will be quick:
&lt;code&gt;New/Resurrected  (1st day) → Churned (2) → Stale (3rd day)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Usually it makes sense to run all three (D/M/W) and keep tabs on all of them. DAU and WAU respond quicker to problems (with MAU, if there’s a problem, people won’t enter Churned for 28 days after the last activity), but are also more volatile.&lt;/p&gt;
&lt;p&gt;As discussed in the introduction Growth Accounting as a framework can be separated from the object and activity definition, and can be run for different time periods. So it makes sense to write it as a modular script that can be reused by plugging in whatever object/action definition. Thinking about tables, Growth Accounting needs as an input just 2 columns:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;day: a date&lt;/li&gt;
&lt;li&gt;id: the id of the object&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A row like (day, id) simply means that object with id was active on that day. That’s it. Given this table, a starting date, and a choice of D/W/M, a Growth Accounting framework can generate the output:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;day&lt;/li&gt;
&lt;li&gt;id&lt;/li&gt;
&lt;li&gt;state: the growth accounting state of the object with id on day&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We use Airflow, our Growth Accounting framework is baked into a function &lt;code&gt;dag_growth_accounting()&lt;/code&gt;, and it’s parameterized like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ga_client.png" alt="B2C Client Growth Accounting" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is the definition of our B2C Client Growth Accounting pipeline. The resulting table is &lt;code&gt;client_growth_accounting_28d&lt;/code&gt;, the first day for which states are generated is &lt;code&gt;2017-07-01&lt;/code&gt; (on that day every object with activity is New), it’s Monthly Growth Accounting, and activity is when we make a delivery for a client. Day is not an explicit column here, the &lt;code&gt;events_table&lt;/code&gt; is used as a sub-select in the script for every day when Growth Accounting runs.&lt;/p&gt;
&lt;p&gt;There is one additional feature here, there is an additional &lt;code&gt;country&lt;/code&gt; column. This makes it possible to filter the charts for country, so we can see per-country counts. This is just a property that is attached to the objects.&lt;/p&gt;
&lt;p&gt;The code for the framework function &lt;code&gt;dag_growth_accounting()&lt;/code&gt; which creates the Airflow DAG is below:&lt;/p&gt;
&lt;script src="https://gist.github.com/mtrencseni/8c93349f12aa48478588a2ecd91d4c1b.js"&gt;&lt;/script&gt;

&lt;p&gt;Note that this also calculates the L-number for each object: the number of active days in the last X days, where X is 1, 7, 28 depending on whether we're doing D/W/M Growth Accounting.&lt;/p&gt;
&lt;h2&gt;Backtraced Growth Accounting&lt;/h2&gt;
&lt;p&gt;The challenge we faced at Fetchr is that standard Growth Accounting treats every object the same. One user equals another user, one churned user equals another churned user, and so on. There is no concept of a “more valuable user”.&lt;/p&gt;
&lt;p&gt;But when doing Growth Accounting for our B2C client companies, this is not true. If Client X gives us 5,000 deliveries per day, and Client Y gives us 10 deliveries per day, then if Client X churns, that’s a big deal, but Client Y is less important than Client X.&lt;/p&gt;
&lt;p&gt;There are various ways to go about this. One possibility is to use the standard Growth Accounting count charts, but weigh each object by their “activity weight”. For example, in our case, we could weigh each client by the number of deliveries they gave us in the last 28 days. But then how do we weigh them once they churn and then become stale? In the end we did not go down this route.&lt;/p&gt;
&lt;p&gt;We chose a relatively simple approach, which works well for us:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Picture #1: Show Standard Growth Accounting charts to show how clients behave, with each client being weighted equally&lt;/li&gt;
&lt;li&gt;Picture #2: Also show our historic B2C deliveries, with each delivery (from Client X) colored according to Client X’s current (today’s) Growth Accounting state, historically, all the way back. This “backtracing” is what gives the name of Backtraced Growth Accounting.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/ga_picture1.png" alt="Standard Growth Accounting" style="width: 650px;"/&gt;
&lt;center&gt;Standard Growth Accounting for clients on synthetic data (Picture #1).&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ga_picture_2.png" alt="Backtraced Growth Accounting" style="width: 650px;"/&gt;
&lt;center&gt;The order-wise view on synthetic data (Picture #2).&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ga_picture_2_pct.png" alt="Backtraced Growth Accounting" style="width: 650px;"/&gt;
&lt;center&gt;The same view, percentage-wise split (Picture #2). This is what I call a sciccor chart.&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/ga_read_off.png" alt="Backtraced Growth Accounting" style="width: 650px;"/&gt;
&lt;center&gt;We can read off what our order-wise “retention” is from a year ago on this synthetic dataset.&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;-&lt;/p&gt;
&lt;p&gt;How to read the “scissor chart” above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Today, all deliveries are Retained ( if we made a delivery in the last 28 days for a client, that client is Monthly Active, so they will be Retained or Resurrected)&lt;/li&gt;
&lt;li&gt;As we go back in time, more and more deliveries are “colored” Stale, because they are for clients that don't use us anymore (no delivery in the last 28 days), so they are Stale&lt;/li&gt;
&lt;li&gt;Going back in time it's easy to read off what % of deliveries were for companies that have already churned out&lt;/li&gt;
&lt;li&gt;Whales churning/retaining is easy to read off: going backward in time, at the point where a whale churned there will be a jump in Stale deliveries (when they stopped using us)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In terms of code, to perform the backtrace (the sciccor view) is trivial once we have Growth Accounting:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/backtrace_view.png" alt="Backtraced Growth Accounting" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The big risk of this Picture #2 is that it hides onboarding problems: if a lot of clients come on, try us out by giving us low volume and then leave, it will not dramatically impact such a weighted view. That's not a problem though, that's why we also look Standard Growth Accounting (Picture #1).&lt;/p&gt;
&lt;p&gt;Backtraced Growth Accounting like this also makes sense for more common SaaS use-cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Color (state ↔ color) historic pageviews, message sends, document creates, time spent by the current GA state of the user&lt;/li&gt;
&lt;li&gt;Color views of a document by the current GA state of the document (eg. a Prezi)&lt;/li&gt;
&lt;li&gt;Color posts in a group by the current GA state of the group (eg. a FB group)&lt;/li&gt;
&lt;li&gt;Color deliveries made by a driver by whether the driver is still active with the company (eg. at Fetchr or Uber)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Backtraced Growth Accounting picture has become a standard part of our thinking when it comes to client retention, but as the examples above show it can also be useful in other product and business settings. Implementing a framework is simple and fun, it can be deployed repeatedly (we run it for both B2C and C2C use-cases), it’s a relatively easy way to get insight and have top-level impact.&lt;/p&gt;
&lt;p&gt;At Fetchr, implementing Growth Accounting and Backtraced Growth Accounting had a major impact on our B2C thinking. Before this, other teams ran one-off analysis with arbitrary cuts on (if client X gives us Y orders, but next month they give us 0.6 * Y, then...), which could be tuned to give any desired answer. Standardized Growth Accounting as described in this article gives a clear, parameter-free picture of both our client-wise and order-wise retention. For us, it showed the business is healthy and growing. The Growth Accouting charts today are on CxOs dashboards and are also presented to investors to explain our B2C business.&lt;/p&gt;</content><category term="data"></category><category term="data-science"></category><category term="metrics"></category><category term="growth-accounting"></category><category term="fetchr"></category></entry><entry><title>Fetchr Data Science Infra at 1 year</title><link href="https://bytepawn.com/fetchr-data-science-infra.html" rel="alternate"></link><published>2018-08-14T00:00:00+02:00</published><updated>2018-08-14T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-08-14:/fetchr-data-science-infra.html</id><summary type="html">&lt;p&gt;A description of our Analytics+ML cluster running on AWS, using Presto, Airflow and Superset.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/fetchr-data-science-infra-update.png" alt="Fetchr Data Science Infra" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;This is a quick follow-up to my &lt;a href="http://bytepawn.com/fetchr-airflow.html#fetchr-airflow"&gt;previous post describing the Fetchr Data Science infra and philosophy&lt;/a&gt;. The platform has doubled in the last 6 months, and I'm currently approaching the end of my first year at Fetchr, so it's a good time to post an update.&lt;/p&gt;
&lt;p&gt;The basic principles behind our infrastructure have not changed, but we have scaled it out horizontally in key areas. We have also added a small ML prediction cluster, which is already in production and having a big impact on on-the-ground operations. As of today, the Data Science infra is about 20 nodes and looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/fetchr-data-science-infra-update.png" alt="Fetchr Data Science Infra" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;Architecture:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;S3 buckets (all data lives here)&lt;/li&gt;
&lt;li&gt;two Presto clusters&lt;ul&gt;
&lt;li&gt;5 node Presto cluster for ETL and dashboards&lt;/li&gt;
&lt;li&gt;5 node Presto cluster for analytics queries&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Airflow&lt;ul&gt;
&lt;li&gt;1 node for scheduler + webserver&lt;/li&gt;
&lt;li&gt;1 node for workers&lt;/li&gt;
&lt;li&gt;1 node for staging&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Superset&lt;ul&gt;
&lt;li&gt;1 node for dashboarding&lt;/li&gt;
&lt;li&gt;1 node for analytics queries&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Jupyter host (Machine Learning notebooks)&lt;/li&gt;
&lt;li&gt;2 node ML prediction cluster (blue+green)&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;S3&lt;/h2&gt;
&lt;p&gt;As before, all data lives on S3, whether it's data imported from our production databases or data produced by the ETL. Data imported is stored in flat CSV files, whereas DWH tables produced by Airflow running Presto jobs are stored in ORC format (like at Facebook). EMR/EC2 nodes never store data.&lt;/p&gt;
&lt;p&gt;We continue to use the &lt;code&gt;ds&lt;/code&gt; partitioned DWH architecture. This means that every night the ETL imports a fresh copy of all production tables into a new ds partition, like &lt;code&gt;ds=2018-08-01&lt;/code&gt;, and all subsequent tables are also re-created in a new partition. Because all tables are backed on S3, this is also mirrored in our S3 path hierarchy. For example, the backing files for our main &lt;code&gt;company_metrics&lt;/code&gt; table are divided like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/company_metrics_s3.png" alt="A table on S3." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;When querying, we always specify the &lt;code&gt;ds&lt;/code&gt; like &lt;code&gt;WHERE ds='2018-08-13'&lt;/code&gt;, otherwise we're looking at multiple copies of the table. This is cumbersome, so for most of our tables, the Airflow jobs create a &lt;code&gt;xyz_latest&lt;/code&gt; view that just points to the latest partition, like &lt;code&gt;CREATE OR REPLACE VIEW xyz_latest AS SELECT * FROM xyz WHERE ds = '{{ ds }}'&lt;/code&gt;. This way analysts can usually just query the &lt;code&gt;_latest&lt;/code&gt; and not think about it.&lt;/p&gt;
&lt;p&gt;There are many upsides to this scheme: (i) if today's ETL fails, people can continue working with yesterday's data and (ii) since partitions are write-once, data never changes, so it's always easy to reproduce a number sent out 6 months ago (just run the query on that &lt;code&gt;ds&lt;/code&gt; partition, like &lt;code&gt;WHERE ds='2018-02-08'&lt;/code&gt;). The downside is that it's a lot of duplicate data, but with S3 being super-cheap this is a non-issue.&lt;/p&gt;
&lt;p&gt;Okay, but still, this is wasteful and slow in terms of ETL time? After all, we just imported all these tables last night, do we need to re-import the whole dataset again? As the company and our data volume grew, the nightly import actually started taking too long, so we were forced to optimize this: for our big tables, now we import the historic tail once a week on weekends, and on a daily basis we only import data on orders that we received in the last ~3 months. This ensures our ETL finishes on time every night.&lt;/p&gt;
&lt;h2&gt;Presto&lt;/h2&gt;
&lt;p&gt;We currently run two EMR+Presto clusters, each 5 nodes. As before we don't run any ETL or queries on Hive/MapReduce, we exclusively use Presto for compute, since our queries never touch more than 10-100M rows.&lt;/p&gt;
&lt;p&gt;We introduced a secondary cluster for ad-hoc analytics queries because, in cases when our ETL is slow and running during the day, or our regular hourly ETLs are running during the day, it kept blocking us from getting our work done.&lt;/p&gt;
&lt;p&gt;Since all our data lives in S3, having two clusters see (read and write) the same data is not very hard. All we had to do is make sure our schemas are in sync on the two clusters. Since 99% of our schema operations are managed through Airflow jobs (&lt;code&gt;CREATE TABLE IF NOT EXISTS ...&lt;/code&gt;), we just had to modify our Airflow framework to also execute these on the secondary cluster. Additionally, when we manually make changes to an existing table (&lt;code&gt;ALTER TABLE ...&lt;/code&gt;), we have to execute this on both clusters, which is an inconvenience, but a minor one, and quite managable at this scale.&lt;/p&gt;
&lt;h2&gt;Airflow&lt;/h2&gt;
&lt;p&gt;We continue to use Airflow to be the backbone of our data pipelines with great success. We have two nodes for production: one running the scheduler and webserver, and one running the worker processes. Since these nodes don't do any compute themselves (they just launch Presto &lt;code&gt;INSERT INTO ... SELECT&lt;/code&gt; jobs), we did not need to scale out here so far, nor do we expect this to happen in the next year.&lt;/p&gt;
&lt;p&gt;One the other hand, we have deepened our investment into Airflow as our standard ETL system wrt code. We have identified the 4-5 common Airflow use-cases we have (import from Postgres to S3, run an ETL job on Presto, export data to the BI team's Redshift DWH, create dashbord screenshots and send in email, run Growth Accounting) and we have created helper functions to encapsulate them. As a result, the vast majority of our Airflow DAGs don't create Airflow operators directly, instead they call these library functions, which construct and return the DAGs. As a result our Airflow jobs look very clean, with all of the messy complexity hidden away:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/airflow_code_example.png" alt="A table on S3." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This had a big pay-off when we introduced our secondary Presto cluster a few months ago, and we needed to automtically create all our schemas there. We just added extra operators to our library functions to run schema operations on the secondary, and the next night when the ETL ran, all our table schemas were created on the secondary Presto cluster, pointing to the backing S3 files, ready to go. We were running analytics queries on the secondary cluster the day after we spun it up!&lt;/p&gt;
&lt;p&gt;Currently we have 76 DAGs in Airflow, importing and exporting from 5-10 data sources (3 production databases, S3, 2 Presto clusters, Redshift, DynamoDB, various custom extracts sent to clients and the ML cluster).&lt;/p&gt;
&lt;h2&gt;Superset&lt;/h2&gt;
&lt;p&gt;Superset is both a dashboarding system and has an SQL IDE (called SQL Lab), which our Data Scientists use as their primary tool for accessing data. We continue to use Superset for dashboarding and have spun up a secondary Superset instance just for queries.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/sql-lab.png" alt="Fetchr Data Science Infra" style="width: 800px;"/&gt;&lt;/p&gt;
&lt;p&gt;We have hit a limitation with Superset, where Superset will launch a new gunicorn process for each page request, and if the page happens to be a dashboard, for each chart on the dashboard. The chart processes will launch a Presto query each, which could take 10-60 seconds to return. Some of our dashboards have a lot of charts on them (20+), plus we have many concurrent users accessing dashboards (or running queries). In cases like this, Superset runs out of worker processes, and it becomes totally unresponsive. Each worker process eats up 1-2G of RAM, so the number of processes it can run are limited.&lt;/p&gt;
&lt;p&gt;As an initial workaround, we split the dashboarding and querying use-case into two Superset instances, so analysts are not blocked by the dashboards. Additionally, we broke large dashboards into smaller pieces (which is unfortunate). This way, running 32 worker processes each, both instances are good for every-day work at current loads.&lt;/p&gt;
&lt;p&gt;Both Airflow and Superset are still rough around the edges, but since Superset is user-facing, it can create more problems. It still happens that we want to look at a dashboard but the webserver times out because Superset ran out worker processes (maybe because the Presto queries are slow, because they're running on the same cluster as the ETL, and the ETL is slow, because something changed in production). Right now we get by with work-arounds (for example, in the previous case, we re-direct the Superset dashboarding traffic to the analytics cluster temporarily by changing a connection string, until ETL finishes). So far Superset is good enough for internal Data Science / Understand dashboards, and we do have a fair number of colleagues using it on a daily basis for basic reporting. But admittedly we will need to invest more time into understanding how to tune if we want to deploy it to a company-wide 1000+ person audience and feel good about it.&lt;/p&gt;
&lt;p&gt;Currently we have 26 dashboards in Superset, many of them viewed by CxOs and country General Managers every day.&lt;/p&gt;
&lt;h2&gt;Machine Learning&lt;/h2&gt;
&lt;p&gt;In the last 3 months we have rolled out a prediction model in production. We perform the Data Science work to arrive at the models on our laptops and/or on a dedicated Jupyter host, in ipython notebooks. Once we're happy with the result, we deploy the model to our ML prediction cluster. We don't yet have CI/CD set up for it, deployment is manual and requires domain knowledge.&lt;/p&gt;
&lt;p&gt;The model is already running in production and is being used for on-the-ground delivery operations, so downtime is not acceptable. We quickly arrived at a 2 node blue/green model: both nodes are running identical code/data, we deploy to green first, if it goes well, then to blue. Both are behind an Elastic Load Balancer (&lt;code&gt;predict&lt;/code&gt; happens over a HTTP API call), so things keep going if one of them is down, even if down for a long time.&lt;/p&gt;
&lt;p&gt;Fresh data is loaded every night by an Airflow process: first it creates a daily dump for the ML models, uploads it to S3, and then triggers a special &lt;code&gt;load&lt;/code&gt; API call, first on the green, then the blue host. The Airflow job for blue depends on green, so if green fails, it won't touch blue, so production will not be impacted.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/model_dag.png" alt="A table on S3." style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;At this scale/complexity, this simple model works quite well and we have excellent uptime; we have more problems coming from software bugs than availability.&lt;/p&gt;
&lt;h2&gt;Impact&lt;/h2&gt;
&lt;p&gt;This post focused on what is usually called Data Infrastructure and Data Engineering, but actually to build and run this platform only took about 0.5-1 FTE average effort over time (though very senior FTEs). The rest of our time directly focused on operational and business impact by building metrics and dashboards, running ad-hoc analytics, building operational models for forecasting and sizing, building ML models, and most important of all, explaining it all to our colleagues so our work is adopted and has impact on the ground. It's interesting to note that just 5 years ago, in a similar scenario, good-enough open source tools like Airflow and Superset were not available, so we had to roll our own and ended up spending an order of magnitude more time on DI/DE work.&lt;/p&gt;
&lt;p&gt;Overall, in the last year, our small Data Science team was able to have dollar-measurable outsized impact on Fetchr by using data to understand and optimize operations and business processes. Today, many important operational decisions are based on data and driven by Data Scientists, including sizing our fleets and warehouses, understanding their performance, and ML models optimizing and automating human labor.&lt;/p&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="airflow"></category><category term="fetchr"></category><category term="model"></category><category term="ml"></category></entry><entry><title>What not to spend time on</title><link href="https://bytepawn.com/what-not-to-spend-time-on.html" rel="alternate"></link><published>2018-07-23T00:00:00+02:00</published><updated>2018-07-23T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-07-23:/what-not-to-spend-time-on.html</id><summary type="html">&lt;p&gt;Warren Buffett says deciding what &lt;strong&gt;not&lt;/strong&gt; to spend time on is just as important as deciding what to spend time on.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/warren-buffett.jpg" alt="Warren Buffett" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;A couple of years ago I read &lt;a href="http://www.euclidean.com/the-essays-of-warren-buffett-review-lessons-quotes/"&gt;Warren Buffett's books&lt;/a&gt; and some stories about him. One of the lessons stuck with me, it's something I think about regularly when deciding what to spend time on. Below is an excerpt from &lt;a href="https://jamesclear.com/buffett-focus"&gt;James Clear's website&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;The story of Mike Flint&lt;/h2&gt;
&lt;p&gt;Mike Flint was Buffett's personal airplane pilot for 10 years. (Flint has also flown four US Presidents, so I think we can safely say he is good at his job.) According to Flint, he was talking about his career priorities with Buffett when his boss asked the pilot to go through a 3-step exercise. Here's how it works…&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;STEP 1:&lt;/strong&gt; Buffett started by asking Flint to write down his top 25 career goals. So, Flint took some time and wrote them down. (Note: you could also complete this exercise with goals for a shorter timeline. For example, write down the top 25 things you want to accomplish this week.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;STEP 2:&lt;/strong&gt; Then, Buffett asked Flint to review his list and circle his top 5 goals. Again, Flint took some time, made his way through the list, and eventually decided on his 5 most important goals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;STEP 3:&lt;/strong&gt; At this point, Flint had two lists. The 5 items he had circled were &lt;strong&gt;List A&lt;/strong&gt; and the 20 items he had not circled were &lt;strong&gt;List B&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Flint confirmed that he would start working on his top 5 goals right away. And that's when Buffett asked him about the second list, &lt;em&gt;“And what about the ones you didn't circle?”&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/warren-buffett.jpg" alt="Warren Buffett" style="width: 600px;"/&gt;&lt;/p&gt;
&lt;p&gt;Flint replied, &lt;em&gt;“Well, the top 5 are my primary focus, but the other 20 come in a close second. They are still important so I’ll work on those intermittently as I see fit. They are not as urgent, but I still plan to give them a dedicated effort.”&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;To which Buffett replied, &lt;em&gt;“No. You’ve got it wrong, Mike. Everything you didn’t circle just became your Avoid-At-All-Cost list. No matter what, these things get no attention from you until you’ve succeeded with your top 5.”&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;What I don't spend time on&lt;/h2&gt;
&lt;p&gt;I think this is great advice, it's something I think about regularly. I will list 2 things here that I decided not to spend time on after reading this Warren Buffett bit a few yeas ago.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum Field Theory&lt;/strong&gt; (and research-level physics in general). I have a degree in Physics, it's a second degree I got mostly for fun and curiousity. Getting it has been one of the greatest decisions of my life, learning and doing Physics has been continually paying off ever since than, both personally and for work (Computer Science + Physics = Data Science). As a physicist you learn the deep insights of your predecessors, but also that with certainty they are wrong in some sense, and over time their achievements will be "just" an approximation or a special case. So physicists are always learning and adjusting their perspective. Physicists also have a keen sense for measurements, statistics, errors, which is very useful when dealing with numbers in the real world.&lt;/p&gt;
&lt;p&gt;So after I got the Msc degree in Physics back in 2008, I started a Phd that I never finished because I went off to do a startup. Maybe it's because I never finished it, but for many years afterwards, I had this romantic notion and kept going back to Physics in my free time, reading papers, books. I once even had a yearly goal of writing a paper and submitting it to a journal. I actually did it, I wrote a short paper titled &lt;a href="https://arxiv.org/abs/1208.4990"&gt;Pure Lattice Gauge Theory in the Expanding Universe&lt;/a&gt;, and submitted it to the Physical Review Letters (PRL), but got rejected (most papers get rejected at first, &lt;em&gt;real&lt;/em&gt; scientists keep improving and submitting). Since I don't &lt;em&gt;really&lt;/em&gt; care about getting into a journal, I didn't spend any more time on it, now it just sits on Arxiv [1].&lt;/p&gt;
&lt;p&gt;So, when I decided not to spend time on Physics anymore, it was not a trivial thing to do. I really like to do Physics, even in my spare time. But at the end of the day, I'm not in Academia, so I don't actually have time to follow topic(s) of research like real scientists do and figure out how to contribute. Also, academia is pretty crowded, and research topics have become very specialized. It's not really a good investment of time for me; investing my time into things related to software/startups has much higher potential impact and payoff. I also stopped reading Physics blogs, all the drama around string theory is fun but ultimately just a distraction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Haskell.&lt;/strong&gt; I first came into contact with Haskell in 2013 when working at Prezi. There are a lot of things that are intesting about Haskell: it's a strongly typed, purely functional language with type classes, type inference, etc. You can do really cool things with Haskell, or so it seems at first. In 2013, before systems like &lt;a href="http://bytepawn.com/fetchr-airflow.html#fetchr-airflow"&gt;Airflow&lt;/a&gt; were opensourced by Airbnb, there was no good standard open source ETL system; but we needed one at Prezi, to replace the bash hairball we had. We used this project as a testing ground for Haskell, and wrote an ETL framework in Haskell (I think we called it Datapipe). We spent about 3-6 months on it and it was a big disappointment. Although we put it into production at one point, it was quickly replaced by a re-write (this time in Go, which I believe is still in production). At a later point I tried using Haskell for a personal project for representing Physics equations and quantities, and also ran into major/deep problems (oddly, I found it easier to model what I wanted with C++ templates of all things).&lt;/p&gt;
&lt;p&gt;Despite all these failures, I continued to be interested in Haskell. Although I adopted a critical stance, and usually argued against it in conversations, I assumed &lt;em&gt;"it's just me, I don't "get it"&lt;/em&gt;. But I never found or really saw good reasons and generalizable examples where using Haskell in production really made sense. So when I read the Warren Buffett bit, I knew that Haskell is one of those things that I have to stop spending time on, it's just not a practical thing for me to use. I share &lt;a href="http://functionaltalks.org/2013/08/26/john-carmack-thoughts-on-haskell/"&gt;John Carmack's stance&lt;/a&gt;, he views Haskell as a good source for ideas to use in imperative languages like C++.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I don't think spending time reading/writing Physics papers or Haskell has been a waste for me. Far from it. I learned a lot doing these things. But I've also determined that investing more time into them is not worth it right now. Remembering Warren Buffett and his pilot is a great way to remind myself to stick to it, and spend time on other, potentially more impactful things [2].&lt;/p&gt;
&lt;p&gt;-&lt;/p&gt;
&lt;p&gt;[1] A &lt;a href="https://arxiv.org/abs/1405.6665"&gt;PRD paper&lt;/a&gt; later referenced it and called it "pioneering work", but I'm pretty sure it's not.&lt;/p&gt;
&lt;p&gt;[2] Having said that, if I were to go on a sabbatical, doing Physics would be fair game. The point of sabbatical is to take time off from the normal pursuit of things and spend time on more risky/fun projects, in a timeboxed way.&lt;/p&gt;</content><category term="warren"></category><category term="buffett"></category><category term="self"></category><category term="help"></category><category term="physics"></category><category term="haskell"></category></entry><entry><title>Beat the averages</title><link href="https://bytepawn.com/beat-the-averages.html" rel="alternate"></link><published>2018-07-07T00:00:00+02:00</published><updated>2018-07-07T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-07-07:/beat-the-averages.html</id><summary type="html">&lt;p&gt;When working with averages, we have to be careful. There are pitfalls lurking to pollute our statistics and results reported.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/averages-7.png" alt="Probability distribution" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;The numbers in this article are made up, but the lessons come from real life.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;When doing Data Science, we almost always report averages. This is natural, because it stands for a simple model that everybody can understand. For example, in the delivery business, a topline metric is Deliveries per Driver (DPD), the average deliveries made per driver per day. This is a simple model the CEO can also remember: if our fleet is performing at DPD = 40, and we have 1,000 drivers, we make 40,000 deliveries per day. Being able to multiply two topline numbers and get a third one is a good thing.&lt;/p&gt;
&lt;p&gt;When working with averages, we have to be careful though: there are pitfalls lurking to pollute our statistics and results reported. It is important to note that &lt;strong&gt;there is nothing wrong with averages themselves, we just have to be careful with them&lt;/strong&gt;. I don’t believe that for most reporting purposes averages should or can be replaced (eg. by reporting the median), it is simply the job of the Data Science team to make sure the metrics make sense.&lt;/p&gt;
&lt;h2&gt;Outliers&lt;/h2&gt;
&lt;p&gt;When we say that our DPD is 40 and we have 1,000 drivers, the natural inclination (even for data people) is to &lt;em&gt;imagine&lt;/em&gt; 1,000 equivalent drivers, each performing exactly 40 deliveries every day. But we know that the world isn’t this simple. Things like driver performance tend to follow some more interesting distribution. The simplest thing we can imagine is that it follows a normal distribution. The plot below shows a normal distribution, the mean (green) and median (red) coincide. Gauss is happy.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/averages-1.png" alt="Probability distribution" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;But almost always &lt;strong&gt;there are outliers&lt;/strong&gt;. In the case of drivers, there are various special circumstances which can cause a driver to have very low or very high DPD. For example, maybe the driver got sick, interrupted his route and went home early. Below is a the same distribution as above, with some stragglers introduced. We can see that this shifts the mean (green) down.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/averages-2.png" alt="Probability distribution" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The shift in the mean is important, because it signals that something is going on: a bunch of our drivers got sick and went home early. Maybe tomorrow they are not coming to work. So monitoring both the average and median is important to detect and understand deviations.&lt;/p&gt;
&lt;p&gt;Apart from the median, which is also called the 50th &lt;a href="https://en.wikipedia.org/wiki/Percentile"&gt;percentile&lt;/a&gt;, &lt;strong&gt;checking out the bottom and top percentiles&lt;/strong&gt; is also very helpful. Below is the same two plots, with p10 and p90 also showing in red.&lt;/p&gt;
&lt;div&gt;
&lt;img src="/images/averages-3.png" alt="Probability distribution" style="width: 325px;"/&gt;

&lt;img src="/images/averages-4.png" alt="Probability distribution" style="width: 325px;"/&gt;
&lt;/div&gt;

&lt;p&gt;Something really useful happened! After we introduced the stragglers, the p10 dropped from about 27 to about 8!&lt;/p&gt;
&lt;p&gt;In general, &lt;strong&gt;showing percentiles is a useful technique, because as the example above shows, they can dramatically speed up detection of anomalies&lt;/strong&gt;. In real life work, looking at distribution doesn’t happen on a daily basis, but a timeseries showing the historic DPD can also show p10, median and p90, and can show such anomalies. The chart below shows such a made-up example, showing p10, p50 and p90 in red, the average in green, for the last 30 days for the fleet. On the 25th day a flu started spreading between our drivers, introducing the stragglers as shown in the distribution above. The mean and the median separate somewhat, but &lt;strong&gt;the p10 gives it away&lt;/strong&gt;.
It’s worth showing all four lines, at least on internal, debug dashboards.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/averages-5.png" alt="Probability distribution" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;It’s worth noting the outliers can come from another source too: data bugs. Another good trick is to periodically examine low and high performers in a table, attached to the bottom of the internal report/dashboard.&lt;/p&gt;
&lt;p&gt;Finally, outlier/anomaly detection can also be automated, for example Facebook does this internally for various metrics. It’s important to automate at least the visualization of anomalies/distributions/stragglers in a debug dasboard, because in the long-run Data Scientists will forget to check manually (export and plot in ipython takes time).&lt;/p&gt;
&lt;h2&gt;Several populations&lt;/h2&gt;
&lt;p&gt;Another reason averages can be polluted is because of multiple populations (outliers can also be thought of as a population). In the delivery business, it is not uncommon to have many separate fleets of drivers, for different purposes. For example, we may have a B2C and a C2C fleet. Another distinction is cars vs bikes. Uber could have a fleet for passengers and a totally separate fleet for UberEats. Below is a (made-up) distribution that’s actually two fleets, a C2C fleet performing at DPD=20 and a B2C fleet performing at DPD=40.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/averages-6.png" alt="Probability distribution" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;In cases like this, &lt;strong&gt;reporting on the blended mean may be misleading&lt;/strong&gt;. For example, if country X has a B2C and a C2C fleet, while country Y only has a B2C fleet, then reporting just on country-wise DPD will be misleading. For country X the C2C fleet will pull the DPD down, but this doesn’t mean that the Ops team in country X is performing worse, in fact it’s possible their B2C fleet is outperforming country Y’s. Report the per-fleet mean instead.&lt;/p&gt;
&lt;h2&gt;Skewed distributions&lt;/h2&gt;
&lt;p&gt;Sometimes distributions are not symmetric, they can be lopsided. In this case the median, mode (the most frequent outcome, the maximum of the distribution) and mean can be at different locations, which is often unintuitive for people. This isn’t a problem wrt the mean, but it’s good to know. The &lt;a href="https://en.wikipedia.org/wiki/Log-normal_distribution"&gt;Log-normal distribution&lt;/a&gt; is one such example:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/averages-7.png" alt="Probability distribution" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Percentage instead of mean&lt;/h2&gt;
&lt;p&gt;Sometimes, when building a metric, the mean is not a good choice. Let’s take pageload times an an example. Suppose we measure the average pageload time in miliseconds, and we see that it is 4,200ms; too high. After rolling out changes, it goes down to 3,700ms; but, 3,700 is still too high. Does that mean the rollout wasn't successful?&lt;/p&gt;
&lt;p&gt;In situations like this, it makes sense to &lt;strong&gt;bake the goal into the metric&lt;/strong&gt;. Suppose our goal is 2,000ms, which we deem pleasant from a UX perspective. Then a better way to define the metric is "% of pageloads that are within 2,000ms". If it was 57% before, and 62% after the rollout, it’s &lt;strong&gt;more natural to understand what happened&lt;/strong&gt;: an additional 5% of people now have a pleasant experience when loading our page. If there are 1,000,000 users per month, we impacted 50,000 users per month with the rollout. Not bad! A metric like this is also &lt;strong&gt;more motivating for product teams to work on&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Another big advantage of using percentages is &lt;strong&gt;increased resiliency to outliers&lt;/strong&gt;. While the mean could be polluted by outliers (users on slow connections, bots, data bugs), in the % it will be “just” a constant additive factor.&lt;/p&gt;
&lt;h2&gt;Ratios instead of means&lt;/h2&gt;
&lt;p&gt;Our delivery business also has C2C, ie. people can call a driver and send a package to another person, on demand. For example, if my partner is at the airport, but they forgot their passport at home, I can use the C2C app to fetch a car and send her the passport in an envelope. As such, the C2C app has standard metrics such as Daily Active Users (DAU) and Monthly Active Users (MAU). These are topline metrics, but we also need a metric which expresses how often people use the product. One way to do it using means would be to count, for each user, how many days they were DAU of the last 28 days. Suppose we call this Average DAU, and it’s 5.2. This is not that hard to understand, but could still be confusing. For example, people always forget the definition of a metric, in this case they would forget if the metric is 28 or 30 or 7 day based. Also, increments like this don’t feel natural: a +1% increment corresponds to +0.28 active days or 6.72 hours.&lt;/p&gt;
&lt;p&gt;A better metric is simply to divide DAU per MAU. This is a common metric also used inside Facebook. This feels more natural: if we are honest with ourselves, a user is essentially a MAU, because somebody who hasn’t used the product for 28 days is probably not coming back (For products with more sporadic usage, the base could be a 3*28 days). Thinking like this DAU/MAU is a very natural metric: it is the % of "users" who use the product daily.&lt;/p&gt;
&lt;h2&gt;Daily variations&lt;/h2&gt;
&lt;p&gt;Suppose our fleet’s average DPD is 40. Looking at driver X, his DPD yesterday was 29. Is he a low performer? Our first intuition might be to ask what the standard deviation of the fleet is (suppose it is 10), and then argue that this value is not “significantly” off. But from a business perspective, variance is irrelevant: if the COO wants to improve DPD and is looking for low performing drivers to cut, "cutting" at mean minus one sigma is a valid approach.&lt;/p&gt;
&lt;p&gt;However, &lt;strong&gt;it’s possible that our drivers have significant daily variation in their performance&lt;/strong&gt;. It’s possible that this driver had a DPD of 29 yesterday, but the previous day it was 47, and their historic average is actually 42. Always &lt;strong&gt;compare averages to averages&lt;/strong&gt;. In this case, compare the fleet’s average DPD over a long enough timeframe (probably at least 28 days) to the driver’s average DPD in the same 28 days. That is a more fair comparison to make, because it smooths daily variation. Of course, remember what was said here, for example don’t count days when the driver was sick, and compare him to his own fleet. &lt;/p&gt;
&lt;h2&gt;In summary&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Using averages is okay most of the time.&lt;ul&gt;
&lt;li&gt;Reporting on medians is probably not feasible in a business/product setting.&lt;/li&gt;
&lt;li&gt;Instead, make sure the average is meaningful.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Watch out for outliers.&lt;ul&gt;
&lt;li&gt;Check median/p10/p90 and distributions regularly.&lt;/li&gt;
&lt;li&gt;Prune/separate outliers.&lt;/li&gt;
&lt;li&gt;Split up populations (B2C/C2C, car/bike), etc. to make sure the reported average (or median) is a meaningful number.&lt;/li&gt;
&lt;li&gt;Outliers can be real outliers, or issues in the data (eg. Self Pickup as a driver)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Sometimes the population is homogeneous, but the distribution is skewed to one side or bimodal, in this case the average may be intuitively misleading.&lt;/li&gt;
&lt;li&gt;Sometimes, using %s instead of averages makes a better metric (Pageloads within 2000ms vs Average Pageload time).&lt;/li&gt;
&lt;li&gt;Sometimes, using a ratio instead of averages makes a better metric (example: DAU/MAU vs average number of DAUs in the last 28 days).&lt;/li&gt;
&lt;li&gt;Be careful when comparing daily snapshots and averages, there may be significant daily variation in performance.&lt;/li&gt;
&lt;/ol&gt;</content><category term="statistics"></category><category term="data"></category></entry><entry><title>Building the Fetchr Data Science Infra on AWS with Presto and Airflow</title><link href="https://bytepawn.com/fetchr-airflow.html" rel="alternate"></link><published>2018-03-14T00:00:00+01:00</published><updated>2018-03-14T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2018-03-14:/fetchr-airflow.html</id><summary type="html">&lt;p&gt;We used Hive/Presto on AWS together with Airflow to rapidly build out the Data Science Infrastructure at Fetchr in less than 6 months.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/warehouse-dag.png" alt="Warehouse DAG" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Our goal at &lt;a href="https://fetchr.us"&gt;Fetchr&lt;/a&gt; is to build a world-class Data Science team. To do this, we need a world-class Data Science platform. I was fortunate enough to work at Facebook previously, which over the years arrived at a very efficient way of doing Data Science. So, when it came to building the platform I decided to follow the basic design patterns that I saw at Facebook.&lt;/p&gt;
&lt;p&gt;Based on the last 6 months, building a platform (including computation jobs, dashboarding) that is simple but allows us to move fast is feasible in just a 3-6 month period. So what does our platform look like? Like most things at Fetchr, we run on AWS. Our infra consists of 5-10 nodes right now (5 EMR, 2 Airflow, a few more for Supersets and others).&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/fetchr-ds-arch.png" alt="Fetchr Data Science Infra" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;We use &lt;a href="https://aws.amazon.com/emr/"&gt;EMR&lt;/a&gt; to get a Hadoop instance, with S3 as the backing storage. We actually don’t use the Hive query engine or MapReduce. We just use &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt; as a metadata store (table definitions) for &lt;a href="https://prestodb.io/"&gt;Presto&lt;/a&gt;. Each EMR node also runs a Presto worker. Right now we use 1+4 nodes, with plans to scale it out to ~10.&lt;/p&gt;
&lt;p&gt;The data warehouse (DWH) philosophy is again based on the Facebook design pattern. We use flat tables, no fact/dimension tables; usually you can look at a table and see a complete picture. This makes the tables very usable and allows us to move fast, for example writing quick queries against tables is easy because it doesn’t require a lot of JOINs to get readable strings.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/flat-table.png" alt="Flat DWH table" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The other major design pattern from Facebook is the idea of daily partitioned tables. This is a feature available on Hive, and not really practical on eg. &lt;a href="https://aws.amazon.com/redshift/"&gt;Redshift&lt;/a&gt;. Essentially we store (complete) daily, write-once slices of each table, which are generated by daily jobs. The partitions are called &lt;code&gt;ds&lt;/code&gt; at Facebook and logically show up as a column of the table, and you’ll find plenty of references to it if you read the &lt;a href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual"&gt;Hive docs&lt;/a&gt; (because Hive was written at Facebook). Physically, these are essentially directories, each one holding the data files for that day’s data. We use S3, so in our case it looks something like &lt;code&gt;s3://dwh-bucket/&amp;lt;table&amp;gt;/&amp;lt;ds&amp;gt;/&amp;lt;data_files&amp;gt;&lt;/code&gt;. For example, &lt;code&gt;s3://dwh-bucket/company_metrics/2018-03-01/datafile&lt;/code&gt;. For technical reasons, when importing data from our production (Postgresql) database, we use .csv, for later computed warehouse tables we use &lt;a href="https://orc.apache.org/"&gt;ORC&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The advantage of this is that we have a complete history of the data warehouse going back as far as we’d like (old partitions can be deleted from a script after the desired retention period expires). There’s two ways to use &lt;code&gt;ds&lt;/code&gt; partitions, cumulative and events: each partition can store a complete copy of its data up to that day (cumulative), or each partition just stores that day’s worth of (event) data. For aggregate tables, it’s usually the first, for raw event tables, it’s usually the second. For example, our &lt;code&gt;company_metrics&lt;/code&gt; has complete cumulative data in each &lt;code&gt;ds&lt;/code&gt;, while our &lt;code&gt;driver_telemetry&lt;/code&gt; table has just that day’s worth of telemetry events. The advantage of this is that if something breaks, there’s almost never a big problem; we can always refer to yesterday’s data, and get away with it. Data will never be unavailable, it may just be late. Also, if there’s ever a question why a number changed, it’s easy to see what the reported number was a month ago (by examining that day’s &lt;code&gt;ds&lt;/code&gt; partition).&lt;/p&gt;
&lt;p&gt;We use &lt;a href="https://airflow.apache.org/"&gt;Airflow&lt;/a&gt; for data piping, which is loosely based on &lt;a href="http://www.asiliconvalleyinsider.com/asiliconvalleyinsider/Blog_A_Silicon_Valley_Insider/Entries/2016/5/1_Data_Engineering_%40_Facebook.html"&gt;Facebook’s Dataswarm system&lt;/a&gt;. Airflow allows us to write jobs as Directed Acyclic Graphs (DAGs) of tasks, with each task getting something useful done, like a database &lt;code&gt;INSERT&lt;/code&gt;. In Airflow, each DAG has a schedule, which uses the &lt;a href="https://airflow.apache.org/scheduler.html"&gt;cron format&lt;/a&gt;, so it can be daily, hourly, or just run every Wednesday at 3:15PM. On each of these runs, Airflow creates an instance of the DAG (identified by the timestamp), and executes the tasks, taking into account the dependencies between them. We have 2 types of DAGs: imports, for importing tables from the production database to the DWH, and compute jobs, which take existing (imported or computed) tables and make a new, more useful table. Fundamentally, each table is its own DAG.&lt;/p&gt;
&lt;p&gt;This poses a question: how do we make sure that a table’s DAG only runs once another table that is required (eg. it’s used in the &lt;code&gt;FROM&lt;/code&gt; part) is available (the latest &lt;code&gt;ds&lt;/code&gt; is available). This is accomplished with having special sensor tasks, which continuously check something (in this case whether a table’s partition is there), and only succeed if the check succeed; until then these “wait” tasks block the DAG from executing. For example, this is what a typical DAG looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/warehouse-dag.png" alt="Warehouse DAG" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;There are two waits (one for a table called &lt;code&gt;deliveries&lt;/code&gt;, one for this table but yesterday’s &lt;code&gt;ds&lt;/code&gt; partition, which is a kind of self-dependency), there is a &lt;code&gt;create&lt;/code&gt; task which creates the table in case it doesn’t exist, the &lt;code&gt;drop_partition&lt;/code&gt; drops the partition in case it already exists (in case we’re re-running the job), the &lt;code&gt;insert&lt;/code&gt; does the actual &lt;code&gt;INSERT INTO … SELECT ... FROM ...&lt;/code&gt;, and then some useful views are created (eg. for a table called &lt;code&gt;company_metrics&lt;/code&gt;, the view task creates a view called &lt;code&gt;company_metrics_latest&lt;/code&gt;, which points to the latest &lt;code&gt;ds&lt;/code&gt; partition).&lt;/p&gt;
&lt;p&gt;DAGs for import jobs are simpler:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/import-dag.png" alt="Import DAG" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;s3copy&lt;/code&gt; is the task which dumps the table from the production Postgresql into a local file and then copies it to S3, to the appropriate path. The &lt;code&gt;notice&lt;/code&gt; lets Hive now that we “manually” created a new partition on the backing storage, and triggers the metadata store to re-scan for new partitions by issuing &lt;code&gt;MSCK REPAIR TABLE &amp;lt;table&amp;gt;&lt;/code&gt;. (The &lt;code&gt;notice&lt;/code&gt; in the upper DAG is actually not required, since it’s a Presto job.)&lt;/p&gt;
&lt;p&gt;Airflow creates daily instances (for daily jobs) of these DAGs, and has a very helpful view to show progress/completion.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/dag-runs.png" alt="Warehouse DAG" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The UI also allows for tasks to be cleared, re-run, etc.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/task-actions.png" alt="Task actions" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Each DAG is implemented as Python code, in our case one &lt;code&gt;.py&lt;/code&gt; file per DAG. Most of these DAGs are highly repetitive, so we wrote a small library to save us time. For example, since we’re importing from a Postresql database, which is itself a relational database, it’s enough to say which table we want to import, our scripts figure out what the source table’s schema is, it knows how to map Postgresql types to Hive types, handle column names which are not allowed on Hive, etc. This makes importing a table as easy as:&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/import-code.png" alt="Import code" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;All the logic is contained in the &lt;code&gt;dag_import_erp_table()&lt;/code&gt; function, which is re-used for all imports.&lt;/p&gt;
&lt;p&gt;We wrote similar helper functions for our common warehouse jobs, which take existing tables to build a new, more useful one. We specify the name of the output table, the &lt;code&gt;schedule_interval&lt;/code&gt;, the Hive columns (which is used to generate the &lt;code&gt;CREATE TABLE&lt;/code&gt; task), and the Presto &lt;code&gt;SELECT&lt;/code&gt; query, which will be placed after the &lt;code&gt;INSERT&lt;/code&gt; part in the insert task. Note the use of the &lt;code&gt;wait::&lt;/code&gt; prefix in the &lt;code&gt;FROM&lt;/code&gt; part. The helper functions automatically parses out these and generates wait tasks for these tables. A number of other such features were added to make it easy, fast and convenient to write jobs, without having to go outside the use of these helper functions. The &lt;code&gt;{{ ds }}&lt;/code&gt; macro will be replaced by the Airflow runtime with the proper ds, like &lt;code&gt;2018-02-20&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/warehouse-code.png" alt="Warehouse code" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Right how we have around 50 jobs, about half are “real” computations, the rest are imports. At this point we are able to move really fast: writing a new job and deploying it to production takes about an hour, and new joiners can ramp up quickly. Because we use Presto/Hive on top of S3 (versus &lt;a href="https://medium.com/airbnb-engineering/data-infrastructure-at-airbnb-8adfb34f169c"&gt;Airbnb runs their own Hadoop cluster&lt;/a&gt;) this introduced some low-level difficulties, so we had to write our own Operators, for example a PrestoOperator. Overall this code, plus the helper code is about 1-2k LOC, so it wasn’t too much work. To be fair, we never hit any data size problems, since compared to the capabilities of these tools, we have "small data". Our biggest tables are ~100M rows (these are part of 10-way &lt;code&gt;JOINs&lt;/code&gt;), but Hive/Presto can easily handle this with zero tuning. We expect to grow 10x within a year, but we expect that naive linear scaling will suffice.&lt;/p&gt;
&lt;p&gt;Maintaining a staging data warehouse is not practical in our experience, but maintaining a staging Airflow instance is practical and useful. This is because of Airflow’s brittle execution model: DAG’s &lt;code&gt;.py&lt;/code&gt; files are executed by the main webserver/scheduler process, and if there’s a syntax error then bad things happen, for example certain webserver pages don’t load. So it’s best to make sure that scripts deployed to the production Airflow instance are already working. So we set up a second, staging Airflow instance, which writes to the same data warehouse, (we have only one) but has its own internal state. Our production Airflow instance runs on two EC2 nodes. One for the webserver and the scheduler, one for the workers. The staging runs on a third, all 3 components on the same host.&lt;/p&gt;
&lt;p&gt;Overall, getting here was fast, mostly because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the database components (Hive, Presto) were open sourced by Facebook&lt;/li&gt;
&lt;li&gt;Amazon runs them for us as part of EMR&lt;/li&gt;
&lt;li&gt;we don't have to manage storage because of S3&lt;/li&gt;
&lt;li&gt;other former Facebook engineers built Airflow and Airbnb open sourced it&lt;/li&gt;
&lt;li&gt;because of the common background (Facebook) everything made sense.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Having said that, Airflow still feels very “beta”. It’s not hard to “confuse” it, where it behaves in weird ways, pages don’t load, etc. For example, if a DAG’s structure changes too much, Airflow seems to get confused and exceptions are thrown; for cases like this we wrote a custom scripts which wipes Airflow’s memory of this DAG completely (we didn’t find a way to do this with the provided CLI or UI). But, once we understood how it works and learned its quirks, we found a way to use it for our use-case. This process took about 1-2 months. We now rarely run into Airflow issues, perhaps once a month.&lt;/p&gt;
&lt;p&gt;The limits of this architecture is that it's very batch-y. For "real-time" jobs, we use hourly or 15-minute jobs to get frequent updates, but we apply manual filters on data size to make these run fast(er). Overall, this is inconvenient, and won't scale very well, eventually we'll have to look at other technologies for this use-case. Overall, we feel this is inconveniance/limitation/techdebt is a small price to pay for all the high-level product and business impact that we were able to deliver with this architecture.&lt;/p&gt;
&lt;p&gt;Airflow is now under Apache incubation, with lots of development activity, so it will surely get even better in the coming years. Going with Airflow was a bet that payed off, and we expect that Airflow will become the defacto open source ETL tool, if it’s not already that.&lt;/p&gt;
&lt;p&gt;In the next part about Fetchr's Data Science Infra, I’ll talk about how we use Superset for dashboarding and SQL.&lt;/p&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="airflow"></category><category term="fetchr"></category></entry><entry><title>Don’t build cockpits, become a coach</title><link href="https://bytepawn.com/data-science-coaching.html" rel="alternate"></link><published>2016-11-09T00:00:00+01:00</published><updated>2016-11-09T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-11-09:/data-science-coaching.html</id><summary type="html">&lt;p&gt;I used to think that a good analogy for using data is the instrumentation of a cockpit in an airliner. Lots of instruments, and if they fail, the pilot can’t fly the plane and bad things happen. There’s no autopilot for companies. The problem with this analogy is that planes aren’t built in mid-air. Product teams and companies constantly need to build and ship new products.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/hosszu-shane.jpg" alt="A big complicated cockpit" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;I used to think that a good analogy for using data is the instrumentation of a cockpit in an airliner. Lots of instruments, and if they fail, the pilot can’t fly the plane and bad things happen. There’s no autopilot for companies.&lt;/p&gt;
&lt;p&gt;The problem with this analogy is that planes aren’t built in mid-air. Product teams and companies constantly need to build and ship new products. Facebook is very good at this.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cockpit.jpg" alt="A big complicated cockpit" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;A better model is a person, let’s say Petie, who is overweight and wants to lose weight. We’re going to need a scale, to see how much Petie actually weighs. If there’s no scales around, we’ll need to build one. While we’re at it, we can also collect other numbers, like body fat, circumference, etc. Then, we need to get Petie to actually look at the numbers. So we send these numbers to Petie every day in an email, or maybe we build a dashboard for him. Sounds good! But at this point all we did is make Petie know precisely how overweight he is. We probably successfully made Petie depressed about himself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What Petie really needs is a coach who helps him get in shape. He needs a workout plan and somebody to work with him on reaching his goals.&lt;/strong&gt; He needs somebody to help figure out what the workout should be, and set goals for the workout sessions. Once he’s reached his goals, he needs help figure out the next phase, what other exercises to do. Less running, more lifting, maybe do an experiment to see what diet works better for him. Knowing numbers is part of it, but the point it to somehow get him to go and do the things which will make him lose weight, keep track of how he’s doing, make sure he’s on track, help him make changes on the way.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/hosszu-shane.jpg" alt="A big complicated cockpit" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Many professional sports teams already use data heavily. The most famous example I know is the british cycling team, first described in this &lt;a href="http://www.bbc.co.uk/sport/olympics/19174302"&gt;2012 BBC article&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I think being good coach is really hard. A good coach needs to know the sport, the athlete, the equipment, how training works, how the season and races work, everything. A good coach is on excellent terms with the athlete and is a great communicator. The coach needs to be able to convince the athlete to perform deliberate practice, which is hard and painful stuff. And to make it harder, painstakingly take precise measurements while he’s doing it. A good coach is not made overnight. A good idea is to learn from other, more experienced coaches, who have successfully helped athletes reach their goals. Facebook has good coaches (data scientists).&lt;/p&gt;
&lt;p&gt;Whenever I use an analogy I set off the Elon-alarm in my head. &lt;a href="http://jamesclear.com/first-principles"&gt;Elon Musk famously said&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I think it is important to reason from first principles rather than by analogy. The normal way we conduct our lives is we reason by analogy. [When reasoning by analogy] we are doing this because it’s like something else that was done or it is like what other people are doing — slight iterations on a theme.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;But, this one seems useful enough to ignore Elon.&lt;/p&gt;</content><category term="data"></category><category term="science"></category><category term="product"></category><category term="analytics"></category></entry><entry><title>Beautiful A/B testing</title><link href="https://bytepawn.com/beautiful-ab-testing.html" rel="alternate"></link><published>2016-06-05T00:00:00+02:00</published><updated>2016-06-05T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-06-05:/beautiful-ab-testing.html</id><summary type="html">&lt;p&gt;I gave this talk at the O’Reilly Strata Conference London in 2016 June, mostly based on what  I learned at Prezi from 2012-2016.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/strata-2016/14.png" alt="14. slide" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;I gave this talk at the &lt;a href="http://conferences.oreilly.com/strata/hadoop-big-data-eu/public/schedule/detail/49583"&gt; O’Reilly Strata Conference London&lt;/a&gt; in 2016 June.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; this is not about Facebook A/B testing. If you want to hear about that, you will be disappointed, because nothing here is about Facebook. This is based on my experiences at previous jobs.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/1.png" alt="1. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;5 years ago I was working on my own startup, it was called Scalien, a combination of the words Scale + Alien. It was a cool name, and we had a cool product called &lt;a href="https://github.com/scalien/scaliendb"&gt;ScalienDB&lt;/a&gt; that nobody wanted, a NoSQL database that unlike other NoSQL databases used an algorithm called &lt;a href="https://en.wikipedia.org/wiki/Paxos_(computer_science)"&gt;Paxos&lt;/a&gt; for consistent replication, whereas the competing products used Eventual Consistency. Scalien didn't work out and after we shut down the company I took a job with &lt;a href="https://prezi.com"&gt;Prezi&lt;/a&gt; in Budapest, Hungary, where I'm from. I was hired at Prezi as a data engineer, and eventually became the head data guy. My job was to build out the data team and the data infrastructure, and one of the projects I worked on was A/B testing, figuring out how the company should do A/B testing.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/2.png" alt="2. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;When I first started to think about how to improve the A/B testing culture and saw the problems we were having, I was a bit surprised. I thought, isn't it as simple as taking four numbers (the total impressions for A and B, and conversions for A and B), plugging it into some math equations, and reading off the results? I guess the naivete I had back then is a testament to the fact that even an education in Physics, the cradle of the scientific method, doesn't prepare you for how much more there is to it (=the scientific method), and sort of explains &lt;a href="http://blog.minitab.com/blog/understanding-statistics/what-can-you-say-when-your-p-value-is-greater-than-005"&gt;why so many academic studies are flawed&lt;/a&gt;. So the title is beautiful because (other than stealing this trendy phrase from O'Reilly) it turns out that A/B testing is a much more complex topic, and the actual complexities that you find when you look into it I personally find beautiful.&lt;/p&gt;
&lt;p&gt;I cannot claim I have found all pitfalls or know all the answers to the issues around A/B testing; here I will tell you what I learned in the last 3 years, what fallacies I found, and how we tried to work around them.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/3.png" alt="3. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Let’s define what A/B testing is in this context here:&lt;/strong&gt; you have two variants of your SaaS product, and you want to see which one is better. Note that you don't need to perform an A/B test (=compare two variants) if you just want to measure metrics for the new version! Performance is always measured relative to some relevant business metric, and on your target audience, the users you care about. The standard simple example is checking whether a button should be green or blue, which converts better. Another example is reorganizing templates that you can use in an editor, and seeing whether it affects user behaviour, like spending more time in the editor or creating more presentations. A third example is trying out different variants of a pricing page, and seeing how it influences people's choices between free and various paying options of a freemium product.&lt;/p&gt;
&lt;p&gt;No talk about A/B testing is complete without mentioning HIPPOs. &lt;strong&gt;HIPPO stands for Highest Paid Person's Opinion&lt;/strong&gt;, and is the antithesis of A/B testing and scientific thought: it's when the big guy decides what happens with the software based on her own personal preferences, and in general it's a bad idea.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/4.png" alt="4. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The talk is divided into 9 topics, organized in a nice staircase, as they are based on each other:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4 are things you should think about before doing an A/B test&lt;/li&gt;
&lt;li&gt;3 are relevant while the test is running&lt;/li&gt;
&lt;li&gt;2 are things you should do after the A/B test is finished&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/5.png" alt="5. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The first example is about pricing pages. In a freemium business, users can go free or choose one of the paying options to get extra features. One thing we learned is that, if and as we make the free option on the pricing page smaller and the color less distinguished, people chose the paying option more (paying after a free trial period). After a few such experiments, we didn’t need to do an A/B test just to learn this, we already knew.&lt;/p&gt;
&lt;p&gt;The other example is about sign up emails. After a large number of such experiments, we learned that green buttons work better with our blue color scheme. (We also learned that designers are unhappy with green buttons.)&lt;/p&gt;
&lt;p&gt;So the lesson is, &lt;strong&gt;don't run an A/B test---which is an experiment to tell whether A or B is better---when you already know from past experiments which one will be better.&lt;/strong&gt; Note that not doing an experiment doesn’t mean you should not measure the metrics associated with a new version, that’s not the same thing.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/6.png" alt="6. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When you run an A/B test, you have to decide what metric to look at, and that should be the right metric.&lt;/strong&gt; A mistake many people make is they focus on a more immediate metric like conversion. But usually your desired business outcome is 3 more steps down the line, so you should focus on that. Usually this metric you should look at is related to something important to your business, like revenue, or how many pieces of content people make, or how much time people spend viewing content, and so on; or to give an example from social networking: do people post more, comment more, do they spend more time with the product, and so on.&lt;/p&gt;
&lt;p&gt;This is very tricky, and even when you think you're doing the right think, you may not. One story here is series of experiments that was performed on a pricing page. And we were smart, we didn't just look at clicks, we looked at the revenue generated. And we saw it was better, and we were super happy, we changed the HTML of the pricing page and now we're making more money! But we didn't have a good understanding of payment flows, so we didn't take into account refunds and chargebacks. A refund is when a customer asks you, the merchant for the money back. A chargeback is when the customer asks their credit card company for the money back. The way it works, if you're an online SaaS, essentially is a customer asks for money back, you want to give it back, to have good relations with customers and look good with the payment providers, because you're at their mercy. So these are 2 channels that we should have taken into account, and subtracted from the results of the A/B test. This is of course hard, because it requires sophisticated tracking and data collection, plus patience, because chargebacks are collected not by you, and only arrive at your doorstep 30-60 days after the payment event.&lt;/p&gt;
&lt;p&gt;A fallacy related to the question of which metric to look at is what I call &lt;strong&gt;data mining metrics&lt;/strong&gt;. I haven't yet talked much about technical concepts like statistical significance, and I won't because this is not intended to be a deeply technical talk. But most A/B tests are run using what's called frequentist statistics. With these statistics, after collecting samples for both A and B (exposures, conversions), you calculate something called the p-value. The p-value is usually expressed as a %, like 3%. The p-value has a very specific meaning, and it's the false positive rate. It says, if you were to repeat this experiment many-many times, given the current results, the math says that there is 3% chance of getting this outcome, assuming that A and B are the same. So the way frequentist A/B testing works is that, a company usually standardizes on a threshold like 5%, saying, let's run our A/B tests long enough so we collect enough samples so we only accept a false positives 1 out of 20 times.
So the fallacy here is that if you have many business metrics, let's say 20, and you run a test, and then you calculate A vs B for all 20 metrics, and you find one metric where B is better than A (p is less than 5%), and you accept it; then, if this is your methodology, then in the long run you're wrong in thinking that your false positive rate is only 5%. It will be much higher, because you're essentially giving yourself extra chances, like is B better wrt to Metric 1? How about Metric 2? and so on. So you shouldn't fool yourself like this.&lt;/p&gt;
&lt;p&gt;Of course you &lt;em&gt;should&lt;/em&gt; look at a bunch of metrics after the A/B test, but:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;before the test, you should select the target metric, and put it down in your A/B testing tool to keep yourself honest&lt;/li&gt;
&lt;li&gt;when you look at other metrics later, just remember that you're increasing the chance of seeing a false positive&lt;/li&gt;
&lt;li&gt;if you find a lift in another metric that seems like a reasonable effect that you didn't think off, repeat the A/B test with that target metric&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Large numbers =&amp;gt; significance.&lt;/strong&gt; There is one more thing to keep in mind. Large sites with a lot of traffic, they won't have these problems, because they can quickly collect so many samples that the p-value goes way down to like 1 in 10,000, and then you can look at many metric and still be reasonably sure that it's not a false positive. Essentially, if you collect a lot of samples, you're converting that bell curve estimate of the metrics, which has a certain precision error, into a point estimate as the bell curve becomes leaner and leaner, and you can compare those "freely". I'm a bit oversimplifying, but that's the gist of it.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/7.png" alt="7. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This is a simple lesson that I learned by observing PMs in action. The anti-pattern is that a product team will spend 2 months building out a cool new feature, and then they will run an A/B test on it "to prove that it works". Note how I said "to prove that it works". The problem is, after 2 months of invested time in the project, the team wants to see the thing succeed, they don't want their work to go to waste. This is when people can get really creative with p-values and explaining the results, if the results are negative or just inconclusive.&lt;/p&gt;
&lt;p&gt;I will cite an example that I heard from a guy at Etsy some time back because it's a great example. They were developing infinite scroll, and they invested a lot of work, but the A/B test showed it's performing much worse then the original. It was hard to let go. In the end they performed a sort of back-test to verify the negative result, and they simply increased one parameter in the original version, instead of showing 40 items they showed 80. And surprisingly, that already showed that users don't like that, so they could have saved a lot of development time by performing that simple A/B test first.&lt;/p&gt;
&lt;p&gt;In a SaaS environment, where you can release anytime and hopefully you have a lot of users, always try to test mock versions early. It's really the same idea as the Build-Measure-Learn loop from the Lean Startup book, it's just that people forget that the whole point of BML is that you want to get to a place where you're moving through BML loops very quickly. So I think if you spend more than 2 weeks on a feature before you start collecting data on it, you are in danger of getting attached to it.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/8.png" alt="8. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;The message around logging is pretty simple. Make sure that your product has the logging in place to tell which user was exposed to which treatment, and that you can connect your target metric. When I say you can connect your target metric, I mean that you can attribute the bottom of the funnel with what's happening in your A/B test. So if your A/B test is about which pricing page variant you show, and your target metric is revenue, you have to be able to connect the users to the purchases (and refunds, and chargebacks, etc). Sounds easy, but there are lots of caveats. For example, you have to make sure they always see the same pricing page. Or maybe there are 10 other ways to make a purchase. And so on. Or maybe your target metric is time spent, in that case you want to be able to slice that. Think it through, and make sure you have enough logging in place that you can compute your results at the end. It sounds easy and obvious, but it's actually pretty hard to do, because in a reasonably complex application, events come from lots of different sources on the site, and you have to make sure the logging is good and uniform everywhere.&lt;/p&gt;
&lt;p&gt;Also, make sure you have a dashboard that shows, &lt;strong&gt;based on the logs, what % of your users are seeing which variant&lt;/strong&gt;. I've seen a situation where the A/B test was essentially misconfigured due to a bug, and the actual exposure %s were different than what we thought it is. It would have been easy to catch if you just show this simple split in users, but we didn't. If one of the variants is performing very poorly, this can be very painful.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Logging is an investment you want to make early on in the game.&lt;/strong&gt; At that point it's pretty cheap, because the code for this is not terribly complex even for a moderately site. But if you don't do it, then it will be very hard and painful later, because you have to go through your entire product and change the logging. That is so painful that it pretty much requires executive buy-in, because product teams need to stop what they're doing and fix the logging.&lt;/p&gt;
&lt;p&gt;A related issue is hashing. Suppose you have an A/B test that you want to run on 10% of your users. So you want to run A for 5% and B for 5%. So who should those 10% be? A deceptively simple answer is, let's take the user's user_id, and take the modulo 100. Modulo 100 is just the fancy term for taking the last 2 digits of the number, like 11 or 39. There happens to be 100 two digit numbers (00 to 99), you then you can easily think in terms of %s. So you give treatment A to 00-04 and B to 05-09. We ended up assigning these ranges to product teams to use. Team one uses 00-09, team two uses 10-19, and so on.&lt;/p&gt;
&lt;p&gt;So what's the problem with this? It turns out there are a number of problems here. Let me point out a couple:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Uneven distribution: because servers assign user_ids in some systematic way, for example german users are more likely to end up with user_ids that end in 50-99.&lt;/li&gt;
&lt;li&gt;Memory: Suppose you run an A/B test the way I described above, in the first 10%, with an even split. Then it's over, and a week later, you run another one, with the same split. You risk contaminating the new test with the effects of the old tests, because users remember! Eg. if they got frustrated with the previous B, then you'll measure that frustration for your new B, too.&lt;/li&gt;
&lt;li&gt;What if you have more tests to run then you have ranges, or a product team wants to run more tests in its own range. It just becomes hard to do the accounting, and it becomes error-prone. The problem is, if you start to overlap A/B tests, and different A/B tests overlap in uneven ways, then you risk measuring the effects of other A/B tests, and not your own.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Hashing:&lt;/strong&gt; Fortunately there's an easy way out of this, and that's to use hashing, and I hope everybody is already doing that and nobody is learning anything here. Hashing just means that for each test, you randomize your users, and divide them into buckets after randomization. It's simple, you can take the name of the A/B test and combine it with a user_id or session_id, and use a standard hashing function like MD5(). This will generate a number that's unique for each test+user combination. Then you can actually use the modulo rule above, and take 10%, and so on. Because the randomization is different for each test, the tests will overlap (whether they're running at the same time or one after the other) in random ways, so other tests will "contaminate" your A and B to the same degree, so you can still measure the lift. This also shows that when you're doing a large number of A/B tests, you're never really measuring the “true” value of metrics, your measurement are always affected by other tests; but if you’re using md5(test_name+user_id) hashing both your variants are equally affected, by a constant background lift, so your measured lift is still the true difference between your A and B.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/9.png" alt="9. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Now we move on to things that are relevant while the experiment is running. The first one says, "Don't change the experiment while it's running".&lt;/p&gt;
&lt;p&gt;Let’s take this example: a product team is churning out new versions of a feature every week, and they're A/B testing it. They are using frequentist A/B testing, and they want to prove that B is better than A, so they're trying to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;build a better product (variant B)&lt;/li&gt;
&lt;li&gt;collect enough samples so that the p-value goes below 5%&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The fallacy is, when they release a new version, they don't restart the A/B test, because then they'd have to restart counting the exposures and everything from 0. They just release into a running A/B test, to “keep the existing samples”. But you shouldn't do that, because you're cheating. Of course it's fine to release minor bugfixes and such, but if you want to run an A/B test where you take the p-value seriously at the end, then the "treatment" that you expose your users to has to be constant and fixed. Otherwise it's possible that the initial version was good, and that got some good conversions, and then subsequents were not good, but then statistical significance was reached, so now you think your final version is good. Imagine doing coin flips, and you’re trying to tell whether the coins are different, but you keep changing one of the coins, but not resetting the counts. What does the final result say about the final coin? Not much.&lt;/p&gt;
&lt;p&gt;Of course, if you're very quickly releasing new versions, that's a good thing. There are three things you can do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keep the old ones running "in the background" for long enough to get enough samples. &lt;/li&gt;
&lt;li&gt;Restart the experiment each time, and since you have lower counts, accept lower p-value thresholds, and hence accept more false positives. Remember, there's nothing magic or special about 5%.&lt;/li&gt;
&lt;li&gt;Do release the new version into the running test, but in that case you shouldn't calculate the p-value, just look at the time series for A and B, and if you have enough users, maybe you can see the difference between A and B, and between the versions of B. Or maybe you won't be able to.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/10.png" alt="10. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This one is also pretty simple, but it's hard to be disciplined about. So suppose you have a freemium product and you want to make more money. But only a fraction of your users is a paying user, so you test on free users, because you have plenty of that. That's great, but:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You won't be able to measure revenue related metrics.&lt;/li&gt;
&lt;li&gt;Users who don't want to pay for your product will be fundamentally different from users who do, so you're not really measuring on your target demographic.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The other fallacy here is to run a test for 12 hours on Wednesday. The problem is that if your German users are up, your Chinese users are asleep, and so on. So depending on your user base, it's possible that the sample was not representative. Weekly seasonality is a similar story, don’t run a test from Friday to Tuesday. I found in most cases it's a good idea to run tests for multiples of 7 days. Even tougher is the yearly seasonality. For example, during the summer students and teachers are on vacation, so if your product has a mix of students and teachers, that mix will change sharply between June and September. Fortunately these are all things you can be cautious about and take into account, even correct.&lt;/p&gt;
&lt;p&gt;The keyword is &lt;strong&gt;representative sample&lt;/strong&gt;, you have to run your test on a representative sample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mechanical Turk.&lt;/strong&gt; Sometimes at A/B testing talks the question comes up, what if you have a small site, not a lot of visitors, and you want to do A/B testing, and the speaker will say you can use Amazon Mechanical Turk. I think this is dangerous. For many SaaS sites your measurements will be meaningless, because people getting paid to take your test will behave differently from your real users. So what can you do?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From a math perspective, if you don’t have enough users, you’ll have to accept higher false positive/negative rates.&lt;/li&gt;
&lt;li&gt;Perhaps you should (take exception to one of the fallacies I have here, and) concentrate on an intermediate metric, like how much time visitors spend on your site, did that increase significantly? Or look at heatmaps.&lt;/li&gt;
&lt;li&gt;If you don’t have enough traffic, you could concentrate on getting more traffic, by eg. purchasing email lists.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/11.png" alt="11. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This presentation is not specifically about the mathematics of A/B testing, so I decided not to talk too much about it. What is perhaps interesting is that there's 2 distinct schools of thought about A/B testing: &lt;strong&gt;frequentist and bayesian&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;frequentist view&lt;/strong&gt; of the world is an objective view. They say that there is a true conversion rate of A and B, and you could in theory repeat your finite A/B test an infinite number of times, each time collect only a finite number N of samples, and they say you should worry about your false positive and false negative rate. The concept of statistical significance is about false positive rates, the concept of statistical power is related to false negative rates. False positive is when B isn't better than A, but you measure B to be better and keep it. False negative is when B is better than A, but you measure it to be worse or your experiment does not reach significance, so you reject it. So in the frequentist view, you, the observer is not a part of the picture, you're just controlling statistical fluctuations.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;bayesian approach&lt;/strong&gt; is one where the observer is part of the picture. Essentially, it starts by asking what your belief is about A and B's conversion rate, this is called the prior, and then updates this belief after each and every observation. The end-result is something called the posterior, which is the current best-guess about A and B, based on what we've seen.&lt;/p&gt;
&lt;p&gt;Personally, I like the bayesian approach better. The reason is that the frequentist approach is very hard to use and the quantities it deals with are not intuitive. The way to conduct a frequentist approach requires the product manager, or somebody:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;to guess how big the lift will be; this is necessary to figure out how many samples N to collect&lt;/li&gt;
&lt;li&gt;understand the concepts of statistical significance (~false positive rate) and statistical power (~false negative rate); based on my experience, many non-technical people like product managers are confused and easily misled by these&lt;/li&gt;
&lt;li&gt;not commit a number of mistakes, such as peeking and stopping early&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The bayesian approach, on the other hand, is more natural. It does not solve all these issues, but at least it operates with more easy to understand terms like "What is the probability that B is better than A? The probability---based on the observations so far---is 82%". Note that this is not the same thing as (one minus) statistical significance.&lt;/p&gt;
&lt;p&gt;Personally, I would use a mix of bayesian and frequentist today; frequentist to gauge how long to run a test, bayesian to evaluate (but also show frequentist results in small print). Note that large organizations which can afford to hire a team of smart data scientists can essentially go either way, because that team will make sure the decisions are statistically sound.&lt;/p&gt;
&lt;p&gt;There are other approaches like &lt;strong&gt;Multi Armed Bandit&lt;/strong&gt;, which is about regret minimalization. This says, if one of the variants is performing better, let's show that more often, so as not lose conversions. So it dynamically re-allocated traffic between A and B based on current conversion rates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Maserati problem.&lt;/strong&gt; A Maserati problem is a "nice to have problem", like "once I'm rich, should I buy a Maserati or a Ferrari". In my experience, chosing between these statistical engines is a nice problem to have, if this is your biggest concern, it means all your logging is good, you have good metric, you have enough traffic, you're not committing any of these fallacies. The flipside is, if you have problems with more basic things like logging, you probably shouldn't obsess too much about the math: as long as you're feeding wrong numbers to the equations, your number one priority should be to fix that, as your numerical results will be wrong or off.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/12.png" alt="12. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Now we move on to what to do after the test is finished running. The first question is, how to report the results, whatever they are. I found it's very important to have a &lt;strong&gt;standardized reporting&lt;/strong&gt; tool for A/B results, and not have PMs write their results in custom emails that get sent around. There are two reasons. One is, it helps to keep us honest. If there is standard tool where everybody can go and check out results, there’s less of a chance that people will cherry-pick, and just talk about the good results, or do metrics data mining, the fallacy I mentioned before. Also, it conditions stakeholders and executives to a standard format, so they’ll get used to a standard set of numbers, how they’re presented. They’ll get used to usual lift %s, so they can relate results to past results, have expectations.&lt;/p&gt;
&lt;p&gt;One story I want to share is related to &lt;strong&gt;raw vs projected&lt;/strong&gt; results. So in this case there was no standardized reporting, and the results were communicated in email, and the result was like a 40% lift. So everybody was happy, and they were right to be happy, and the thing got deployed to 100% of users. Then a month later the team got an email from the CEO, asking why the metric is flat, if the thing got us a 40% lift. So it turns out that the PM was talking about raw lift %, which is the lift you see among users who actually use that feature in the product. Think of the feature like a dialog in a big application. But let’s say only 1% of users ever use that. Then your overall or projected or top-line lift will only be 0.4%. So that’s why the CEO didn’t see a big lift across the company, because the result was miscommunicated and a false expectation was set.&lt;/p&gt;
&lt;p&gt;So the lesson, keep your raw and projected (to all users) lift separate, and report in a standard way to your stakeholders!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/13.png" alt="13. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;This point just says remember, and it connects back nicely to the first point, where I said you should ask yourself, do you already know? The example I gave there was about the green button, which has been tested and always performed better in emails than the blue ones. So the way to make sure your organization learns and doesn’t repeat experiments if you record them in a tool. That way you look them up and reference them later. Referencing them is important, because people’s memory gets worse over time, so it’s important to have raw results that you can point to to make arguments. This also ties in with the previous point about standardized reports, once you have standardized the reporting then it’s probably easy to keep them around. If you don’t have a fancy tool, you can just use a standard Google Docs template and put it into a shared folder!&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/14.png" alt="14. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;There are a number of other topics around A/B testing that would be interesting to talk about, these were the ones that were the most important lessons for me over the last few years.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/strata-2016/15.png" alt="15. slide" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Based on these lessons I constructed this simple flowchart, what a good A/B testing procedure looks like. There’s the Build-Measure-Learn cycle from Lean Startups, this is sort of an exploded view of that. Speed is very important, at large sophisticated websites only 1 in 10 experiments yields a clear lift, so you have to be fast to get to the wins.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://prezi.com/sqzccuudw5hz/strata-2016/"&gt;Prezi of this talk&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://greenteapress.com/wp/think-bayes/"&gt;Think Bayes&lt;/a&gt;: free book&lt;/li&gt;
&lt;li&gt;&lt;a href="http://greenteapress.com/thinkstats/"&gt;Think Stats&lt;/a&gt;: free book&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/How-Measure-Anything-Intangibles-Business/dp/1118539273"&gt;How to Measure Anything&lt;/a&gt; - good book, though not specifically about A/B testing&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.evanmiller.org/"&gt;Evan Miller’s site&lt;/a&gt;: articles and tools about A/B testing&lt;/li&gt;
&lt;li&gt;&lt;a href="http://varianceexplained.org/"&gt;Variance Explained&lt;/a&gt;: articles about A/B testing&lt;/li&gt;
&lt;li&gt;&lt;a href="http://ai.stanford.edu/~ronnyk/ronnyk-bib.html"&gt;Ron Kohavi’s articles&lt;/a&gt;: he worked A/B testing at Bing&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/PASS/Group-Sequential_Tests_for_Two_Means.pdf"&gt;Group Sequential Testing&lt;/a&gt;: how to take into account peeking in frequentist experiment design&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=Eh00PoR76NY"&gt;Youtube: talk from the CEO of Optimizely on A/B testing lessons learned after 100,000+ experiments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=0dVIjWTI_A0"&gt;Youtube: A/B testing war stories from Etsy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=Auu9AnCozWQ"&gt;Youtube: A/B testing math from a Groupon statistician&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=8DMU9h4dwi8"&gt;Youtube: introductory talk on Bayesian A/B testing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/A/B_testing"&gt;Wikipedia: starting point on A/B testing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="ab-testing"></category><category term="strata"></category><category term="statistics"></category><category term="data"></category></entry><entry><title>Hack, HHVM and avoiding the Second-system effect</title><link href="https://bytepawn.com/hack-hhvm-second-system-effect.html" rel="alternate"></link><published>2016-05-14T00:00:00+02:00</published><updated>2016-05-14T00:00:00+02:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-05-14:/hack-hhvm-second-system-effect.html</id><summary type="html">&lt;p&gt;I read this book on my first vacation after I started working at Facebook and thus became a semi-regular &lt;a href="http://hhvm.com/"&gt;Hack/HHVM&lt;/a&gt; user. I highly recommend reading (parts of) it. But not to learn Hack/PHP, which is irrelevant to most people. Instead, it’s to learn about how Facebook improved it’s www codebase and performance without rewriting the old PHP code in one big effort, and thus avoided the famous Second-system effect.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/hack_book.jpg" alt="Hack book"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;I read &lt;a href="http://www.amazon.com/Hack-HHVM-Programming-Productivity-Breaking/dp/1491920874"&gt;Hack &amp;amp; HHVM—Programming Productivity without Breaking Things&lt;/a&gt; on my first vacation after I started working at Facebook and thus became a semi-regular Hack/HHVM user. I highly recommend reading (parts of) it. But not to learn Hack/PHP, which is irrelevant to most people. Instead, it’s to learn about how Facebook improved it’s www codebase and performance without rewriting the old PHP code in one big effort, and thus avoided the famous &lt;em&gt;Second-system effect&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Hack &amp;amp; HHVM book" src="/images/hack_book.jpg" title="Hack &amp;amp; HHVM book"&gt;&lt;/p&gt;
&lt;h2&gt;Second-system effect&lt;/h2&gt;
&lt;p&gt;The second system effect was first described by Fred Brooks in &lt;a href="https://en.wikipedia.org/wiki/The_Mythical_Man-Month"&gt;The Mythical Man Month&lt;/a&gt;, based on his experiences managing operating system software development at IBM in the early 1960s:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The second-system effect proposes that, when an architect designs a second system, it is the most dangerous system they will ever design, because they will tend to incorporate all of the additions they originally did not add to the first system due to inherent time constraints. Thus, when embarking on a second system, an engineer should be mindful that they are susceptible to over-engineering it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let me offer a more modern description: version 1.0 of the product/app/software is successful. Over time the programmers realize that, knowing what they know now, they could do a much better job. Meanwhile, the technology landscape changes, it'd be nice to take advantage of the shiny new architectures, languages, frameworks available. So the team embarks on the quest to ship 2.0―a rewrite. Inevitably, even good teams will over-engineer, and the result will be a technological and project management mess. 2.0 projects like this miss their original ship dates by several years. Once it does ship, it’s buggy and slow, because unlike 1.0 it has no fine-tuning, since it hasn't seen the light of real-world usage yet. So several more years go by until 2.0 is also fine-tuned. At this point the new set of programmers―the cohort who joined after 1.0―can repeat the Second-system effect with 3.0, which for them will be the new 2.0. Rinse, repeat.&lt;/p&gt;
&lt;p&gt;I think within this book is a nice little lesson about how to avoid the Second-system effect. The book doesn't actually mention the Second-system effect, and I'm not implying anything about the history of the main www codebase at Facebook. I'm not saying that Facebook specifically did this to avoid the Second-system effect. It's just a lesson that I think can be extracted from the design decisions explained in the book.&lt;/p&gt;
&lt;h2&gt;PHP, Hack, HHVM&lt;/h2&gt;
&lt;p&gt;The story here is that Facebook started out as a PHP codebase. Over time the product became very successful, which meant that it was:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;very large (1M+ LOC)&lt;/li&gt;
&lt;li&gt;serving a large number of users&lt;/li&gt;
&lt;li&gt;being worked on by a large number of engineers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So there was a desire to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;speed it up so it can serve more users per node&lt;/li&gt;
&lt;li&gt;make it easier for engineers to work on the code&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I think that for many programmers (including yours truly) the instinctive reaction would have been to say "PHP sucks, it's slow and unsafe, let's rewrite the www codebase in a real programming language like Java and run on the JVM". What's interesting is that Facebook did not do this; Facebook did not discard PHP.&lt;/p&gt;
&lt;p&gt;Instead, Facebook decided to improve the layer below the application code to improve overall performance, and write new code in a way which takes advantage of the features of the improved layer (and very slowly deprecate old code). The "layer" here is actually many things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Hack, a language like PHP, but much nicer&lt;/li&gt;
&lt;li&gt;a static type-checker for Hack&lt;/li&gt;
&lt;li&gt;HHVM, a runtime for Hack (and also regular PHP)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Two notes are in order here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Historically, there was something called &lt;a href="https://en.wikipedia.org/wiki/HipHop_for_PHP"&gt;HPHPc&lt;/a&gt; before HHVM. It was a PHP-to-C++ compiler, but it’s no longer being used at Facebook.&lt;/li&gt;
&lt;li&gt;Hack and HHVM did not come about as a result of a committee sitting down, identifying the problem, scoping out the solutions, and picking one. They originated (both HPHPc and the Hack language) from &lt;a href="https://www.facebook.com/hackathon"&gt;Hackathons&lt;/a&gt;, an integral part of Facebook engineering culture, where individual engineers were attacking problems they thought are promising and important.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My favorite features of Hack/HHVM:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;very fast&lt;/li&gt;
&lt;li&gt;100% interoperability with regular PHP code (eg. existing code)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.hhvm.com/hack/types/introduction"&gt;types&lt;/a&gt; (can also run regular PHP code in untyped mode)&lt;/li&gt;
&lt;li&gt;the &lt;a href="https://docs.hhvm.com/hack/typechecker/introduction"&gt;type-checker&lt;/a&gt; is very fast, millisecond response time even is very large codebases since it maintains state&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.hhvm.com/hack/generics/introduction"&gt;generics&lt;/a&gt;, &lt;a href="https://docs.hhvm.com/hack/lambdas/introduction"&gt;lambdas&lt;/a&gt;, etc.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.hhvm.com/hack/async/introduction"&gt;async/await keyword for cooperative multitasking&lt;/a&gt;: this is very cool and worth reading up on. Essentially it's language/runtime level support for (single-threaded) event-driven architecture (epoll, kqueue, Completion Ports), so you don't have to explicitly manage the state like we did in the plain old C++ &lt;a href="https://github.com/scalien/scaliendb"&gt;ScalienDB codebase&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.hhvm.com/hack/XHP/introduction"&gt;XHP&lt;/a&gt;: the way to do www rendering safely (in the xss sense) in Hack, with language level support for XHTML and custom modules (eg. a Comments box)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Lesson learned&lt;/h2&gt;
&lt;p&gt;So the interesting lesson here is that a possible way out of the Second-system effect is to start improving the environment (language, runtime, frameworks, etc.) of the main codebase instead of rewriting the main codebase. I certainly don't think this is the solution, in many cases it cannot be applied, but it's something to keep in mind as a design pattern. Some of the challenges of this approach:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You need a couple of really smart people who can design and implement a new language that's backwards compatible with existing code.&lt;/li&gt;
&lt;li&gt;You need to put sustained effort into it afterwards, keeping it mostly compatible with the standard version of the language.&lt;/li&gt;
&lt;li&gt;Every new engineer will need time to ramp up using the new language.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Another interesting aspect of this is the investment needed. Rewriting the whole application codebase is an all-in project, with all (or much) of the engineering team working on it. I assert that changing out the layers below and around it can be accomplished by a smaller, focused team, iteratively. It's a smaller bet. Writing HHVM was certainly a smaller effort than rewriting all of Facebook in Java would have been! Having said that, an organizational/management note: I do think you need a fairly large group of people to generate enough ideas (and Hackathon projects) so that some really good and impactful ones come out of it.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I will conclude this post with my personal impressions: working with Hack/HHVM is very pleasant. The type checker holds your hand all the way, so it feels much nicer/safer than eg. writing Python. The syntax is a bit unfortunate in places, but overall it’s a non-issue for me. I'd consider using Hack/HHVM for personal projects or a startup. It's completely open source, so anybody can use it for their projects.&lt;/p&gt;
&lt;p&gt;Thanks to &lt;a href="https://github.com/zsol"&gt;Zsolt Dollenstein&lt;/a&gt; for reviewing this blog post and giving valuable suggestions.&lt;/p&gt;
&lt;p&gt;Links:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://hhvm.com/"&gt;HHVM main site&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.facebook.com/hhvm/"&gt;HHVM Facebook page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/facebook/hhvm"&gt;HHVM on Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.engineyard.com/2014/hhvm-hack"&gt;A series of posts about Hack/HHVM from 2014&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;If you find Hack interesting, check out &lt;a href="http://flowtype.org/"&gt;Flow, a static type checker for Javascript, by Facebook&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;</content><category term="books"></category><category term="programming"></category><category term="hhvm"></category><category term="brooks"></category></entry><entry><title>Einstein's amazing theory</title><link href="https://bytepawn.com/einsteins-amazing-theory.html" rel="alternate"></link><published>2016-02-16T00:00:00+01:00</published><updated>2016-02-16T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-02-16:/einsteins-amazing-theory.html</id><summary type="html">&lt;p&gt;This post is about the amazing success of Einstein's general theory of relativity. The theory predicts, among other things the accelerating Universe, black holes, gravitational lensing and gravitational waves. The real shocker is to remember that Einstein didn't invent general relativity to explain these. He didn’t know about these, they didn't exist at that time!&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently the &lt;a href="https://www.ligo.caltech.edu/"&gt;LIGO&lt;/a&gt; experiment &lt;a href="https://www.ligo.caltech.edu/news/ligo20160211"&gt;reported the measurement and  experimental verification of gravitational waves&lt;/a&gt;. This is a big deal and should get those involved in the experiment a &lt;a href="https://en.wikipedia.org/wiki/Nobel_Prize"&gt;Nobel prize&lt;/a&gt;. Great news for physics and our understanding of nature!&lt;/p&gt;
&lt;p&gt;From the press release:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Gravitational waves carry information about their dramatic origins and about the nature of gravity that cannot otherwise be obtained. Physicists have concluded that the detected gravitational waves were produced during the final fraction of a second of the merger of two black holes to produce a single, more massive spinning black hole. This collision of two black holes had been predicted but never observed.&lt;/p&gt;
&lt;p&gt;Based on the observed signals, LIGO scientists estimate that the black holes for this event were about 29 and 36 times the mass of the sun, and the event took place 1.3 billion years ago. About 3 times the mass of the sun was converted into gravitational waves in a fraction of a second—with a peak power output about 50 times that of the whole visible universe. By looking at the time of arrival of the signals—the detector in Livingston recorded the event 7 milliseconds before the detector in Hanford—scientists can say that the source was located in the Southern Hemisphere.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This post is about the amazing success of the &lt;a href="https://en.wikipedia.org/wiki/General_relativity"&gt;theory of general relativity&lt;/a&gt;, the theory &lt;a href="https://en.wikipedia.org/wiki/Albert_Einstein"&gt;Einstein&lt;/a&gt; developed between 1907 and 1916.&lt;/p&gt;
&lt;h2&gt;Historical context&lt;/h2&gt;
&lt;p&gt;Let's start with some history and context. The 100 years between 1850 and 1950 were a tremendously fruitful time for physics. &lt;a href="https://en.wikipedia.org/wiki/Maxwell%27s_equations"&gt;Maxwell wrote down his famous equations&lt;/a&gt;, Einstein developed the &lt;a href="https://en.wikipedia.org/wiki/Special_relativity"&gt;special&lt;/a&gt; and the &lt;a href="https://en.wikipedia.org/wiki/General_relativity"&gt;general theory of relativity&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Quantum_mechanics"&gt;quantum mechanics&lt;/a&gt; and then &lt;a href="https://en.wikipedia.org/wiki/Quantum_field_theory"&gt;quantum field theories&lt;/a&gt; were invented.&lt;/p&gt;
&lt;p&gt;It is often said that given the Maxwell equations and its invariant, the &lt;a href="https://en.wikipedia.org/wiki/Lorentz_transformation"&gt;Lorentz transformation&lt;/a&gt;, somebody would have eventually extracted special relativity from it. The invention of &lt;a href="https://en.wikipedia.org/wiki/Quantum_mechanics"&gt;quantum mechanics&lt;/a&gt; (QM) came after experimental observations that needed a new theory to explain them. Various frameworks for QM were devised in parallel by a number of physicists (Heisenberg, Schrodinger, Dirac, Pauli, Bohr, Sommerfeld, Einstein, etc.). Quantum mechanics isn’t compatible with special relativity, so a few years later &lt;a href="https://en.wikipedia.org/wiki/Quantum_field_theory"&gt;quantum field theories&lt;/a&gt; came along (mostly by the QM physicists, plus new kids on the block, like Feynman). QFTs are extensions of QM to take into account special relativity and the creation and destruction of particles and antiparticles. QFTs have predicted the existence of then-unobserved particles, but primarily have been constructed to model experimental observations, and have to be continuously patched and hacked to do so.&lt;/p&gt;
&lt;h2&gt;General relativity&lt;/h2&gt;
&lt;p&gt;Compared to the history of quantum theories, Einstein's invention of general relativity is very different and elevates Einstein into a class by himself. Einstein conducted &lt;a href="https://en.wikipedia.org/wiki/Thought_experiment"&gt;gedanken experiments&lt;/a&gt; and concluded &lt;em&gt;“this is how the Universe must work”&lt;/em&gt;. &lt;a href="http://www.astronomynotes.com/relativity/s3.htm"&gt;Here is a short description of his famous thought experiment involving elevators.&lt;/a&gt; His invention of general relativity was completely unexpected because from an experimental viewpoint it was "unnecessary": there were no experiments that needed to be explained. From &lt;a href="https://en.wikipedia.org/wiki/History_of_general_relativity"&gt;Wikipedia:&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As Einstein later said, the reason for the development of general relativity was the preference of inertial motion within special relativity, while a theory which from the outset prefers no state of motion (even accelerated ones) appeared more satisfactory to him. So, while still working at the patent office in 1907, Einstein had what he would call his "happiest thought". He realized that the principle of relativity could be extended to gravitational fields. Consequently, in 1907 (published 1908) he wrote an article on acceleration under special relativity. In that article, he argued that free fall is really inertial motion, and that for a free falling observer the rules of special relativity must apply. This argument is called the Equivalence principle. In the same article, Einstein also predicted the phenomenon of gravitational time dilation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Einstein's theory was successful of course, and in the next 100 years turned out to predict, among other things: (i) &lt;a href="https://en.wikipedia.org/wiki/Accelerating_expansion_of_the_universe"&gt;the accelerating Universe&lt;/a&gt;, (ii) &lt;a href="https://en.wikipedia.org/wiki/Black_hole"&gt;black holes&lt;/a&gt;, (iii) &lt;a href="https://en.wikipedia.org/wiki/Gravitational_lens"&gt;gravitational lensing&lt;/a&gt; and (iv) &lt;a href="https://en.wikipedia.org/wiki/Gravitational_wave"&gt;gravitational waves&lt;/a&gt;. The real shocker is to remember that Einstein didn't invent general relativity to explain these things. He didn’t know about these things, they didn't exist at that time!&lt;/p&gt;
&lt;p&gt;So how does thinking about inertial reference frames and accelerating observers lead one to come up with a theory that somehow features black holes and gravitational waves? The original thought experiments were “just” arguments about what would happen in an elevator in space, in an elevator in a gravitational field, and so on. Einstein needed a mathematical framework which can be extended with some physics, ie. equations, and a mapping from the mathematical quantities to measurable quantities, and hence a way to connect the math to an understanding of what is being calculated. He found this in &lt;a href="https://en.wikipedia.org/wiki/Manifold"&gt;manifolds&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Differential_geometry"&gt;differential geometry&lt;/a&gt;, put the famous &lt;a href="https://en.wikipedia.org/wiki/Einstein_field_equations"&gt;Einstein field equations&lt;/a&gt; on top, and connected the resulting theory to the real world (eg. in the mathematical framework, what corresponds to a real-world event, world line, how does an observer perceive time, distance, etc). It is this theory—that treats space and time as a combined entity called &lt;a href="https://en.wikipedia.org/wiki/Spacetime"&gt;spacetime&lt;/a&gt;, modeled as a manifold—that amazingly predicts (i)-(iv): (i) spacetime is expanding, at an accelerating rate (ii) spacetime can have singularities (iii) spacetime warps near a heavy object, photos follow spacetime, hence an object behind a heavy object appears lensed (iv) ripples in spacetime.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I’m not sure how successful I was in communicating the amazing fact that Einstein’s theory, developed to explain a very general but simple idea, predicts such a variety of mind boggling phenomena, which are one after the other found to exist in nature. I recommend you to read the &lt;a href="http://www.amazon.co.uk/Relativity-Special-General-Albert-Einstein/dp/1891396307/ref=la_B00BGN8B7O_1_1?s=books&amp;amp;ie=UTF8&amp;amp;qid=1455650031&amp;amp;sr=1-1"&gt;book Einstein wrote for the layman&lt;/a&gt; to explain the special and the general relativity with his original thought experiments. Read the book, and then think about how given that you could get to gravitational waves!&lt;/p&gt;
&lt;p&gt;It is often said that Einstein touched so many areas of physics, he could have received several Nobel prizes. An incomplete list of Einstein's more famous results:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Brownian_motion"&gt;Brownian motion.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The special theory of relativity.&lt;/li&gt;
&lt;li&gt;The general theory of relativity.&lt;/li&gt;
&lt;li&gt;His contributions to quantum mechanics.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Einstein did get the Nobel prize in 1922 related to the last item on the list: &lt;em&gt;“for his services to Theoretical Physics, and especially for his discovery of the law of the &lt;a href="https://en.wikipedia.org/wiki/Photoelectric_effect"&gt;photoelectric effect&lt;/a&gt;”&lt;/em&gt;.&lt;/p&gt;</content><category term="physics"></category><category term="einstein"></category><category term="relativity"></category></entry><entry><title>Heisengames and the importance of patience in business</title><link href="https://bytepawn.com/heisengames-business.html" rel="alternate"></link><published>2016-02-08T00:00:00+01:00</published><updated>2016-02-08T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-02-08:/heisengames-business.html</id><summary type="html">&lt;p&gt;Most bets businesses take, be it hiring, features, products or strategy don't  work out. Still, many businesses are successful despite setbacks. A negative attitude---even when the analysis of the situation is in fact correct---may be missing the bigger picture.&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Most bets businesses take, be it hiring, features, products or strategy don't  work out. Only a few of them are ever successful. Analytical employees often notice this, and become negative: "We just hired executive E, she did X, Y and Z, and it's not working!", and similar sentiments.&lt;/p&gt;
&lt;p&gt;Still, many businesses are successful despite setbacks. A negative attitude---even when the analysis of the situation is in fact &lt;strong&gt;correct&lt;/strong&gt; and X, Y and Z were failures---may be missing the bigger picture. Putting aside the demotivating psychological aspect of negativity, &lt;em&gt;how can you be right and still miss the bigger picture?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Toy model&lt;/h2&gt;
&lt;p&gt;Let's model the business as a dice game you're playing at a casino. The game is simple: if you roll a six, you get &lt;code&gt;$X&lt;/code&gt;. The game costs &lt;code&gt;$1&lt;/code&gt; to play at the casino. Clearly, the breakeven point is &lt;code&gt;X=6&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Suppose the casino offers the game at &lt;code&gt;X=7&lt;/code&gt;. In this case, &lt;strong&gt;you still lose 5 times out of 6&lt;/strong&gt;. However, &lt;strong&gt;when you win, you offset your losses, and in the long run you will make a lot of money&lt;/strong&gt;. (Unless you hit a losing streak and lose all your money, in which case you don't have any more money to keep playing the game; but let's ignore this.) This toy model illustrates the point about business and negativity: most bets will be "failures", but in the long run, it can still be a (very) profitable game!&lt;/p&gt;
&lt;p&gt;Of course, it's also possible that &lt;code&gt;X=5&lt;/code&gt;. In this case the negativity is justified, because even when we win, it doesn't offset the cost of playing the game. Unless...&lt;/p&gt;
&lt;h2&gt;Heisengame&lt;/h2&gt;
&lt;p&gt;We can make the game even more interesting. Suppose &lt;code&gt;X=5&lt;/code&gt;, but every time you win, the casino increases the payout by a cent. In this case, most of the time you lose, and initially you also lose out on average, but if you keep it up long enough (eg. using venture capital), eventually the game will become (very) profitable! I call this Heisengame, because playing the game changes the game itself.&lt;/p&gt;
&lt;p&gt;We can also turn it around. Suppose &lt;code&gt;X=7&lt;/code&gt;, but every time you win, the casino decreases the payout by a cent. In this case, analysis will show that the game is currently profitable on average, but it won't be in the long run.&lt;/p&gt;
&lt;h2&gt;Patience&lt;/h2&gt;
&lt;p&gt;In real life, the payout itself is also not fixed. You can model it like this: the casino may pay you a random amount when you roll a six, say according to a normal distribution centered on &lt;code&gt;X&lt;/code&gt;. In this case, you need to play even longer to learn what the game actually is, because you need to collect many data points to understand the payout function. &lt;em&gt;You need to have patience to understand the game.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;These are just toy models, but they illustrate important points:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Just because most bets are losers isn't a problem in itself. As long as the winners generate a big enough win, the game is worth playing. To know which game the business is playing, long term data and analysis is needed.&lt;/li&gt;
&lt;li&gt;Business is a Heisengame: playing the game changes the rules of the game. Long term data and analysis is needed to tell the direction of the change.&lt;/li&gt;
&lt;li&gt;In the real world, you don't know the parameters of the game. You need to be patient while you collect data and measure the parameters.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So next time you see a bet not working out, consider that maybe the business just rolled a five. As long as you win in the long run, it's okay.&lt;/p&gt;</content><category term="heisengames"></category><category term="business"></category></entry><entry><title>Cloud9: Cloud coding that actually works</title><link href="https://bytepawn.com/cloud9.html" rel="alternate"></link><published>2016-02-07T00:00:00+01:00</published><updated>2016-02-07T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-02-07:/cloud9.html</id><summary type="html">&lt;p&gt;For the past 2 months I've been using Cloud9 for writing code in the cloud, and I can wholeheartedly recommend it: it just works for me. It's basically Docker plus an IDE: you get a Docker container running Ubuntu that you can access over a web IDE.&lt;/p&gt;</summary><content type="html">&lt;p&gt;I've always wondered when it will be practical to write code in the cloud (and not locally).
Most of the apps I use have already moved to the cloud: email, calendar, docs, music. Writing code is one of the last blockers to a true thin client approach for me (the other notable one being photo management).&lt;/p&gt;
&lt;p&gt;Recently I've been using &lt;a href="https://c9.io"&gt;Cloud9&lt;/a&gt; for writing code in the cloud, and I can wholeheartedly recommend it: it just works for me. It's basically &lt;a href="https://www.docker.com/"&gt;Docker&lt;/a&gt; plus an IDE: you get a Docker container running Ubuntu that you can access over a web IDE. Docker is a pretty standard thing, so there's not much to say there. The good news is that there's also not much to say about the IDE: &lt;strong&gt;it just works&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;You can theme it, I have it set up to look and feel like Sublime. It has a nice treeview on the left, you can do simple file management there. Shortcuts like copy/paste, find, etc. work as expected, as does full-text search. It's quite impressive, I've been using it for about 2 months and I haven't run into a single instance where the webapp/javascripts feeling leaks through. Part of the IDE is the ability to open terminal consoles. Like the editing, the terminal also just works, including copy/paste, shortcuts, and so on. The one use-case I found where the IDE doesn't work is over mobile: trying to use it over a touch interface is horrible. They're working on it and this will get better over time.&lt;/p&gt;
&lt;p&gt;Some screenshots:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/images/c9-1.png"&gt;Editing a Markdown file&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bytepawn.com/images/c9-2.png"&gt;Terminal console&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The one problem I kept hitting is that I'm used to Cmd-Tab'ing to get to Sublime, but if it's running in Chrome it's a tab window. I couldn't get used to that. So I used &lt;a href="https://www.lessannoyingcrm.com/blog/2010/08/149/Create+application+shortcuts+in+Google+Chrome+on+a+Mac"&gt;this shell script&lt;/a&gt; to create a dedicated Cloud9 Chromium app.&lt;/p&gt;
&lt;p&gt;The container is accessible from the outside, so if you launch a webserver on localhost:8080, you can open it from your browser at https://&lt;project&gt;-&lt;username&gt;.c9users.io. At first I thought this is public so anybody can see it, but they check whether you're logged into c9, so actually only you can see it---nice!&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://c9.io/pricing/webide"&gt;business model&lt;/a&gt; is freemium. In the free tier can run 1 private docker container (with 1G space); you can launch more, but they will be public, meaning anybody can see your files. For $20/month you get unlimited private containers, and 10G disks (see &lt;a href="https://docs.c9.io/docs/reaching-resource-limits"&gt;reaching resource limits&lt;/a&gt;). I've been using the free tier, right now I'm at 50% disk usage. Most of it is stuff I install to get something working (&lt;code&gt;apt-get&lt;/code&gt;, &lt;code&gt;pip install&lt;/code&gt;, etc).&lt;/p&gt;
&lt;p&gt;This also enables a nice "separation of concerns": you can work on your personal projects on a computer (eg. work computer) without storing any files on it.&lt;/p&gt;</content><category term="coding"></category><category term="ide"></category><category term="c9"></category></entry><entry><title>Luigi vs Airflow vs Pinball</title><link href="https://bytepawn.com/luigi-airflow-pinball.html" rel="alternate"></link><published>2016-02-06T00:00:00+01:00</published><updated>2016-02-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-02-06:/luigi-airflow-pinball.html</id><summary type="html">&lt;p&gt;A spreadsheet comparing the three opensource workflow tools for ETL.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/airflow-luigi-pinball.png" alt="Comparison" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;After reviewing these three ETL worflow frameworks, I compiled a table comparing them. Here's the original &lt;a href="https://docs.google.com/spreadsheets/d/1KCXtuht_wZPFROFwdeg7IXrNPUhFI277y4h-xnc8mgk/edit#gid=0"&gt;Gdoc spreadsheet&lt;/a&gt;. If I had to build a new ETL system today from scratch, &lt;strong&gt;I would use Airflow&lt;/strong&gt;. If you find any mistakes, please let me know at &lt;a href="mailto:mtrencseni@gmail.com"&gt;mtrencseni@gmail.com&lt;/a&gt;.&lt;/p&gt;
&lt;style type="text/css"&gt;.ritz .waffle a { color: inherit; }.ritz .waffle .s1{border-bottom:1px SOLID #000000;text-align:center;font-weight:bold;color:#000000;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s4{text-align:center;color:#000000;background-color:#fce5cd;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s6{text-align:center;color:#000000;background-color:#d9ead3;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s9{text-align:left;color:#000000;background-color:#ffffff;font-family:'arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s0{border-bottom:1px SOLID #000000;border-right:1px SOLID #000000;text-align:left;font-weight:bold;color:#000000;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s3{text-align:center;text-decoration:underline;color:#1155cc;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s5{text-align:center;color:#000000;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s8{text-align:center;color:#000000;background-color:#d9ead3;font-family:'arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s2{border-right:1px SOLID #000000;text-align:left;color:#000000;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}.ritz .waffle .s7{border-right:1px SOLID #000000;text-align:left;font-weight:bold;text-decoration:underline;color:#000000;background-color:#ffffff;font-family:'Arial';font-size:10pt;vertical-align:bottom;white-space:nowrap;direction:ltr;padding:2px 3px 2px 3px;}
.row-headers-background {display:none;}
.column-headers-background {display:none;}
.freezebar-cell {display:none;}
&lt;/style&gt;

&lt;div class="ritz grid-container" dir="ltr"&gt;&lt;table class="waffle" cellspacing="0" cellpadding="0"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th class="row-header freezebar-vertical-handle"&gt;&lt;/th&gt;&lt;th id="0C0" style="width:183px" class="column-headers-background"&gt;A&lt;/th&gt;&lt;th id="0C1" style="width:295px" class="column-headers-background"&gt;B&lt;/th&gt;&lt;th id="0C2" style="width:329px" class="column-headers-background"&gt;C&lt;/th&gt;&lt;th id="0C3" style="width:295px" class="column-headers-background"&gt;D&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R0" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;1&lt;/div&gt;&lt;/th&gt;&lt;td class="s0"&gt;&lt;/td&gt;&lt;td class="s1" dir="ltr"&gt;Luigi&lt;/td&gt;&lt;td class="s1" dir="ltr"&gt;Airflow&lt;/td&gt;&lt;td class="s1" dir="ltr"&gt;Pinball&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th style="height:3px" class="freezebar-cell freezebar-horizontal-handle"&gt;&lt;/th&gt;&lt;td class="freezebar-cell"&gt;&lt;/td&gt;&lt;td class="freezebar-cell"&gt;&lt;/td&gt;&lt;td class="freezebar-cell"&gt;&lt;/td&gt;&lt;td class="freezebar-cell"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R1" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;2&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;repo&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="https://github.com/spotify/luigi"&gt;https://github.com/spotify/luigi&lt;/a&gt;&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="https://github.com/airbnb/airflow"&gt;https://github.com/airbnb/airflow&lt;/a&gt;&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="https://github.com/pinterest/pinball"&gt;https://github.com/pinterest/pinball&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R2" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;3&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;docs&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="http://luigi.readthedocs.org"&gt;http://luigi.readthedocs.org&lt;/a&gt;&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="https://airflow.readthedocs.org"&gt;https://airflow.readthedocs.org&lt;/a&gt;&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;none&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R3" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;4&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;my review&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="http://bytepawn.com/luigi.html"&gt;http://bytepawn.com/luigi.html&lt;/a&gt;&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="http://bytepawn.com/airflow.html"&gt;http://bytepawn.com/airflow.html&lt;/a&gt;&lt;/td&gt;&lt;td class="s3" dir="ltr"&gt;&lt;a target="_blank" href="http://bytepawn.com/pinball.html"&gt;http://bytepawn.com/pinball.html&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R4" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;5&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;github forks&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;750&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;345&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;58&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R5" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;6&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;github stars&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;4029&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;1798&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;506&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R6" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;7&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;github watchers&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;319&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;166&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;47&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R7" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;8&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;commits in last 30 days&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;lots of commits&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;lots of commits&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;3 commits&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R8" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;9&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;architecture&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R9" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;10&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;web dashboard&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;not really, minimal&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;very nice&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R10" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;11&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;code/dsl&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;code&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;code&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;python dict + python code&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R11" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;12&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;files/datasets&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, targets&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;not really, as special tasks&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R12" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;13&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;calendar scheduling&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no, use cron&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, LocalScheduler&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R13" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;14&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;datadoc&amp;#39;able [1]&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;maybe, doesn&amp;#39;t really fit&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;probably, by convention&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, dicts would be easy to parse&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R14" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;15&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;backfill jobs&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R15" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;16&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;persists state&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;kindof&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, to db&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, to db&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R16" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;17&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;tracks history&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, in db&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, in db&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R17" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;18&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;code shipping&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, pickle&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;workflow is shipped using pickle, jobs are not?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R18" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;19&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;priorities&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R19" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;20&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;parallelism&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, workers, threads per workers&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, workers&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R20" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;21&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;control parallelism&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, resources&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, pools&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R21" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;22&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;cross-dag deps&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, using targets&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, using sensors&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R22" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;23&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;finds new deployed tasks&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R23" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;24&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;executes dag&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no, have to create special sink task&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R24" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;25&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;multiple dags&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no, just one&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes, also several dag instances (dagruns)&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R25" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;26&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;scheduler/workers&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R26" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;27&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;starting workers&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;users start worker procceses&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;scheduler spawns workers processes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;users start worker procceses&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R27" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;28&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;comms&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;scheduler&amp;#39;s HTTP API&lt;/td&gt;&lt;td class="s8" dir="ltr"&gt;minimal, in state db&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;through master module using Swift&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R28" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;29&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;workers execute&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;worker can execute tasks that is has locally&lt;/td&gt;&lt;td class="s8" dir="ltr"&gt;worker reads pickled tasks from db&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;worker can execute tasks that is has locally?&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R29" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;30&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;contrib&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R30" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;31&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;hadoop&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R31" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;32&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;pig&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;doc mentions PigOperator, it&amp;#39;s not in the source&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R32" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;33&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;hive&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R33" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;34&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;pgsql&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R34" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;35&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;mysql&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R35" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;36&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;redshift&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R36" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;37&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;s3&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R37" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;38&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;source&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R38" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;39&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;written in&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;python&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;python&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;python&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R39" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;40&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;loc&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;18,000&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;21,000&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;18,000&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R40" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;41&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;tests&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;lots&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;minimal&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;lots&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R41" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;42&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;maturity&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;fair&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;low&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;low&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R42" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;43&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;other serious users&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;not really&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R43" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;44&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;pip install&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;broken&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R44" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;45&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;niceties&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;-&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;sla, xcom, variables, trigger rules, celery, charts&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;pass data between jobs&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R45" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;46&lt;/div&gt;&lt;/th&gt;&lt;td class="s7" dir="ltr"&gt;does it for you&lt;/td&gt;&lt;td class="s5" dir="ltr"&gt;&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;td class="s5"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R46" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;47&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;sync tasks to workers&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R47" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;48&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;scheduling&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R48" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;49&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;monitoring&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R49" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;50&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;alerting&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;slas, but probably not enough&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;sends emails&lt;/td&gt;&lt;/tr&gt;&lt;tr style='height:20px;'&gt;&lt;th id="0R50" style="height: 20px;" class="row-headers-background"&gt;&lt;div class="row-header-wrapper" style="line-height: 20px;"&gt;51&lt;/div&gt;&lt;/th&gt;&lt;td class="s2" dir="ltr"&gt;dashboards&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;no&lt;/td&gt;&lt;td class="s6" dir="ltr"&gt;yes&lt;/td&gt;&lt;td class="s4" dir="ltr"&gt;yes&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;

&lt;script type='text/javascript'&gt;
function posObj(sheet, id, row, col, x, y) {
  var rtl = false;
  var sheetElement = document.getElementById(sheet);
  if (!sheetElement) {
    sheetElement = document.getElementById(sheet + '-grid-container');
  }
  if (sheetElement) {
    rtl = sheetElement.getAttribute('dir') == 'rtl';
  }
  var r = document.getElementById(sheet+'R'+row);
  var c = document.getElementById(sheet+'C'+col);
  if (r &amp;&amp; c) {
    var objElement = document.getElementById(id);
    var s = objElement.style;
    var t = y;
    while (r) {
      t += r.offsetTop;
      r = r.offsetParent;
    }
    var offsetX = x;
    while (c) {
      offsetX += c.offsetLeft;
      c = c.offsetParent;
    }
    if (rtl) {
      offsetX -= objElement.offsetWidth;
    }
    s.left = offsetX + 'px';
    s.top = t + 'px';
    s.display = 'block';
    s.border = '1px solid #000000';
  }
};
function posObjs() {
};
posObjs();&lt;/script&gt;

&lt;p&gt;[1] By datadoc'able I mean: could you write a script which reads and parses the ETL jobs, and generates a nice documentation about your datasets and which ETL jobs read/write them. At Prezi we did this, we called it datadoc.&lt;/p&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="luigi"></category><category term="airflow"></category><category term="pinball"></category></entry><entry><title>Pinball review</title><link href="https://bytepawn.com/pinball.html" rel="alternate"></link><published>2016-02-06T00:00:00+01:00</published><updated>2016-02-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-02-06:/pinball.html</id><summary type="html">&lt;p&gt;Pinball is an ETL tool written by Pinterest. Like Airflow, it supports defining tasks and dependencies as Python code, executing and scheduling them, and distributing tasks across worker nodes. It supports calendar scheduling (hourly/daily jobs, also visualized on the web dashboard). Unfortunately, I found Pinball has very little documentation, very few recent commits in the Github repo and few meaningful answers to Github issues by maintainers, while it's architecture is complicated and undocumented.&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://github.com/pinterest/pinball"&gt;Pinball&lt;/a&gt; is Pinterest’s open sourced workflow manager / ETL system. It supports defining several workflows (DAGs) consisting of jobs, and dependencies within jobs. Workflows are defined using a combination of declarative-style Python dictionary objects (like JSON) and Python code referenced in these objects. Pinball comes with a dashboard for checking currently running and past workflows.&lt;/p&gt;
&lt;p&gt;This review will be shorter than the previous &lt;a href="/luigi.html"&gt;Luigi&lt;/a&gt; and &lt;a href="/airflow.html"&gt;Airflow&lt;/a&gt; reviews, because Pinball turned out to be not very interesting to me for the following reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Very little &lt;a href="https://github.com/pinterest/pinball#installation"&gt;documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Very &lt;a href="https://github.com/pinterest/pinball/commits/master"&gt;few recent commits&lt;/a&gt; in the Github repo&lt;/li&gt;
&lt;li&gt;Very &lt;a href="https://github.com/pinterest/pinball/issues?utf8=%E2%9C%93&amp;amp;q=is%3Aissue"&gt;few meaningful answers&lt;/a&gt; to Github issues from the maintainers&lt;/li&gt;
&lt;li&gt;Complicated and undocumented architecture&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Unfortunately &lt;code&gt;pip install pinball&lt;/code&gt; doesn’t work and &lt;a href="https://github.com/pinterest/pinball/issues/9"&gt;the maintainers don’t care&lt;/a&gt;, so I didn't invest time in actually trying out Pinball, I just read the source code. Since this review is short and opinionated, I recommend also reading the Pinterest posts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://engineering.pinterest.com/blog/pinball-building-workflow-management"&gt;Pinball: Building workflow management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://engineering.pinterest.com/blog/open-sourcing-pinball"&gt;Open-sourcing Pinball&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Architecture&lt;/h2&gt;
&lt;p&gt;Pinball has a modularized architecture. There are 5 modules:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Master (sits on the DB)&lt;/li&gt;
&lt;li&gt;Scheduler (also accessed DB)&lt;/li&gt;
&lt;li&gt;Worker (also accessed DB)&lt;/li&gt;
&lt;li&gt;UI web server (also accessed DB)&lt;/li&gt;
&lt;li&gt;Command-line&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The master module sits on top of a &lt;a href="https://www.mysql.com/"&gt;Mysql&lt;/a&gt; database (no others supported) and uses &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; for ORM. The master exposes a synchronization token API using &lt;a href="https://thrift.apache.org/"&gt;Thrift&lt;/a&gt; to the other modules, and that’s all the master does. I think this is an unnecessary layer of abstraction; the Airflow design decision is better: everybody sees the DB and uses that to communicate, get &lt;a href="https://en.wikipedia.org/wiki/ACID"&gt;ACID&lt;/a&gt; for free; no need to define and maintain an API, no need for Thrift. In the blog post, they say &lt;em&gt;“component-wise design allows for easy alterations”&lt;/em&gt;, eg. you could write a different scheduler implementation. But:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Who’d ever want to write a different scheduler implementation? I'm using an opensource project to avoid writing my own ETL system.&lt;/li&gt;
&lt;li&gt;You can change the code in other architectures as well as long as it’s modularized.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Moving on, the other daemon modules are the scheduler, the worker and the UI web server. The scheduler performs calendar scheduling of workflows. The workers actually execute individual jobs.&lt;/p&gt;
&lt;p&gt;An important piece of the Pinball architecture are tokens. Tokens are basically records, and the collection of all tokens is the system state. Unfortunately the different sort of tokens are not documented, and since Python is dynamic, there’s also no usable documentation in the code (eg. a header file in C++). Tokens have a &lt;code&gt;data&lt;/code&gt; member, and Python objects are pickled and stored there on the fly as the state.
At first when I read the blog posts and code, I saw &lt;a href="https://github.com/pinterest/pinball/blob/master/pinball_system.png"&gt;this diagram&lt;/a&gt; and &lt;a href="https://engineering.pinterest.com/sites/default/files/article/fields/field_image/tumblr_inline_mzxiegqh5c1s1gqll.png"&gt;then this&lt;/a&gt;, and I thought that only the master accesses the database, and the scheduler and workers don’t, everything goes through the master using tokens. But actually that’s not true, I think the architecture is  everybody accesses the database for reads (as an optimization), but only the master writes to the database. This seems like a leaky abstraction, and again it’s not clear why the modules can’t use the DB to communicate state, why the need for Thrift. Relevant parts from the &lt;a href="https://engineering.pinterest.com/blog/pinball-building-workflow-management"&gt;blog post&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Every state (token) change goes through the master and gets committed to the persistent store before the worker request returns… workers can read archived tokens directly from the persistent storage, bypassing the master, greatly improving system scalability.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;An interesting design decision is separation of workflow description, which is given in Python dictionaries, and the actual job codes. &lt;a href="https://github.com/pinterest/pinball/tree/master/pinball_ext/examples"&gt;See example here.&lt;/a&gt; It’s a bit wierd that the workflow references the actual job using a string. I think this is because many modules load the workflow (eg. scheduler), but only the workers actually load the jobs.&lt;/p&gt;
&lt;h2&gt;Contrib stuff&lt;/h2&gt;
&lt;p&gt;Pinball has contrib stuff for the following job types:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Bash&lt;/li&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;S3 (also EMR)&lt;/li&gt;
&lt;li&gt;Hadoop, Hive&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.qubole.com/"&gt;Qubole&lt;/a&gt; (a data processing platform-as-a-service Pinterest uses)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are no connectors to Postgres, Mysql, Redshift, Presto or any SQL databases.&lt;/p&gt;
&lt;h2&gt;Source code and tests&lt;/h2&gt;
&lt;p&gt;The main codebase is ~18,000 LOC (python), plus about ~7,000 lines of unit test code. Other Python libraries used on the server side:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://thrift.apache.org/"&gt;Thrift&lt;/a&gt; for RPC&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tox.readthedocs.org/"&gt;Tox&lt;/a&gt; for testing&lt;/li&gt;
&lt;li&gt;and a few more…&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I think it’s cool that Pinball doesn’t have many library dependencies; for a Python project, it barely has any.&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;If I had to build an ETL system from scratch today, I would not use Pinball. It’s not documented, not a lot of commits, can't find other users, and I'm suspicious of the architecture. I would use Airflow.&lt;/p&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="pinball"></category></entry><entry><title>How to make a blog like this</title><link href="https://bytepawn.com/how-to-make-a-blog-like-this.html" rel="alternate"></link><published>2016-01-07T00:00:00+01:00</published><updated>2016-01-07T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-01-07:/how-to-make-a-blog-like-this.html</id><summary type="html">&lt;p&gt;Make a simple blog with Github Pages and Pelican.&lt;/p&gt;</summary><content type="html">&lt;p&gt;Getting your own blog like this is really easy, no server hosting nedded. There are two ingredients:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Github Pages&lt;/li&gt;
&lt;li&gt;the Pelican static blog generator&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Github Pages&lt;/h2&gt;
&lt;p&gt;Suppose your username on github is mtrencseni. Create a repo called &lt;code&gt;mtrencseni.github.io&lt;/code&gt;. Here's mine: &lt;a href="https://github.com/mtrencseni/mtrencseni.github.io"&gt;https://github.com/mtrencseni/mtrencseni.github.io&lt;/a&gt;. Anything you put in there will be served up at &lt;a href="http://mtrencseni.github.io"&gt;http://mtrencseni.github.io&lt;/a&gt;. Try it out for youself, put in an index.html containing &lt;code&gt;Hello world&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Your own domain name with Github Pages&lt;/h2&gt;
&lt;p&gt;I wanted to use my existing domain name &lt;code&gt;bytepawn.com&lt;/code&gt;. Github is so nice, they support this. Put a file called &lt;code&gt;CNAME&lt;/code&gt; into your repo. Here's mine: &lt;a href="https://github.com/mtrencseni/mtrencseni.github.io/blob/master/CNAME"&gt;https://github.com/mtrencseni/mtrencseni.github.io/blob/master/CNAME&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This tells Github to expect requests for this domain. What's left is to tell your DNS provider to use Github. (My registrar is Internet.bs, so much better than Godaddy.) Create an A record that points to &lt;code&gt;192.30.252.153&lt;/code&gt;, this is a Github IP address. That's it. For &lt;code&gt;bytepawn.com&lt;/code&gt; it looks like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ dig bytepawn.com

&lt;span class="p"&gt;;&lt;/span&gt; &amp;lt;&amp;lt;&amp;gt;&amp;gt; DiG &lt;span class="m"&gt;9&lt;/span&gt;.9.5-3ubuntu0.5-Ubuntu &amp;lt;&amp;lt;&amp;gt;&amp;gt; bytepawn.com
&lt;span class="p"&gt;;;&lt;/span&gt; global options: +cmd
&lt;span class="p"&gt;;;&lt;/span&gt; Got answer:
&lt;span class="p"&gt;;;&lt;/span&gt; -&amp;gt;&amp;gt;HEADER&lt;span class="s"&gt;&amp;lt;&amp;lt;- opco&lt;/span&gt;de: QUERY, status: NOERROR, id: &lt;span class="m"&gt;34353&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt; flags: qr rd ra&lt;span class="p"&gt;;&lt;/span&gt; QUERY: &lt;span class="m"&gt;1&lt;/span&gt;, ANSWER: &lt;span class="m"&gt;1&lt;/span&gt;, AUTHORITY: &lt;span class="m"&gt;0&lt;/span&gt;, ADDITIONAL: &lt;span class="m"&gt;1&lt;/span&gt;

&lt;span class="p"&gt;;;&lt;/span&gt; OPT PSEUDOSECTION:
&lt;span class="p"&gt;;&lt;/span&gt; EDNS: version: &lt;span class="m"&gt;0&lt;/span&gt;, flags:&lt;span class="p"&gt;;&lt;/span&gt; udp: &lt;span class="m"&gt;512&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt; QUESTION SECTION:
&lt;span class="p"&gt;;&lt;/span&gt;bytepawn.com.                  IN      A

&lt;span class="p"&gt;;;&lt;/span&gt; ANSWER SECTION:
bytepawn.com.           &lt;span class="m"&gt;600&lt;/span&gt;     IN      A       &lt;span class="m"&gt;192&lt;/span&gt;.30.252.153

&lt;span class="p"&gt;;;&lt;/span&gt; Query time: &lt;span class="m"&gt;19&lt;/span&gt; msec
&lt;span class="p"&gt;;;&lt;/span&gt; SERVER: &lt;span class="m"&gt;172&lt;/span&gt;.17.0.1#53&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.17.0.1&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt; WHEN: Thu Jan &lt;span class="m"&gt;07&lt;/span&gt; &lt;span class="m"&gt;21&lt;/span&gt;:36:09 UTC &lt;span class="m"&gt;2016&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt; MSG SIZE  rcvd: &lt;span class="m"&gt;57&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;h2&gt;Generating a static blog&lt;/h2&gt;
&lt;p&gt;Github will serve static content from your repo, it doesn't run any scripts.
So you need to use a static site generator and serve up the generated pages.
Fortunately, there are many.
The two biggest ones are &lt;a href="https://jekyllrb.com/"&gt;Jekyll&lt;/a&gt; for Rubyists and &lt;a href="http://docs.getpelican.com"&gt;Pelican&lt;/a&gt; for Pythonistas.
I'm a Python guy, so I use Pelican.
The &lt;a href="http://docs.getpelican.com/en/3.6.3/quickstart.html"&gt;Pelican quickstart doc&lt;/a&gt; explains how to generate an empty blog. Basically:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pip install pelican markdown
$ &lt;span class="nb"&gt;cd&lt;/span&gt; blog
&lt;span class="c1"&gt;### this is https://github.com/mtrencseni/blog locally&lt;/span&gt;
$ pelican-quickstart
$ vi contents/my-first-article.md
&lt;span class="c1"&gt;### see the docs what an article should look like&lt;/span&gt;
$ pelican content
&lt;span class="c1"&gt;### generates static files in the `output` dir&lt;/span&gt;
$ cp -R ouput/* ../mtrencseni.github.io
$ &lt;span class="nb"&gt;cd&lt;/span&gt; ../mtrencseni.github.io
$ git add * &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git commit -m &lt;span class="s2"&gt;&amp;quot;Working :)&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git push
&lt;span class="c1"&gt;### it&amp;#39;s live!&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Simply copy the contents of Pelican's &lt;code&gt;output&lt;/code&gt; directory into &lt;code&gt;mtrencseni.github.io&lt;/code&gt;, and that's it.&lt;/p&gt;
&lt;h2&gt;Getting a theme for Pelican&lt;/h2&gt;
&lt;p&gt;The default theme is kind of crappy looking.
Fortunately, there's a ton of free themes for Pelican at &lt;a href="http://www.pelicanthemes.com/"&gt;http://www.pelicanthemes.com/&lt;/a&gt;. Here's the github repo for all those themes: &lt;a href="https://github.com/getpelican/pelican-themes"&gt;https://github.com/getpelican/pelican-themes&lt;/a&gt;.
I picked &lt;a href="https://github.com/alexandrevicenzi/Flex"&gt;Flex&lt;/a&gt;, and did some very minor customization on it.
Getting Pelican to use a theme is simple: put a line like &lt;code&gt;THEME = 'flex'&lt;/code&gt; into your &lt;code&gt;pelicanconf.py&lt;/code&gt;, where &lt;code&gt;flex&lt;/code&gt; is the directory containing the theme you picked. Here is mine: &lt;a href="https://github.com/mtrencseni/blog/blob/master/pelicanconf.py"&gt;https://github.com/mtrencseni/blog/blob/master/pelicanconf.py&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Putting it all together&lt;/h2&gt;
&lt;p&gt;I have a repo &lt;a href="https://github.com/mtrencseni/blog"&gt;https://github.com/mtrencseni/blog&lt;/a&gt; which contains the source of the blog. It's a copy of Pelican, with my articles in the &lt;code&gt;content&lt;/code&gt; directory, and a (customized) copy of the flex theme in the &lt;code&gt;flex&lt;/code&gt; directory.&lt;/p&gt;
&lt;p&gt;I use Pelican to generate the static output, and then that gets pushed to &lt;a href="https://github.com/mtrencseni/mtrencseni.github.io"&gt;https://github.com/mtrencseni/mtrencseni.github.io&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I customized the Makefile to automate this. Here's mine: &lt;a href="https://github.com/mtrencseni/blog/blob/master/Makefile"&gt;https://github.com/mtrencseni/blog/blob/master/Makefile&lt;/a&gt;. It contains targets for &lt;code&gt;clean&lt;/code&gt;, &lt;code&gt;output&lt;/code&gt;, &lt;code&gt;preview&lt;/code&gt; and &lt;code&gt;publish&lt;/code&gt;. &lt;code&gt;preview&lt;/code&gt; uses the built in Python webserver to serve up the static site on localhost:8080 for testing. &lt;code&gt;publish&lt;/code&gt; git commits to the &lt;code&gt;mtrencseni.github.io&lt;/code&gt; repo. &lt;/p&gt;
&lt;p&gt;That's it. You can get up an running within a day.&lt;/p&gt;
&lt;h2&gt;Links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://pages.github.com/"&gt;Github Pages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://help.github.com/articles/setting-up-a-custom-domain-with-github-pages/"&gt;Custom domain with Github pages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://stackoverflow.com/questions/9082499/custom-domain-for-github-project-pages"&gt;Stackoverflow help on custom domain setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://docs.getpelican.com"&gt;Pelican&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.pelicanthemes.com/"&gt;Pelican themes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="blog"></category><category term="pelican"></category></entry><entry><title>Airflow review</title><link href="https://bytepawn.com/airflow.html" rel="alternate"></link><published>2016-01-06T00:00:00+01:00</published><updated>2016-01-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-01-06:/airflow.html</id><summary type="html">&lt;p&gt;Airflow is a workflow scheduler written by Airbnb. It supports defining tasks and dependencies as Python code, executing and scheduling them, and distributing tasks across worker nodes. It supports calendar scheduling (hourly/daily jobs, also visualized on the web dashboard), so it can be used as a starting point for traditional ETL. It has a nice web dashboard for seeing current and past task state, querying the history and making changes to metadata such as connection strings.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/airflow-main-view.png" alt="Airflow" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Airflow is a workflow scheduler written by Airbnb. It supports defining tasks and dependencies as Python code, executing and scheduling them, and distributing tasks across worker nodes. It supports calendar scheduling (hourly/daily jobs, also visualized on the web dashboard), so it can be used as a starting point for traditional ETL. It has a nice web dashboard for seeing current and past task state, querying the history and making changes to metadata such as connection strings.
I wrote this after my &lt;a href="/luigi.html"&gt;Luigi review&lt;/a&gt;, so I make comparisons to Luigi throughout the article.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note: Airflow has come a long way since I wrote this. Also, I've been using Airflow in production at Fetchr for a while. Check out &lt;a href="http://bytepawn.com/fetchr-airflow.html"&gt;Building the Fetchr Data Science Infra on AWS with Presto and Airflow&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Architecture&lt;/h2&gt;
&lt;p&gt;Airflow is designed to store and persist its state in a relational database such as Mysql or Postgresql. It uses &lt;a href="http://www.sqlalchemy.org/"&gt;SQLAlchemy&lt;/a&gt; for abstracting away the choice of and querying the database. As such much of the logic is implemented as database calls.
It would be fair to call the core of Airflow “an SQLAlchemy app”. This allows for very clean separation of high-level functionality, such as persisting the state itself (done by the database itself), and scheduling, web dashboard, etc.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/airflow-main-view.png" alt="Airflow" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;p&gt;Similarly to Luigi, workflows are specified as a DAG of tasks in Python code. But there are many differences. Luigi knows that tasks operate on targets (datasets, files) and includes this abstraction; eg. it checks the existence of targets when deciding whether to run a task (if all output targets exists, there’s no need to run the task). This concept is missing from Airflow, it never checks for the existence of targets to decide whether to run a task. Like in Luigi, tasks depend on each other (and not on datasets). Unlike Luigi, Airflow supports the concept of calendar scheduling, ie. you can specify that a DAG should run every hour or every day, and the Airflow scheduler process will execute it. Unlike Luigi, Airflow supports shipping the task’s code around to different nodes using &lt;code&gt;pickle&lt;/code&gt;, ie. Python binary serialization.&lt;/p&gt;
&lt;p&gt;Airflow also has a webserver which shows dashboards and lets users edit metadata like connection strings to data sources. Since everything is stored in the database, the web server component of Airflow is an independent &lt;a href="http://gunicorn.org/"&gt;gunicorn&lt;/a&gt; process which reads and writes the database.&lt;/p&gt;
&lt;h2&gt;Execution&lt;/h2&gt;
&lt;p&gt;In Airflow, the unit of execution is a &lt;code&gt;Task&lt;/code&gt;. DAG’s are made up of tasks, one &lt;code&gt;.py&lt;/code&gt; file is a DAG. &lt;a href="http://pythonhosted.org/airflow/tutorial.html"&gt;See tutorial.&lt;/a&gt; Although you can tell Airflow to execute just one task, the common thing to do is to load a DAG, or all DAGs in a subdirectory. Airflow loads the &lt;code&gt;.py&lt;/code&gt; file and looks for instances of class &lt;code&gt;DAG&lt;/code&gt;. DAGs are identified by the textual &lt;code&gt;dag_id&lt;/code&gt; given to them in the &lt;code&gt;.py&lt;/code&gt; file. This is important, because this is used to identify the DAG (and it’s hourly/daily instances) throughout Airflow; changing the &lt;code&gt;dag_id&lt;/code&gt; will break dependencies in the state!&lt;/p&gt;
&lt;p&gt;The DAG contains the first date when these tasks should (have been) run (called &lt;code&gt;start_date&lt;/code&gt;), the recurrence interval if any (called &lt;code&gt;schedule_interval&lt;/code&gt;), and whether the subsequent runs should depend on each other (called &lt;code&gt;depends_on_past&lt;/code&gt;). Airflow will interleave slow running DAG instances, ie. it will start the next hour’s jobs even if the last hour hasn’t completed, as long as dependencies permit and overlap limits permit. An instance of a &lt;code&gt;DAG&lt;/code&gt;, eg. one that is running for 2016-01-01 06:00:00 is called a &lt;code&gt;DAGRun&lt;/code&gt;. A &lt;code&gt;DAGRun&lt;/code&gt; is identified by the id of the DAG postfixed by the &lt;code&gt;execution_date&lt;/code&gt; (not when it’s running, ie. not &lt;code&gt;now()&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Tasks, like DAGs are also identified by a textual id. Internally, instances of tasks are instances of &lt;code&gt;TaskInstance&lt;/code&gt;, identified by the task’s &lt;code&gt;task_id&lt;/code&gt; plus the &lt;code&gt;execution_date&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The tasks in a DAG may define dependencies on other tasks using &lt;code&gt;set_upstream()&lt;/code&gt; and &lt;code&gt;set_downstream()&lt;/code&gt;. Airflow will raise an exception when it finds cycles in the DAG.&lt;/p&gt;
&lt;p&gt;A task is a parameterized operator. Airflow provides many types of operators, such as &lt;code&gt;BashOperator&lt;/code&gt; for executing a bash script, &lt;code&gt;HiveOperator&lt;/code&gt; for executing Hive queries, and so on. All these operators derive from &lt;code&gt;BaseOperator&lt;/code&gt;. In line with Airflow being “an SQLAlchemy app”, &lt;code&gt;BaseOperator&lt;/code&gt; is derived from SQLAlquemy's &lt;code&gt;Base&lt;/code&gt; class, so objects can be pushed to the database; this pattern happens throughout Airflow. Operators don’t actually contain the database specific API calls (eg. for Hive or Mysql); this logic is contained in hooks, eg. class &lt;code&gt;HiveCliHook&lt;/code&gt;. All hooks are derived from class &lt;code&gt;BaseHook&lt;/code&gt;, a common interface for connecting and executing queries. So, whereas Luigi has one &lt;code&gt;Target&lt;/code&gt; class (and subclasses), in Airflow this logic is distributed into operators and hooks.&lt;/p&gt;
&lt;p&gt;There are 3 main type of operators (all three use the same hook classes to accomplish their job):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Sensor:&lt;/strong&gt; Waits for events to happen. This could be a file appearing in HDFS, the existence of a Hive partition, or waiting for an arbitrary MySQL query to return a row.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remote Execution:&lt;/strong&gt; Triggers an operation in a remote system. This could be an HQL statement in Hive, a Pig script, a map reduce job, a stored procedure in Oracle or a Bash script to run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data transfers:&lt;/strong&gt; Move data from one system to another. Push data from Hive to MySQL, from a local file to HDFS, from Postgres to Oracle, or anything of that nature.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The most interesting are sensors. They allow tasks to depend on special “sensor tasks”, which are actually files or datasets. A sensor let’s you specify how often it should be checked (default 1 minute), and when it should time out (default 1 week). These are all derived from class &lt;code&gt;BaseSensorOperator&lt;/code&gt;. There is a special sensor called &lt;code&gt;ExternalTaskSensor&lt;/code&gt;, which lets a task depend on another task (specified by a &lt;code&gt;dag_id&lt;/code&gt; and a &lt;code&gt;task_id&lt;/code&gt; and &lt;code&gt;execution_date&lt;/code&gt;) in another DAG, since this is not supported “by default”. &lt;code&gt;ExternalTaskSensor&lt;/code&gt; actually just checks what the specified record looks like in the Airflow state database.&lt;/p&gt;
&lt;p&gt;All operators have a &lt;code&gt;trigger_rule&lt;/code&gt; argument which defines the rule by which the generated task get triggered. The default value for &lt;code&gt;trigger_rule&lt;/code&gt; is &lt;code&gt;all_success&lt;/code&gt; and can be defined as “trigger this task when all directly upstream tasks have succeeded. Others are: &lt;code&gt;all_failed&lt;/code&gt;, &lt;code&gt;all_done&lt;/code&gt;, &lt;code&gt;one_failed&lt;/code&gt;, &lt;code&gt;one_success&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Scheduling and executors&lt;/h2&gt;
&lt;p&gt;Recap: Airflow supports calendar scheduling (hour/daily tasks). Each such run is an instance of a DAG (internally, a &lt;code&gt;DAGRun&lt;/code&gt; object), with tasks and their dependencies. As mentioned previously, DAGs can depend on their previous runs (&lt;code&gt;depends_on_past&lt;/code&gt;), and additionally, specific task dependencies across DAGs is possible with the &lt;code&gt;ExternalTaskSensor&lt;/code&gt; operator. The maximum number of DAG runs to allow per DAG can be limited with &lt;code&gt;max_active_runs_per_dag&lt;/code&gt; in &lt;code&gt;airflow.cfg&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When running Airflow, we have to specify what sort of executor to use in &lt;code&gt;airflow.cfg&lt;/code&gt;: &lt;code&gt;SequentialExecutor&lt;/code&gt;, &lt;code&gt;LocalExecutor&lt;/code&gt; or &lt;code&gt;CeleryExecutor&lt;/code&gt;; all three derive from &lt;code&gt;BaseExecutor&lt;/code&gt;. The sequential executor runs locally in a single process/thread, and waits for each task to finish before starting the next one; it should only be used for testing/debugging. The &lt;code&gt;LocalExecutor&lt;/code&gt; also runs tasks locally, but spawns a new process for each one using &lt;code&gt;subprocess.popen()&lt;/code&gt; to run a new &lt;code&gt;bash&lt;/code&gt;; the maximum number of processes can be configured with &lt;code&gt;parallelism&lt;/code&gt; in &lt;code&gt;airflow.cfg&lt;/code&gt;. Inside the &lt;code&gt;bash&lt;/code&gt;, it runs an &lt;code&gt;airflow&lt;/code&gt;, parameterized to just run the a given &lt;code&gt;dag_id&lt;/code&gt; &lt;code&gt;task_id&lt;/code&gt; &lt;code&gt;execution_date&lt;/code&gt; combination using the &lt;code&gt;airflow&lt;/code&gt; run command line parametrization. The python code belonging to the task is read back from the database (where it was stored by the scheduler using &lt;code&gt;pickle&lt;/code&gt;). The &lt;code&gt;CeleryExecutor&lt;/code&gt; works similarly, except the job is pushed inside a distributed &lt;a href="http://www.celeryproject.org/"&gt;celery&lt;/a&gt; queue.&lt;/p&gt;
&lt;p&gt;When running Airflow, internally a number of jobs are created. A job is a long running something that handles running  smaller units of work; all jobs derive from &lt;code&gt;BaseJob&lt;/code&gt;. There is &lt;code&gt;SchedulerJob&lt;/code&gt;, which manages a single DAG (creates DAG runs, task instances, manages priorities),  &lt;code&gt;BackfillJob&lt;/code&gt; for backfilling a specific DAG, and &lt;code&gt;LocalTaskJob&lt;/code&gt; when running a specific &lt;code&gt;dag_id&lt;/code&gt; &lt;code&gt;task_id&lt;/code&gt; &lt;code&gt;execution_date&lt;/code&gt; combination (as requested by the &lt;code&gt;LocalExecutor&lt;/code&gt; or the &lt;code&gt;CeleryExecutor&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;When running the airflow scheduler, the &lt;code&gt;SchedulerJob&lt;/code&gt; supports loading DAGs from a folder: in this case, new code added/changed is automatically detected and loaded. This is very convenient, because new code just has to be placed on the production server, and it’s automatically picked up by Airflow.&lt;/p&gt;
&lt;p&gt;So in Airflow there is no need to start worker processes: workers are spawned as subprocesses by the &lt;code&gt;LocalExecutor&lt;/code&gt; or remotely by celery. Also, more than one scheduler/executor/main process can run, sitting on the main database. When running tasks, Airflow creates a lock in the database to make sure tasks aren’t run twice by schedulers; other parallelism is enforced by unique database keys (eg. only one &lt;code&gt;dag_id&lt;/code&gt; &lt;code&gt;execution_date&lt;/code&gt; combination allowed to avoid schedulers creating multiple &lt;code&gt;DAGRun&lt;/code&gt; copies). &lt;em&gt;Note: I’m not sure what the point would be of running several schedulers, other than redundancy, and whether this truly works without hiccups; the TODO file includes this todo item: “Distributed scheduler”.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Airflow supports pools to limit parallelism of certain types of tasks (eg. limit number of bash jobs, limit number of Hive connections); this is similar to Luigi resources. Priorities are also supported: The default &lt;code&gt;priority_weight&lt;/code&gt; is 1, and can be bumped to any number. When sorting the queue to evaluate which task should be executed next, Airflow uses the &lt;code&gt;priority_weight&lt;/code&gt;, summed up with all of the &lt;code&gt;priority_weight&lt;/code&gt; values from tasks downstream from this task.&lt;/p&gt;
&lt;p&gt;Airflow supports heartbeats. Each job will update a heartbeat entry in the database. If a job hasn’t updated it’s heartbeat for a while, it’s assumed that it has failed and it’s state is set to &lt;code&gt;SHUTDOWN&lt;/code&gt; in the database. This also allows for any job to be killed externally, regardless of who is running it or on which machine it is running. &lt;em&gt;Note: I’m not sure how this works, because from my reading of the code, the actual termination of the process that didn’t send the heartbeat should be performed by the process itself; but if it stuck or blocked and didn’t send a heartbeat, then how will it notice it should shut itself down?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Other interesting features&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;SLAs:&lt;/strong&gt; Service Level Agreements, or time by which a task or DAG should have succeeded, can be set at a task level as a timedelta. If one or many instances have not succeeded by that time, an alert email is sent detailing the list of tasks that missed their SLA. The event is also recorded in the database and made available in the web UI under Browse -&amp;gt; Missed SLAs where events can be analyzed and documented.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;XCom:&lt;/strong&gt; XComs let tasks exchange messages, allowing more nuanced forms of control and shared state. The name is an abbreviation of “cross-communication”. XComs are principally defined by a key, value, and timestamp, but also track attributes like the task/DAG that created the XCom and when it should become visible. Any object that can be pickled can be used as an XCom value, so users should make sure to use objects of appropriate size. XComs can be “pushed” (sent) or “pulled” (received). When a task pushes an XCom, it makes it generally available to other tasks. Tasks can push XComs at any time by calling the &lt;code&gt;xcom_push()&lt;/code&gt; method. In addition, if a task returns a value (either from its Operator’s &lt;code&gt;execute()&lt;/code&gt; method, or from a &lt;code&gt;PythonOperator&lt;/code&gt;’s &lt;code&gt;python_callable()&lt;/code&gt; function), then an XCom containing that value is automatically pushed. Tasks call &lt;code&gt;xcom_pull()&lt;/code&gt; to retrieve XComs, optionally applying filters based on criteria like key, source &lt;code&gt;task_id&lt;/code&gt;s, and source &lt;code&gt;dag_id&lt;/code&gt;. By default, &lt;code&gt;xcom_pull()&lt;/code&gt; filters for the keys that are automatically given to XComs when they are pushed by being returned from execute functions (as opposed to XComs that are pushed manually).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Variables:&lt;/strong&gt; Variables are a generic way to store and retrieve arbitrary content or settings as a simple key value store within Airflow. Variables can be listed, created, updated and deleted from the UI (Admin -&amp;gt; Variables) or from code. While your pipeline code definition and most of your constants and variables should be defined in code and stored in source control, it can be useful to have some variables or configuration items accessible and modifiable through the UI.&lt;/p&gt;
&lt;h2&gt;Contrib stuff&lt;/h2&gt;
&lt;p&gt;Like Luigi, Airflow has an impressive library of stock operator classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Bash&lt;/li&gt;
&lt;li&gt;Mysql&lt;/li&gt;
&lt;li&gt;Postgresql&lt;/li&gt;
&lt;li&gt;MSSQL&lt;/li&gt;
&lt;li&gt;Hive&lt;/li&gt;
&lt;li&gt;Presto&lt;/li&gt;
&lt;li&gt;HDFS&lt;/li&gt;
&lt;li&gt;S3&lt;/li&gt;
&lt;li&gt;HTTP sensor&lt;/li&gt;
&lt;li&gt;and many more...
Redshift is currently not supported.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Source code and tests&lt;/h2&gt;
&lt;p&gt;The main codebase is ~21,000 LOC (python, js, html), plus  about ~1,200 lines of unit test code.
Other Python libraries used on the server side:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.sqlalchemy.org/"&gt;SQLAlchemy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://jinja.pocoo.org/"&gt;Jinja&lt;/a&gt; for templating (why, if we’re using Python code to define jobs anyway?)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://gunicorn.org/"&gt;Gunicorn&lt;/a&gt; and &lt;a href="http://flask.pocoo.org/"&gt;Flask&lt;/a&gt; for HTTP&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.python.org/pypi/dill"&gt;Dill&lt;/a&gt; for pickling&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tox.readthedocs.org"&gt;Tox&lt;/a&gt; for testing
and many more...&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Airflow’s design decisions are very close to my heart: the fact that it’s an SQLAlchemy app make managing state, restarting the daemon, or running more in parallel very easy.  It has lots of contrib stuff baked in, so it’s easy to get started. The dashboard is very nice, and also shows historic runs nicely color-coded. If I were to build a new ETL system, I would definitely consider using Airflow (over Luigi, since Airflow has many more features out of the box).&lt;/p&gt;
&lt;p&gt;What I don’t like about Airflow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Apart from special sensor operators, doesn’t deal with files/datasets as inputs/outputs of tasks directly. This I find an odd design decision, as it leads to some complications:&lt;ul&gt;
&lt;li&gt;The state database stores the state of tasks, not the datasets; if the state database is lost, it’s hard to restore the historic state of the ETL, even if all the datasets are there. It’s better to separate datasets and tasks, and represent the historic state of ETL using the state of the datasets&lt;/li&gt;
&lt;li&gt;It’s harder to deal with tasks that appear to finish correctly, but don’t actually produce output, or good output. In the Airflow architecture this problem only shows up later, when a task downstream (hopefully) errors out. This can happen eg. if a bash script forgets to set -e.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;I think it’d be better if workers could be started independently, and picked up tasks scheduled by a central scheduler; instead Airflow starts workers centrally.&lt;/li&gt;
&lt;li&gt;Still a work in progress, not many tests, probably will run into bugs in production. Also see the end of &lt;a href="https://medium.com/handy-tech/airflow-tips-tricks-and-pitfalls-9ba53fba14eb#.lzfjq4wx9"&gt;this blog post&lt;/a&gt;, they restart the Airflow process pretty often because of some bug.&lt;/li&gt;
&lt;li&gt;Personally, I'm still not convinced that the ETL-job-as-code is the right way to go.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Links, talks&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/airbnb/airflow"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://airflow.readthedocs.org"&gt;Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/airbnb/airflow#links"&gt;Slides from Airflow users&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="airflow"></category></entry><entry><title>Systems thinking and system traps</title><link href="https://bytepawn.com/systems-thinking.html" rel="alternate"></link><published>2016-01-06T00:00:00+01:00</published><updated>2016-01-06T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2016-01-06:/systems-thinking.html</id><summary type="html">&lt;p&gt;Thinking in Systems, written by the late Donella Meadows, is a book about how to think about systems, how to control systems and how systems change and control themselves. A system can be anything from a heating furnace to a social system. The gem of the book is the part about system traps. System traps are ways a system can go wrong; examples are drift to low performance, seeking the wrong goals, shifting the burden, etc.&lt;br/&gt;&lt;br/&gt;&lt;img src="/images/thinking-in-systems.png" alt="Thinking in systems" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;a href="http://www.amazon.com/gp/product/1603580557/ref=pd_lpo_sbs_dp_ss_1?pf_rd_p=1944687762&amp;amp;pf_rd_s=lpo-top-stripe-1&amp;amp;pf_rd_t=201&amp;amp;pf_rd_i=0123859158&amp;amp;pf_rd_m=ATVPDKIKX0DER&amp;amp;pf_rd_r=181NKCEKSEPQ62PT0S07"&gt;Thinking in Systems&lt;/a&gt;, written by the late &lt;a href="https://en.wikipedia.org/wiki/Donella_Meadows"&gt;Donella Meadows&lt;/a&gt;, is a book about how to think about systems, how to control systems and how systems change and control themselves. A system can be anything from a heating furnace to a social system. The book is conceptual, there’s not a single equation in it, it's not about differential equations or control theory.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/thinking-in-systems.png" alt="Thinking in systems" style="width: 650px;"/&gt;&lt;/p&gt;
&lt;h2&gt;System traps&lt;/h2&gt;
&lt;p&gt;The gem of the book is the part about &lt;em&gt;system traps&lt;/em&gt;. System traps are ways a system can go wrong. It’s really interesting to read about system traps and then notice and observe them in action: in micro environments such as a company and in macro environments such as an industry or a  country. Here’s a list of the most interesting system traps from the book, with some examples.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Policy resistance:&lt;/strong&gt; The inherent resistance of the establishment to allow changes to affect the system. People would rather live with a flawed system that is familiar then to allow changes that might cause uncertainty and instability. Such resistance can cause inevitable collapse to be more dramatic, sometime even catastrophic.&lt;br&gt;
Example: &lt;a href="http://www.wsj.com/articles/obama-to-discuss-gun-control-options-with-attorney-general-1451646004"&gt;US citizens resisting Obama’s gun control changes.&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Drift to low performance:&lt;/strong&gt; The notion that prolonged failures causes acceptance of the new state of things, “new normal”.&lt;br&gt;
Example: a great example is soccer in Hungary. Hungary used to have a very strong soccer culture, but over time quality decayed to the point where today, a draw or only getting defeated by 1 goal is considered a good result. All this even though the hungarian government is investing large amounts into the sport. The root cause for this sustained drift to low performance seems to be that soccer is used as a way to channel money from the government to private individuals, ie. corruption.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Seeking wrong goals:&lt;/strong&gt; Sometime goals change. Many systems suffer from the fact that original goal don’t make any sense in the current context, or never did. Pursuit of wrong goals will cause the system pursue these goals, capturing wrong or insignificant metrics, leaving the illusion of progress, while heading toward system collapse.&lt;br&gt;
Example: &lt;a href="http://techcrunch.com/2011/07/30/vanity-metrics/"&gt;startups seeking to increase vanity metrics&lt;/a&gt; such as registered users and bookings instead of engagement and profits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Shifting the burden:&lt;/strong&gt; Notion that risk is shifted to someone else, while success is reaped by the actor.&lt;br&gt;
Example: &lt;a href="https://hbr.org/2013/05/six-myths-about-venture-capitalists"&gt;venture capitalists and hedge fund managers work under a model&lt;/a&gt; where they get a nice base salary, a nice bonus if their fund performs well, but there is no downside for them. Turn around times are on the order of 10 years, so there’s little historic data on fund manager’s performance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The tragedy of commons:&lt;/strong&gt; This is classic economic theory, described in terms of system thinking. The common, defined as community space – such as a town common, is a shared resource. This resource can be governed by community standards, privatization or effective regulation. Each approach has tradeoffs and benefits. It’s the conclusion of the [Donella Meadows] that only regulation is effective since the community standards are usually not enough.&lt;br&gt;
Example: &lt;a href="https://en.wikipedia.org/wiki/Cybersquatting"&gt;domain name squatting&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Success to the Successful:&lt;/strong&gt; The notion that success will give advantage to those that have already succeeded, thus limiting the “losers” ability to win in the future. “The rich get richer!”&lt;br&gt;
Example: that’s how it is everywhere, see &lt;a href="http://www.economist.com/blogs/economist-explains/2014/05/economist-explains"&gt;Thomas Piketty’s book Capital&lt;/a&gt;. A more specific example is entrepreneurs who’ve had a successful startup previous have an easier time raising money for their next startup. Surprisingly, &lt;a href="https://hbr.org/2014/02/research-serial-entrepreneurs-arent-any-more-likely-to-succeed/"&gt;data doesn’t show a correlation between past and future success&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rule beating:&lt;/strong&gt; The notion that rules, or laws, are ignored, broken or skirted. The cause of rule breaking is usually related to the fact that these rules are perceived as unjust or not flexible enough wrt real life issues.&lt;br&gt;
Example: there’s a whole industry called &lt;a href="https://en.wikipedia.org/wiki/Search_engine_optimization"&gt;SEO&lt;/a&gt; to &lt;a href="https://www.quora.com/What-techniques-do-websites-use-to-game-the-Google-search-engine"&gt;game search engine rankings&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</content><category term="systems"></category><category term="books"></category></entry><entry><title>Luigi review</title><link href="https://bytepawn.com/luigi.html" rel="alternate"></link><published>2015-12-20T00:00:00+01:00</published><updated>2015-12-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2015-12-20:/luigi.html</id><summary type="html">&lt;p&gt;I review Luigi, an execution framework for writing data pipes in Python code. It supports task-task dependencies, it has a simple central scheduler with an HTTP API and an extensive library of helpers for building data pipes for Hadoop, AWS, Mysql etc. It was written by Spotify for internal use and open sourced in 2012. A number of companies use it, such as Foursquare, Stripe, Asana.&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Luigi is an execution framework for writing data pipes in Python code. It supports task-task dependencies, it has a simple central scheduler with an HTTP API and an extensive library of helpers for building data pipes for Hadoop, AWS, Mysql etc. It was written by Spotify for internal use and open sourced in 2012. A number of companies use it, such as Foursquare, Stripe, Asana.&lt;/p&gt;
&lt;h2&gt;Execution&lt;/h2&gt;
&lt;p&gt;Suppose that part of your ETL process is to take some data A, apply transformation X on it, and save it as Y. In Luigi, you would write a &lt;code&gt;.py&lt;/code&gt; file which contains a class X, which derives from class &lt;code&gt;Task&lt;/code&gt;. X would have three methods: &lt;code&gt;requires(), run(), and output()&lt;/code&gt;.&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3
4&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;X&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;requires&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Then you execute luigi and pass this &lt;code&gt;.py&lt;/code&gt; file to it, like &lt;code&gt;luigi --module x X&lt;/code&gt; if the file name is &lt;code&gt;x.py&lt;/code&gt;. When given a &lt;code&gt;Task&lt;/code&gt;, luigi:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Calls the &lt;code&gt;output()&lt;/code&gt; method, which returns one or more objects deriving from class &lt;code&gt;Target&lt;/code&gt;. A &lt;code&gt;Target&lt;/code&gt; is something which has an &lt;code&gt;exists()&lt;/code&gt; method which returns either &lt;code&gt;True&lt;/code&gt; or &lt;code&gt;False&lt;/code&gt;. Luigi calls &lt;code&gt;exists()&lt;/code&gt; on all the targets to see whether they exist. If all return &lt;code&gt;True&lt;/code&gt;, luigi will flag this task as &lt;code&gt;DONE&lt;/code&gt; and never call &lt;code&gt;run()&lt;/code&gt;.
   If at least one of the output targets returned &lt;code&gt;False&lt;/code&gt;, this job needs to be run.&lt;/li&gt;
&lt;li&gt;Luigi then calls the &lt;code&gt;requires()&lt;/code&gt; method to see what other tasks need to first run for this task to run successfully. &lt;code&gt;requires()&lt;/code&gt; returns one or more objects deriving from class &lt;code&gt;Task&lt;/code&gt;, and recursively performs this process for all those.
   Note: after returning, luigi checks whether the output targets of the required tasks really exists. This is encapsulated in the &lt;code&gt;complete()&lt;/code&gt; method, the default implementation just calls &lt;code&gt;exists()&lt;/code&gt; on all targets returned by &lt;code&gt;output()&lt;/code&gt;; the method can optionally be overridden in the derived &lt;code&gt;Target&lt;/code&gt; class. The purpose of &lt;code&gt;complete()&lt;/code&gt; is to make sure &lt;code&gt;run()&lt;/code&gt; was successful, because if a required target’s &lt;code&gt;run()&lt;/code&gt; didn’t raise a Python exception but didn’t actually produce the output needed, then &lt;code&gt;run()&lt;/code&gt; shouldn’t be called. In this case the required task is re-run.&lt;/li&gt;
&lt;li&gt;Luigi calls the &lt;code&gt;run()&lt;/code&gt; method and sets the task status to &lt;code&gt;DONE&lt;/code&gt; if no Python exceptions were raised.
   Note: &lt;code&gt;run()&lt;/code&gt; can also dynamically &lt;code&gt;yield&lt;/code&gt; dependencies tasks.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Local and central scheduler&lt;/h2&gt;
&lt;p&gt;When luigi is launched and a task is given to it a &lt;code&gt;Worker&lt;/code&gt; object is created. Workers need to talk to a &lt;code&gt;Scheduler&lt;/code&gt;, which manages the dependency graph of tasks and tells workers what to do. So when the local worker object is created, it can either:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create a local scheduler in the process, or&lt;/li&gt;
&lt;li&gt;Connect to a remote scheduler using the HTTP API. This is the default.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;Local scheduler:&lt;/em&gt; The local scheduler can be used by passing &lt;code&gt;--local-scheduler&lt;/code&gt; to the luigi runtime. When running with the local scheduler, the algorithm given above is run recursively, and then luigi exits. This is usually only used for testing and debugging purposes.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Central scheduler:&lt;/em&gt; More interesting is the central scheduler. The central scheduler is a separate &lt;code&gt;luigid&lt;/code&gt; Python Tornado app that workers can talk to over HTTP. It performs two tasks: scheduling of tasks based on the dependency graph and serving a simple web dashboard on port 8082 (default). Note that the central scheduler:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Doesn’t see or execute the &lt;code&gt;Task&lt;/code&gt; object's code, hence it never sees or checks whether targets exist; this is always performed by workers.&lt;/li&gt;
&lt;li&gt;The task is identified by its signature:&lt;ul&gt;
&lt;li&gt;Python name of the class; in the example above it’s X.&lt;/li&gt;
&lt;li&gt;The values of the parameters passed to the task, eg. &lt;code&gt;day=2015-12-01&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Parameters are member variables in the &lt;code&gt;Task&lt;/code&gt; objects which derive from class &lt;code&gt;Parameter&lt;/code&gt;, eg.:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;X&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateParameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;By specifying &lt;code&gt;significant=False&lt;/code&gt; in the &lt;code&gt;Parameter&lt;/code&gt; constructor, we can tell Luigi not to treat it as part of the task signature.&lt;/p&gt;
&lt;p&gt;The worker builds the local dependency graph and then uploads it to the central scheduler. Then it asks the central scheduler what it should do. The central scheduler potentially receives dependency graphs from several workers, and merges them, assuming tasks with the same name (and parameter values) uploaded from different workers are the same (generate the same &lt;code&gt;output()&lt;/code&gt; targets, contain the same &lt;code&gt;run()&lt;/code&gt; logic, etc).&lt;/p&gt;
&lt;p&gt;Given the dependency graph, the central scheduler then tells workers to start running tasks. A worker can only run tasks that it uploaded to the central scheduler, because those are the tasks that that Python process loaded. So workers are not generic workers, they can only work on the tasks that they were started with!&lt;/p&gt;
&lt;p&gt;Given a dependency graph, the scheduler will tell workers to run tasks that have no dependencies. By default, the order is non-deterministic. However, tasks can specify a priority, tasks with higher priority run first. The default priority is 0. Example:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3
4
5
6
7&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;X&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nd"&gt;@property&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;something&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Because priorities are in code, the worker must evaluate them and pass it on to the central scheduler.&lt;/p&gt;
&lt;h2&gt;Local parallelism&lt;/h2&gt;
&lt;p&gt;More than 1 worker thread can be created by passing &lt;code&gt;--workers N&lt;/code&gt; to luigi. This is registered to the central scheduler, and if possible N tasks are run in parallel by one worker.
So there are multiple levels of parallelism in Luigi:
1. Multiple workers
2. Multiple threads in workers
3. Each task can have further parallelism, eg. a Hadoop MapReduce job.&lt;/p&gt;
&lt;h2&gt;Managing a library of tasks&lt;/h2&gt;
&lt;p&gt;What if we’re managing a library of 100s or 1000s of ETL jobs? While I haven’t used Luigi for this, it seems that the basic building blocks are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Python &lt;code&gt;import&lt;/code&gt; statements: our jobs are distributed into different &lt;code&gt;.py&lt;/code&gt; files, so we need to &lt;code&gt;import&lt;/code&gt; them to use them in &lt;code&gt;requires()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WrapperTask&lt;/code&gt; objects: these are special sink tasks which don’t have an output, they just require other tasks to be run.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This part puts a lot of work on the user of Luigi:
1. If we create a new task and forget to add it to the sink task, it won’t be executed (unless it’s a dependency for something else).
2. If we refactor a job (eg. rename the task class, change parameters), we have to search and replace all references in subsequent &lt;code&gt;requires()&lt;/code&gt; methods. Since Python isn’t a statically typed language, this has to be done by hand.
3. If running workers on separate machines, it’s our job to synchronize the library of &lt;code&gt;.py&lt;/code&gt; files (eg. using &lt;code&gt;git&lt;/code&gt; and &lt;code&gt;cron&lt;/code&gt; jobs to sync very often). Different versions of tasks with different logic or local, uncommitted changes propagating to the central scheduler will lead to hard to find bugs and data corruption.&lt;/p&gt;
&lt;h2&gt;Date parameters&lt;/h2&gt;
&lt;p&gt;In an ETL system, most tasks will have a date(time) parameter which tells the code which day/hour to run the scripts for. For example, a Daily Active User (DAU) script computes the number of unique DAUs for a given day. Because this is such a common use-case, Luigi has a number of helper classes for dealing with date parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.DateParameter"&gt;DateParameter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.MonthParameter"&gt;MonthParameter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.YearParameter"&gt;YearParameter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.DateHourParameter"&gt;DateHourParameter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.DateMinuteParameter"&gt;DateMinuteParameter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/api/luigi.parameter.html#luigi.parameter.DateIntervalParameter"&gt;DateIntervalParameter&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Often tasks have to be re-run for a number of days. One way to do this is to call luigi repeatedly from the command line. Or we can use the built in &lt;code&gt;RangeDailyBase&lt;/code&gt; (also &lt;code&gt;RangeHourlyBase&lt;/code&gt;) helpers:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3
4&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# instead of calling this repeatedly:&lt;/span&gt;
    &lt;span class="c1"&gt;# luigi task Task --date 2015-01-XX&lt;/span&gt;
&lt;span class="c1"&gt;# do this:&lt;/span&gt;
$ luigi --module task RangeDailyBase --of Task --start &lt;span class="m"&gt;2015&lt;/span&gt;-01-01 --stop &lt;span class="m"&gt;2015&lt;/span&gt;-01-31
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;The name of the date parameter of the task can be specified with &lt;code&gt;--param_name==&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When we pass in a large number of dates (as an interval), the &lt;code&gt;RangeXBase&lt;/code&gt; classes will instantiate a task object for each date and call &lt;code&gt;complete()&lt;/code&gt; to check whether that task needs to be run. This can be very slow, eg. if each one creates a database connection and then closes it down.
There are two optimization classes &lt;code&gt;RangeDaily&lt;/code&gt; and &lt;code&gt;RangeHourly&lt;/code&gt; that solve this problem. These are used just like the two &lt;code&gt;Base&lt;/code&gt; versions from the command line. But instead of instantiating many tasks which potentially don’t have to be run, they assume and call the task’s &lt;code&gt;bulk_complete()&lt;/code&gt; classmethod to get a list of dates which have to be run. So the user has to implement a &lt;code&gt;bulk_complete()&lt;/code&gt; to use &lt;code&gt;RangeDaily&lt;/code&gt; and &lt;code&gt;RangeHourly&lt;/code&gt;.&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ luigi --module task RangeDaily --of Task --start &lt;span class="m"&gt;2015&lt;/span&gt;-01-01 --stop &lt;span class="m"&gt;2015&lt;/span&gt;-01-31
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Note: it seems Luigi doesn’t support bulk &lt;em&gt;running&lt;/em&gt; of parameter intervals.&lt;/p&gt;
&lt;h2&gt;Scheduling&lt;/h2&gt;
&lt;p&gt;Most ETL systems have jobs which need to run every hour or every day. Luigi doesn’t have a concept of calendar scheduling, this is up to the user. The recommended method by the authors is to create sink tasks and run them from &lt;code&gt;cron&lt;/code&gt; when the external input files (eg. raw log files) are likely to be available.&lt;/p&gt;
&lt;p&gt;Rescheduling failed tasks is influenced by the following parameters in the central scheduler’s &lt;code&gt;luigi.cfg&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;retry-delay&lt;/code&gt;: when to re-schedule, default 900 seconds&lt;/li&gt;
&lt;li&gt;&lt;code&gt;remove-delay&lt;/code&gt;: how long the central scheduler keeps tasks around that have no stakeholder; a stakeholder is a worker who uploaded that task&lt;/li&gt;
&lt;li&gt;&lt;code&gt;disable-hard-timeout&lt;/code&gt;: if a task fails again after this much time, it is disabled for good&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the worker’s &lt;code&gt;luigi.cfg&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;worker-keep-alive&lt;/code&gt;: you probably need to set this to true, so workers will stay alive when they run out of jobs to run, as long as they have some pending job waiting to be run. Otherwise workers will disconnect from the central scheduler and exit if there’s nothing to do, even if there are tasks which will be scheduled a few minutes from now.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;retry-external-tasks&lt;/code&gt;: If true, incomplete external tasks (i.e. tasks where the &lt;code&gt;run()&lt;/code&gt; method is &lt;code&gt;NotImplemented&lt;/code&gt;) will be retested for completion while Luigi is running. This means that if external dependencies are satisfied after a workflow has started, any tasks dependent on that resource will be eligible for running.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The central scheduler has a feature called task history. This logs task completion to a database, and exposes it on the dashboard.&lt;/p&gt;
&lt;p&gt;For tasks where the output is a database table, Luigi needs to keep track of successful inserts. It uses a special marker table for this (set with &lt;code&gt;marker-table&lt;/code&gt; in &lt;code&gt;luigi.cfg&lt;/code&gt;, default name is &lt;code&gt;table_updates&lt;/code&gt;). When a task finishes whose target is a database table, an entry is created in the marker table with the task’s &lt;code&gt;task_id&lt;/code&gt; (its name and parameter values). When the target’s &lt;code&gt;exists()&lt;/code&gt; method is called, this marker table is queried to check whether the task has been run (the &lt;code&gt;task_id&lt;/code&gt; is passed by the task to the &lt;code&gt;Target&lt;/code&gt; in its constructor).&lt;/p&gt;
&lt;h2&gt;Resources&lt;/h2&gt;
&lt;p&gt;Resources can be used to introduce limits on task parallelism. For example, suppose we never want to run more than 10 mysql tasks, or we never want to run more than 3 instances of the hourly job &lt;code&gt;count_users&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Resources are declared in the &lt;code&gt;luigi.cfg&lt;/code&gt; file of the scheduler:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;mysql&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;count_users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Resource use is given in the resources property of the task object in the Python code, like:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="n"&gt;mysql&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;# using 2 mysql connections in this task&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;h2&gt;Contrib stuff&lt;/h2&gt;
&lt;p&gt;Luigi has an impressive library of stock &lt;code&gt;Target&lt;/code&gt; and &lt;code&gt;Task&lt;/code&gt; classes, each with lots of functionality baked in as helper methods. This is the big reason why I think Luigi is popular and why I would consider using it.&lt;/p&gt;
&lt;p&gt;Luigi has &lt;code&gt;Task&lt;/code&gt; and &lt;code&gt;Target&lt;/code&gt; classes which support:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Google Bigquery&lt;/li&gt;
&lt;li&gt;Hadoop jobs&lt;/li&gt;
&lt;li&gt;Hive queries&lt;/li&gt;
&lt;li&gt;Pig queries&lt;/li&gt;
&lt;li&gt;Scalding jobs&lt;/li&gt;
&lt;li&gt;Spark jobs&lt;/li&gt;
&lt;li&gt;Postgresql, Redshift, Mysql tables&lt;/li&gt;
&lt;li&gt;and more… &lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Source code and tests&lt;/h2&gt;
&lt;p&gt;I spent a fair amount of time digging through the Luigi Python source code. It’s pretty clean Python code with a lot of tests. Code size is about 18KLOC plus 16KLOC tests. It’s pretty easy to understand and extend.&lt;/p&gt;
&lt;h2&gt;Sample cases&lt;/h2&gt;
&lt;p&gt;Trying it out on a free &lt;a href="http://c9.io"&gt;cloud9&lt;/a&gt; Docker instance:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3
4
5
6
7&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pip install tornado &lt;span class="c1"&gt;# luigi uses the tornado web server&lt;/span&gt;
$ &lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:/home/ubuntu/workspace/luigi/bin
$ &lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="nv"&gt;PYTHONPATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/home/ubuntu/workspace/luigi:.
$ luigid
&lt;span class="m"&gt;2015&lt;/span&gt;-12-19 &lt;span class="m"&gt;14&lt;/span&gt;:18:08,492 luigi-interface&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;11022&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; INFO: Loaded &lt;span class="o"&gt;[]&lt;/span&gt;
&lt;span class="m"&gt;2015&lt;/span&gt;-12-19 &lt;span class="m"&gt;14&lt;/span&gt;:18:08,494 luigi.server&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;11022&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; INFO: No prior state file exists at /var/lib/luigi-server/state.pickle. Starting with clean slate
&lt;span class="m"&gt;2015&lt;/span&gt;-12-19 &lt;span class="m"&gt;14&lt;/span&gt;:18:08,497 luigi.server&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;11022&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; INFO: Scheduler starting up
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;In another terminal, this is the default Luigi sample to try:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1
2
3
4
5&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ &lt;span class="nb"&gt;cd&lt;/span&gt; luigi/examples
$ luigi --module top_artists AggregateArtists --date-interval &lt;span class="m"&gt;2012&lt;/span&gt;-06
&lt;span class="c1"&gt;# does the job, creates files locally!&lt;/span&gt;
$ luigi --module top_artists AggregateArtists --date-interval &lt;span class="m"&gt;2012&lt;/span&gt;-06
&lt;span class="c1"&gt;# notices files are there, doesn’t do anything&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Let’s play around with Luigi. Let’s create this x.py:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;luigi&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;X&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WrapperTask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;task_namespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;examples&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Running X&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;requires&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;task_namespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;examples&amp;#39;&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IntParameter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Bar &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;w&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Bar touched &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;luigi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LocalTarget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/tmp/bar/&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;And run it like:&lt;/p&gt;
&lt;table class="highlighttable"&gt;&lt;tr&gt;&lt;td class="linenos"&gt;&lt;div class="linenodiv"&gt;&lt;pre&gt;1&lt;/pre&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class="code"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ luigi --module x examples.X
&lt;/pre&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;This will create an &lt;code&gt;X&lt;/code&gt; task and 10 &lt;code&gt;Bar&lt;/code&gt; tasks. The 10 &lt;code&gt;Bar&lt;/code&gt; tasks will touch &lt;code&gt;/tmp/bar/…&lt;/code&gt; and that’s it.
Let’s delete the tmp files, and create a similarly named y.py, with identical &lt;code&gt;X&lt;/code&gt; and &lt;code&gt;Bar&lt;/code&gt; tasks, except &lt;code&gt;X&lt;/code&gt; renamed to &lt;code&gt;Y&lt;/code&gt;. Let’s launch two workers, one with x and one with y. Notice that the central scheduler will merge the dependency graphs and treat the &lt;code&gt;Bar&lt;/code&gt; tasks coming from the different workers/codes as the same, because their &lt;code&gt;task_id&lt;/code&gt; (class name plus parameters) are identical. It’s a bit weird, but this is how Luigi works. Another thing you’ll notice is that at the end of the execution, one of &lt;code&gt;X&lt;/code&gt; and &lt;code&gt;Y&lt;/code&gt; will be unfinished (not green on the dashboard). This is because the workers are run without &lt;code&gt;--worker-keep-alive&lt;/code&gt;. So the first worker who finishes its tasks and is waiting for the other worker to finish the last &lt;code&gt;Bar&lt;/code&gt; will exit (it’s got nothing to do). If that worker was eg. the x worker, then task &lt;code&gt;X&lt;/code&gt; is not going to be run by anyone! if we turn on &lt;code&gt;--worker-keep-alive&lt;/code&gt; in the command-line, this oddity goes away.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;When designing an ETL framework, I would make (and have made) different design decisions compared to Luigi. But if I were tasked with creating a new ETL framework from scratch (eg. at a new company), I would definitely consider using Luigi. There is simply too much useful stuff there to ignore (and re-implement).
&lt;strong&gt;However, I would expect to:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Find unexpected and painful behaviour in Luigi.&lt;/li&gt;
&lt;li&gt;Write significant scaffolding code to make it useful:&lt;ol&gt;
&lt;li&gt;Syncing the task library to different workers&lt;/li&gt;
&lt;li&gt;Scheduling series of tasks&lt;/li&gt;
&lt;li&gt;Monitoring&lt;/li&gt;
&lt;li&gt;Alerting&lt;/li&gt;
&lt;li&gt;Dashboard for the ETL datasets and jobs (see below)&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Downsides of Luigi:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Sometimes unexpected behaviour: for example, a wrapper task can reach &lt;code&gt;DONE&lt;/code&gt; status without ever running the &lt;code&gt;run()&lt;/code&gt; method depending on non-deterministic execution order.&lt;/li&gt;
&lt;li&gt;The biggest downside to Luigi is that ETL jobs are specified as programmatic Python Task objects and not given is some sort of DSL. This means no external tool can reasonably/easily parse a library of tasks and extract dependency information, which would be useful for eg. generating documentation of the ETL system. Also, analysts have to learn Python.&lt;/li&gt;
&lt;li&gt;The web dashboard of the central scheduler is basically useless.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Links, talks&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/spotify/luigi"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://luigi.readthedocs.org/en/stable/index.html"&gt;Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/spotify/luigi#who-uses-luigi"&gt;Slides from Luigi users&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="data"></category><category term="etl"></category><category term="workflow"></category><category term="luigi"></category></entry><entry><title>Cargo Cult Data</title><link href="https://bytepawn.com/cargo-cult-data.html" rel="alternate"></link><published>2015-01-26T00:00:00+01:00</published><updated>2015-12-22T00:00:00+01:00</updated><author><name>Marton Trencseni</name></author><id>tag:bytepawn.com,2015-01-26:/cargo-cult-data.html</id><summary type="html">&lt;p&gt;Cargo cult data is when you're collecting and looking at data when making decisions, but you're only following the forms and outside appearances of scientific investigation and missing the essentials, so it doesn't work. &lt;br/&gt;&lt;br/&gt;&lt;img src="/images/cargo_cult_data.jpg" alt="Cargo cult data" style="width: 400px;"/&gt;&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Cargo cult science&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Richard_Feynman"&gt;R. P. Feynman&lt;/a&gt; was a Nobel-prize winning physicist who coined the term &lt;a href="https://en.wikipedia.org/wiki/Cargo_cult_science"&gt;cargo cult science&lt;/a&gt;. In Feynman's words:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the South Seas there is a cargo cult of people. During the [second world] war they saw airplanes land with lots of good materials, and they want the same thing to happen now [after the Americans left]. So they've arranged to imitate things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas—he's the controller—and they wait for the airplanes to land. They're doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn't work. No airplanes land. So I call these things cargo cult science, because they follow all the apparent precepts and forms of scientific investigation, but they're missing something essential, because the planes don't land.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Feynman cautioned that to avoid becoming cargo cult scientists, researchers must avoid fooling themselves, be willing to question and doubt their own theories and their own results, and investigate possible flaws in a theory or an experiment. He recommended that researchers adopt an unusually high level of honesty which is rarely encountered in everyday life, and gave examples from advertising, politics, and behavioral psychology to illustrate the everyday dishonesty which should be unacceptable in science.&lt;/p&gt;
&lt;p&gt;&lt;img src="/images/cargo_cult_data.jpg" alt="Cargo cult data" style="width: 400px;"/&gt;&lt;/p&gt;
&lt;h2&gt;Cargo cult data&lt;/h2&gt;
&lt;p&gt;The same idea applies to data. Cargo cult data is when you're collecting and looking at data when making decisions, but you're only following the forms of scientific investigation and missing the essentials, so it doesn't work. &lt;em&gt;So in the end you're like the natives of the South Seas, and the planes don't land for you either.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Signs that you're doing cargo cult data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;you don't have standardized logging across your products&lt;/li&gt;
&lt;li&gt;you routinely break your logging and have holes in your dataset&lt;/li&gt;
&lt;li&gt;you don't have standardized KPIs across your products and company&lt;/li&gt;
&lt;li&gt;you're not A/B testing all your releases&lt;/li&gt;
&lt;li&gt;you don't have explicit hypothesis for your experiments&lt;/li&gt;
&lt;li&gt;you don't know what statistical power is&lt;/li&gt;
&lt;li&gt;you confuse statistical significance and magnitude of change&lt;/li&gt;
&lt;li&gt;you stop A/B tests as soon as they're statistically significant (=peeking)&lt;/li&gt;
&lt;li&gt;you're not tracking your experiments and their outcomes historically&lt;/li&gt;
&lt;li&gt;you don't know display and think about standard deviation and standard error on diagrams (=confuse signal and noise)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are no easy answers how to avoid cargo cult data, just as there are no easy answers how to avoid cargo cult science. If you are thinking about this as a company, your best bet is to hire smart mathematicians or physicist for your data team and listen to what they say. Personally, it's a matter of understanding statistics and being disciplined in your work. Fortunately there are &lt;a href="https://www.coursera.org/specializations/jhu-data-science"&gt;great courses on Coursera&lt;/a&gt;, &lt;a href="http://www.amazon.com/s/ref=dp_byline_sr_book_1?ie=UTF8&amp;amp;text=Allen+B.+Downey&amp;amp;search-alias=books&amp;amp;field-author=Allen+B.+Downey&amp;amp;sort=relevancerank"&gt;great books on Amazon&lt;/a&gt; and a &lt;a href="https://en.wikipedia.org/wiki/A/B_testing"&gt;wealth of information available online&lt;/a&gt;.&lt;/p&gt;</content><category term="data"></category></entry></feed>