Monthly Archives: April 2018

Variance Reduction Techniques to Improve Power of the Test

Improving the Power of our experiment¶ Now that we understand the system we are dealing with, we can ask the question: how can we increase the detectable effect size of our experiments? We are left with a few options: Increase the effect size Increase the sample size Decrease the variance Increasing the effect size may […]

Read More

Central Limit Theorem, Violations & Remedy

Normal Distribution About 68% of values drawn from a normal distribution are within one standard deviation σ away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations. This fact is known as the 68-95-99.7 (empirical) rule, or the 3-sigma rule. What is Central Limit Theorem (CLT)? In […]

Read More

TPG, Carlyle Consortium Acquires Baidu Financial Services Group For $1.9B

Chinese internet giant Baidu Inc. has sold a majority stake in its Financial Services Group (Baidu FSG) for US$1.9 billion to an investor group  led by TPG and The Carlyle Group, with participation from Taikang Group, ABC International Holdings and others. China Money Network first reported that Baidu was planning to “dispose of a majority equity stake” in FSG on Friday. TPG […]

Read More

Multi-Armed Bandits and Contextual-Bandit

Multi-armed bandit uses machine learning algorithms to minimize opportunity costs and minimize regret. They’re more efficient because they move traffic towards winning variations gradually, instead of forcing you to wait for a “final answer” at the end of an experiment. They’re faster because samples that would have gone to obviously inferior variations can be assigned to […]

Read More

Multivariate Tests (Orthogonal Design)

Because resources are limited, it is very important to get the most information from each experiment you do. Well-designed experiments can produce significantly more information and often require fewer runs than haphazard or unplanned experiments. Also, a well-designed experiment will ensure that you can evaluate the effects that you have identified as important. As a […]

Read More

Recency Frequency Monetary (RFM) Customer Segmentation

RFM Analysis. It is a marketing technique used to quantitatively determine which customers are the best ones by examining their shopping behaviour – how recently a customer has purchased (recency), how often they purchase (frequency), and how much the customer spends (monetary). RFM analysis is based on an extension of Pareto’s principle which says that […]

Read More

Using Google’s Convolutional Neural Networks (CNN) for Image Recognition

  Convolutional neural networks are the state of the art technique for image recognition-that is, identifying objects such as people or cars in pictures.   We call this a “deep neural network” because it has more layers than a traditional neural network. How Convolution Works Instead of feeding entire images into our neural network as one […]

Read More

Sequential Probability Ratio Test

What is a Sequential Probability Ratio Test? A sequential probability ratio test (SPRT) is a hypothesis test for sequential samples. Sequential sampling works in a very non-traditional way; instead of a fixed sample size, you choose one item (or a few) at a time, and then test your hypothesis. You can either: Reject the null hypothesis (H0) in favor of […]

Read More

How to Choose the Right KPI?

Product key performance indicators (KPIs) are metrics that measure your product’s performance. They help you understand if the product is meetings its business goals and if the product strategy is working. Without KPIs, you end up guessing how your product is performing. Long vs short term metrics KPI/Business metrics, counter metrics Optional (deep dive/debug/perf/safety metrics) […]

Read More

Sampling and Sampling Bias

    In statistics, sampling bias is a bias in which a sample is collected in such a way that some members of the intended population are less likely to be included than others. It results in a biased sample, a non-random sample of a population (or non-human factors) in which all individuals, or instances, were not equally likely to have been selected.If this […]

Read More