I will be writing every couples of weeks about stuff that interests me in the fields of ML/AI/Data Science. These can be newly learned/researched topics or tutorials in the format of how-to posts. The aim of this blog is to give back to the community at large and share the knowledge I’ve acquired through my journey in ML/AI which is a life-long journey ðŸ˜„

Introduction Conda in an open source package management system that works on all platforms. It is a tool that helps manage packages and environments for different programming languages. Develop a high level understanding of how Conda works helped me at so many levels especially when it comes to managing environments and make my work more reproducable. Below are the notes that I wrote down during my journey of learning Conda and I always refere back to them:

Overview
Clustering Kmeans Algorithm Implementation Applications Geyser’s Eruptions Segmentation Image Compression Evaluation Methods Drawbacks Conclusion Clustering
Clustering is one of the most common exploratory data analysis technique used to get an intuition about the structure of the data. It can be defined as the task of identifying subgroups in the data such that data points in the same subgroup (cluster) are very similar while data points in different clusters are very different.

Dropout. Dropout is a regularization technique. On each iteration, we randomly shut down some neurons (units) on each layer and don’t use those neurons in both forward propagation and back-propagation. Since the units that will be dropped out on each iteration will be random, the learning algorithm will have no idea which neurons will be shut down on every iteration; therefore, force the learning algorithm to spread out the weights and not focus on some specific feattures (units).

Bias-Variance Trade-off Bias and variance as a function of model complexity (flexibility). Generalization (test) error is the most important metric in Machine/Deep Learning. It gives us an estimate on the performance of the model on unseen data. Test error is decomposed into 3 parts (see figure 1): Variance, Squared-Bias, and Irreducible Error. Models with high bias are not complex enough (too simple) for the data and tend to underfit.

Optimization, in Machine Learning/Deep Learning contexts, is the process of changing the model’s parameters to improve its performance. In other words, it’s the process of finding the best parameters in the predefined hypothesis space to get the best possible performance. There are three kinds of optimization algorithms:
Optimization algorithm that is not iterative and simply solves for one point. Optimization algorithm that is iterative in nature and converges to acceptable solution regardless of the parameters initialization such as gradient descent applied to logistic regression.

Gradients’ Directions. In the previous post, Coding Neural Network - Forward Propagation and Backpropagation, we implemented both forward propagation and backpropagation in numpy. However, implementing backpropagation from scratch is usually more prune to bugs/errors. Therefore, it’s necessary before running the neural network on training data to check if our implementation of backpropagation is correct. Before we start, let’s revisit what back-propagation is: We loop over the nodes in reverse topological order starting at the final node to compute the derivative of the cost with respect to each edge’s node tail.

Why Neural Networks?
According to Universal Approximate Theorem, Neural Networks can approximate as well as learn and represent any function given a large enough layer and desired error margin. The way neural network learns the true function is by building complex representations on top of simple ones. On each hidden layer, the neural network learns new feature space by first compute the affine (linear) transformations of the given inputs and then apply non-linear function which in turn will be the input of the next layer.

A/B testing can be defined as a randomized controlled experiment that allows us to test if there is a causal relationship between a change to a website/app and the user behavior. The change can be visible such as location of a button on the homepage or invisible such as the ranking/recommendation algorithms and backend infrastructure.
Web/Mobile developers and business stakeholders always face the following dilemma: Should we try out all ideas and explore all options continuously?

Introduction
The two most critical questions in the lending industry are: 1) How risky is the borrower? 2) Given the borrower’s risk, should we lend him/her? The answer to the first question determines the interest rate the borrower would have. Interest rate measures among other things (such as time value of money) the riskness of the borrower, i.e. the riskier the borrower, the higher the interest rate. With interest rate in mind, we can then determine if the borrower is eligible for the loan.

Iphone’s text suggestion. Have you ever wondered how Gmail automatic reply works? Or how your phone suggests next word when texting? Or even how a Neural Network can generate musical notes? The general way of generating a sequence of text is to train a model to predict the next word/character given all previous words/characters. Such model is called a Statistical Language Model. What is a statistical language model?