Data Science How to create and add a conda environment as Jupyter Kernel? As data scientist, I daily work with Jupyter Notebook/ Jupyter Lab. One thing that I used to google a lot every time I start a new project is how to

Data Science When (& why) to use log transformation in regression? As data scientist working on regression problems I have faced a lot of times datasets with right-skewed target's distributions. By googling it I found out that log transformation can help

Data Science Python vs Swift for data science: Python's days are numbered At the TensorFlow Developer Summit in March 2018, Swift was announced for TensorFlow as an open-source project on GitHub. Later in March 2019, Jeremy Howard founder of fastai announced that

Data Science Top 10 Technical Machine Learning YouTube Channels to follow In this article, I will present my favorite top-10 Machine Learning YouTube Channels to follow in order to keep up with the current trends. The list of my favorite channels

Data Science Be a more efficient data scientist by using these Pandas Tricks In this article, I will present some useful Pandas commands that I use in my day-to-day job. So let’s get started. ✏️ Table of ContentsFix messy dataframe column namesRemove OutliersInspect

Data Science Hypothesis Testing: Z-test & Student's t-test Today I am going to speak about Hypothesis Testing which is frequently used by data scientists to: Test a particular ideaConstructed an experiment to answer a particular question✏️ Table of

Data Science Pandas-Profiling: A useful EDA tool When loading a new data set, the first thing we do is to get an understanding of the data. This includes steps like determining the number of unique values, identifying

Data Science Linear Models Decoded - Part 1 In this article, I will present the linear models in terms of questions and answers that can be asked during an interview process. I will try to start from very

Interview Coding Problems Interview Coding Problems: 1.Two Sum 2.Longest Common Prefix 3.Longest Substring Without Repeating Characters A great way to improve your coding skills is by solving coding challenges. Solving different types of challenges and puzzles can help you become a better problem solver, learn the

Python How to "best" print out a list/tuple in Python For long enough I have used to print out a python list as below: a = [1,2,3] print(a)Which is kind of working but without printing it in

Python “is” vs “==” for Object Comparisons In this article, I will discuss the difference between the "==" and the "is" operators. Photo by Aleks Dorohovich / UnsplashThe "==" operator compares by checking for equality while the "is" operator, compares

Data Science Tmux an essential tool for Data Scientists Many data scientists get stuck with the manifold of tools available to them. In this article, I will present the Tmux tool. What it does best is turn a single

NLP What is an N-gram Multichannel Convolutional Neural Network for Text Classification Deep neural networks have achieved remarkable results in some NLP tasks, one of them is text classification, i.e., assigning a set of pre-defined tags on a text based on

Data Science Importance of Cross-Validation Validation is probably in one of most important techniques that a data scientist use as there is always a need to validate the stability of the machine learning model-how well

NLP Sentiment Analysis on IMDB movie dataset - Achieve state of the art result using a simple Neural Network In my previous articles, I used two models to predict whether the movie reviews were positive or negative using the IMDB dataset. If you haven't read those articles I would

Data Science What is an Embedding Layer? A couple of months ago I had myself the same question, so I thought of writing an article trying to summarize and documented my understanding of an embedding layer. ✏️ Table

NLP Sentiment Analysis on IMDB movie dataset - Achieve state of the art result using Logistic Regression In my previous article, I used the Naive Bayes model to predict whether the movie reviews were positive or negative using the IMDB dataset. If you haven't read this article

Data Science Support Vector Machine vs Logistic Regression Support Vector Machine (SVM) is an algorithm used for classification problems similar to Logistic Regression (LR). LR and SVM with linear Kernel generally perform comparably in practice. The goal of

NLP Sentiment Analysis on IMDB movie dataset - Achieve state of the art result using Naive Bayes NLP refers to any kind of modelling where we are working with natural language text. Sentiment Analysis is a one of the most common NLP task that Data Scientists need

NLP NLP Tutorial: MultiLabel Classification Problem using Linear Models This article presents in details how to predict tags for posts from StackOverflow using Linear Model after carefully preprocessing our text features. Table of ContentsIntroductionDatasetImport Libraries and Load the dataText

NLP Transforming tokens into useful features (BOW,TF-IDF) In my previous article, I presented different methods to preprocess text and extract useful tokens. However, these tokens are only useful if you can transform them into features for your

NLP All you need to know about NLP Text Preprocessing Text preprocessing is a severely overlooked topic and a lot NLP applications fail badly due to use of wrong kind of text preprocessing. With that in mind, I thought of

Interview Coding Problems How to nail a Data Scientist SQL interview (includes SQL code snippets to tough interview questions) - Part 2 As we already know Data Science/Machine Learning is such a multi-disciplinary field means that the sheer volume of material you need to feel properly prepared can become overwhelming.The

Data Science Random Forest regression model Advanced Topics (+ Python code snippet using Sklearn) In my previous article, I presented the Random Forest Regressor model. If you haven't read this article I would urge you to read it before continuing. In simple terms, a

Data Science Random Forest Regressor explained in depth In my previous article, I presented the Decision Tree Regressor algorithm. If you haven't read this article I would urge you to read it before continuing. The reason is that