Home


K-Means Clustering: The Secret Weapon to Personalized Marketing Campaigns

INTRODUCTION Once upon a time, the marketers relied on intuition, creativity, and a little bit of luck to execute their campaigns. They threw ideas at the wall to see what would stick. And while that approach may have had some charm, they live in a different era now. Today, they have the power of data…

Demystifying Data Scientist Salaries using Random Forrest Algorithm

Introduction Data scientists are in high demand, and their salaries reflect that. But what factors truly influence how much they earn? Our analysis of a rich dataset sheds light on this often opaque topic, offering valuable insights for both aspiring and experienced data professionals. Variables Work_year: Year in which the individual was employedExperience_level: Total work…

Using Hypothesis Testing and Exploratory Data Analysis To Unravel The Complex Tapestry of 2023 Unemployment Trends In India

INTRODUCTION In the dynamic landscape of India’s economic scenario, the study of unemployment trends plays a pivotal role in understanding the diverse factors influencing the workforce. This blog delves into the analysis of the unemployment rates across different states, shedding light on intriguing patterns and unexpected correlations. Unemployment and Literacy: An Unexpected Dichotomy Contrary to…

Air Quality Prediction Using FB PROPHET Algorithm

INTRODUCTION Some Indian cities fall in the array of the most polluted cities in the world, and the threat of air pollution is being raised day by day. Poor air quality in India is now considered a significant health challenge and a major obstacle to economic growth. The main pollutant emissions in India are due…

Employee Attrition Prediction Using SMOTE & Logistic Regression

INTRODUCTION In today’s competitive business landscape organizations face a significant challenge in retaining talented employees. It has become extremely crucial for organizations to the reasons behind departure. Attrition rate is the metric that quantifies the rate at which employees depart an organization, whether voluntarily or involuntarily. Having a clear view of your employee attrition rate…

Netflix Recommender System

Introduction The platform and production firm Netflix is based in the United States. Since its inception in 1997, Netflix has provided us with incredible entertainment, whether it be movies or television shows. Given the present global pandemic, Netflix has proven to be a great stress reliever for many of us. In numerous languages, genres, and…

Analyzing the Income Statement of Asian Paints

Introduction Analyzing the financial health of any company is required for all the stakeholders in order to take efficient and enhanced decision making. Python can be used in order to calculate the financial ratios and even to predict the finances using machine learning algorithms. We have taken the example of Asian Paints in order to…

Credit Card Clustering

Introduction The task of categorizing credit card customers according to their spending patterns, credit limitations, and a variety of other financial characteristics is known as credit card clustering. This is to understand how to group credit card customers using clustering analysis through python using kmeans++. Credit card clustering means grouping credit card holders based on their…

EMPLOYEE QUITTING THEIR JOB PREDICTION

Introduction: EMPLOYEE QUITTING THEIR JOB PREDICTION It’s entirely feasible that someone on your team may depart in the near future given that over four million people will leave their employment each month during the first quarter of 2022 and that 44% of workers are now seeking for new jobs. Additionally, it might not be the…

Heart Disease Prediction

INTRODUCTION Heart disease, often known as cardiovascular disease (CVD), is a general term that refers to a variety of illnesses that have an impact on the heart. Cardiomyopathy, arrhythmia, and problems with the blood vessels such peripheral or coronary artery disease are a few of them. Inhibition of the heart’s normal function is the primary cause…

Online Payments Fraud Detection Using Machine Learning

INTRODUCTION The introduction of online payment systems has helped a lot in the ease of payments. But, at the same time, it increased in payment frauds. Online payment frauds can happen with anyone using any payment system, especially while making payments using a credit card. That is why detecting online payment fraud is very important…

Predicting Breast cancer using logistic regression

INTRODUCTION Nowadays, breast cancer is the most frequently diagnosed life-threatening cancer in women and the leading cause of cancer death among women. Breast cancer was responsible for 626,679 out of 9.55 million cancer-related fatalities (or 6.6% of all cancer-related deaths) and 2.08 million out of 18.08 million new cancer cases worldwide in 2018.  Breast cancer…

Analyzing world energy consumption pattern using cluster analysis

INTRODUCTION WORLD ENERGY CONSUMPTION USING CLUSTER ANALYSIS Energy is the primary source that fuels all activities, our industries and every country’s GDP. Without Energy the world would come to a standstill. There has been debates revolving around the source of energy that is being currently used by the entire world (i.e. Renewable and Non-renewable energies),…

CUSTOMER CHURN PREDICTION USING LOGISTIC REGRESSION

INTRODUCTION Customer churning analysis and prediction: In this fast and rapid moving era, it is quite a challenge to retain the customer.  customer churn is the proportion of customers that stopped using your business’s goods or services over a certain period of time.  A low churn rate is essential to maintain in order to maintain…

Customer Churn Analysis using ANN

Introduction What is ANN? An artificial neuron network (neural network) is a computational model that mimics the way nerve cells work in the human brain. Artificial neural networks (ANNs) use learning algorithms that can independently adjust – or learn, in a sense – as they receive new input. This makes them a very effective tool…

sms spam classification using Naïve Bayes Classifier

Introduction to Naïve Bayes Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable. For example, a fruit may be considered to be an apple if it is red, round, and…

Lip Reader using CNN +LSTM

Lip Reading is decoding text from the movement of the speaker’s mouth. It plays a crucial role in human communication and speech understanding. Lip Reading is a notoriously difficult task for humans, especially without any context. Most lip-reading actuations besides lips and sometimes tongue and teeth are latent and difficult to disambiguate without context. Hearing-impaired…

Determining Beer Sales

Talking about a beer company, one of the most difficult tasks in serving customers is ensuring that the firm’s products are available on store shelves. According to the market study, when customers discover their favourite brand is constantly absent from the shelves, they are more likely to switch to a competitor’s brand and considering the…

Sentiment Analysis on Tweets about Omicron

Introduction Sentiment analysis is a natural language processing technique that determines the emotional tone of a body of text. This is a common method for businesses to determine and categorize customer views about a product, service, or concept. Sentiment analysis technologies assist enterprises in extracting information from unstructured and disorganized texts. The data is analyzed…

Consumer Sentiment Analysis

INTRODUCTION Sentiment analysis is an approach to natural language processing that identifies the emotional tone behind a body of text. This is a popular way for organizations to determine and categorize opinions about a product, service, or idea. Sentiment analysis systems help organizations gather insights from unorganized and unstructured texts. Rule-based automatic or hybrid methods…

Crypto Prediction by ARIMA

Cryptocurrencies are digital tokens that, in the future, might easily replace traditional currency. The ease with which they may be accessed is one of the reasons for their rapid popularity. These coins are available to almost everyone and are accepted as payment in the same way that traditional currency is. In the future, the blockchain…

HUMAN ACTIVITY RECOGNITION

DESCRIPTION A person performs various activities in his regular day to day life. It is quite difficult to differentiate between a regular and an anomalous activity. Human activity recognition plays an important part in human’s everyday life. An effective HAR consists of applications in behavior analysis of elder persons at home video surveillance, gesture recognition,…

Credit Card Analytics

Introduction As we delve deeper into the realms of digitization, the transition from paper to plastic money is inevitable. In India, there are just 6.5% of the total cards issued and less than 5% of the population holds credit cards. However, in the present times, the rising incomes and aspirations of the Indian millennial population…

Bankruptcy Prediction-five Algorithmic Model Comparison

 What & Why of Bankruptcy Prediction? Bankruptcy prediction is a technique for predicting bankruptcy and various financial conditions of state-owned enterprises. It is a broad area of ​​study in finance and accounting. This area’s importance is partly because it allows lenders and investors to evaluate the likelihood of a company going bankrupt. The number of…

Rainfall Prediction using Support Vector Machines

Introduction India is an agrarian economy with 60-70% of its population deriving their livelihood from agriculture. A major chunk of Indian farmers depends on rainfall for carrying out agricultural activities. As rainfall in India is mostly restricted to the monsoon season, it becomes imperative to have a rainfall prediction model so as to help farmers…

Predicting Job-Switching Behaviour by Employees through Random Forest Model

Introduction In a Harvard Business Review Article of 2012, Data Scientist was described as the most desired job of the 21st century. Given the buzz and craze surrounding data science jobs, there are several applications received by an organization for a Data Science position. The company would want to know which of these candidates genuinely…

App Rating prediction on google store data

Google Play acts as the official Android app store, allowing you to install apps developed with the Android Software Development Kit and published through Google. As Figure 1 and Figure 2 shows, Google Play has gained popularity in recent years, with the most diverse apps available in the world’s leading app stores. The situation at…

Data Analytics in Formula 1

                 Sports are one of the few things which are enjoyed by people of all ages. People passionately supporting their teams and wanting their teams to succeed has always been a constant, it is a ride full of emotions. The teams too, to deliver to the expectations of their fans, try to always enhance their…

Predicting Life Expectancy through Random Effect Model

What is life expectancy? Life expectancy is the average number of years a person is expected to live. There are several factors affecting life expectancy like socio economic status like income, education and economic well -being, quality of health care available in the country, alcohol consumption, poor nutrition, lack of exercise etc. The importance of…

RFM ANALYSIS

In today’s dynamic digital era, where there is an inherent need for organizations to increase their digital presence, companies get a lot of information about their customers. With Customer-centricity becoming an integral part of mission statements for most corporations, this data becomes a holy grail for marketers to address each customer most effectively, be in…

Automation of Mechanical pressure-controlled ventilator

What do doctors do when a patient has trouble breathing? Doctors use a ventilator to pump oxygen into a sedated patient’s lungs via a tube in the windpipe. But it’s easier said than done. Every patient has a unique requirement for a breathing support system tailored according to their needs. Resetting a ventilator is a…

Industry Applications of Churn Analysis

Ever wondered where Pokémon Go stands in the current market scenario?

Or how has Netflix managed to establish its stronghold in the market despite exorbitant prices?

In today’s blog, we will take a look at Industry Applications of Churn Analysis.

Read how this simple yet effective technique helps companies to retain their customers!

Predicting customer churn the right way!

In business analytics, we cover a wide range of topics that can essentially be classified under the data-preprocessing techniques.

Popular instances where the application has become of utmost importance include fraud detection, customer relationship management, and customer churn prediction.

In this blog, we will take a look at why predicting customer churn is important…

Stock Screening using web scrapping and Power BI

In today’s world there are a lot of information available in the market regarding stock markets. But the big question is how to use relevant information and do well in the market. Stock screener is a tool which presents to us all these information in a simple and understandable format which makes the job easy…

Book Recommender system

INTRODUCTION Have you ever wondered how e-commerce sites such as Amazon, Flipkart etc suggests you items based on your purchases or items frequently bought with your product or Netflix and other OTT platforms suggesting you shows based on your watchlist. A lot of companies nowadays are using such recommender systems to not only increase their…

Generative Adversarial Networks

Introduction The algorithms we have looked at so far, ranging from Linear Regression to Random Forests and Neural Networks are amazing in their predictive and classification abilities and have helped individuals and organizations develop products and services used by millions of people on a daily basis. But most of them are only capable of predicting…

NEXT WORD PREDICTOR USING LSTM

INTRODUCTION Have you ever imagined how the keyboard apps in WhatsApp, google, Facebook,  Instagram etc predict what is going to be the next word after we start typing?  It would be great if we are able to predict the next word as it is going to save us a lot of typing time. In this…

Has COVID-19 dragged world employment down: Looking towards the Future

According to the World Employment and Social Outlook, unemployment was projected to increase by around 2.5 million in 2020. Global unemployment has been roughly stable for the last nine years, but the pandemic has drastically increased global unemployment and there is a lesser number of jobs being generated in the formal sector. So, we collected…

RNN vs TRANSFORMERS

The ultimate showdown between RNN & Transformers. RNN Models : GRU, LSTM, Bi-LSTM . . . . Transformers : BERT, XLM-R, GPT-2, T5

How RNN & LSTM works?

This blog is for those who want to understand in depth the functioning of RNN and LSTM. . . . . One of the most complicated and hard to understand neural networks of all time.

CNN IMPLEMENTATION (Part -2)

In this blog we will be discussing about different CNN Architectures and their applications. . . . . We will also build a Convolutional Neural Network from scratch.

Convolutional Neural Network (Part-1)

The most comprehensive blog on Convolutional Neural Network(CNN) that you’ll find. . . We will be dissecting and analyzing every layer and function of the CNN along with the mathematics to truly gain an scrupulous insight.

ARTIFICIAL NEURAL NETWORKS IN PYTHON

In the previous post, we looked at the concepts of neural networks. Let us now consider an example to understand the working of neural networks. For this, we have considered a Churn dataset.

Artificial Neural Network- PART 1

Recently there has been a great buzz around the words neural network in the field of computer science and it has attracted a great deal of attention from many people. But what is this all about, how do they work, and are these things really beneficial? Essentially, neural networks are composed of layers of computational…

XGBOOST IMPLEMENTATION

Introduction XGBoost or extreme gradient boosting is a supervised learning algorithm that uses gradient boosting which is a decision tree-based ensemble machine learning algorithm. XGBoost is mainly known for its speed and performance and is very popular among the Kaggle community and is widely used in many competitions. Algorithms on which XGBoost is based on:…

A Deep Dive Into Clustering

Introduction Unsupervised and supervised learning are the two strategies of machine learning. But both these techniques are used in different manner and with different datasets. Supervised learning is a method in which model is trained with the labelled data. In this Model one needs to find the relation between the independent and dependent variables. Classification…

EXPLORING NATURAL LANGUAGE PROCESSING WITH NLTK

Introduction Natural Language Processing (NLP) is a field for analysis and generation of human languages. The language we humans use is highly context sensitive and can even be difficult to discern for humans because of ambiguities, so it’s a given that computers would have a hard time with them too. NLP is an interdisciplinary domain…

IMPLEMENTING DECISION TREE AND RANDOM FOREST ON PYTHON

Decision tree is one of the most important models as it lays out an important concept that is used for other machine learning models like Random Forest, XGBoost, bagging & boosting etc which all together come under the ensemble methods. It’s a tree-shaped model consisting of root nodes, branches and internal & leaf nodes which…

Explaining k-NN, Naïve Bayes and SVM from Scratch

k-NN, Naïve Bayes, and SVM are Machine Learning algorithms that are easy to learn and can be implemented on datasets without much hassle. However, at times learning and understanding them can be tough to comprehend and easier said than done. Do you want to learn about these algorithms and develop your understanding of them? Check…

TYPES OF REGRESSION AND THEIR USES

We need different types of regression analysis to tackle different problems, as the basic linear model might not always be significant.

Each model has its own set of assumptions which makes it more suitable to use in certain scenarios.

Curious to know more about it?

Check out our in depth analysis of this algorithm.

Trading Strategies

The hunt for the right trading strategy almost never seems to end. Considering it, we have made 4 trading strategies and 2 prediction models. The trading strategies along with the technique that we have used are: Momentum – by modelling the returns on 5 stocks selected randomly Weighted Volatility – by giving weight to volatility…

Varying Sentiments during Pandemic

Almost every company takes steps to understand the effect on the market sentiments because of a looming crisis. Thus, to replicate and understand the steps taken by the company to understand the sentiment surrounding it over a long period, we tried to conduct a sentimental analysis on our own.

FIFA 20 – Part ii: One for the money

We are back with our second part of the blog on FIFA game analysis. Here, we will capture some of the intriguing question for the FIFA fans like how to select a team under a given budget? Can we predict the overall performance of an individual player for particular season? What are the different parameters…

FIFA 20: For the Game. For the World.

Ever wondered why Bayern Munich performed so well last season? Are the overall ratings and potential of footballers justified? What parameters does EA Sports consider when finding these ratings? We got you covered. We analyzed the FIFA 20 data and found how various parameters affect the performance of the players.

Using ML to Find “REAL” Job Postings on Job Portals

we started with this noble cause to help you in your job search and prevent people from getting scammed in the name of jobs. The COVID-19 is also not helping in the pursuit of job search and if our models comes in handy to anyone than we will be very satisfied with our work. Lets…

Market Basket Analysis in Management Research (using R)

“MARKET BASKET ANALYSIS” IN THE BUSINESS INTELLIGENCE ENVIRONMENT HELPS RETAILERS BETTER UNDERSTAND AND EVENTUALLY SUPPORT THEIR CUSTOMERS BY ANTICIPATING THEIR PURCHASING HABITS. IN THIS BLOG POST WE WILL CLARIFY HOW THE ANALYSIS OF MARKET BASKETS WORKS AND WHAT IT TAKES TO DEPLOY A PROJECT FOR MARKET ANALYSIS. Introduction Market basket analysis (MBA) is a collection…

Leveraging online unstructured data to provide curated map-based COVID solutions for businesses and individuals

INTRODUCTION Before we start, have a look at a version of the maps we made on Google Maps: https://www.google.com/maps/d/edit?mid=1EyvHRefIY54pPfPjiZhReZMRMZtj0CMY&usp=sharing As the country starts on a revival path amidst the pandemic, it can certainly be stated that no such event before had caused an uproar or indecision as the COVID-19 pandemic. The imprints of the pandemic…

Developing Random Forest Classification in R

In supervised machine learning algorithms, Random Forest stands apart as it is arguably the most powerful classification model. When Microsoft developed their X-box game which enables you to play as per the movement of your posture, they used Random Forest over any other machine learning algorithm and over ANN (Advanced Neural Networks) as well !…

Developing Machine Learning Model using SVM in R to solve A Business Problem

Support Vector Machine (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges. However, it is mostly used in classification problems.

SVMs have been used successfully in many real-world problems; such as text (and hypertext) categorization,& image classification, bioinformatics (Protein classification,&Cancer classification), handwritten character recognition, etc.

Naïve bayes classifier from scratch with hands on examples in r

Naïve Bayes classifiers are a family of simple “probabilistic classifiers” based on applying Bayes’ theorem with strong independence assumptions between the features. In this post you will learn about What is Bayes Theorem Naïve Bayes Classifier Why is the algorithm called Naïve Bayes Advantages and applications of using Naïve Bayes to classify data and its…

C4.5 in detail and comparative analysis of decision tree algorithms

Previously, we have talked about 2 of the Decision Tree Algorithms: 1. Gini Index (while implementing CART – Classification) 2. ID3 – Iterative Dichotomiser 3 (while implementing CART – Regression) There are a few more algorithms that you need to be aware of. In this blog, we are going to learn about those algorithms and…

Logistic Regression From scratch with a hands-on Example in R

Introduction Researchers are often interested in setting up a model to analyse the relationship between some predictors (i.e., independent variables) and a response (i.e., dependent variable). Linear regression is commonly used when the response variable is continuous. One assumption of linear models is that the residual errors follow a normal distribution. This assumption fails when…

CART – Regression Tree from scratch with a hands-on example(in R)

Decision trees are made with the objective of creating a model that predicts the value of a target or dependent variable based on the values of several input or independent variables. The CART algorithm is structured as a sequence of questions, the answers to which determine what the next question, if any should be. The…

Classification and regression tree

The blog aims at explaining the Classification part of the CART algorithm in as much detail as possible. Give a read to gain a real, hands-on experience on a real-life data.

Linear Regression for Cross-Sectional Data from Scratch with Hands-on example (in R)

Linear Regression is one of the 1st algorithms that everyone learns in Machine Learning, Statistics, Financial Econometrics and Data Science. In this post you will learn and go through Types of Datasets in Machine Learning Assumptions that go into Linear Regression Understanding the statistical(or econometric) route of data preparation, model fitting and diagnostics that goes…

Text Classification in R

Text classification is the task of assigning a set of predefined categories to free-text. Businesses are turning to text classification for structuring text in a fast and cost-efficient way to enhance decision-making and automate processes. Some examples of text classification are: Sentiment Analysis Detection of spam and non-spam emails, Auto tagging of customer queries, and…

Basics of Statistics and Hypothesis Testing

Statistician Valen Johnson recently published an earth-shaking proceeding which asserts that 17 to 25% of published scientific results may simply be wrong. Statistics is base of all analytics and it is essential for an analyst to know the basics of statistics and hypothesis testing. Hypothesis testing is generally used in research where one tries to answer research…