This essay has been submitted by a student. This is not an example of the work written by professional essay writers.
Business

big data analytics, business intelligence, data science and machine learning

This essay is written by:

Louis PHD Verified writer

Finished papers: 5822

4.75

Proficient in:

Psychology, English, Economics, Sociology, Management, and Nursing

You can get writing help to write an essay on these topics
100% plagiarism-free

Hire This Writer

big data analytics, business intelligence, data science and machine learning

Introduction

Compared to the past ages, most businesses in the present day tend to have a significantly large number of clients. This increase in the number of clients can be attributed to the fact that novel business models are being employed in the present age which did not exist previously. Subsequently, such models have resulted in the respective businesses which use them gaining more clients. While an increase in the number of customers being served by a business may be thought of as an advantage, which is true, it is also important to consider the challenges associated with such quantities of clients. For instance, it is important to note that an increase in the number of clients subsequently implies that an institution also needs to improve its security strategies to ensure that the client’s confidentiality is maintained. Secondly, another major challenge associated with an increase in the number of clients is the exponential increase in customer data that the management of such businesses has to handle (Grace, 2018). Most investments rely on client data to make insights which may then be used to further propel the business objectives of the investments. However, for such insights to be successfully made, such businesses need to have in place proper data processing techniques which can handle large volumes of data collected at a significantly fast rate effectively. Traditional methods such as normal databases and excel have proven to be ineffective in accomplishing this objective. As a result, novel approaches such as big data analytics, business intelligence, data science and machine learning, in general, have been proposed to handle this shortcoming.

Machine learning, in general, became popular towards the latter half of the 20th century when advances in computing technologies were made popular. Since then, the use of machine learning has been incorporated in almost all fields, ranging from medicine and other fundamental sciences to economics, finance and social sciences. The success of machine learning methodologies can largely be attributed to the fact that there is a rich source of data which can be used to apply most of the machine learning models (Grace, 2018). It is important to note that the effectiveness of machine learning significantly relies on the availability of appropriate volumes of the necessary data which has to be analyzed. Otherwise, traditional programming and other forms of analytics can be easily implemented to handle the tasks at hand.

This discussion intends to show how machine learning, in general, can be used to predict the churn of customers in the telecommunications industry. The contents of this discussion are expected to shed more light on how effective machine learning techniques can be in helping organizations make sound decisions that will, in the end, be beneficial to their economic progress. The actual business problem being investigated in this report is for an American telecommunications company that intends to increase their productivity through two primary means. First, the company intends to increase the number of customers it retains and subsequently lead to increased profits. Secondly, the company also intends to employ machine learning models to predict the clients that are most likely to change their service providers.  The data used in this study was obtained from Kaggle. More information regarding the data will be provided in the descriptive data section.  The paper is structured as follows, following this introduction, there will be a literature review section, descriptive data, models being used, focused model and finally a conclusion.

Literature Review.

According to IBM Watson, machine learning has shown promising potential in significantly revolutionizing how organizations can optimize their business models. The organization has developed a rigorous Business Analytics and Optimization approach as its primary means of leveraging on technology to aid in undertaking business optimization processes as well as aid in the routine conduction of operational decision-making process. This decision stems from the fact that the organization acknowledges that changes in the marketing sphere have generally brought about a shift on how data handling and leveraging is conducted in modern-day businesses. IBM Watson, like many other leading technology providers also provides a clear description of the significance of descriptive analytics of any data present in an organization (Ape, 2010). It is through such data that the management in an organization can obtain insights regarding key trends in the organization as well as answer fundamental questions such as the crucial events that occurred in an organization’s operations and the frequency of such events. Descriptive analytics further provides an organization with extend 0benefits in the sense that individuals can speculate future events such as occurrences that may negatively impact the organization. From such speculation, an organization is subsequently well positioned to apply proper mitigation strategies to avoid any undesirable eventualities. These extended functionalities are the main building blocks of predictive analysis (Ape, 2010). The main purpose of this article by IBM is to provide a clear road map of how machine learning techniques, in general, can be used to expedite the normal operating procedures of any given organization.

Understanding the fundamental concepts of business intelligence and analytics is equally as important to this study as well. Chen et al conducted a study and presented their results on the topic of Business intelligence and Analytics by studying the evolution from Big data to the big impact brought about by the new approach (Chen et al, 2012). According to the publication, business intelligence and analytics can be traced back to the field of database management. Business intelligence relies on the collection of data, extraction of significant features from the data and subsequently implementing analysis technologies on the data as a means of obtaining insights and making predictive insights. In principle, these fundamental components of business intelligence and analytics are in agreement with IBM Watson’s envisioning of the importance of machine learning technologies in business models. The authors note that recent developments in the field of database management have resulted in emerging studies in more promising fields such as web analytics, text analytics, mobile analytics, network analytics and big data analytics in general. The authors further point out that business intelligence and big data analytics has been extensively leveraged by some of the key players in e-commerce. By providing examples of key players such as Amazon, Google and Facebook, the paper shows how advances in fields such as web analytics can be put into perfect use by different business institutions to achieve maximum productivity. The fundamental reason behind the improved performance of businesses relying on novel forms of analytics such as web analytics stems from the fact that these new approaches tend to provide organizations with a richer source of customer information that was previously not available. Therefore, organizations can adjust their operations to adequately meet the expectations of their clients, a fete that was challenging in the past (Chen et al, 2012). This study, therefore, plays a major role in this paper in the sense that it provides actual evidence of how business intelligence and analytics, implemented through machine learning models, can be used to improve the competence of institutions in their different markets.

To further focus on this work, an article by Renjith was analyzed to investigate the existing studies that have been conducted in trying to use machine learning to reduce instances of customer churn. Renjith compared the results of using the Support Vector Machine and a hybrid method in trying to detect the likelihood of customer churn in a company. The author stated that churn detection can be proactively achieved by identifying the risk score of every individual client and thereafter parsing various forms of predictive analysis techniques on their data (Renjith, 2017). the article proposed either statistical models or machine learning approaches. Among other factors, Renjith identified that there are various key factors which affect the likelihood of successfully detecting the likelihood of client churn. These include factors such as the customer reviews of the company under study, the demographical characteristics of the clients, consumption metrics and many more. As a statistical approach, the author proposed the use of Logistic Regression as a means of obtaining insights from the data given the existence of such numerous variables. However, statistical models have been found to have poor performance in instances where significant volumes of data are involved. Therefore, machine learning methodologies such as the support vector machine are employed. This approach tends to separate the provided data into different entities based on a given set of decision planes. Learning in this methodology, therefore, occurs when the is fed with individual elements from the training set belonging to either of the entities that were created through the separation of the hyper-planes. Subsequent classification using the model is achieved when the model, after finally learning the process, can categorize provided elements into their respective entities based on the training process (Renjith, 2017). Effective classification of the model, like in any other machine learning approach, relies on optimization. For the support vector machine, an optimal hyper-plane is defined to serve the purpose of the decision boundary.

Descriptive Data Analysis.

The data being used in this exercise was obtained from Kaggle. Originally, the data contained close to 100, 000 observations. However, through the process of data cleaning and pre-processing, these instances were reduced to 66000. The pre-processing served to remove aspects of the data such as missing values and outliers which would generally affect the models being fed into the classifier during the training process. It is important to note that outlier data can significantly affect how an individual’s model performs since it adds on the dimensionality of the data thus reducing its accuracy.

The data was also found to have 100 variable columns. These were found to be more than the average number of variables commonly used in machine learning problems. Therefore, through data cleaning and further feature engineering, these variables were reduced to 35. the reduction in the variables can also be explained by the fact that a large number of variables results in the curse of dimensionality, where one’s model tends to perform poorly as a result of being fed data containing too many features.

As a means of evaluating the performance of the model used in this study, various metric parameters were used. To begin with, the confusion matrix was used to find the number of false-positive, false negatives, true positives and true negatives (Visa et al, 2011). True positives in a confusion matrix refer to the actual number of positive values which were correctly identified as positive. In this case, the number of churn values which were predicted to be churn values will be the true positives. True negatives, on the other hand, refers to the right representation of the actual number of instances where no-churn was predicted. False positives refer to the number of instances where the classifier predicted churn values while in actual sense the classification was supposed to be no-churn. Finally, false negatives refer to the instances where the classifier returned no-churn when it should have returned churn.

The root means square was also used to evaluate the performance of the model. The RMSE operates on the principle of finding the standard deviation of the prediction errors made by the model (Willmott and Matsuura, 2005). Commonly used for regression problems, the approach is based on finding how far from the regression line are the actual values which were supposed to be predicted.

Models Used

For this study, two models were used. These were the logistic regression classifier and the decision tree classifier. Each of these is explained in detail as follows.

The logistic regression model is a statistical model that stems from the logistic function. In most settings, the model is used in the modelling of binary variables which tend to be dependent. This explains why the model is among the two that will be investigated in this study. However, while this may be the reason why the logistic regression model is common, it is also important to consider the fact that there are other means through which the model can be extended to serve other purposes. For instance in regression analysis, logistic regression can be used to estimate the parameters in the model. Binary models tend to have a fixed range of outputs. For the case study, these outputs were either churn or no-churn. The logistic regression classifier works by computing the probability of a given element falling into a given class. These probabilities are obtained from logarithmic odds of the various independent odds of the independent variables that form part of the data being analyzed.

The logistic regression function in binary classification is commonly interchanged with the sigmoid function due to the similarities shared by the two. Fundamentally, they both aim at classifying an element into either one of a binary category. The implementation of a logistic regression model in machine learning can be optimized in various ways. For instance, there are different solvers which can be chosen to make the logistic regression classifier effective. The liblinear solver was chosen for this exercise. The model also provides a class_weight functionality that allows the user to specify the distribution of the categories being investigated in the sample problem. Based on the fact that this exercise had an equal distribution of both churn and no-churn clients, the balanced class weight was selected. Finally, a random state of 42 was selected to add on to improve the stochastic nature of the model. From the logistic regression classifier used, a root mean square error of 0.650 was obtained.

Decision Tree Classifier

The decision tree classifier is another approach that was used in this exercise as well. Like other models, the classifier provides predictions based on provided inputs. The model can be explained in terms of a classification network that has a tree structure. Every input variable in the structure is termed as an internal node. From the nodes, there are arcs which are passed on to subsequent features in the tree. An outstanding characteristic of the decision tree classifier is the fact that every node or leaf of the tree has an associated probability of distribution that categorizes it into one of the classes in which the data has to be categorized into. Decision trees can be used for various models (Jain et al, 2018). These may include binary classification or multi-class classification.

The choice to use decision tree classifiers can be justified for various reasons. Fundamentally, the decision tree classifier is known to be an effective method based on the fact that it is highly flexible to changes in the features of the input data being fed into it. As a result, these classifiers have been found to surpass previously set targets in various supervised classification methods that they have been provided within past studies. Decision tree classifier has several characteristics. For instance, it does not rely on any distribution assumption in the input data provided to it. The flexibility and adaptability of the model, therefore, stems from this feature.

Other novel characteristics of the decision tree classifier which makes it unique are the fact that it operates based on non-parametric metrics, can handle non-linear relations existing between features and classes effectively and the implementation of the tree-structure in the decision-making process can be easily interpreted. The decision tree model implemented in this study was found to have a mean root mean square error of 0.6760. This was found to be slightly higher than the 0.65 that was obtained using the logistic regression model.

Focused Model Used.

Following the two different results above, the logistic regression model was selected as the focus model for the case study. This decision was based on the difference in the performance of the two models. Although the difference in their root mean squared error was of a small magnitude, it showed how differently the two models would perform given data containing more features. Decision tree classifiers tend to have their performance being limited by the number of nodes and the depth of the trees used in the classification problem. On the other hand, the logistic regression tends to perform well in such a binary classification. Among other reasons, this can be attributed to the fact that the logistic regression model is a generalization of neural networks (Dreiseitl and Ohno-Machado, 2002). Neural networks have been found to have exemplary performance in the predictive analysis based on the architecture of implementation that is used to design them. The success of such networks can be seen from success stories such as deep learning architectures commonly used in image recognition and other artificial intelligence methodologies that have widely been incorporated into other fields as well. Another reason as to why logistic regression was preferred over the decision tree classifier is because the decision tree classifier works better in multi-class classification instances as compared to binary classification instances. As stated previously, the logistic regression classifier relies on probabilistic methods to make decisions on whether to classify new entities into any two categories. This softmax approach, therefore, makes it more suitable for binary classification models.

Conclusion.

In conclusion, this study has investigated how machine learning methodologies can be employed in business analytics instances to gain insights from data. Using data from a telecommunications company, two machine learning models have been developed and their results analyzed. The development of the models was conducted after sufficient data pre-processing was carried out in the provided data, as is the custom in any statistical process. The study has revealed that the logistic regression model tends to perform better than the decision tree classifier in instances of binary classification. This exercise, therefore, provides a proper foundation for which more knowledge on how machine learning can be applied in business processes can be obtained. As stated in the literature section, machine learning has been extensively used in top tier e-commerce companies to improve their performance. By exploring the fundamentals of machine learning in this exercise, it is anticipated that more complex algorithms such as deep learning techniques can further be employed to come with even more effective models for business intelligence and analytics.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

References

Apte, C. (2010). The role of machine learning in business optimization. In Proceedings of the        27th International Conference on Machine Learning (ICML-10) (pp. 1-2).

Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS quarterly, 1165-1188.

Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: a methodology review. Journal of biomedical informatics35(5-6),             352-     359.

Grace, R. (2018). The Impact of Artificial Intelligence and Machine Learning in Business.

Hajmeer, M., & Basheer, I. (2003). Comparison of logistic regression and neural network-based classifiers for bacterial growth. Food Microbiology20(1), 43-55.

Jain, V., Phophalia, A., & Bhatt, J. S. (2018, October). Investigation of joint splitting criteria for decision tree classifier use of information gain and Gini index. In TENCON 2018- 2018 IEEE Region 10 Conference (pp. 2187-2192). IEEE.

Renjith, S. (2017). B2C E-Commerce Customer Churn Management: Churn Detection using         Support Vector Machine and Personalized Retention using Hybrid Recommendations. International Journal on Future Revolution in Computer Science &     Communication Engineering (IJFRCSCE)3(11), 34-39.

Visa, S., Ramsay, B., Ralescu, A. L., & Van Der Knaap, E. (2011). Confusion Matrix-based         Feature Selection. MAICS710, 120-127.

Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research30(1), 79-82.

  Remember! This is just a sample.

Save time and get your custom paper from our expert writers

 Get started in just 3 minutes
 Sit back relax and leave the writing to us
 Sources and citations are provided
 100% Plagiarism free
error: Content is protected !!
×
Hi, my name is Jenn 👋

In case you can’t find a sample example, our professional writers are ready to help you with writing your own paper. All you need to do is fill out a short form and submit an order

Check Out the Form
Need Help?
Dont be shy to ask