Data Analysis and Business Intelligence

 

 

 

Name

University

Date

 

 

 

 

 

 

 

 

Statistics deals with gathering and analyzing numerical data to understand characteristics of the population the data was obtained from.   There exist two main types of data. Qualitative data deals with non-numerical and helps to describe a population such as colors, height etc.  Quantitative data are numerical data and is classified into two discrete and continous. Discrete data have finite values such as 1, 2,3,4 … (Hawkins & Panel, 1985). For example how many students attended school today etc. While continous data have infinite number value such 1.23, 1.435, 1.245 .. For example physical measure of matter have continuous values such as mass. Volume, length etc.  some of the common variables associated with statistics are population, sample. Population refers to the whole set of data in elements. Sample is a portion of data selected from whole element. There are four levels of measurement used in statistics; nominal, ordinal, interval, ratio (Popović, 2014).

 

Frequency is the number of times a value or an event occurs.  A frequency table refers to table with items showing the number of times a value or event occurs typically represented as ‘f’. Frequency distribution table shows the list of items and their respective frequencies. There is grouped frequency distribution table and ungrouped.  For grouped the items are categorized into disjoint class intervals, while ungrouped frequency shows a single list of items. Again frequency distribution table is organized in either ascending or descending order with their respective frequencies (Wang, 2015).

 

The following is a frequency distribution table with grouped data showing mark scored and frequency.

Marks obtained in a test No of students (f)
0-9 0
10-19 1
20-29 3
30-39 10
40-49 15
50-59 4
60-69 2
70-79 1
Total 35

 

 

The following is a frequency table  about the family planning method that uses categorical variable

Family planning method Number(f)
Abstinence 10
condoms 15
pills 20
Injectable 5
Total 40

 

 

Quantitative data is categorized into two; discrete or continous. Discrete variables have finite number values such as the number of cars, cows, people, etc. Therefore, the discrete variable has a whole number, such as 1,2,3, etc.  A continuous variable has an infinite number of values representing height, mass, weight, etc. For example, 3.56kg, 2.78m.   Measures of dispersion show the scattered data sets and inform the trend of data distribution and variation.  Again measure of dispersion shows heterogeneity or homogeneity of distribution patterns.  The terms used to explain the dispersion; mean, mode, median, range, variance, standard deviation. Range shows the difference by chekcing at both extremes of the distribution of the data set. For example, calculating the difference between a maximum value and minimum value, you get range. A mode is the most frequent value. Mean is the measure of absolute deviation from the central tendency measure (Wang, 2015). Standard deviation refers to the positive square root of the data set’s square deviations, usually denoted as sigma, σ.

 

 

A dot plot is used to display data. The technique organizes data into groups whereby each observation is shown as a dot on a horizontal number line and shows the actual value of data. In case of observation are similar or close, the dots are put on top of another (Kamath, 2009).

Dot plot

 

Stem- and-leaf statistical technique presents data whereby the data is divided into two parts; leading digit (stem) and trailing digit (leaf).  The leaf digits stacked along the horizontal while the stem digits are staked in the vertical axis.  The stem-and-leaf technique shows the frequency distribution and no identity is lost. Other methods used to describe and explore data include standard deviation, which calculates other variables such as quartile, decile, percentile, etc (Corlett, 1966).

Stem-and-leaf

stem leaf
2 2,5,7,9
3 4,6
5 0

 

 

Probability refers to the chance or likelihood something will happen or occur.  Again probability equals the sum of a possible outcome. A random variable is when a person is not sure of the event (Fatovich & Phillips, 2017). For example, drawing a ball from a bag cannot determine the ball’s color, and such a probability gives a random variable. Three primary types of probability; marginal, Joint, and conditional probability.

Marginal probability of event B occuring is given as P(B), and assuming playing cards, the marginal probability of card taken is black will P(black)=0.5.

Joint probability shows the intersection of more two events. For example, taking A and B as events, the joint probability is P (A ∩ B).

Conditional probability is the probability that an event will occur depending on whether we know another has occurred.  For example, X and Y are events; it was written as P(X|Y).

XYZ company management has used probability distribution to make strategic decisions. For example, in situation analysis whereby managers create many theoretical possible outcomes of a particular event.  A business can have the best case, likely, and worst-case scenario.  The worst-case scenario involves regulating the value from the low end of probability; the likely scenario involves containing data sets in the middle. The best-case consists of managing the values at the upper end of the data set.  The company has also used probability to forecast sales with the aims of optimizing production.  By using a case, scenario analysis management can plan efficiently (Fatovich & Phillips, 2017).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

References

Corlett, T. (1966). Statistical Applications in Market Research: An Introduction. Applied Statistics, 15(3), 155. https://doi.org/10.2307/2985297

Fatovich, D., & Phillips, M. (2017). The probability of probability and research truths. Emergency medicine Australasia, 29(2), 242-244. https://doi.org/10.1111/1742-6723.12740

Hawkins, A., & Panel, S. (1985). Statistics and probability-An Introductory course. Applied statistics, 34(2), 174. https://doi.org/10.2307/2347372

Kamath, C. (2009). Application-Driven Data Analysis. Statistical Analysis And Data Mining, 1(5), 285-285. https://doi.org/10.1002/sam.10023

Popović, B. (2014). Understanding advanced statistical methods. Journal Of Applied Statistics, 41(12), 2777-2777. https://doi.org/10.1080/02664763.2014.913838

Wang, S. (2015). Exploring a Research Method – Interview. Advances In Social Sciences Research Journal, 2(7). https://doi.org/10.14738/assrj.27.1270

error: Content is protected !!