Quantitative data analysis

Quantitative data analysis

QUESTION TWO

  1. Explain what is meant by quantitative data analysis

Fielding et al (1998) states that quantitative analysis refers to economic, business or financial analysis that aims to understand or predict behavior or events through the use of mathematical measurements and calculations, statistical modeling and research. Quantitative analysts aim to represent a given reality in terms of a numerical value. Quantitative analysis is employed for a number of reasons, including measurement, performance evaluation or valuation of a financial instrument, and predicting real world events such as changes in a country’s gross domestic product (GDP) growth rate.

In general terms, quantitative analysis can best be understood as simply a way of measuring or evaluating things through the examination of mathematical values of variables. The primary advantage of quantitative analysis is that it involves studying precise, definitive values that can easily be compared with each other, such as a company’s year-over-year revenues or earnings. In the financial world, analysts who rely strictly on quantitative analysis are frequently referred to as “quants” or “quant jockeys.”

Governments rely on quantitative analysis to make monetary and other economic policy decisions. Governments and central banks commonly track and evaluate statistical data such as GDP and employment figures.

Common uses of quantitative analysis in investing include the calculation and evaluation of key financial ratios such as the price-earnings ratio (P/E) or earnings per share (EPS). Quantitative analysis ranges from examination of simple statistical data such as revenue, to complex calculations such as discounted cash flow or option pricing.

 

 

 

 

 

  1. Show the stages to be followed in quantitative data analysis

The following are some of the stages that are used in quantitative data analysis

Formulate an hypothesis and select variables

A hypothesis is a statement about an expected relationship between two or more variables that permits empirical testing. Formulating the hypothesis and then choosing the variables represent the key conceptual stage of the research, since it defines the direction of the study. If you play enough with a set of data you will find some sort of relationship, but the relationships that are meaningful to you should be those defined in the hypothesis.

Determine sample

To put the data in context, describe it in terms of averages (e.g. average height) and variation (e.g. the range of heights). The sample can be determined by using randomly sampling using probability.

Collect the data

The data is collected after the proposal has been approved.

Prepare the data

Data must be cleaned and organised for analysis. (Note that coding and nature of the data should be thought through before the data gathering process starts and should pre-tested.)

Actions:

  • Code/ input the data in the analysis software
  • Check the data for errors and accuracy (Are all the responses reasonable? Are all relevant questions answered? Are the responses complete?)
  • Transform the data (e.g. collapse data into categories, handle missing values)

Organize and present the data

The data is organised and presented in form of tables, frequencies and graphs.

Validate/discuss with key stakeholders

The data findings are discussed with key stakeholders and recommendations made.

  1. Explain when to use individual measure of central tendency

According to Felsing, M et al (2000) Central tendency is defined as the statistical measure that identifies a single value as representative of an entire distribution. It aims to provide an accurate description of the entire data. It is the single value that is most typical/representative of the collected data. The term number crunching is used to illustrate this aspect of data description. The mean, median and mode are the three commonly used measures of central tendency.

Mean

Mean is the most commonly used measure of central tendency. There are different types of mean, viz. arithmetic mean, weighted mean, geometric mean (GM) and harmonic mean (HM). If mentioned without an adjective (as mean), it generally refers to the arithmetic mean. This is used when you want to get the average of any data set.

Arithmetic mean

Arithmetic mean (or, simply, “mean”) is nothing but the average. It is computed by adding all the values in the data set divided by the number of observations in it. If we have the raw data, mean is given by the formula

Weighted mean

Weighted mean is calculated when certain values in a data set are more important than the others. A weight mean, wi is attached to each of the values xi to reflect this importance.

MEDIAN

Median is the value which occupies the middle position when all the observations are arranged in an ascending/descending order. It divides the frequency distribution exactly into two halves. Fifty percent of observations in a distribution have scores at or below the median. Hence median is the 50th percentile. Median is also known as positional average.

It is easy to calculate the median. If the number of observations are odd, then (n + 1)/2th observation (in the ordered set) is the median. When the total number of observations are even, it is given by the mean of n/2th and (n/2 + 1)th observation.

MODE

Mode is defined as the value that occurs most frequently in the data. Some data sets do not have a mode because each value occurs only once. On the other hand, some data sets can have more than one mode. This happens when the data set has two or more values of equal frequency which is greater than that of any other value. Mode is rarely used as a summary statistic except to describe a bimodal distribution. In a bimodal distribution, the taller peak is called the major mode and the shorter one is the minor mode.

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Explain different statistical tests and show why they can be used in quantitative study.

According to Felsing, M et al (2000), the following are some of the statistical tests and their use as explained below.

Type of TestUse
CorrelationalThese tests look for an association between variables
Pearson correlationTests for the strength of the association between two continuous variables
Spearman correlationTests for the strength of the association between two ordinal variables (does not rely on the assumption of normally distributed data)
Chi-squareTests for the strength of the association between two categorical variables
Comparison of Means: look for the difference between the means of variables
Paired T-testTests for the difference between two related variables
Independent T-testTests for the difference between two independent variables
ANOVATests the difference between group means after any other variance in the outcome variable is accounted for
Regression: assess if change in one variable predicts change in another variable
Simple regressionTests how change in the predictor variable predicts the level of change in the outcome variable
Multiple regressionTests how change in the combination of two or more predictor variables predict the level of change in the outcome variable
Non-parametric: used when the data does not meet assumptions required for parametric tests
Wilcoxon rank-sum testTests for the difference between two independent variables—takes into account magnitude and direction of difference
Wilcoxon sign-rank testTests for the difference between two related variables—takes into account the magnitude and direction of difference
Sign testTests if two related variables are different—ignores the magnitude of change, only takes into account direction

 

 

References

Agresti, A. (1996). An introduction to Categorical Data Analysis. Wiley, New York.

 

Anderson, R.L. (1959). Use of contingency tables in the analysis of consumer preference studies.

Biometrics, 15, 582-590.

 

Conover, W.J. (1999). Practical non-parametric statistics. 3rd Edition. Wiley, New York.

 

Felsing, M., Haylor, G.S., Lawrence A. and Abeyasekera S. (2000). Evaluating some statistical

Methods for preference testing in participatory research. DFID Aquaculture Research Programme Project.

 

Fielding, W.J., Riley, J., and Oyejola, B.A (1998). Ranks are statistics: some advice for their

interpretation. PLA Notes 33. IIED, London.

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

RSS
Follow by Email
YouTube
Pinterest
LinkedIn
Share
Instagram
WhatsApp
FbMessenger
Tiktok