Research consultancy

Content

Introduction:
What is SPSS?
Major windows used in SPSS
Opening SPSS
Creating a data entry screen/ template
Creating Variables (For both closed and open-ended questions)
Entering data
Saving data
Creating new variables from the existing ones
Computing new variables
Importing data from excel
Merging data files
Data cleaning
Data analysis and interpretation of results
Descriptive statistics (Maximum, minimum, and standard deviation of the variables)
Sample T-tests
Frequency tables (one-way table and cross tabulation)
Graphs
Hypothesis testing
Correlations
Regression (Assessing an effect/Impact of one variable on the other)

1.0 Introduction:

1.1 What is SPSS and its Role?

Originally SPSS stands for Statistical Package for Social Science.
SPSS is one of the statistical packages which can perform highly complex data manipulation and analysis with simple instructions

1.2 To start SPSS

Double click on the icon for SPSS if it exists on your Task bar or desktop

After clicking on the Icon for SPSS, the screen will appear as below;

1.3 Major Windows used in SPSS

Data view or Data editor window. It is used to enter, edit and view data. To access the Data view window, you can click on a tab named “Data view” at the bottom of the extreme left end of the SPSS spread sheet. The rows display the data being entered while the columns display the variables upon which the data is entered. The screen will as be shown below;
Variable view window. It is used when creating new variables, deleting, and adding variables. It displays different columns where each line on the row corresponds to a variable. You can click on a tab named “Variable view” at the bottom of the extreme left end of the SPSS spread sheet. The screen will appear as shown below;
Each column in the variable view has its function as shown in the table below;

Column	Function
Name	Name of the variable. Make it short and understandable and do not use numbers or symbols as the first letter since SPSS will not accept it. Moreover, you cannot use spaces in the name. To add space, use an underscore sign, For example: “Edu_level”. You Cannot include words that are used as commands by SPSS, such as ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, WITH, etc. You Cannot exceed 64 characters. The variables must be unique (all variables should have different names)
Type	It refers to the specific kind of data being entered. Some examples of types are: Numeric (numbers), Date (dates) and String (letters, and characters with or without spaces)
Width	Corresponds to the number of characters that is allowed to be typed in the data cell. Default for numerical and string variables is 8, which only needs to be altered if you want to type in long strings of numbers or whole sentences.
Decimal	Default is 2 for numerical variables. It is applied for only continuous data
Label	The description of the variable. Use the question that the variable is based upon or something else accurately describing the variable. For example: “What is your highest level of education?”
Values	Here you can add labels to each response alternative. For example: For the variable Sex, “Male” are coded as 1 and “Female” are coded as 2.
Missing	By default, missing values will be coded as “.” (dot) for numerical variables in the data set. For missing values in String variables, cells will be left blank.
Align	You do not need to do anything for this
Measures	Measure is the nature of data that is being entered. In SPSS, it is represented Scale (measurable data i.e. weight, height, and temperature), Ordinal (data in ranked categories/groups i.e. Likert scale questions), and Nominal (Data in categorical form without ranks i.e. Gender, Religion, tribes).

Output window. It displays the results using tables or graphs after analysis has been performed. The output window appears as shown below before analysis is conducted;

After conducting some analysis, the output window will display your results as shown in the sample below.

1.4. Saving SPSS file

Click on “File” on the upper left end of SPSS
Select “Save as” or “Save” Note: before creating any variable or entering any data, the option of “Save as” or “Save” will not be active.
Give a “Filename” and click “save”

1.5 Opening data

To open a file in SPSS, navigate to File >>> Open >>> Data.
Browse for the desired dataset, select and press the open button
If the dataset is not visible make sure that the directory is correct and the proper file format (SPSS (*. Sav), Excel (*.xls, *.xlsx, *.xlsm) etc.…) is selected.

2.0 Computing New Variable

Transform>>> Compute Variable
In the “Target Variable” box, enter the name of the variable that is being computed. In the numeric expression Box, drag in the variable you may want to compute and then perform your calculation of interest.
Press “Ok” button to compute the variable

3.0. Recording Variables

When recording variables there are two options, either recoding into the same variable or recoding into a different variable
It is recommended to almost always recode into a new variable in order to ensure that no recoded data will be lost.
To recode a variable, go to Transform>>>Recode into different variables…
In leftmost column of the popped-up box, select the variable to be recoded and
Click the arrow
The variable(s) will appear in the middle, Numeric variable–>output variable,
Enter the name of the new recoded variable in the “Name” text box found
Enter a label if desired
Click the “Change” button.
This will replace the question mark with the new variable’s name, showing that the variable on the left of the “–>” will be recoded into the variable on the right of the “–>”.

Click on the “Old and New Values”
A new box will pop up with three sections: “Old Value”, “New Value” and “old –> New”.
Insert code “1” in the New value text box
Click on range and enter in the corresponding value(s) or ranges of code “1” from the minimum to the maximum value.
Click the “Add” button in the “Old –> New” section.
Repeat the process for several codes until all the categories for different codes are finished
Click “Continue” and then
“Ok” to finish the recode process

4.0 Merging variables and data cases

4.1 Merging cases

To add cases, ensure that the variable names are the same in the both files
Open the first data set
Click data>>> merge files>>> add cases

Browse/click on the file you want to add
Click on open
Click ok

4.2 Merging variables

Open the first data set
Click data>>> merge files>>> add variables

Browse/click on the file you want to add
Click open
Click ok

5.0 Frequency distribution-categorical/nominal variables.

Analyze>>>descriptive statistics>>> frequencies

Select variables from the LH box into the RH box
Additional statistics can be selected by clicking on “statistics” button
Charts like histogram can be selected by clicking in Charts
Press OK

6.0 Statistical analysis of data

It involves major 5 steps.

Enter your data in the data editor
Select a procedure from the menu
Select variables from the analysis
Examine the results in the output widow
Interpret the results in the word document

6.1 frequency distribution – categorical/nominal variables e.g., sex, marital status, age group

Analyze >>>descriptive statistics>>> frequencies
Select variables from the LH box into the RH box
Additional statistics can be selected by clicking on ‘statistics’ button
Chart like histogram can be selected by clicking on charts
Press OK

6.2. Descriptive statistics – Quantitative /continuous variables e.g., age, height, weight, temperature.

Descriptive statistics are mostly generated for discrete and continuous. The most common ones include: mean and sum. You can also generate dispersion statistics is the same window as that for descriptives, such maximum, minimum, range and standard deviation. Skewness and kurtosis can also be generated.

Analyze >>> descriptive statistics >>> descriptive
Select variables from the Left-Hand box into the Right-Hand box
The user can specify the particular statistics required by selecting ‘options’ or ‘statistics’ button
Press Ok
Interpret the resultse., mean, median, mode, frequency, quartile, sum, variance, standard deviations, minimum, maximum, range, kurtosis and skewness.

7.0 Graphing Data

7.1 Bar graph

Illustration

Sex

Sex: 01=Male 02=Female

Procedure

Variable view>>>Name “Sex”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Sex of the respondents”>>>values “1=Male, 2=Female” >>> Measure select “Nominal”
Enter the Sex of the participants that participated in the study in the data view “Data cell”
Graphs>>>legacy dialogs>>>bar>>>simple
Select summarizes for groups of cases
Define
Select the categorical variable to be charted “sex”
Press Ok

Output

Right click on the bar graph>>> select “edit content”>>> in separate window>>> chart editor “show data labels”>>>properties “drag percent from not displayed to displayed” and “drag count from displayed to not displayed
Press Apply
Close properties
Close chart editor

7.2 Box Plot

Illustration

HHNumber

Procedure

Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
Enter the Household Number of the participants that participated in the study in the data view “data cell”
Graph>>>legacy dialogs >>Box plot>>>simple
Select summaries of separate variables
Define
Select the continuous variables to be charted
Press OK

Output

7.3 Histogram

Illustration

HHNumber

Procedure

Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
Enter the HH no of the respondents that participated in the study in the data view “data cell”
Graph>>>legacy dialog>>>histogram
Select the variable (s)
Click on operationally
Select display normal curve
Press ok

Output

7.4 Scatter Plot

Illustration

Weight (kg)	56	56	67	89	98	65	76	71	63	72	74	73	64	75	75	76	89	56	56	67	89	98	65	76	71	63	72	74	73	64	75	75	76	89	89
Height (M)	1	2	2	2	2	2	1	2	2	1	2	2	2	2	2	1	2	2	1	2	2	2	2	2	1	1	2	2	2	2	2	1	2	2	2

Procedure

Variable view>>>Name “weight”>>>Type select “numeric”>>> decimal “change to 0”>>> label “weight of the respondents”>>>values “blank”>>> Measure select “scale”
Variable view>>>Name “height”>>>Type select “numeric”>>> decimal “change to 0”>>> label “height of the respondents”>>>values “blank”>>> Measure select “scale”
Enter the Weight and height of the respondents that participated in the study in the data view “data cell”
Graph>>>legacy dialog>>>scatter>>> simple>>>define
Select the Y and X-axis variables
Press OK

Output

7.5 Line Graph

Illustration

HHNumber

Procedure

Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
Enter the HH no of the participants that participated in the study in the data view “data cell”
Graph>>>legacy dialogs>>>line>>>simple
Select values of individual cases
Define
Select the Y and X -axis variables
Press Ok

Output

8.0 Samples Tests in SPSS

One sample t-test
Paired sample t-test
Independent sample t-test
ANOVA Test

Please always remember that:

One sample t-test is used to compare the mean of one variable
Paired sample t-test is used to compare the mean of two variables for a single group
Independent sample t-test is used to compare means of two groups of cases
T-test is used for testing single mean and ANOVA is used for testing several means.

8.1 ONE SAMPLE T-TEST

One sample t-test is performed when you want to determine if the mean value of a target variable is different from a hypothesized value

When should one use one sample T-test?

If you have a single sample of data and you want to test whether your sample comes from a population with a known mean
If you want to test whether the mean of a single variable differs from a specified constant
If you want to test the hypothesis of a sample comes from a population with a particular mean

Assumptions for the one sample t-test

The dependent variable is normally distributed within the population
The data are independent (scores of participants are not dependent on scores of others)

Steps

Analyze – compare means-one sample T-test
Enter the hypothesized test value i.e., numeric test value against which each sample mean is compared
Optionally, you can click options to control the treatment of missing data and the level of confidence interval.
Finally, click ok
Interpret the result

ILLUSTRATION

The data below shows the glucose levels of 10 runners before and after the marathon race.

After	31.4	25.9	16.8	23.8	24.6	31.5	26.8	22.6	16.9	32.6
Before	3.4	6.5	5.4	6.9	8.3	7	9.5	10.4	14	17.5
Diff

Question: Is there a difference in the glucose levels before and after the marathon race?

State the hypothesis
Use t-test to show that there is no mean difference in the glucose levels before and after the marathon race.

Procedure

Variable view>>>Name “After”>>>Type select “numeric”>>> decimal “change to 0”>>> label “after marathon glucose level of the respondents”>>>values “blank”>>> Measure select “scale”
Variable view>>>Name “before”>>>Type select “numeric”>>> decimal “change to 0”>>> label “before marathon glucose level of the respondents”>>>values “blank”>>> Measure select “scale”
Enter the after and before glucose level of the respondents that participated in the study in the data view “data cell”
First compute a new variable – the difference between the after value and the before value
Transform – compute-
For target variable type diff2, for numeric expression type after-before
Click ok
Analyze-compare means-one sample test
Select diff as the test variable and test value to be 0
Click on option and put 95%
Under missing value select “exclude cases analysis by analysis”
Continue
Ok

Interpretation of the result

Test for Normality

Before you conduct any parametric tests, you need to check that data values come from an “approximately normal” distribution. To do this, you can compare the frequency distribution of your data values with those of a normalized version of these values. If the data are approximately normal, the distributions should be similar. This test will provide you with a statistic that determines whether your data are significantly different from normal. The null hypothesis is that the distribution on your data is NOT different from a normal distribution. The Alternative hypothesis is that the distribution on your data is different from a normal distribution. We reject the null hypothesis is the P-Value is less than 0.05 (meaning that the chances of the null hypothesis being true is less than 0.05).

Ho: µ₁– µ₂= 0 or Ho: µ₁= µ₂

H₁: µ₁– µ₂≠0 or Ho: µ₁≠µ₂

One-Sample Test
	Test Value = 0
	t	df	Sig. (2-tailed)	Mean Difference	95% Confidence Interval of the Difference
					Lower	Upper
diff2	7.444	9	.000	16.40000	11.4163	21.3837

looking at the two tailed significance level, the p-value (0.000) is less than 0.05, so we reject the null hypothesis
looking at the confidence interval, it does not include 0, so we reject the null hypothesis

Conclusion

Glucose levels rise during a marathon run.

Illustration

The data below shows the Household number of 35 respondents that participated in the study

HHNo.

Question : Is the mean of the Household number equal to 3

State the hypothesis

Procedure

Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
Enter the HH no of the participants that participated in the study in the data view
Analyze-compare means-one sample test
Select HH no as the test variable and test value to be 3
Click on option and put 95%
Under missing value select “exclude cases analysis by analysis”
Continue
Ok

Output

One-Sample Statistics
	N	Mean	Std. Deviation	Std. Error Mean
HH Number of the respondent	35	3.17	1.014	.171

One-Sample Test
	Test Value = 3
	t	df	Sig. (2-tailed)	Mean Difference	95% Confidence Interval of the Difference
					Lower	Upper
HH Number of the respondent	1.000	34	.324	.171	-.18	.52

Ho: µ₁= 3

H₁: µ₁≠3

We accept the null hypothesis and conclude that the mean of the Household number is equal to 3.

Wilbroad Oketcho

Leave a Reply Cancel reply