Research consultancy

Content

  • Introduction:
  • What is SPSS?
  • Major windows used in SPSS
  • Opening SPSS
  • Creating a data entry screen/ template
  • Creating Variables (For both closed and open-ended questions)
  • Entering data
  • Saving data
  • Creating new variables from the existing ones
  • Computing new variables
  • Importing data from excel
  • Merging data files
  • Data cleaning
  • Data analysis and interpretation of results
  • Descriptive statistics (Maximum, minimum, and standard deviation of the variables)
  • Sample T-tests
  • Frequency tables (one-way table and cross tabulation)
  • Graphs
  • Hypothesis testing
  • Correlations
  • Regression (Assessing an effect/Impact of one variable on the other)

 

 

 

1.0 Introduction:

1.1 What is SPSS and its Role?

  • Originally SPSS stands for Statistical Package for Social Science.
  • SPSS is one of the statistical packages which can perform highly complex data manipulation and analysis with simple instructions

1.2 To start SPSS

 
  • Double click on the icon for SPSS if it exists on your Task bar or desktop

 

 

 

 

 

 

 

 

 

 

 

 

  • After clicking on the Icon for SPSS, the screen will appear as below;

 

 

 

 

 

 

 

 

 

1.3 Major Windows used in SPSS                                                                 

  • Data view or Data editor window. It is used to enter, edit and view data. To access the Data view window, you can click on a tab named “Data view” at the bottom of the extreme left end of the SPSS spread sheet. The rows display the data being entered while the columns display the variables upon which the data is entered. The screen will as be shown below;
  • Variable view window. It is used when creating new variables, deleting, and adding variables. It displays different columns where each line on the row corresponds to a variable. You can click on a tab named “Variable view” at the bottom of the extreme left end of the SPSS spread sheet. The screen will appear as shown below;
  • Each column in the variable view has its function as shown in the table below;

 

 

 

 

 

 

 

 

 

ColumnFunction
NameName of the variable. Make it short and understandable and
do not use numbers or symbols as the first letter since SPSS will not accept
it. Moreover, you cannot use spaces in the name. To add space, use an underscore sign, For example: “Edu_level”. You Cannot include words that are used as commands by SPSS, such as ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, WITH, etc. You Cannot exceed 64 characters. The variables must be unique (all variables should have different names)
Type                                                                                            It refers to the specific kind of data being entered. Some examples of types are: Numeric (numbers), Date (dates) and String (letters, and characters with or without spaces)
WidthCorresponds to the number of characters that is allowed to be typed in the data cell. Default for numerical and string variables is 8, which only needs to be altered if you want to type in long strings of numbers or whole sentences.
DecimalDefault is 2 for numerical variables. It is applied for only continuous data
LabelThe description of the variable. Use the question that the variable is based upon or something else accurately describing the variable. For example: “What is your highest level of education?”
ValuesHere you can add labels to each response alternative. For example: For the variable Sex, “Male” are coded as 1 and “Female” are coded as 2.
MissingBy default, missing values will be coded as “.” (dot) for numerical variables in the data set. For missing values in String variables, cells will be left blank.
Align You do not need to do anything for this
MeasuresMeasure is the nature of data that is being entered. In SPSS, it is represented Scale (measurable data i.e. weight, height, and temperature), Ordinal (data in ranked categories/groups i.e. Likert scale questions), and Nominal (Data in categorical form without ranks i.e. Gender, Religion, tribes).

Output window. It displays the results using tables or graphs after analysis has been performed. The output window appears as shown below before analysis is conducted;

 

 

 

 

 

 

 

 

 

 

 

  • After conducting some analysis, the output window will display your results as shown in the sample below.

1.4. Saving SPSS file                                                                                 

  • Click on “File” on the upper left end of SPSS
  • Select “Save as” or “Save” Note: before creating any variable or entering any data, the option of “Save as” or “Save” will not be active.
  • Give a “Filename” and click “save

1.5 Opening data

  • To open a file in SPSS, navigate to File >>> Open >>> Data.
  • Browse for the desired dataset, select and press the open button
  • If the dataset is not visible make sure that the directory is correct and the proper file format (SPSS (*. Sav), Excel (*.xls, *.xlsx, *.xlsm) etc.…) is selected.

2.0 Computing New Variable

  • Transform>>> Compute Variable
  • In the “Target Variable” box, enter the name of the variable that is being computed. In the numeric expression Box, drag in the variable you may want to compute and then perform your calculation of interest.
  • Press “Ok” button to compute the variable

3.0. Recording Variables

  • When recording variables there are two options, either recoding into the same variable or recoding into a different variable
  • It is recommended to almost always recode into a new variable in order to ensure that no recoded data will be lost.
  • To recode a variable, go to Transform>>>Recode into different variables
  • In leftmost column of the popped-up box, select the variable to be recoded and
  • Click the arrow
  • The variable(s) will appear in the middle, Numeric variable–>output variable,
  • Enter the name of the new recoded variable in the “Name” text box found
  • Enter a label if desired
  • Click the “Change” button.
  • This will replace the question mark with the new variable’s name, showing that the variable on the left of the “–>” will be recoded into the variable on the right of the “–>”.

 

  • Click on the “Old and New Values
  • A new box will pop up with three sections: “Old Value”, “New Value” and “old –> New”.
  • Insert code “1” in the New value text box
  • Click on range and enter in the corresponding value(s) or ranges of code “1” from the minimum to the maximum value.
  • Click the “Add” button in the “Old –> New” section.
  • Repeat the process for several codes until all the categories for different codes are finished
  • Click “Continue” and then
  • “Ok” to finish the recode process

4.0 Merging variables and data cases

4.1 Merging cases

  • To add cases, ensure that the variable names are the same in the both files
  • Open the first data set
  • Click data>>> merge files>>> add cases

 

  • Browse/click on the file you want to add
  • Click on open
  • Click ok

4.2 Merging variables

  • Open the first data set
  • Click data>>> merge files>>> add variables

 

 

 

 

 

 

 

 

 

 

  • Browse/click on the file you want to add
  • Click open
  • Click ok

5.0 Frequency distribution-categorical/nominal variables.

  • Analyze>>>descriptive statistics>>> frequencies

 

 

 

 

 

 

 

 

 

  • Select variables from the LH box into the RH box
  • Additional statistics can be selected by clicking on “statistics” button
  • Charts like histogram can be selected by clicking in Charts
  • Press OK

6.0 Statistical analysis of data

It involves major 5 steps.

  • Enter your data in the data editor
  • Select a procedure from the menu
  • Select variables from the analysis
  • Examine the results in the output widow
  • Interpret the results in the word document

6.1 frequency distribution – categorical/nominal variables e.g., sex, marital status, age group

  • Analyze >>>descriptive statistics>>> frequencies
  • Select variables from the LH box into the RH box
  • Additional statistics can be selected by clicking on ‘statistics’ button
  • Chart like histogram can be selected by clicking on charts
  • Press OK

6.2. Descriptive statistics – Quantitative /continuous variables e.g., age, height, weight, temperature.

Descriptive statistics are mostly generated for discrete and continuous. The most common ones include: mean and sum. You can also generate dispersion statistics is the same window as that for descriptives, such maximum, minimum, range and standard deviation. Skewness and kurtosis can also be generated.

  • Analyze >>> descriptive statistics >>> descriptive
  • Select variables from the Left-Hand box into the Right-Hand box
  • The user can specify the particular statistics required by selecting ‘options’ or ‘statistics’ button
  • Press Ok
  • Interpret the resultse., mean, median, mode, frequency, quartile, sum, variance, standard deviations, minimum, maximum, range, kurtosis and skewness.

7.0 Graphing Data

7.1 Bar graph

Illustration

Sex12111111222222111211111111222222222

Sex: 01=Male     02=Female

Procedure

  • Variable view>>>Name “Sex”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Sex of the respondents”>>>values “1=Male, 2=Female” >>> Measure select “Nominal”
  • Enter the Sex of the participants that participated in the study in the data view “Data cell”
  • Graphs>>>legacy dialogs>>>bar>>>simple
  • Select summarizes for groups of cases
  • Define
  • Select the categorical variable to be charted “sex”
  • Press Ok

Output

  • Right click on the bar graph>>> select “edit content”>>> in separate window>>> chart editor “show data labels”>>>properties “drag percent from not displayed to displayed” and “drag count from displayed to not displayed
  • Press Apply
  • Close properties
  • Close chart editor

 

7.2 Box Plot

Illustration

HHNumber12343134324534333444555222333333333

Procedure

  • Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
  • Enter the Household Number of the participants that participated in the study in the data view “data cell”
  • Graph>>>legacy dialogs >>Box plot>>>simple
  • Select summaries of separate variables
  • Define
  • Select the continuous variables to be charted
  • Press OK

Output

 

7.3 Histogram

Illustration

HHNumber12343134324534333444555222333333333

Procedure

  • Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
  • Enter the HH no of the respondents that participated in the study in the data view “data cell”
  • Graph>>>legacy dialog>>>histogram
  • Select the variable (s)
  • Click on operationally
  • Select display normal curve
  • Press ok

 

 

Output

 

 

7.4 Scatter Plot

Illustration

Weight (kg)5656678998657671637274736475757689565667899865767163727473647575768989
Height (M)12222212212222212212222211222221222

Procedure

  • Variable view>>>Name “weight”>>>Type select “numeric”>>> decimal “change to 0”>>> label “weight of the respondents”>>>values “blank”>>> Measure select “scale”
  • Variable view>>>Name “height”>>>Type select “numeric”>>> decimal “change to 0”>>> label “height of the respondents”>>>values “blank”>>> Measure select “scale”
  • Enter the Weight and height of the respondents that participated in the study in the data view “data cell”
  • Graph>>>legacy dialog>>>scatter>>> simple>>>define
  • Select the Y and X-axis variables
  • Press OK

Output

7.5 Line Graph

Illustration

HHNumber12343134324534333444555222333333333

Procedure

  • Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
  • Enter the HH no of the participants that participated in the study in the data view “data cell”
  • Graph>>>legacy dialogs>>>line>>>simple
  • Select values of individual cases
  • Define
  • Select the Y and X -axis variables
  • Press Ok

Output

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8.0 Samples Tests in SPSS

  • One sample t-test
  • Paired sample t-test
  • Independent sample t-test
  • ANOVA Test

Please always remember that:

  • One sample t-test is used to compare the mean of one variable
  • Paired sample t-test is used to compare the mean of two variables for a single group
  • Independent sample t-test is used to compare means of two groups of cases
  • T-test is used for testing single mean and ANOVA is used for testing several means.

8.1 ONE SAMPLE T-TEST

One sample t-test is performed when you want to determine if the mean value of a target variable is different from a hypothesized value

When should one use one sample T-test?

  • If you have a single sample of data and you want to test whether your sample comes from a population with a known mean
  • If you want to test whether the mean of a single variable differs from a specified constant
  • If you want to test the hypothesis of a sample comes from a population with a particular mean

Assumptions for the one sample t-test

  • The dependent variable is normally distributed within the population
  • The data are independent (scores of participants are not dependent on scores of others)

Steps

  • Analyze – compare means-one sample T-test
  • Enter the hypothesized test value i.e., numeric test value against which each sample mean is compared
  • Optionally, you can click options to control the treatment of missing data and the level of confidence interval.
  • Finally, click ok
  • Interpret the result

ILLUSTRATION

The data below shows the glucose levels of 10 runners before and after the marathon race.

After 31.425.916.823.824.631.526.822.616.932.6
Before 3.46.55.46.98.379.510.41417.5
Diff          

Question: Is there a difference in the glucose levels before and after the marathon race?

  • State the hypothesis
  • Use t-test to show that there is no mean difference in the glucose levels before and after the marathon race.

Procedure

  • Variable view>>>Name “After”>>>Type select “numeric”>>> decimal “change to 0”>>> label “after marathon glucose level of the respondents”>>>values “blank”>>> Measure select “scale”
  • Variable view>>>Name “before”>>>Type select “numeric”>>> decimal “change to 0”>>> label “before marathon glucose level of the respondents”>>>values “blank”>>> Measure select “scale”
  • Enter the after and before glucose level of the respondents that participated in the study in the data view “data cell”
  • First compute a new variable – the difference between the after value and the before value
  • Transform – compute-
  • For target variable type diff2, for numeric expression type after-before
  • Click ok
  • Analyze-compare means-one sample test
  • Select diff as the test variable and test value to be 0
  • Click on option and put 95%
  • Under missing value select “exclude cases analysis by analysis”
  • Continue
  • Ok

Interpretation of the result

Test for Normality

Before you conduct any parametric tests, you need to check that data values come from an “approximately normal” distribution. To do this, you can compare the frequency distribution of your data values with those of a normalized version of these values. If the data are approximately normal, the distributions should be similar. This test will provide you with a statistic that determines whether your data are significantly different from normal. The null hypothesis is that the distribution on your data is NOT different from a normal distribution. The Alternative hypothesis is that the distribution on your data is different from a normal distribution. We reject the null hypothesis is the P-Value is less than 0.05 (meaning that the chances of the null hypothesis being true is less than 0.05).

Ho: µ1– µ2= 0 or Ho: µ1= µ2

H1: µ1– µ2≠0 or Ho: µ1≠µ2

 

One-Sample Test
 Test Value = 0
tdfSig. (2-tailed)Mean Difference95% Confidence Interval of the Difference
LowerUpper
diff27.4449.00016.4000011.416321.3837

 

 

 

  • looking at the two tailed significance level, the p-value (0.000) is less than 0.05, so we reject the null hypothesis
  • looking at the confidence interval, it does not include 0, so we reject the null hypothesis

Conclusion

Glucose levels rise during a marathon run.

Illustration

The data below shows the Household number of 35 respondents that participated in the study

HHNo.12343134324534333444555222333333333

Question : Is the mean of the Household number equal to 3

  • State the hypothesis

Procedure

  • Variable view>>>Name “HH no”>>>Type select “numeric”>>> decimal “change to 0”>>> label “Household Number”>>>values “blank”>>> Measure select “scale”
  • Enter the HH no of the participants that participated in the study in the data view
  • Analyze-compare means-one sample test
  • Select HH no as the test variable and test value to be 3
  • Click on option and put 95%
  • Under missing value select “exclude cases analysis by analysis”
  • Continue
  • Ok

 

 

 

Output

One-Sample Statistics
 NMeanStd. DeviationStd. Error Mean
HH Number of the respondent353.171.014.171

 

One-Sample Test
 Test Value = 3
tdfSig. (2-tailed)Mean Difference95% Confidence Interval of the Difference
LowerUpper
HH Number of the respondent1.00034.324.171-.18.52

 

Ho: µ1= 3

H1: µ1≠3

We accept the null hypothesis and conclude that the mean of the Household number is equal to 3.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

RSS
Follow by Email
YouTube
Pinterest
LinkedIn
Share
Instagram
WhatsApp
FbMessenger
Tiktok