Research consultancy in uganda
ASSIGHNMENT 1
VALIDITY
It is the degree to which results obtained from the analysis of the data actually represents the phenomenon understudy, (Mugenda&Mugenda, 2003).
Validity is the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world.
The word valid is derived from the Latin word “validus” meaning strong. The validity of a measurement tool (for example, a test in education) is considered to be the degree to which the tool measures what it claims to measure; in this case, the validity is an equivalent to accuracy.
In psychometrics, validity has a particular application known as test validity: the degree to which evidence and theory support the interpretations of test scores, (as entailed by proposed uses of tests).
It is generally accepted that the concept of scientific validity addresses the nature of reality and as such is an epistemological and philosophical issue as well as a question of measurement. The use of the term in logic is narrower, relating to the truth of inferences made from premises.
Validity is important because it can help determine what types of tests to use, and help to make sure researchers are using methods that are not only ethical, and cost-effective, but also a method that truly measures the idea or construct in question.
Validity of an assessment is the degree to which it measures what it is supposed to measure. This is not the same as reliability, which is the extent to which a measurement gives results that are very consistent. Within validity, the measurement does not always have to be similar, as it does in reliability.
However, just because a measure is reliable, it is not necessarily valid e.g. A scale that is 5 pounds off is reliable but not valid. A test cannot be valid unless it is reliable. Validity is also dependent on the measurement measuring what it was designed to measure, and not something else instead.
Validity (similar to reliability) is a relative concept; validity is not an all-or-nothing idea. There are many different types of validity.
Reliability
The term reliability in psychological research refers to the consistency of a research study or measuring test.
For example, if a person weighs themselves during the course of a day they would expect to see a similar reading. Scales which measured weight differently each time would be of little use.
Reliability has to do with the quality of measurement. In its everyday sense, reliability is the consistency or repeatability of your measures. Before we can define reliability precisely we have to lay the groundwork. First, you have to learn about the foundation of reliability, the true score theory of measurement. Along with that, you need to understand the different types of measurement error because errors in measures play a key role in degrading reliability. With this foundation, you can consider the basic
Types of reliability
Test-retest reliability is a measure of reliability obtained by administering the same test twice over a period of time to a group of individuals. The scores from Time 1 and Time 2 can then be correlated in order to evaluate the test for stability over time.
Parallel forms reliability is a measure of reliability obtained by administering different versions of an assessment tool (both versions must contain items that probe the same construct, skill, knowledge base, etc.) to the same group of individuals. The scores from the two versions can then be correlated in order to evaluate the consistency of results across alternate versions.
Inter-rater reliability is a measure of reliability used to assess the degree to which different judges or raters agree in their assessment decisions. Inter-rater reliability is useful because human observers will not necessarily interpret answers the same way; raters may disagree as to how well certain responses or material demonstrate knowledge of the construct or skill being assessed.
Internal consistency reliability is a measure of reliability used to evaluate the degree to which different test items that probe the same construct produce similar results.
- Average inter-item correlation is a subtype of internal consistency reliability. It is obtained by taking all of the items on a test that probe the same construct (e.g., reading comprehension), determining the correlation coefficient for each pair of items, and finally taking the average of all of these correlation coefficients. This final step yields the average inter-item correlation.
- Split-half reliability is another subtype of internal consistency reliability. The process of obtaining split-half reliability is begun by “splitting in half” all items of a test that are intended to probe the same area of knowledge (e.g., World War II) in order to form two “sets” of items. The entire test is administered to a group of individuals, the total score for each “set” is computed, and finally the split-half reliability is obtained by determining the correlation between the two total “set” scores.
Different measures used quantitative research
Quantitative research requires that measurements be both accurate and reliable. Researchers commonly assign numbers or values to the attributes of people, objects, events, perceptions, or concepts. This process is referred to as measurement. The variables that are measured are commonly classified as being measured on a nominal, ordinal, interval or ratio scale. The following discussion defines and provides examples of each of the four levels of measurement.
Nominal Scale: The nominal scales is essentially a type of coding that simply puts people, events, perceptions, objects or attributes into categories based on a common trait or characteristic. The coding can be accomplished by using numbers, letters, colors, labels or any symbol that can distinguish between the groups. The nominal scale is the lowest form of a measurement because it is used simply to categorize and not to capture additional information. Other features of a nominal scale are that each participant or object measured is placed exclusively into one category and there is no relative ordering of the categories. Some examples include distinguishing between smokers and nonsmokers, males and females, types of religious affiliations, blondes vs. brunettes and so on. In a study related to smoking, smokers may be assigned a value of 1 and nonsmokers may be assigned a value of 2. The assignment of the number is purely arbitrary and at the researcher’s discretion.
Ordinal Scale: The ordinal scale differs from the nominal scale in that it ranks the data from lowest to highest and provides information regarding where the data points lie in relation to one another. An ordinal scale typically uses non-numerical categories such as low, medium and high to demonstrate the relationships between the data points. The disadvantage of the ordinal scale is that it does not provide information regarding the magnitude of the difference between the data points or rankings. An example of the use of an ordinal scale would be a study that examines the smoking rates of teenagers. The data collected may indicate that the teenage smokers in the study smoked anywhere from 15 to 40 cigarettes per day. The data could be arranged in order and examined in terms of the number of smokers at each level.
Interval Scale: An interval scale is one in which the actual distances, or intervals between the categories or points on the scale can be compared. The distance between the numbers or units on the scale are equal across the scale. An example would be a temperature scale, such as the Farenheit scale. The distance between 20 degrees and 40 degrees is the same as between 60 degrees and 80 degrees. A distinguishing feature of interval scales is that there is no absolute zero point because the key is simply the consistent distance or interval between categories or data points.
Ratio Scale: The ratio scale contains the most information about the values in a study. It contains all of the information of the other three categories because it categorizes the data, places the data along a continuum so that researchers can examine categories or data points in relation to each other, and the data points or categories are equal distances or intervals apart. However, the difference is the ratio scale also contains a non-arbitrary absolute zero point. The lowest data point collected serves as a meaningful absolute zero point which allows for interpretation of ratio comparisons. Time is one example of the use of a ration measurement scale in a study because it is divided into equal intervals and a ratio comparison can be made. For example, 20 minutes is twice as long as 10 minutes.
Different types of validity
Concurrent validity refers to the degree to which the operationalization correlates with other measures of the same construct that are measured at the same time. When the measure is compared to another measure of the same type, they will be related (or correlated)
Predictive validity refers to the degree to which the operationalization can predict (or correlate with) other measures of the same construct that are measured at some time in the future. Again, with the selection test example, this would mean that the tests are administered to applicants, all applicants are hired, their performance is reviewed at a later time, and then their scores on the two measures are correlated.
Statistical conclusion validity is the degree to which conclusions about the relationship among variables based on the data are correct or ‘reasonable’. This began as being solely about whether the statistical conclusion about the relationship of the variables was correct, but now there is a movement towards moving to ‘reasonable’ conclusions that use: quantitative, statistical, and qualitative data.
Internal validity is an inductive estimate of the degree to which conclusions about causal relationships can be made , based on the measures used, the research setting, and the whole research design.
Threats to validity
Maturation:
Where changes in the dependent variable due to normal developmental processes operating within the subject as a function of time?
- Is a threat to for the one group design.
- Is not a threat to the two group design, assuming that participants in both groups change (“mature”)at same rate.
History:
- Is not a threat for the two group (treatment/experimental and comparison/control) design because the comparison is between the treatment group and the comparison group.
- If the history threat occurs for both groups, the difference between the two groups will not be due to the history event.
Statistical regression:
An effect that is the result of a tendency for subjects selected on the bases of extreme scores to regress towards the mean on subsequent tests.
■When measurement of the dependent variable is not perfectly reliable, there is a tendency for extreme scores to regress or move toward the mean.
■The amount of statistical regression is inversely related to the reliability of the test.
Testing:
A pre-test may sensitize participant in unanticipated ways and their performance on the post-test may be due to the pre-test, not to the treatment, or, more likely, and interaction of the pre-test and treatment.
Not a threat to the two group design. Both groups are exposed to the pre-test and so the difference between groups is not due to testing.
Compensatory rivalry.
When subjects in some treatments receive goods or services believed to be desirable and this becomes known to subjects in other groups, social competition may motivate the latter to attempt to reverse or reduce the anticipated effects of the desirable treatment levels.
Saretsky (1972) named this the “John Henry” effect in honor of the steel driver who, upon learning that his output was being compared with that of a steam drill, worked so hard that he outperformed the drill and died of overexertion.
Assignment 3
Hypothesis
According to Tajalli (2006), a hypothesis is a proposed explanation for a phenomenon.
The scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous observations that cannot satisfactorily be explained with the available scientific theories.
A hypothesis is a tentative statement about the relationship between two or more variables.
It is a specific, testable prediction about what you expect to happen in a study. For example, a study designed to look at the relationship between sleep deprivation and test performance might have a hypothesis that states, This study is designed to assess the hypothesis that sleep deprived people will perform worse on a test than individuals who are not sleep deprived
Even though the words hypothesis and theory are often used synonymously, a scientific hypothesis is not the same as a scientific theory. A working hypothesis is a provisionally accepted hypothesis proposed for further research.
Types of Hypothesis
Null Hypothesis
This is the conventional approach to making a prediction. It involves a statement that says there is no relationship between two groups that the researcher compares on a certain variable. The hypothesis also may state that there is no significant difference when different groups are compared with respect to a particular variable.
There is no difference in the academic performance of high school students who participate in extracurricular activities and those who do not participate in such activities” is a null hypothesis. In many cases, the purpose of a null hypothesis is to allow the experimental results to contradict the hypothesis and prove the point that there is a definite relationship.
Non-directional Hypothesis
Certain hypothesis statements convey a relationship between the variables that the researcher compares, but do not specify the exact nature of this relationship. This form of hypothesis is used in studies where there is no sufficient past research on which to base a prediction. Continuing with the same example, a non-directional hypothesis would read, “The academic performance of high school students is related to their participation in extracurricular activities.”
Directional Hypothesis
This type of hypothesis suggests the outcome the investigator expects at the end of the study. Scientific journal articles generally use this form of hypothesis. The investigator bases this hypothesis on the trends apparent from previous research on this topic. Considering the previous example, a researcher may state the hypothesis as, “High school students who participate in extracurricular activities have a lower GPA than those who do not participate in such activities.” Such hypotheses provide a definite direction to the prediction.
Causal Hypothesis
Some studies involve a measurement of the degree of influence of one variable on another. In such cases, the researcher states the hypothesis in terms of the effect of variations in a particular factor on another factor. This causal hypothesis is said to be bivariate because it specifies two aspects — the cause and the effect. For the example mentioned, the causal hypothesis will state, “High school students who participate in extracurricular activities spend less time studying which leads to a lower GPA.” When verifying such hypotheses, the researcher needs to use statistical techniques to demonstrate the presence of a relationship between the cause and effect. Such hypotheses also need the researcher to rule out the possibility that the effect is a result of a cause other than what the study has examined.
The following are the needs to have a specific hypothesis
Advantages of Hypothesis Testing
- Well suited for comparing a treatment with the control.
- Relatively simple to calculate.
Disadvantages of Hypothesis Testing
- Dependent on concentrations tested.
- Statistical power is influenced by variability.
- Inability to calculate confidence intervals.
- Confounded by hormes is or poorly behaved data.
- Frequently need to use non-parametric statistical methods.
Factors considered in assessing hypothesis
I. Hypothesis should focus on something that can be tested
- Hypothesis should include both independent and dependent variable
- The researcher should be able to manipulate the variables
- Hypothesis should be tested without violating ethical standards
REFERENCES
Patricia M. Shields, Hassan Tajalli (2006). “Intermediate Theory: The Missing Link in Successful Student Scholarship”. Journal of Public Affairs Education. 12 (3): 313–334.
Chapman GA, Anderson BS, Bailer AJ, Baird RB, Berger R, Burton DT, Denton DL, Goodfellow WL, Heber MA, McDonald LL, Norberg-King TJ, Ruffier PJ, 1996. Discussion Synopsis. In: Grothe, D. R., K. L. Dickson, and D. K. Reed-Judkins, eds.1996. Whole Effluent Toxicity Testing: An Evaluation of Methods and Prediction of Receiving System Impacts, SETAC Press, Pensacola, FL, USA. 51-78 p.
Grothe, D. R., K. L. Dickson, and D. K. Reed-Judkins, eds.1996. Whole Effluent Toxicity Testing: An Evaluation of Methods and Prediction of Receiving System Impacts, SETAC Press, Pensacola, FL, USA. 340 p.