Measurement is the quantification f what the student learned trough the use of tests, questionnaires, rating scales and checklist
Assessment refers to the full range of information gathered and synthesized by the teacher about their students in the classroom.
Evaluation is the process of making judgment or assigning value on worth f the students performance.
Types of classroom assessment
1.official assessment
2.Sizing up Assessment
3.Instructional Assessment
Test and their uses in Educational Assessment
Test- is a systematic procedure for measuring an individuals behavior
Uses of test
1. school Administrators utilize test results fr making decision regarding the promotion or retention of the students
2.suprvisors utilize test results in dicovering learning areas needing special attention
3.teachers utilize test for numerous purposes trough testing they were able to gather information about the effectiveness f the instruction, give feedback to their students and assign grades
Assessment Of learning in the cognitive domain
1. Arrangement type
2. Matching type
3.Multiple choice
4.Alternative response type
5. Key list type
6.Interpretative Exercise
7. brief restricted essay type
8. extended essay type
Types of objective test
Supply type
1. completion drawing type
2. completion statement type
3.correction type
4.Identification type
5.simple recall type
Selection type
Tools for measuring acquisition scale
1. rating scale
2.check list
3.questionnaire
Attributes of a good test as an assessment tool
Validity- it a degree to witch a test measure what it seeks to measure
Reliability-it is the accuracy to which a test consistently measure that which it does measure
Objectivity-it is the extent t which personal biases or subjectivity judgment or the test scorer is eliminated in checking the students responses to the test items as there is only one correct answer for each question.
Scorability
Administrabilty
Relevance
Balance
Efficiency
Difficulty
Discrimination
Fairness
Preparing Table of Specification
1. List down the topics covered for the incrimination of the test
2.Determined the objectives to be assess in the test
3. Specify the number of days spent fr teaching a particular topic
4. determine percentage allocation of the test item for each of the topics
We use this test for comparing the means of two samples (or treatments), even if they have different numbers of replicates. In simple terms, the t-test compares the actual difference between two means in relation to the variation in the data (expressed as the standard deviation of the difference between the means).
Procedure
First, we will see how to do this test using "pencil and paper" (with a calculator to help with the calculations). Then we can see how the same test can be done in a spreadsheet package (Microsoft 'Excel')
1. We need to construct a null hypothesis - an expectation - which the experiment was designed to test. For example:
3. List the data for sample (or treatment) 2.
4. Record the number (n) of replicates for each sample (the number of replicates for sample 1 being termed n1 and the number for sample 2 being termed n2)
5. Calculate mean of each sample (1 and 2).
6. Calculate s 2 for each sample; call these s 12 and s 22 [Note that actually we are using S2 as an estimate of s 2 in each case]
5. Calculate the variance of the difference between the two means (sd2) as follows
6. Calculate sd (the square root of sd2)
7. Calculate the t value as follows:
(when doing this, transpose 1 and 2 if 2 > 1 so that you always get a positive value)
8. Enter the t-table at (n1 + n2 -2) degrees of freedom; choose the level of significance required (normally p = 0.05) and read the tabulated t value.
9. If the calculated t value exceeds the tabulated value we say that the means are significantly different at that level of probability.
10. A significant difference at p = 0.05 means that if the null hypothesis were correct (i.e. the samples or treatments do not differ) then we would expect to get a t value as great as this on less than 5% of occasions. So we can be reasonably confident that the samples/treatments do differ from one another, but we still have nearly a 5% chance of being wrong in reaching this conclusion.
Now compare your calculated t value with tabulated values for higher levels of significance (e.g. p = 0.01). These levels tell us the probability of our conclusion being correct. For example, if our calculated t value exceeds the tabulated value for p = 0.01, then there is a 99% chance of the means being significantly different (and a 99.9% chance if the calculated t value exceeds the tabulated value for p = 0.001). By convention, we say that a difference between means at the 95% level is "significant", a difference at 99% level is "highly significant" and a difference at 99.9% level is "very highly significant".
What does this mean in "real" terms? Statistical tests allow us to make statements with a degree of precision, but cannot actually prove or disprove anything. A significant result at the 95% probability level tells us that our data are good enough to support a conclusion with 95% confidence (but there is a 1 in 20 chance of being wrong). In biological work we accept this level of significance as being reasonable.
N SXY =10 (43110)= 431,100 N SX2 =10(47625)=476250
Assessment refers to the full range of information gathered and synthesized by the teacher about their students in the classroom.
Evaluation is the process of making judgment or assigning value on worth f the students performance.
Types of classroom assessment
1.official assessment
2.Sizing up Assessment
3.Instructional Assessment
Test and their uses in Educational Assessment
Test- is a systematic procedure for measuring an individuals behavior
Uses of test
1. school Administrators utilize test results fr making decision regarding the promotion or retention of the students
2.suprvisors utilize test results in dicovering learning areas needing special attention
3.teachers utilize test for numerous purposes trough testing they were able to gather information about the effectiveness f the instruction, give feedback to their students and assign grades
Assessment Of learning in the cognitive domain
1. Arrangement type
2. Matching type
3.Multiple choice
4.Alternative response type
5. Key list type
6.Interpretative Exercise
7. brief restricted essay type
8. extended essay type
Types of objective test
Supply type
1. completion drawing type
2. completion statement type
3.correction type
4.Identification type
5.simple recall type
Selection type
Tools for measuring acquisition scale
1. rating scale
2.check list
3.questionnaire
Attributes of a good test as an assessment tool
Validity- it a degree to witch a test measure what it seeks to measure
Reliability-it is the accuracy to which a test consistently measure that which it does measure
Objectivity-it is the extent t which personal biases or subjectivity judgment or the test scorer is eliminated in checking the students responses to the test items as there is only one correct answer for each question.
Scorability
Administrabilty
Relevance
Balance
Efficiency
Difficulty
Discrimination
Fairness
Preparing Table of Specification
1. List down the topics covered for the incrimination of the test
2.Determined the objectives to be assess in the test
3. Specify the number of days spent fr teaching a particular topic
4. determine percentage allocation of the test item for each of the topics
We use this test for comparing the means of two samples (or treatments), even if they have different numbers of replicates. In simple terms, the t-test compares the actual difference between two means in relation to the variation in the data (expressed as the standard deviation of the difference between the means).
Procedure
First, we will see how to do this test using "pencil and paper" (with a calculator to help with the calculations). Then we can see how the same test can be done in a spreadsheet package (Microsoft 'Excel')
1. We need to construct a null hypothesis - an expectation - which the experiment was designed to test. For example:
- If we are analysing the heights of pine trees growing in two different locations, a suitable null hypothesis would be that there is no difference in height between the two locations. The student's t-test will tell us if the data are consistent with this or depart significantly from this expectation. [NB: the null hypothesis is simply something to test against. We might well expect a difference between trees growing in a cold, windy location and those in a warm, protected location, but it would be difficult to predict the scale of that difference - twice as high? three times as high? So it is sensible to have a null hypothesis of "no difference" and then to see if the data depart from this.
3. List the data for sample (or treatment) 2.
4. Record the number (n) of replicates for each sample (the number of replicates for sample 1 being termed n1 and the number for sample 2 being termed n2)
5. Calculate mean of each sample (1 and 2).
6. Calculate s 2 for each sample; call these s 12 and s 22 [Note that actually we are using S2 as an estimate of s 2 in each case]
5. Calculate the variance of the difference between the two means (sd2) as follows
6. Calculate sd (the square root of sd2)
7. Calculate the t value as follows:
(when doing this, transpose 1 and 2 if 2 > 1 so that you always get a positive value)
8. Enter the t-table at (n1 + n2 -2) degrees of freedom; choose the level of significance required (normally p = 0.05) and read the tabulated t value.
9. If the calculated t value exceeds the tabulated value we say that the means are significantly different at that level of probability.
10. A significant difference at p = 0.05 means that if the null hypothesis were correct (i.e. the samples or treatments do not differ) then we would expect to get a t value as great as this on less than 5% of occasions. So we can be reasonably confident that the samples/treatments do differ from one another, but we still have nearly a 5% chance of being wrong in reaching this conclusion.
Now compare your calculated t value with tabulated values for higher levels of significance (e.g. p = 0.01). These levels tell us the probability of our conclusion being correct. For example, if our calculated t value exceeds the tabulated value for p = 0.01, then there is a 99% chance of the means being significantly different (and a 99.9% chance if the calculated t value exceeds the tabulated value for p = 0.001). By convention, we say that a difference between means at the 95% level is "significant", a difference at 99% level is "highly significant" and a difference at 99.9% level is "very highly significant".
What does this mean in "real" terms? Statistical tests allow us to make statements with a degree of precision, but cannot actually prove or disprove anything. A significant result at the 95% probability level tells us that our data are good enough to support a conclusion with 95% confidence (but there is a 1 in 20 chance of being wrong). In biological work we accept this level of significance as being reasonable.
Correlating test scores
In testing, teachers are frequently called upon to describe the relationship between two sets of measures. These two measures might be scores by the same set of students on two different forms of the same test or on two separate test. This chapter shall focus on two of the most commonly used measures of relationship in educational measurement and evaluation, namely Pearson-product-moment correlation and Spearman rho.
Correlation
Correlation is the relationship between two or more paired factors or two or more set of test scores.(Best and Khan,1998). A correlation coefficient is a numerical measure or the linear relationship between two factors or sets of scores(Deauna,1996).this coefficient can be identified by either the letter R or the Greek letter rho, or other symbols, depending on the manner the coefficient has been computed.
Correlation coefficient ca range from α- 1.00 or α +1.00 toward zero the sign of the coefficient indicates the direction of the relationship and the numerical value of its strength.
Obtained correlation coefficient can be interpreted with the use of a scale, like the ones presented below (Best Kahn, 1998).
Correlation coefficient Degree of relationships
.00-.20 negligible
.21-.40 low
.41-.60 moderate
.60-.80 Substantial
.81-1.00 high to very high
The correlation between two sets of scores can either be positive or negative (Garcia, 2003. Positive correlation means that high scores in one variable (X) are associated with in another variable (Y). Conversely, a negative correlation means that high scores on one variable are associated with low scores in another variable or vice-versa.
Pearson’s Product-Moment Correlation
The measure of relation ship is used when factors to be correlated are both metric data. By metric data are meant measurements, which can be subjected to the four fundamental operations. To compute the correlation coefficient using the aforementioned test statistics, follow these step:
- Compute the sum of each set of scores (SY,SY)
- Square each score and sum the squares (SX2,SY2)
- Count the numbers of scores in each group (N)
- Multiply each X score by its corresponding Y score.
- Sum the cross products of X and Y (SXY)
- Calculate the correlation, following the formula:
r= [NSXY-(SX)(SY)]
√[(N SX2-(SX2)(NSY2-(SY)2‑)]
Where: N =number of paired observations
SXY =sum of cross products of X and Y
SX =sum of the scores under variable X
SY =sum of the scores of the variable Y
(SX2) =sum of X squared
(SY2) =sum of scores Y squared
SX2 = sum of squared X scores
SY2 =sum of squared y scores
Let us illustrate the Pearson’s is computed. Table 9.1 shows the computational procedures in determining the degree of relation ship between test scores of 10 students In english (X) and mathematics (Y).
Table 10.1
Computation of Correlation Coefficient Using Pearson’s r
X | Y | X2 | Y2 | XY |
90 85 80 75 70 65 60 55 50 45 | 80 72 70 65 68 55 60 50 53 44 | 8100 7225 6400 5625 4900 4225 3600 3025 2500 2025 | 6400 5184 4900 4225 4624 3025 3600 2500 2809 1936 | 7200 6120 5600 4875 4750 3575 3600 2750 2650 1980 |
SX=675 | ST=617 | SX2=47625 | SY2=39203 | SXY=43110 |
N=10
(SX)(SY)=(675)(617)=416,475
{N SXY- (SX)(SY)}=416,475
(SX)2=(675)(675)=455,625
(N SX2-(SX)2)=476,250-455,625
NSY2=10(39203)=392,030
(SY)2=(617)(617)=380,689
(N SY2-(SY)2=392,030-380,689=1,341
{(N SY2-(SX)2)(N SY2-(SY)2)}=(20,625)(11,341
=233,908,125
=square root of 233,908,125=15294.05522
r =14,625/15294.05522
r =0.956 or 0.96
Results of the computation of Pearson’s r yielded a computed r of 0.96. This indicates that a very high degree of relation ship exists between the test scores in english and mathematics. A student who scored high in english and mathematics. A student who scored high in English also obtain high score in mathematics
n Rho
This measures of relation ship is used when test scores are ordinal or rack-ordered. In computing rho, the following steps should be observed:
1. Rank the scores in distribution X, giving the highest score a rank of 1.
2. Repeat process for the scores in distribution Y.
3. Obtain the difference between the two sets of tanks(D)
4. Square each of these differences and sum up squared differences (SD2)
5. Solve for the rho, following the formula:
Where rho=rank-order correlation coefficient
D= difference between paired ranks
SD2=sum of squared differences between paired ranks
N= Number of paired ranks
The computational procedures for calculation of rho are reflected in table 9.2.
Table10.2
Computation of Correlation Coefficient
Using spearman rho
X | Y | Rank of X | Rank of Y | D | D2 |
90 85 80 75 70 65 60 55 50 45 | 80 72 70 65 68 55 60 50 53 44 | 1 2 3 4 5 6 7 8 9 10 | 1 2 3 4 5 6 7 8 9 10 | 0 0 0 -1 1 -1 1 -1 1 0 | 0 0 0 1 1 1 1 1 1 0 |
| | | | | SD2=6 |
N=10
Walang komento:
Mag-post ng isang Komento