ABSTRACT
The main aim of this minor thesis is to evaluate the reliability of the final Achievement
Computer-based MCQs Test 1 for the 4th semester non-English majors at Hanoi
University of Business and Technology.
In order to achieve this aim, a combination of both qualitative and quantitative research
methods were adopted. The findings indicate that there is a certain degree of unreliability
in the final achievement computer-based MCQs test1 and there are two main factors that
cause the unreliability including test item quality and test- takers performance. ’
Having carefully considered a thorough analysis of the collected data, the author made
some suggestions in order to improve the quality of the final achievement test and the
MCQs test 1 for the non-majors of English in the 4
th
semester in Hanoi University of
Business and Technology. Firstly, the test objectives, sections and skill weight should be
adjusted to be more compatible with the course objectives and the syllabus. Secondly, a
testing committee should be set up for the construction and development of a multi choice
item bank including test items which are of good p-value and discrimination value.
iii
LIST OF ABBRIVIATIONS
1. CBT: Computer-based testing
2. HUBT: Hanoi University of Business and Technology
3. MC: Multi choice
4. MCQs: Multi choice questions
5. ML Pre- : Market Leader Pre-intermediate
6. KD: Kuder- Richardson
7. SD: Standard deviation
iv
LIST OF TABLES AND CHARTS
1. Table 1 Types of tests
2. Table 2 Scoring format for each semester
3. Table 3 The syllabus for 4
th
semester (for non English majors– )
4. Table 4 Time allocation for language skills and sections
5. Table 5 Specification grid for the final computer-based MCQs test 1
6. Table 6 Main points in the grammar section
7. Table 7 Main points in the vocabulary section
8. Table 8 Topics in reading section
9. Table 9 Items in the functional language sections
10. Table 10: Test reliability coefficient
10. Table 11: p-value of items in 4 sections
11. Table 12: Discrimination value of items in 4 sections
12. Table 13: Number of test items with acceptable p-value and discrimination
value in 4 sections
13. Table 14: Suggested scoring format
14. Table 15: Proposed test specifications
12. Chart 1 Students response on test content’
13. Chart 2 Students response on item discrimination value’
14. Chart 3 Students response on time length ’
15. Chart 4 Students response arbitrariness ’
16. Chart 5 Students response on relation between test score and their ’
achievement
v
TABLE OF CONTENT
CANDIDATE S STATEMENT’
i
ACKNOWLEDGEMENT
ii
ABSTRACT
iii
LIST OF ABBREVIATION
iv
LIST OF TABLES AND CHARTS
v
TABLE OF CONTENT
vi
Chapter 1: INTRODUCTION
1
1.1. Rationale for the study
1
1.2. Aims and research questions
2
1.3. Theoretical and practical significance of the study
2
1.4. Scope of the study
2
1.5. Method of the study
2
1.6. Organization of the paper
3
Chapter 2: LITERATURE REVIEW
4
2.1. Language testing
4
2.1.1. What is a language test?
4
2.1.2. The purposes of language tests
4
2.1.3. Types of language tests
5
2.1.4. Criteria of a good language test
5
2.2. Achievement test
6
2.2.1. Definition
6
2.2.2. Types of achievement test
6
2.2.3. Considerations in final achievement test construction
7
2.3. MCQs test
7
2.3.1. Definition
7
2.3.2. Benefits of MCQs test
8
2.3.3. Limitations of MCQs test
10
2.3.4. Principles on designing a good MCQs test
11
2.4. Reliability of a test
11
2.4.1. Definition
11
2.4.2. Methods for test reliability estimate
12
2.4.3. Measures to improve test reliability
15
2.5. Summary
15
Chapter 3: The Context of the Study
16
3.1. The current English learning, teaching and testing situation at HUBT
16
3.2. The course objectives, syllabus and materials used for the second non-
majors of English in Semester 4.
17
3.2.1. The course objectives
17
3.2.2. Business English syllabus
17
3.2.3. The course book
19
3.2.4. Specification grid for the final achievement Computer-based MCQs test
in Semester 4.
19
Chapter 4: Methodology
21
vi
4.1. Participants
21
4.2. Data collection instruments
21
4.3. Data collection procedure
21
4.4. Data analysis procedure
22
Chapter 5: RESULTS AND DISCUSSIONS
23
5.1. The compatibility of the objectives, content and skill weight format of
the final achievement computer-based MCQ test 1 for 4
th
semester with
the course objectives and the syllabus
23
5.1.1 The test objectives and the course objectives
23
5.1.2. The test item content in four sections and the syllabus content
24
5.1.3. The skill weight format in the test and the syllabus
26
5.2. The reliability of the final achievement test
27
5.2.1. Reliability coefficient
27
5.2.2. Item difficulty and discrimination value
27
5.3. The attitude of students towards the MCQs test 1
29
5.4. Pedagogical implications and suggestions on improvements of the
existing final achievement computer-based MCQs test 1 for the non-
English majors at HUBT.
34
5.5. Summary
38
Chapter 6: CONCLUSION
39
6.1. Summary of the findings
39
6.2. Limitations of the study
40
6.3. Suggestions for further study
40
REFERENCES
41
APPENDICES
I
APPENDIX 1
Grammar, Reading, Vocabulary and Functional language check list
II
APPENDIX 2
Survey questionnaire (for students at HUBT)
IV
APPENDIX 3
Students test scores’
VII
APPENDIX 4
Item analysis of the final achievement computer-based MCQs test 1- 150
items, 349 examinees
XII
APPENDIX 5
Item indices of the final achievement computer-based MCQs test 1
XVII
vii
Chapter 1: Introduction
1.1. Rationale of the study
Testing plays a very important role in teaching and learning process. Testing is one
form of measurement which is used to point out strengths and weaknesses in the learned
abilities of the students. Through testing, especially tests scores we may discover the
performance of given students and of teachers. As far as students are concerned, test scores
reveal what they have achieved after a learning period. As for teachers, test scores indicate
what they have taught to their students. Based on test results, we may make improvement in
teaching, learning and testing for better instructional effectiveness.
Another reason for the selection of testing a matter of study lies in the fact that the
current language testing at Hanoi University of Business and Technology (HUBT) has
been under a lot of controversy among students and teachers. Testing is mainly carried out
in the form of two objective tests on computers (named test 1 and test 2) which are
administered at the end of each semester. The scores that a student gets on these tests are
the main indicators of his or her performance during the whole semester. There are
different comments on the results of these tests, especially the test 1 for the second-year
non-English majors. Some subject teachers claim that these tests do not truly reflect the
students’ language competence. Others say that these tests are appropriate to what students
have learnt in class and compatible with the course objectives and therefore reliable. Also,
among the students, do opposite ideas exist. Many think that these tests are more difficult
than what they have learnt and studied for the exam, others say that these test items are
easy and relevant to what they have been taught. Therefore finding out whether the tests are
closely related with what the students have been learnt and what the teachers have taught,
also, whether these tests are of reliability is indispensable.
For the two reasons mentioned above, the author would like to undertake this study
entitled A study on the reliability of the final achievement Computer-based MCQs Test“
1 for the 4th semester non-English majors at Hanoi University of Business and
Technology” with the intention to examine rumors about this test. In addition, the author
hopes that the study results help to raise awareness among teachers as well as those who
are interested in this field. At the same time, study results, in some extent, can be applied to
improve the current testing situation in HUBT.
1.2. Aims and research questions
1
The main aim of the study is to investigate the reliability of the existing final
achievement MCQs test 1 (4
th
semester) for non-English majors at HUBT through
analyzing the test objectives, test content and test skill weight format, students’ scores, test
items, perception and comments from students on the test and then to make suggestions
towards the test’s improvement.
To achieve this aim, the following research questions are set for exploration:
1. Are the objectives, content and skill weight format of the final achievement
computer-based MCQs test 1 compatible with the course objectives, the
syllabus content and skill weight format ?
2. To what extend is the test 1 reliable?
3. What is the student’s attitude towards the final achievement Computer-based
MCQs test 1?
1.3. Scope of the study
The existing final achievement Computer-based MCQs test 1 in the 4
th
semester for
the second-year non-English majors at HUBT
1.4. Theoretical and practical significance of the study
Theoretically, the study proves that testing is crucial in order to measure and
evaluate the quality of learning and teaching. Also, test reliability is one of the most
important criteria for the evaluation of a test.
Practically, the study presents how reliable the final achievement MCQs test 1
administered at HUBT is and how to improve its quality.
1.5. Method of the study :
Both qualitative and quantitative methods are used.
Regarding literature review on language testing, course objectives, syllabus, the
objectives, content and format of the achievement test 1 for 4
th
term, results of the
questionnaires for students, qualitative method is applied.
With reference to test scores and test items analysis, quantitative method is used.
1.6. Organization of the paper
The study is composed of 6 chapters.
Chapter 1- Introduction briefly states the rationale, aims and research questions,
scope of the study, theoretical and practical significance of the study, method of the study
and organization of the paper.
2
Chapter 2- Literature review discusses relevant theories of language testing, final
achievement test, Computer-based MCQ tests and test reliability.
Chapter 3- The context of the study deals with English learning, teaching and testing
situation at HUBT, course book, syllabus and check list for the test.
Chapter 4- Methodology presents participants, data collection instruments, data
collection and data analysis procedure.
Chapter 5– Results and Discussions presents and discusses the results of the study.
Suggestions for the improvement of the achievement test 1 are also proposed in this
chapter.
Chapter 6- Conclusion summarizes the findings, mentions the limitations and
provides suggestions for further study.
3
Chapter 2: Literature review
2.1. Language testing
2.1.1. What is a language test?
There are a wide variety of definitions of a language test which have one point of
similarity. That is to say, a language test is considered as a device for measuring
individuals’ language ability.
According to Henning (1987, p.1), “Testing, including all form of language test, is
one form of measurement”. In his opinion, tests such as listening or reading
comprehension are delivered in order to find out the extent to what the abilities of these
skills are present in the learners. Similarly, Bachman (1990, p.20) stated: “A test is a
measurement instrument designed to elicit a specific sample of an individual’s
behavior”. He also considered obtaining the elicited sample of behavior as the
distinction of a test from other types of measurement.
Brown H.D (1995, p.384) presented the notion in a simpler way: “A test, in plain
words, is a method of measuring a person’s ability or knowledge in a given domain”.
He explained that a test first and foremost is a method which includes items and
techniques requiring the performance of testees. Via this performance, a person’s
ability or language competence is measured.
These viewpoints show that a language test is an effective tool of measuring and
assessing students’ language knowledge and skills and providing precious information
for better future teaching and learning.
2.1.2. The purposes of language tests
Language tests regarding their purposes are perceived from different perspectives
by different scholars. Typically, Henton (1990) mentioned 7 points which can be
represented as follows:
• Finding out about progress
• Encouraging students
• Finding out about learning difficulties
• Finding out about achievement
• Placing students
• Selecting student
• Finding out about proficiency
4
In general, a language test is used to evaluate both teaches and students’
performance, to make judgment and adjustment to teaching materials and methods, and
to strengthen students’ motivation for their further study.
2.1.3. Types of language tests
Language tests can be classified into different types according to their purposes.
Henton (1990), Brown (1995), Harrison (1983) and Hughes (1989) pointed out that
language tests include four main types: proficiency tests, diagnostic tests, placement
tests and achievement tests with characteristics illustrated in the following table:
Type of test Characteristics
Proficiency test Measure people’s abilities in a language regardless of any
training they may have had in that language
Diagnostic test Check students’ progress for their strengths and weaknesses
and what further teaching is necessary
Achievement test Assess what students have learnt as known syllabus
Placement test Classify students into groups at different level at the beginning
of a course
Table 1: Types of tests
Another researcher, Henning (1987) divided tests into objective and subjective ones
on the basic of the manner in which they are scored. Subjective tests obtain scoring by
opinionated-judgment on the part of the scorer while objective tests are scored by
comparing examinee responses with an established set of acceptable responses or
scoring key.
2.1.4. Criteria of a good language test
Just like any measuring device, a language test presents potential error
measurement. For the purpose of investigating and evaluating and “testing” a test,
researchers such as Brown (1995), Henning (1987), Bachman (1990) and Harrison
(1983) identified criteria to determine if a test is good or not. A good language test
must feature four most important qualities: reliability, validity, practicality and
discrimination.
The reliability of a test is its consistency (Brown, 1995; Harrison, 1983). A test is
reliable only when it yields the same results whether it is administrated under any
circumstances or scored by any markers. The validity of a test refers to “the degree to
which the test actually measures what it is intended to measure” (Brown, 1995, p.387).
A test is considered to be valid if it possesses content validity, face validity and
5
Không có nhận xét nào:
Đăng nhận xét