School of Science and Technology 科技學院
Computing Programmes 電腦學系

Classification of Short Answers for Semi-Automated Grading and Feedback in online Assessment

TSUI Yiu Chuen, LAI Ka Wai, CHENG Mang Kwan

ProgrammeBachelor of Computing with Honours in Internet Technology
SupervisorDr. Andrew Lui
AreasTeaching and Classroom Support
Year of Completion2017
Award ReceivedFirst Runner-Up, IEEE Computational Intelligence Chapter FYP Competition 2017


The aim of the project is to develop a semi-automated grading short answer algorithm to reduce and make better use of instructors grading effort. In traditional way, short answer grading process need a lot of effort and time. We think that this process can be done in a more efficient way. Semi-automated grading is the combination of traditional grading and automated grouping of answer. We will design a framework for clustering the short answer in different group. The answers which are similar and share common feature are grouped in a cluster. We want to explore the ideal feature set so that the answers will successfully cluster the answers according to the similarity. Hence, instructors only grade representative answer of each group, the grading apply to all the answer in the cluster.

We also want to investigate the balance of the numbers of cluster needed to grade and the accuracy of clusters. We hope instructors can grade the fewest cluster and still have excellent grading performance.

To achieve the aim, the main of objective is to classify and cluster the student short answer and then grade the short answer automatically. The project has defined a number of sub-objectives as follows:

  • The design of the feature set for clustering the answers. For example, the essential factors and not essential factor for determine the similarity of answer.
  • The selection of data to be collected and processed.
  • Evaluation of the design. The evaluation can access the accuracy of the feature set.
  • Evaluation of the cluster quality indices. The evaluation can access the index that can estimate the purity of the cluster.

Video Demonstration

Background and Methodology

This project is a research based project which aims to cluster the short answers to amplify graders effort. Graders are likely to have less clusters to grade (i.e. the minimalized K value). However, with the lower clusters number, the harder to get pure cluster. Pure clusters mean that the group contains answers that deserve same grade.

To achieve this, there are some investigations on balancing these two dimensions as follows:

  • Feature model. It is an important issue that to determine which features are capable to distinguish the similarity between answers.
  • Clustering algorithm. Selection of clustering algorithm is the key leading to success. Choice of algorithm would affect the performance of the clustering result.
  • Cluster quality measurements. Purity of the clusters can measure whether the answers within a cluster is having same grade. However, in the real-life cases, the answers are not labelled. So, we need to find a measurement that able to estimate the purity.
  • Implementation technologies. The prototype system needs helpful technologies to implement
  • Dataset. The datasets will be used in the project.

Conclusion and Future Development

Semi-Automated grading is still a challenging problem. This research suggests the use of Typed Dependencies as features is advanced and worth to continue investigating further. Compared to fully-automated grading, semi-automating combines the human grading and computer-assisted, which provides higher flexibility and accuracy to the graders. Also, their effort would be amplified compared to traditional grading process. Our project aim is achieved. This research has mainly two contributions. In the following paragraphs, we will discuss them in the following sub-chapters.

There are some limitations in our project. Synonyms and antonyms cannot be identified in our algorithm. It is important to have it because students may use different words to present the same meaning. On the other hand, abbreviation cannot be detected in our algorithm. In some dataset, there is a frequently use of abbreviation to express their answers. For example, “us” can be literally understand as a pronoun and the abbreviation of “United State”. Handling them can benefit to the accuracy of the algorithm.

Also, we did not investigate other possibly use of clustering algorithm. We used K-Means in our experiment only. There is possibility that other clustering algorithm could have better performance then K-Means.

As the limitations stated above, we suggest that future work can aim at them. Synonyms, antonyms and abbreviation features can be investigated to check whether it is useful to identify the answers more accurately.

Clustering algorithm investigation is also one of the future work. The choice of algorithms can be a main issue to have different performance.

Jonathan Chiu
Marketing Director
3DP Technology Limited

Jonathan handles all external affairs include business development, patents write up and public relations. He is frequently interviewed by media and is considered a pioneer in 3D printing products.

Krutz Cheuk
Biomedical Engineer
Hong Kong Sanatorium & Hospital

After graduating from OUHK, Krutz obtained an M.Sc. in Engineering Management from CityU. He is now completing his second master degree, M.Sc. in Biomedical Engineering, at CUHK. Krutz has a wide range of working experience. He has been with Siemens, VTech, and PCCW.

Hugo Leung
Software and Hardware Engineer
Innovation Team Company Limited

Hugo Leung Wai-yin, who graduated from his four-year programme in 2015, won the Best Paper Award for his ‘intelligent pill-dispenser’ design at the Institute of Electrical and Electronics Engineering’s International Conference on Consumer Electronics – China 2015.

The pill-dispenser alerts patients via sound and LED flashes to pre-set dosage and time intervals. Unlike units currently on the market, Hugo’s design connects to any mobile phone globally. In explaining how it works, he said: ‘There are three layers in the portable pillbox. The lowest level is a controller with various devices which can be connected to mobile phones in remote locations. Patients are alerted by a sound alarm and flashes. Should they fail to follow their prescribed regime, data can be sent via SMS to relatives and friends for follow up.’ The pill-dispenser has four medicine slots, plus a back-up with a LED alert, topped by a 500ml water bottle. It took Hugo three months of research and coding to complete his design, but he feels it was worth all his time and effort.

Hugo’s public examination results were disappointing and he was at a loss about his future before enrolling at the OUHK, which he now realizes was a major turning point in his life. He is grateful for the OUHK’s learning environment, its industry links and the positive guidance and encouragement from his teachers. The University is now exploring the commercial potential of his design with a pharmaceutical company. He hopes that this will benefit the elderly and chronically ill, as well as the society at large.

Soon after completing his studies, Hugo joined an automation technology company as an assistant engineer. He is responsible for the design and development of automation devices. The target is to minimize human labor and increase the quality of products. He is developing products which are used in various sections, including healthcare, manufacturing and consumer electronics.

Course CodeTitleCredits
 COMP S321FAdvanced Database and Data Warehousing5
 COMP S333FAdvanced Programming and AI Algorithms5
 COMP S351FSoftware Project Management5
 COMP S362FConcurrent and Network Programming5
 COMP S363FDistributed Systems and Parallel Computing5
 COMP S382FData Mining and Analytics5
 COMP S390FCreative Programming for Games5
 COMP S492FMachine Learning5
 ELEC S305FComputer Networking5
 ELEC S348FIOT Security5
 ELEC S371FDigital Forensics5
 ELEC S431FBlockchain Technologies5
 ELEC S425FComputer and Network Security5
 Course CodeTitleCredits
 ELEC S201FBasic Electronics5
 IT S290FHuman Computer Interaction & User Experience Design5
 STAT S251FStatistical Data Analysis5
 Course CodeTitleCredits
 COMPS333FAdvanced Programming and AI Algorithms5
 COMPS362FConcurrent and Network Programming5
 COMPS363FDistributed Systems and Parallel Computing5
 COMPS380FWeb Applications: Design and Development5
 COMPS381FServer-side Technologies and Cloud Computing5
 COMPS382FData Mining and Analytics5
 COMPS390FCreative Programming for Games5
 COMPS413FApplication Design and Development for Mobile Devices5
 COMPS492FMachine Learning5
 ELECS305FComputer Networking5
 ELECS363FAdvanced Computer Design5
 ELECS425FComputer and Network Security5