School of Science and Technology 科技學院
Electronic and Computer Engineering 電子工程學系

Development and Evaluation of a Real-time System for Detection of Personal Protective Equipment

Student Yuen Wang Leung
Programme Bachelor of Science with Honours in Computer Engineering
Supervisor Dr. Kevin Hung
Year 2021/22


In this proposal, construction safety issue will be discussed, and analysis how can object detection innovate this industry by reviewing literatures, methodology, expected results and risk assessment.

Project plan has been stated to provide an overview on how to build the system.  At this moment, the proposal only went through the planning, designing and analysis phase. Implementation will be further discussed.

Demonstration Video



This project aims to develop a real-time software system for detection and recognition of personal protective gears in construction site. A quadruped robot – Jueying Mini Lite (see Figure 1.) developed by DeepRobotics Co. Ltd may utilized in dataset researching and recognition system as well.

Figure 1. quadruped robot – Jueying Mini Lite


The objective of this project is to design, develop and evaluate a software system for real-time object detection to recognize personal protective gears in construction site.

The software system contains the following features:

  • Image pre-processing;
  • AI model training for detection and recognition of specific objects;
  • Detecting and recognizing personal protective gears;

Accidents often happen in construction areas, this system aims to stranger the danger awareness of construction workers and easing the duties on construction safety officers or auditors. The age distribution of construction worker is growing as well. To manage the workers, this project will develop an accurate and reliable construction used detection and recognition object detection model to find out the difficulty and precautions in this area. Robotic technology is widely used in industry, this project will try to utilize it as an effective way to collect dataset and safety monitoring system.

Methodologies and Technologies used

System Design

Figure 2. A.I. recognize personal protective gear Entity Relationship Diagram

Figure 2. showed how the A.I. recognize personal protective gear in this project with an entity relationship diagram. The diagram can be defined into two major parts (A.I. and gears).

For the gear part, it will start with the quadruped robot with two attributes including a monocular wide-angle camera and a 16-line lidar. The lidar in the robot is utilized for building a map to implement the patrolling functions. In this project, the robot will focus on patrolling construction site to detect construction workers including operators and safety officers. In the construction site, it set up with CCTV with built-in recording function. On the other hand, the camera in the robot can also record the video while patrolling. Robot's and CCTV's recorded videos will then be outputted for converting into frames. Each frame will be labelled with the worker's gears including safety helmet and vest in image pre-processing.   

For the A.I. part, it has an object detection model. This model will learn the dataset in machine learning method. The dataset is converted from the frames with label, which is mentioned on the above paragraph. Then, the dataset will be separated into two parts: training and testing. Besides the dataset process, the model also input a framework to figure out which algorithm to use. In this project, YOLO will be chosen. To facilitate the model learning process, the model has a weigh parameter to define the class score (worker score, safety helmet score and safety vest score), the coordinate score (x-coordinate, y-coordinate, width-coordinate and height-coordinate) and the object score while the AI model training for detection and recognition of safety gears.

So, combined two parts of the system (A.I. and personal safety gear) will form the entity relationship diagram (Figure 2.) which showed how the A.I. recognize gears.

System Development

Figure 3. is a flow chart which shows the decision making of this project system of dataset collection for image pre-processing and how the AI model training for detection and recognition for the final detecting and recognizing personal protective gears.

This flow starts with the quadruped robot recorded source videos while patrolling in the construction site. After the patrolling run down, the robot will back to its starting position and the operators will start outputting the latest video recordings by remoting its computer build-in the robot. To make the video into usable dataset, first, the videos will be converted into frames in an image form. The video may convert into 30 images based on the original FPS (frame per second) of the video. After that, researcher or operators will start labeling the images with a specified tool – LabelImg. The labeling process will be focus on three targets including safety helmet, safety vest and worker. This will form the dataset this project needs. Then, the dataset will be separate into two parts: training and testing. They involved 80 percent and 20 percent of the total dataset respectively. Then, the system would enter the first decision making point, if the user determined the dataset is not enough for starting the training, the system will back to the robot recording videos while patrolling process until the user is satisfied with the amount of dataset.

Moving into the model training stage, the object detection system will first be tuning the parameters including the weigh from last training. Then, it will start estimate the training time to make an analysis about how many times need to investigate to train-up a usable object detection model. After the estimation, the model will start its training. The training dataset will be inputted while the machine learning process. When the training is finished, a new weight will be formed to score the AP (average precision). Testing dataset will be inputted after the training and learning process of the system to find out the mAP (mean average precision) to understand its performance based on the epoch. Finishing the review of the mAP and outputted result, the system may face four different situations including configuration issue, dataset issue, training time issue and no issue. If the system faced a configuration issue such as wrong object or class definitions, it will back to the parameters tuning process. If it is a training time issue including too much time investment or not enough training time, the system will back to the training time estimation. If the system faced a dataset issue for example, the label is not comprehensive, it will back to the very beginning: robot video recording while patrolling process. After multiple looped or a smooth learning process, the system will enter the further extension part which will be discussed.

Figure 3. personal protective gears with quadruped robot video outputting flowchart

Tools and Equipment


Colab (Google Collaboratory)

For AI model training for detection and recognition, Colab is a code editor and code executor within Google Drive. Colab is connected to a cloud-based runtime so user can run Python, GPU, TensorFlow, OpenCV on it without any required setup. This enhanced the object detection model development to a much convenient and enjoyable environment than developing model in a local PC without a high investment on the equipment. 

YOLO (You Only Look Once)

For AI model training for detection and recognition, algorithm for training should be chosen. YOLOv4 will be utilized. It is a one-stage object detection framework. After resizing the dataset into 448*448, it will execute one F-CNN then output the precision. This method provided a fast training for an epoch with an accurate mAP.


For image pre-processing, online dataset can be utilized for enriching the training dataset. Kaggle is a machine learning and data science community. It converged a huge number of datasets shared by the users all around the world. In this project, this platform speeded up the dataset collection process if the datasets had already labelled.


For image pre-processing, self-make dataset is made by labelling images to produce annotations. LabelImg is a tool for labelling images to produce dataset on machine learning. If the researched dataset did not provide label or need to customize own dataset. This will be the tool to choose.


For detecting and recognizing personal protective gears, machine learning library can be used for implementing AI model in camera such as OpenCV. OpenCV stands for an Open-Source Computer Vision Library. In this project, this library will be used for reading images and video, utilizing camera for real-time object recognition, image transformations, etc.

System evaluation


Impact of training time

For AI model training for detection and recognition, training time affects how many tests can be made in the development cycle. For detecting and recognizing personal protective gears, the number of tests had been made affects the quality of the final model product. Precision of the model will keep changing after each machine learning loop. To estimate the learning progress slope, time is one of the factors. To find out the performance with different training schedule, mAP will be the comparator. Instead of the training schedule, iterations in each training schedule have different performances as well, which will also be experimented.

Relationship between IOU and NMS

For detecting and recognizing personal protective gears, IOU (Intersection over Union) and NMS (Non-maximum Suppression) are two of the factors affecting the coordinates and position of boundary boxes while detecting. The displayed bounding box position can massively influence the user experience of reading the real-time detection on camera. Also, the IOU will be experimented to find out the most suitable ratio for detecting gears with least false positives and false negatives result.

True condition and precision

For detecting and recognizing personal protective gears, different conditions are the factor of model accuracy. True conditions including true positive, true negative, false positive and false negative are the conditions to validate the precision statement of a model. In this experiment, the results will be calculated into positive predictive value, false discovery rate, false omission rate and negative predictive rate as researching materials.

Impact of SVM

For detecting and recognizing personal protective gears, SVM (Support Vector Machines) is a class separation method in object detection to overview the accuracy. To better define the gray area between each class, vectors are set to classify the line between each class. To find out the impact of these vectors, experiments will be done.


Risk Assessment


Model Training Interruption

In this project, object detection model training is run in Google cloud platform. There are several conditions may shutdown the training. For example, browser crushing, Windows automatically updates or computer battery low, etc. Even the training result won't be lost when the latest trained weight is outputted after a configurable amount of epoch (around 1000 epoch), the scheduled training time will be wasted, and the training is not stick to the plan.

Third-party dataset

To collect the dataset, third-party dataset will be researched to reduce the inconvenient label processing. But most of the dataset is hard to control its quality when the dataset is involving thousands of images. These datasets will lead the model weight to an opposite direction.

Robot Camera Access Interruption

To access the camera in the quadruped robot, a remote desktop software – Nomachine is utilized to connect the robot computer system – Nvidia Xavier NX. This remote software has multiple known issues listed in the robot official tutorial sheet. The issues are remote desktop screen may be white after connection and “Session Negotiation Failed” error may be shown after entering password. These errors will interrupt the model learning process.

Run out of Reliable Datasets

To improve the model accuracy performance, the dataset should be increased time-to time for elaborating a wider range of detectable targets. In this project, Kaggle open-source dataset and patrolling robot video recordings are the two major dataset collection methods. If the reliable Kaggle datasets are used up and the robot's video recordings are up to date already. This training process will be forced to stop.


To develop the real-time system for detection of personal safety gear in construction site, quadruped robot will be tested to record patrolling videos and outputted for labelling to build the datasets. The model training process will execute in cloud platform to lower the cost and provide a better risk management. The result of the trained model should achieve at least 80 percent accuracy after tunning and researching. The researching process with experiments and the model will provide valuable reference on object detection in construction safety development. As well as the usage of quadruped robot for data collecting and patrolling.

For Image pre-processing, this project successfully labelled images to produce self-make dataset after testing out multiple labelling tools, the major concern about choosing labelling tool is the labelling output should be fitting YOLO format. To enrich the dataset, online dataset combined with self-make dataset. After multiple testing and data cleaning, the dataset had better balance by reviewing the dataset health check, including number of annotations of 4 different classes (vest, not vest, helmet, not helmet), dataset choices, size of images and heatmaps. In total, 2000 images with 10000 annotations are utilized to form the final dataset for this project.

For AI model training for detection and recognition of specific objects, YOLO algorithm had been chosen to fit the requirement of real-time detection system with high recognition accuracy. To configurate best setting for the AI model training process, the best configuration had been found out (ref. 4.1.3) after a lot of testing to avoid errors such as CUDA out of memory error, time out error, environment errors, run-time errors, etc. The major concern about model training in this project is using local development or cloud platform. At the end after testing both platform, local development has better time control and cloud platform has better file management. They both produce similar accuracy AI model since they are using the same algorithm to train. In my experience, cloud platform is great for testing at the beginning and local development is great for producing the final product.

For detecting and recognizing personal protective gears, the AI model had been trained with different configuration and dataset to look for the best model detecting and recognizing result. Each training took around 9000 epochs and 15 hours, around 30 models with different setting had been tested. At the end, the best trained model got 86.7% overall accuracy for the safety gears in the validation set. (ref. 4.2.1). For classes precision and confidence, all classes had achieved 1.00 precision at 0.804 confidence (ref. 4.2.2). For testing in video to analysis real-time detection result, AI model can recognize most of the test case while maintaining 40 FPS (ref. 4.3).

Jonathan Chiu
Marketing Director
3DP Technology Limited

Jonathan handles all external affairs include business development, patents write up and public relations. He is frequently interviewed by media and is considered a pioneer in 3D printing products.

Krutz Cheuk
Biomedical Engineer
Hong Kong Sanatorium & Hospital

After graduating from OUHK, Krutz obtained an M.Sc. in Engineering Management from CityU. He is now completing his second master degree, M.Sc. in Biomedical Engineering, at CUHK. Krutz has a wide range of working experience. He has been with Siemens, VTech, and PCCW.

Hugo Leung
Software and Hardware Engineer
Innovation Team Company Limited

Hugo Leung Wai-yin, who graduated from his four-year programme in 2015, won the Best Paper Award for his ‘intelligent pill-dispenser’ design at the Institute of Electrical and Electronics Engineering’s International Conference on Consumer Electronics – China 2015.

The pill-dispenser alerts patients via sound and LED flashes to pre-set dosage and time intervals. Unlike units currently on the market, Hugo’s design connects to any mobile phone globally. In explaining how it works, he said: ‘There are three layers in the portable pillbox. The lowest level is a controller with various devices which can be connected to mobile phones in remote locations. Patients are alerted by a sound alarm and flashes. Should they fail to follow their prescribed regime, data can be sent via SMS to relatives and friends for follow up.’ The pill-dispenser has four medicine slots, plus a back-up with a LED alert, topped by a 500ml water bottle. It took Hugo three months of research and coding to complete his design, but he feels it was worth all his time and effort.

Hugo’s public examination results were disappointing and he was at a loss about his future before enrolling at the OUHK, which he now realizes was a major turning point in his life. He is grateful for the OUHK’s learning environment, its industry links and the positive guidance and encouragement from his teachers. The University is now exploring the commercial potential of his design with a pharmaceutical company. He hopes that this will benefit the elderly and chronically ill, as well as the society at large.

Soon after completing his studies, Hugo joined an automation technology company as an assistant engineer. He is responsible for the design and development of automation devices. The target is to minimize human labor and increase the quality of products. He is developing products which are used in various sections, including healthcare, manufacturing and consumer electronics.

Course Code Title Credits
  COMP S321F Advanced Database and Data Warehousing 5
  COMP S333F Advanced Programming and AI Algorithms 5
  COMP S351F Software Project Management 5
  COMP S362F Concurrent and Network Programming 5
  COMP S363F Distributed Systems and Parallel Computing 5
  COMP S382F Data Mining and Analytics 5
  COMP S390F Creative Programming for Games 5
  COMP S492F Machine Learning 5
  ELEC S305F Computer Networking 5
  ELEC S348F IOT Security 5
  ELEC S371F Digital Forensics 5
  ELEC S431F Blockchain Technologies 5
  ELEC S425F Computer and Network Security 5
 Course CodeTitleCredits
 ELEC S201FBasic Electronics5
 IT S290FHuman Computer Interaction & User Experience Design5
 STAT S251FStatistical Data Analysis5
 Course CodeTitleCredits
 COMPS333FAdvanced Programming and AI Algorithms5
 COMPS362FConcurrent and Network Programming5
 COMPS363FDistributed Systems and Parallel Computing5
 COMPS380FWeb Applications: Design and Development5
 COMPS381FServer-side Technologies and Cloud Computing5
 COMPS382FData Mining and Analytics5
 COMPS390FCreative Programming for Games5
 COMPS413FApplication Design and Development for Mobile Devices5
 COMPS492FMachine Learning5
 ELECS305FComputer Networking5
 ELECS363FAdvanced Computer Design5
 ELECS425FComputer and Network Security5