To develop the real-time system for detection of personal safety gear in construction site, quadruped robot will be tested to record patrolling videos and outputted for labelling to build the datasets. The model training process will execute in cloud platform to lower the cost and provide a better risk management. The result of the trained model should achieve at least 80 percent accuracy after tunning and researching. The researching process with experiments and the model will provide valuable reference on object detection in construction safety development. As well as the usage of quadruped robot for data collecting and patrolling.
For Image pre-processing, this project successfully labelled images to produce self-make dataset after testing out multiple labelling tools, the major concern about choosing labelling tool is the labelling output should be fitting YOLO format. To enrich the dataset, online dataset combined with self-make dataset. After multiple testing and data cleaning, the dataset had better balance by reviewing the dataset health check, including number of annotations of 4 different classes (vest, not vest, helmet, not helmet), dataset choices, size of images and heatmaps. In total, 2000 images with 10000 annotations are utilized to form the final dataset for this project.
For AI model training for detection and recognition of specific objects, YOLO algorithm had been chosen to fit the requirement of real-time detection system with high recognition accuracy. To configurate best setting for the AI model training process, the best configuration had been found out (ref. 4.1.3) after a lot of testing to avoid errors such as CUDA out of memory error, time out error, environment errors, run-time errors, etc. The major concern about model training in this project is using local development or cloud platform. At the end after testing both platform, local development has better time control and cloud platform has better file management. They both produce similar accuracy AI model since they are using the same algorithm to train. In my experience, cloud platform is great for testing at the beginning and local development is great for producing the final product.
For detecting and recognizing personal protective gears, the AI model had been trained with different configuration and dataset to look for the best model detecting and recognizing result. Each training took around 9000 epochs and 15 hours, around 30 models with different setting had been tested. At the end, the best trained model got 86.7% overall accuracy for the safety gears in the validation set. (ref. 4.2.1). For classes precision and confidence, all classes had achieved 1.00 precision at 0.804 confidence (ref. 4.2.2). For testing in video to analysis real-time detection result, AI model can recognize most of the test case while maintaining 40 FPS (ref. 4.3).