YOLO-Enhanced Care Label Reader for Clothing Care and Maintenance

School of Science and Technology 科技學院
Computing Programmes 電腦學系

YOLO-Enhanced Care Label Reader for Clothing Care and Maintenance

Tse Long Fung Ivan, Chan Hau Wing Harriet, Cheng Tsz Tsun Issac, Cheng Wai Kit Jacky

Programme	Bachelor of Computing with Honours in Internet Technology Bachelor of Science with Honours in Computer Science
Supervisor	Dr. Roy Li
Areas	Intelligent Applications
Year of Completion	2024

Objectives

Project Aim

The project aims to address the negative impacts of textile waste by providing a proactive solution that extends the lifespan of clothing items. To achieve this, an Android application incorporating a care label reader and a smart wardrobe feature will be developed.

Project Objectives

High-Precision Care Label Recognition

Train a model using a dataset of 10,000+ images (≥100 per class) based on the GINETEX standard.

Target >80% mAP50 and >90% real-world accuracy in recognizing garment care labels.

Model Accuracy & Reliability Enhancement

Evaluate model with 100 real-world tests.

Apply refinements to maintain or exceed 90% real-use accuracy.

Result Categorization Algorithm

Develop logic to group care label outputs.

Suggest laundry instructions (e.g., what can be washed together) to help users care for garments and extend clothing lifespan.

Clothing Registration Interface

Design a user-friendly input system where users scan care labels and manually enter details like color, type, or season.

Allows users to build a personalized clothing inventory.

Inventory Management System

Create tools for users to manage and view their wardrobe digitally, improving outfit planning and item care.

User Experience Optimization

Collect prototype feedback to refine design, usability, and satisfaction.

Apply findings to align the app with user expectations and behaviours

Videos

Demonstration Video

Presentation Video

Methodologies and Technologies used

Frontend:

Built using React Native to deliver a responsive and user-friendly mobile interface.

Backend:

Combines a Node.js HTTP server with a Python script to deploy the trained model. Hosted on AWS EC2, ensuring scalability and stable performance.

Database:

Using MongoDB, a flexible NoSQL database, to manage user accounts and garment registration data efficiently and with minimal setup complexity.

AI Model:

Implements a YOLOv8-based multilabel classification model trained via Ultralytics’ SDK and fine-tuned with a custom dataset for care label recognition.

Hardware:

An Android phone is used for real-device testing to assess compatibility, performance, and user interaction.
Local Development: Visual Studio Code is the main IDE, leveraging port forwarding and Expo Go to simplify cross-platform mobile development and streamline testing.

Key Goals:

Enhance the precision and relevance of AI-generated answers for vocal training.

Support a wide range of users—from beginners to professional singers.

Ensure scalability across sub-domains like classical singing, modern styles, and vocal health.

The chapter introduces the architecture and technical methods supporting this solution, designed to handle singing-related queries more effectively through advanced natural language understanding.

System Design

Figure 1: High-Level System Design Diagram

Figure 2: Component Diagram showcasing the architecture of the application

Figure 3: Dataflow Diagram representing the flow of information of the application

Figure 4: Use-Case Diagram illustrating the functionality of the application

Sign In

Navigation & Input:

User clicks the account icon → redirected to Sign-In page.

Inputs email and password, then clicks “Sign In”.

Validation Check:

If fields are invalid or incomplete → show “Missing field” toast.

If valid → send login API request with {email, password} to backend.

Backend Processing:

Server checks for the email in the database.

If not found → respond with error, show toast: “No account found”.
If found → compare hashed input password with stored hash.

Authentication Outcome:

If hashes match → return success with user ID and username.
If hashes don't match → return failure, show toast: “Login failed”.

Final Response:

Frontend shows toast: “Login Success”.

Redirects user to User Account page.

Figure 5: Activity diagram of Sign In page

Image Preview

Retrieve Image URI:

After a photo is taken in the Camera activity, the image URI is passed to the Preview activity.

User Confirmation:

If the user wants to retake, the app navigates back to the Camera activity.

If the user confirms, the image is:

Encoded to Base64

Sent to the backend for processing (e.g., label recognition or storage)

Post-Processing:

Once backend handling is complete, the frontend navigates to the Add Garment activity to continue the registration process with the processed image data

Figure 6: Activity diagram of Image Preview page

Data Retrieval

On entry, the page loads a list of registered garments from MongoDB, fetched via a backend API.

Item Selection:

Each garment is displayed with a checkbox.

Users select the laundry items they wish to process.

Item Confirmation Modal:

Upon clicking an icon, a Modal pops up displaying: Care label, Color, Image of selected items

Options:

Next → proceeds to washing suggestions

Close → returns to main page

Washing Suggestion Modal:

Selected item details are sent to the backend

Washing Suggestion Algorithm.

A second Modal shows tailored washing instructions.

Options:

Back → returns to confirmation Modal

Close → exits to main page

Navigation Handling:

If the user switches to another feature, the Laundry Care page is closed, and they are redirected accordingly.

Figure 7: Activity diagram of Laundry Care page

Module Design and Hierarchy

Figure 8: Hierarchy of frontend functions

Results (Prototype System Design)

Sign in and Sign-Up Page

Wardrobe Page

Figure 9: UI design of Sign in and Sign-Up page

Figure 10: UI design of Sign in and Sign-Up page

Main UI

Garment Register Page

Laundry Care Page

Figure 11: UI design of main page

Figure 12: UI design of Garment Register page

Figure 13: UI design of Laundry Care page

Camera and Image Preview Page

Label Scanning Camera Page

Figure 14: UI design of Camera and Image Preview page

Figure 15: UI design of Label Scanning Camera page

Conclusion

The project successfully implemented a Retrieval-Augmented Generation (RAG) system powered by LLaMA-2 and enhanced with Semantic Textual Similarity (STS) to provide accurate, context-aware answers within the singing domain. Key achievements included:

Building a specialized knowledge base.

Fine-tuning LLaMA-2 for domain relevance.

Enabling efficient content retrieval and generating high-quality responses.

Establishing and meeting robust evaluation benchmarks.

Limitations

The knowledge base, while rich, lacked full coverage for rare or complex singing topics.

The system was unable to handle multi-turn dialogue, limiting conversational continuity.

Use of top-1 context retrieval restricted the model’s response depth.

Future Development

Shift to a top-k retrieval approach to synthesize insights from multiple sources.

Expand the knowledge base to include wider and deeper content across all singing subfields.

Improve context management for smoother, ongoing conversations.