How to Train YOLO Model to Detect Distracted Drivers
Distracted driving is any activity that diverts a driver’s attention while driving a motor vehicle. This includes activities such as texting, talking on the phone, drinking, doing makeup and hair, fiddling with the stereo and radio systems and talking to fellow passengers.
Distracted driving is one of the leading causes of deaths on the US roads. According to the National Highway Traffic Safety Administration (NHTSA), the distracted driving killed 3,142 people in the US in 2020 (source: https://www.nhtsa.gov/risky-driving/distracted-driving).
In this short article, we will explore a computer vision and machine learning technique that will be able to detect drivers’s distraction in realtime. It is our hope that such system, when implemented in practice, will help save thousands of lives.
I will illustrate how to train a YOLO model from a labeled set of images. You Only Look Once or YOLO is a popular state-of-the-art object detection algorithm that can efficient detect objects within an image. YOLO v5 is an open source implementation of YOLO by Ultralytics. Here is the github repository for more details on YOLO https://github.com/ultralytics/yolov5.
Dataset: For this article, we will utilize an already labeled image set that is publicly and freely available at Roboflow at https://universe.roboflow.com/ipylot-project/distracted-driving-v2wk5. You may have to create an account to access the labeled images.
Dataset contains about 7000 images in training set, 1000 in validation and 1000 in test sets. There are 12 classes labeled from 0 through 11. These labeled classes are mapped as follows:
0: Safe Driving
1: Texting
2: Talking on the phone
3: Operating the Radio
4: Drinking
5: Reaching Behind
6: Hair and Makeup
7: Talking to Passenger
8: Eyes Closed
9: Yawning
10: Nodding Off
11: Eyes Open
The class distribution is shown below in Figure 1.0
Downloading the Dataset: Visit the above URL and click on the “Download this Dataset” button. Select “YOLO v5 PyTorch” format from the dropdown options and check “Download zip to computer” and click the “Continue” button. See Figure 1.1 below for an example.
If you want to label your own images using any other annotation tools, such as LabelImage, follow the guidelines of section 1.2 on this page https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data#12-create-labels-1.
Ensure your directory structure of the labeled images looks like the one shown below in Figure 1.2.
The images directory contains the actual .png or .jpg images. The labels directory contains a .txt file corresponding to each image in the images directory. The .txt file name is exactly the same as the corresponding image file name except the extension.
The *.txt
file specifications are:
- One row per object
- Each row is
class x_center y_center width height
format. - Box coordinates must be in normalized xywh format (from 0–1). If your boxes are in pixels, divide
x_center
andwidth
by image width, andy_center
andheight
by image height. - Class numbers are zero-indexed (start from 0).
If the image contains 2 objects, the label .txt file will contain two lines as shown in Figure 1.3
For the purpose of this article, we assume that you downloaded labeled images from Roboflow to your local computer and it is a zip file.
YOLO v5 Model Training Tool: We will explore how to use Momentum AI’s computer vision training tool to train the YOLO v5 model to detect driver distraction. Create an account by visiting https://one.accure.ai:5555/ and login to it with your username and password.
Uploading Labeled Data to Momentum: Scroll down to locate “Data Management” on the left side menu list, click to launch the page to upload the zip file that you downloaded from Roboflow.
Enter the name of the directory to which you want to upload the zip file containing images and annotations. For example, I entered the directory name “driving” as shown in the screenshot in Figure 1.4 below. To upload, drag and drop the zip file from your local computer over to designated area on this page. Wait until the file is fully uploaded.
After the file is fully uploaded, expand the top level directory, and then the directory you just uploaded the file to. You should see the expanded form of the directory structure that should look something like the one shown in Figure 1.5 below.
Training YOLO v5 Model: From the left menu list, click “Train New Model” option and select “YOLOv5 Object Detection”. This will open a form to configure your YOLO model parameters. Figure 1.6 shows a sample training configuration. The form fields are explained below.
Name: A user defined name
Description: Give a detailed description of the model and its purpose.
model_name: Give a meaningful name. This will be the name of the model file after the training is completed.
model_version: Specify a version number
train_dir: Path to the training directory that contains the images and labels subdirectories. For our example, we have the path as driving/train.
validation_dir: The path to the validation directory
test_dir: path to the test directory
model_output_dir: This is the path where the trained model will be stored. For our example, we entered driving/output.
num_classes: Specify the number of object classes you are training to detect for. In our case, we have 12 classes.
class_names: Comma separated list of class names, for example, c0 — Safe Driving,c1 — Texting,c2 — Talking on the phone,c3 — Operating the Radio,c4 — Drinking,c5 — Reaching Behind,c6 — Hair and Makeup,c7 — Talking to Passenger,d0 — Eyes Closed,d1 — Yawning,d2 — Nodding Off,d3 — Eyes Open
image_size: Specify the dimension to which all images will be resized to. This will depend on the transfer learning model that we will select next. If we select YOLOv5, input size 640x640, then the image size should be 640. If we select YOLOv5, input size 1280x1280, then the image size should be 1280. In our case, we will use YOLOv5, input size 640x640 for the transfer learning and hence the image_size will be 640.
epochs: This is the maximum number of iterations the training should run.
batch_size: This is the number of images in a single batch of training. Enter appropriate batch size based on your hardware memory. Since we are running this example model on a small GPU machine, we are using the batch size 4 but you should consider using a larger batch size, such as 32 or 64 or higher.
transfer_learning_using: Select the appropriate YOLO pre-trained model.
cache_images: Select Yes if you have large enough memory to keep all images in cache to speed up the training. Otherwise, select No.
Click the Submit button to save the model configuration.
Starting the YOLO Model Training: After saving, the page will transition to show the model details. You can also navigate to this page by expanding “My Models” on the left menu list and clicking on the model name.
On the model detail page, click on the green “Start” button to start the model.
It will take a while to transfer all the images and annotations to the training cluster and during that time you might see a message saying the model training has failed. Ignore that “Failed” message specially within a few minutes of the model launch. Keep refreshing the page until you will start seeing the status as “Running” with the spinner spinning.
Monitoring the Training: As shown in Figure 1.7 below, the model detail page shows the training status and the latest 1000 lines of the training logs.
The monitoring screen also shows various types of losses and metrics as shown in Figure 1.8 below.
Displaying Evaluation Results: On the model details and monitoring page, click the “Evaluate” button to display the evaluation result. If the model is still training, you will see the training evaluations only. After the model is done training, it will show the evaluation based on the test data. Here is the screenshot (Figure 1.9) showing the evaluation results. Click on the thumbnail images to see the enlarged view.
After the model is fully trained, clicking on the “Evaluate” button will show thumbnails of evaluation results. Clicking on the thumbnail shows an enlarged view of the evaluation result (Figure 1.10 below).
Downloading the Trained Model: Click on the “Download Model” button to download the model in ONNX format. It may take a while to download the model, so, wait until the browser spinner does not stop spinning and model file is not downloaded.
Using the Trained Model for Inference: After you download the model, save them to a location within your local computer or server.
- Download the latest YOLO v5 source from github using the command:
git clone https://github.com/ultralytics/yolov5
2. Install the dependencies using the command:
cd yolov5
pip install -r requirements.txt
3. After all the requirements are successfully installed, navigate to the directory where detect.py is located.
python yolov5/detect.py — weights path_to_onnx_file — source path_to_input_images — project path_to_save_output
In the above command the arguments should be appropriately changed to match your computer or server paths.
If the above command successfully runs, the output images with bounding boxes around the detected objects will be stored.
References:
- Ansari S. (2020). Building Computer Vision Applications Using Artificial Neural Networks. Apress. 10.1007/978–1–4842–5887–3_4, https://link.springer.com/book/10.1007/978-1-4842-5887-3
- https://github.com/ultralytics/yolov5