The Darknet YOLO model must be trained on a Linux platform.
- Install OpenCV from here, following the detailed process steps.
- Install Darknet from here.
- To compile with OpenCV, edit the
Makefileby:- changing
OPENCV=0toOPENCV=1 - changing line 45 from
LDFLAGS+= `pkg-config --libs opencv` -lstdc++toLDFLAGS+= `pkg-config --libs opencv4` -lstdc++ - changing line 46 from
COMMON+= `pkg-config --cflags opencv`toCOMMON+= `pkg-config --cflags opencv4`
- changing
- Finally,
cd srcand modifyimage_opencv.cppby:- adding in missing header files
#include "opencv2/core/core_c.h" #include "opencv2/videoio/legacy/constants_c.h" #include "opencv2/highgui/highgui_c.h"
- changing the line
IplImage ipl = m;toIplImage ipl = cvIplImage(m);
- Videos for training can be found on the Aquadrone google drive.
- The Timestamps folder contains csv files that contain start and end times as well as labels for objects that appear in the videos.
- The timestamps are used by the
Data Collection/split_video.py(from the original aquadrone-vision repo) to produce images of objects from the videos that can be used to train the neural network.
python split_video.py --directory <directory path to store frames> --video <path to video file> --file <path to timestamp file> [--compress] [-n <number of frames per second>]
Before training, images must be annotated with the following information in a corresponding .txt file.
<object class> <x_center> <y_center> <width> <height>
object_classwill be a number between 0 and total number of classes - 1, which identifies which object the metadata belongs to.x_centeris the x coordinate of the center of the object divided by the image width.y_centeris the y coordinate of the center of the object divided by the image height.widthis the width of the object.heightis the height of the object.
Yolo_mark and LabelImg are two good options.
The data augmentation app must be run on a Windows platform.
To install required packages, use pip install -r requirements.txt in the root of the repo.
A GUI has been made to facilitate the process of augmenting images. The Albumentations library was used; demoes of data augmentation techniques can be found here.
The GUI supports the following data augmentation techniques:
- Horizontal Flip
- Motion Blur
- Iso Noise
- Rotate
- CutOut
- Crop
- Rgb Shift
To run the app, navigate to the Data Augmentation subdirectory and run python aug_app.py.
Data collected must be split into three subsets: training set (to train the model), validation set (to evaluate and tweak model and training process), and test set (to finally double check the model on the test set). The divide_data.py can help with automating the division of data.
To run the script, use
python divide_data.py --directory <directory path to store image sets> --images <path to where all images and bounding boxes are stored> --test <percent of images to use in test set> --train <percent of images to use in training set> --valid <percent of images to use in the validation set>