Development of Tomato Leaf Disease Detection using Single Shot Detector (SSD) Mobilenet V2

Purpose – To create a software prototype for the tomato leaf disease detection model to identify the tomato leaf condition and detect and identify the disease present in it. Methodology – Using the TensorFlow 2 Object Detection API, the object detection model used is the Single Shot Detector (SSD) MobileNetV2 Object Detection model. The feature extractor used is the pre-trained TF2 MobileNetV2 model with the ImageNet dataset providing trained weights that allows feature extraction. Combining the pre-trained TF2 MobileNetV2 and Convolutional Neural Network (CNN) for SSD, the result object localization and image classification with SSD, and feature extractor pre-trained model. Result – When training the model, at the 1300th step out of 6000 steps, the learning rate spiked from 0 to 0.7999. It then stabilized from 0.7999 and gradually decreased to 0.7796. After training, the total loss of the model is 46.95% for evaluation and 45.32% for training results. The average recall of the model is


INTRODUCTION
Tomato, scientifically known as, Solanum Lycopersicum, is one of the major products of the Philippines and other tropic countries (Statista, 2022). The two most prevalent diseases that can affect tomatoes are bacterial spots and the yellow leaf curl virus (Mississippi State University Extension Service, n.d.). This largely affects the development, production, planting, and harvesting of tomato fruits (Gorme et al., 2017). Additionally, these leaf diseases are among the plant-pathogenic bacteria in regions around the world with warm temperate climates (Velásquez et al., 2018).
By diagnosing the problem with the tomato plant ahead of time, it is possible to avoid the spread of the disease or condition that the tomato is suffering from since the spread can be contained to one leaf alone if treated promptly. Object detection uses deep learning methods to recognize humans, buildings, and automobiles as objects in the form of images and videos (Patel, 2019;Patel, 2021;Great Learning Team, 2021;Great Learning Team, 2022). It is known as one of the notable applications of deep learning used in computer vision in a variety of applications (Sabrol & Satish, 2016;Habiba & Islam, 2021).

General Objectives
The proponents aim to develop a tomato leaf disease detection model by utilizing a pre-trained model for object detection to identify the tomato leaf condition; Additionally, the model will detect and identify the specific disease present in a tomato leaf.

Specific Objectives
• To develop a tomato leaf disease detection model that uses MobileNetV2; • To distinguish the tomato leaf condition based on two (2) categories: (1) healthy, and (2) unhealthy; and • To identify the tomato leaf disease under 'unhealthy' into two classifications: (1) Bacterial Spot, and (2) Yellow Leaf Curl Virus.

Scope and Delimitation
The proposed project employs Python as the programming language and MobileNetV2 as a pre-trained model that will both serve as the main framework for the project's object detection. The project is limited in detecting the condition of tomato leaf whether it is healthy or unhealthy. The project will only cover two diseases under unhealthy conditions, the simulation and detection will only be shown inside the output of the program without a GUI. Figure 1 illustrates the procedural sequence for detecting tomato leaf disease. A detailed explanation of each step will be discussed in the project methodology section.

LITERATURE REVIEW
According to reports, 4.8 million Tomatoes were planted on hectares around the world, with a total yield of 161.8 million metric tons in 2012 (FAOSTAT 2012). This corresponds to 4.47% of the total plantation (around 214.6 thousand metric tons) of the Philippines' 16.7 thousand hectares of land (BAS 2015) (Philippine Statistics Authority -Republic of the Philippines, n.d.). The Ilocos Region contributes to the country's 34.1% of total tomato output (24.41 thousand metric tons), followed by Central Luzon (12.6 %), and Cagayan Valley (10.2%).
The bacterial spot is a major tomato disease globally in terms of economic impact (Emerging Pathogens Institute -University of Florida, n.d.). Due to its nutritional worth and culinary qualities, the tomato (Solanum Lycopersicon L.) is the second most significant vegetable crop in the world, following potatoes. Losses in tomatoes cultivated in fields are caused by bacterial spots. The four species of the bacterium genus Xanthomonas are responsible for tomato bacterial spots. Some of the Xanthomonas strains that cause tomato bacterial spots can frequently also infect peppers. Despite not infecting other plants, these diseases can persist on the surface of leaves, which helps plants survive between growing seasons.

1861
The tomato yellow leaf curl virus (TYLCV) is a widespread virus that affects the majority of tomato leaves (Simone, 2019). This causes a variety of issues with a tomato plant. Aside from the yellowing and curling of leaves, there would be a reduced size of leaflets, stunting of growth of the tomato plant, and flower dropping. These symptoms would affect the tomato fruits since they would not be able to get the needed nutrients it needs to be able to reach the quality of tomato fruits that are needed for harvesting.
Once TYLCV starts, it easily spreads among the different tomato plants. This would then lead to a severe yield loss of up to 100% in the tomato fruit harvest. A huge loss of yield would cause a drop in the supply of local tomatoes in the market (Yong et al., 2019). In turn, this would then push local and national governments to source tomatoes internationally to meet the demands, and to avoid an additional factor of economic loss for the country.
The detection of plant disease has practical importance in agricultural production. The success of planting and harvesting relies primarily on the scrutiny of a plant's development and growth, to identify its health status. The advancement of computer vision technology has provided the feasibility of detecting plant diseases. The application of an automatic plant disease detection system benefits not only farmers but also the agriculture of an economy.

METHODOLOGY
Using the TensorFlow 2 Object Detection API, the object detection model used is the SSD MobileNetV2 Object Detection model. The feature extractor used by the said model is the pre-trained TF2 MobileNetV2 model. MobileNetV2 is a model for image classification (Koech, 2020). The convolutional blocks present in MobileNetV2 are illustrated in Figure 2. Two distinct block types are present in MobileNetV2 architecture. With a stride of 1, the first is a residual block. Another one is a block for downsizing which is with a stride of 2. Both blocks contain 3 same layers. The first layer is a 1x1 convolution with ReLU6; the second layer is a depth-wise convolution; and the third layer is another 1x1 non-linear convolution. The first block is a residual block that allows information to flow from the start to the last layers. When this block takes an image as an input, patterns within the image are extracted, and as the extraction flow proceeds to the additional layer, the previous image is combined with the processed image. On the other hand, the second block is a downsizing block because of the stride of 2. While extracting features from the input, the image size is also being downsized. These two distinct blocks inside the MobileNetV2 continuously extract features from the image while downsizing the image size.  Figure 3 is a single convolutional network., capable of learning to predict box locations and classify the predicted locations in one single shot. SSD essentially performs object localization and image classification in a single shot. Given that combining image classification and object localization, the result is object detection, there is only one shot performed by SSD to detect any object within an image. Additionally, the SSD network consists of a base CNN architecture with several convolutional SSD layers. Combining the pre-trained TF2 MobileNetV2 as the base CNN for SSD, the result is a combination of object localization and image classification with SSD and an already accurate feature extractor pre-trained model with TF2 MobileNetV2. The resulting model can perform feature extraction and object detection.

RESULTS
There are 10 images of tomato leaves to be identified by the model. In the pictures of the tomato leaves that were used, 3 had "Tomato Bacterial Spot", 3 had "Tomato Yellow Curl Leaf Virus", and the remaining were healthy tomato leaves. The results have come back with results with accuracies of at least 97% and up to 100%. The model that the proponents had trained was able to correctly identify all the 10 images that were used for testing with high accuracy. Refer to Table 1  The total loss is the summation of classification loss, localization loss, and regularization loss. The total loss of the model is 46.95% for evaluation and 45.32% for training results. Refer to Figure 5 for additional information.
1864 Figure 5. Total Loss Average Recall is used to measure the assertiveness of the object detection model for a specific class. Figure 6 shows the average recall of the model which is 94.51%.

Figure 6. Average Recall
The learning rate describes how quickly the model abandons the patterns it has learned while training to learn new patterns. Figure 7 shows the learning rate of the model. This shows that within the first 1300 steps of training, the model has learned patterns and concepts about the three leaf conditions very quickly.

Training and Validation Loss
The model's classification loss is 4.3% for evaluation and 2.7% for training, indicating a 4.3% loss when comparing the bounding box and the predicted class. The model's localization loss is 0.89% for evaluation and 0.52% for training, indicating that there is a 0.89% difference between the predicted and labeled bounding boxes. The model has a regularization loss of 41.73% in terms of evaluation and 42.11% in training results. The total loss of the model is 46.95% for evaluation and 45.32% for training results.

Average Recall
In this study, it is the recall in detection in an image, averaged over the three leaf conditions. As an example, given that there are a total of 100 tomato leaves with different leaf conditions, in that, some are healthy, some are with bacterial spots, and some are with yellow leaf curl virus, the model can correctly detect the actual leaf conditions of 94 to 95 of those tomato leaves. This shows that out of all the actual conditions of the tomato leaf images evaluated by the model, the model can recognize a healthy tomato leaf, a tomato leaf with bacterial spot, and a tomato leaf with yellow leaf curl virus correctly 94.51% of the time.

Learning Rate
In the first 1300th step of the training, the learning rate of the model spiked from 0 to 0.7999. It then stabilized from 0.7999 and gradually decreased to 0.7796. Figure 7 shows that within the first 1300 steps of training, the model has learned patterns and concepts about the three leaf conditions very quickly. It then stabilizes afterward which shows a desirable learning rate for a model since it maintained the learned patterns while being able to slightly adjust its learnings to subtle changes.

Mean Average Precision
Out of all model's predictions on the tomato leaf images it tested, regardless of whether the prediction class namely healthy, with yellow curl virus, or with bacterial spot, is correct or not, 92 to 93 of these predictions are predicted correctly out of 100 predictions of the same class not knowing whether the prediction is correct or not.

CONCLUSIONS AND RECOMMENDATIONS
Using TF2 MobileNetV2 as the pre-trained feature extractor and, adding it with SSD layers was developed in this project. The resulting model is a model that can perform object detection to localize a tomato leaf and identify its condition. When training the model, at the 1300th step out of 6000 steps, the learning rate spiked from 0 to 0.7999. It then stabilized from 0.7999 and gradually decreased to 0.7796. After training, the total loss of the model is 46.95% for evaluation and 45.32% for training results. The average recall of the model is 94.51%. The mAP of the model is 92.45%. The prototype can be improved further by using other datasets with a larger sample of tomato leaf images, and including other types of tomato leaf diseases, to improve the model's results and broaden the project's scope. Furthermore, a mobile application for real-time plant detection is also recommended to improve the user-friendliness and usability of the project.

IMPLICATIONS
The study aims to be able to help people to identify tomato leaf diseases found in tomato plantations in the Philippines, namely, bacterial spot, and yellow leaf curl virus. By being able to identify these tomato plant diseases accurately, people can spot the diseases early on enabling them to correctly remedy them, salvage available tomato plants, and efficiently avoid these plant diseases in the future. This study can help speed up identification to further the development, production, planting, and harvesting of tomato fruits by providing them with the means to identify tomato leaf diseases.