Price Prediction with Bidirectional Long Short-Term Memory Algorithm for High-Value Commercial Crops

Price prediction with Bidirectional Long Short-Term Memory Algorithm for


INTRODUCTION
The Food and Agriculture Organization defines food security as the condition in which all people at all times have physical, social, and economic access to sufficient safe, and nutritious food (Barrett, 2010;Havas, 2011). There have been programs addressing food security with various approaches being adopted. These include among others the improvement in the seeds' quality, good management practices, and prior knowledge about the expected yield. Since agricultural productivity, hunger, poverty, and sustainability have a direct relationship, increasing agricultural productivity and systematic marketing strategy to adequately satisfy supply and demand is important (Chawla, 2016;Jain et al. 2018).
It is of great importance to farmers and policymakers to have information about the price for efficient agricultural development and for taking decisions on important issues and many other related issues to compete in the vend of vegetable production. Borate et al. (2016) also assert that data mining plays a crucial role in decision-making on issues related to agriculture.
The price of vegetables is volatile and changes frequently. All of these become the primary impediments to the continued and steady growth of vegetable production. Use the scientific method to investigate the price change law for veggies and to predict the trend quickly and accurately. All these projects have a significant impact on vegetable output, government regulation, and the stability of the vegetable business (Luo, et al. 2010).
As the world's population increases, so do agri-food demands, demanding a change away from traditional agricultural practices toward smart agriculture. Even though agriculture is regarded as one of the most important industries in sustaining food security, studies and software development that incorporate technology into agriculture are still few (Abbasi, et al. 2022).
Due to high fluctuations in prices between seasons, farmers and farm investors do not always capitalize on the best price for the crop. Overproduction results from unguided planting calendars and the unavailability of data on price trends give rise to zero or low return of investment in vegetable production.
Therefore, efforts should be taken to develop innovative approaches for sustainable vegetable production. The present study came up with a system to be used in the development of a decision support application utilizing available datasets to predict vegetable prices to be used as decision support for vegetable production for vegetable producers and investors.

Objectives of the Study
This study explored the integration of technology and develop a Crop Price Prediction to address the concerns for High-Value Commercial Crops. Specifically, the objectives of this study are the following: 1. to design and develop the features of the High-Value Commercial Crop price prediction using the Bidirectional Long Short-Term Memory; 2. to test the system functionalities using real datasets for a) price forecasting and b) decision support; 3. to evaluate the accuracy of the system applying assessment metrics in terms of a) computation of errors using RMSE of predicted and forecasted price and b) computation of 'loss' of training and test dataset using MSE and an Adam Optimizer

Price Prediction
The effects of price fluctuations on consumers, farmers, and grain processors are extensive. Thus, forecasting the prices has attracted significant attention from researchers. Forecasting models use quantitative time-series data (dos Reis Filho et al. 2020).
According to Zhao (2021), prediction methods for futures price patterns have been evolving for over a century, and many people have attempted to make predictions using traditional regression analysis. The level of mathematical statistics has improved in tandem with the advancement of computer technology. As a result, artificial neural networks with advanced nonlinear computing capabilities have begun to emerge and are now widely deployed. The capacity of neural networks to fit nonlinear functions is their most notable attribute. This property is the primary reason for neural networks' enormous potential in the financial sector. Nasira et al. (2012) argued that price forecasting aids farmers and the government in making informed decisions. They used the price of tomatoes as a basis for their research and used a neural network-based prediction model. The result displayed the absolute error percentage of monthly and weekly vegetable price prediction as well as the price prediction accuracy percentage. Wang et al. (2019) used the LSTM neural network to create a model to anticipate high and low futures prices of soybean and used Mean Absolute Error (MAE) and trend precision to assess the performance of the model. To compare, they used the Long Short-Term Memory (LSTM) neural network to predict the closing price and the BP neural network to develop another prediction model. As a result, the prediction model based on the LSTM neural network performed better and achieved more than 80% accuracy in trend estimation. Siami-Namini et al. (2019) research looked at whether Bi-directional Long Short-Term Memory (Bi-LSTM) with additional training capability outperforms traditional unidirectional LSTM. They stated that machine and deep learning-based algorithms are two new ways of solving time series prediction challenges. Traditional regression-based modeling has been found to generate less accurate findings than these techniques. Artificial Recurrent Neural Networks (RNNs) with memory, such as LSTM, have been shown to outperform the Autoregressive Integrated Moving Average (ARIMA) by a significant margin. Additional "gates" are included in LSTM-based models to memorize longer sequences of input data. The main concern is whether the gates included in the LSTM architecture already provided a good forecast and whether further data training is required to improve the prediction further. By traversing the input data twice (i.e., left-toright and right-to-left), Bi-directional LSTMs (Bi-LSTMs) enable additional training. The findings of their study suggest that additional data training, and hence Bi-LSTM-based modeling, provides better predictions than standard LSTM-based models. In particular, it was discovered that Bi-LSTM models outperformed ARIMA and LSTM models in terms of prediction. Bi-LSTM models also appear to attain equilibrium much more slowly than LSTM-based models.

Bi-LSTM in Forecasting
For stock market prediction, Althelaya et al. (2018) examined the performance of Bi-directional and Stacked LSTM deep learning methodologies. Three performance indicators are used to evaluate the performance on a benchmark dataset for short-and long-term prediction. The tuned Bi-LSTM and SLSTM models' performance were also compared to shallow neural networks and unidirectional LSTM models. Overall, the result was that Bi-LSTM networks outperformed other networks in terms of accuracy and convergence for both short-and long-term forecasting. Sunny et.al (2020) focused on stock price forecasting. Stock price prediction is important for the growth of shareholders in a business's stock since it increases the interest of speculators in investing money in the company. A good prediction of a stock's future price could yield a significant profit. In their study, a new stock price prediction model was utilized. These are the LSTM model and Bi-LSTM model. The RMSE for both the LSTM and Bi-LSTM models was calculated by altering the number of epochs, hidden layers, dense layers, and hidden layer units to discover a better model that can be used to accurately estimate future stock values. As a result, the Bi-LSTM model generates lower RMSE compared to the LSTM model. They then suggested the use of Bi-LSTM by individuals and ventures for stock market forecasting.
Air-quality prediction in selected parts of Delhi, India using some recent versions of LSTM such as bidirectional-LSTM and encoder-decoder LSTM models was the focus of the study of Tiwari, et. al (2021) in response to Covid-19 pandemic. They employed a multivariate time series approach to estimate air quality over ten prediction horizons totaling 80 hours, as well as a long-term (one month ahead) forecast with quantifiable uncertainty. Despite COVID-19's impact on air quality during full and partial lockdown periods, their findings suggest that the multivariate Bi-LSTM model delivers the best forecasts.
Power load forecasting accuracy is critical for ensuring the power system's safety, stability, and economic operation. Short-term power load forecasting, in particular, provides the foundation for grid planning and decision-making. Machine learning methods have been frequently employed for short-term power load forecasting in recent years. Tang, et. al (2019) used a multi-layer Bi-directional recurrent neural network model based on LSTM and GRU to anticipate short-term power usage, which was then validated using two data sets. The error achieved by the suggested method has a smaller mean absolute percentage error (MAPE), root means square error (RMSE), and mean absolute error (MAE) than the LSTM, Support Vector Machine (SVR), and Back Propagation (BP) models. The proposed method could be utilized to increase the accuracy of short-term load forecasting and reduce anticipated value variations during the forecasting process. Graves et. al. (2005) conducted two tests with Bi-directional and unidirectional LSTM networks on the TIMIT voice corpus. The first one was the framewise phoneme classification and the result was, Bi-directional LSTM outperformers unidirectional LSTM and conventional Recurrent Neural Networks (RNNs). The advantage of bidirectional training carried over into the second experiment which is the phoneme recognition with hybrid Hidden Markov Models HMM/LSTM systems. With these systems, Graves recorded better phoneme accuracy than with equivalent traditional HMMs and did so with fewer parameters. In addition, the phoneme recognition score of Bi-LSTM by using a duration weighted error function was improved. Their study aimed to develop a suitable methodology for mapping rice crops in West Rio Grande do Su and concluded that, in contrast to unidirectional LSTM models, bi-LSTM models were more effective because their output is dependent on the previous and next segments. Overall accuracy and Kappa (>97 percent for all methods and measures) were high in the results. The best model was Bi-LSTM, which showed significant differences in the McNemar test with a significance of 0.05.
An environmental information gathering system for citrus orchards was built by Gao et al. (2021) based on the Internet of Things (IoT) to create an irrigation scheduling plan for usage in large-area citrus orchards. Deep bidirectional long short-term memory (Bi-LSTM) networks were employed to improve soil moisture (SM) and soil electrical conductivity (SEC) predictions using environmental data, offering a useful reference for citrus crop irrigation and fertilization. The deep Bi-LSTM model was compared to a multilayer neural network (MLNN) in terms of performance. Based on the results of the performance criteria, many of the assessment indicators in their study showed that the Bi-LSTM networks perform better than the MLNN model. Kiperwasser et al. (2016) described a Bidirectional-LSTM-based dependency parsing approach that is both simple and effective. Each sentence token has a Bi-LSTM vector that represents it in its sentential context, and feature vectors were created by concatenating a few Bi-LSTM vectors. The Bi-LSTM was combined with the parser aim to provide very effective feature extractors for parsing. They demonstrated the usefulness of the approach by using it to create a greedy transition-based parser and a globally optimized graph-based parser. The parsers that arose had very simple architectures and match or exceed state-of-the-art accuracy in English and Chinese.

Bi-LSTM with another Neural Network
For business process management, outcome-oriented predictive process monitoring is critical. Unfortunately, current classical machine learning algorithms for this topic necessitate a significant amount of manual involvement and a long period in realtime prediction applications. To automatically build a predictive categorization model, Wang et al. (2019) presented the Att-Bi-LSTM strategy, which combined the Bi-directional LSTM network with the Attention Mechanism that was based on deep learning techniques. This method can capture and optimize the elements that significantly impact the outcome of a completed case, resulting in a high-performing prediction model. Extensive tests on twelve real datasets show that Att-Bi-LSTM beats state-of-the-art approaches in terms of prediction accuracy, adequacy, and timeliness. Chen et al. (2019) proposed a real-time monitoring method based on a fusion of a convolutional neural network (CNN) and a Bi-LSTM network with an attention mechanism to monitor the tool wear state of computerized numerical control (CNC) machining equipment in a manufacturing workshop (CABLSTM). The CNN was utilized to extract deep features from the time-series signal as an input in this technique, and then a Bi-LSTM network with a symmetric structure is built to learn the time-series information between the feature vectors. The experimental results showed that, when compared to other deep learning neural networks and traditional machine learning network models, the model can accurately predict tool wear state in real-time from original data collected by sensors, and recognition accuracy and generalization have improved to some extent. Kim et al. (2019) focused on evaluating and predicting multivariate time series data, where they employed an artificial neural network-based model. Experiments and performance evaluations were conducted using the Root Mean Square Error, which was calculated based on whether each field learned independently and whether the input value varied. The proposed model outperformed other models in tests, according to the RMSE results.

RMSE, Activation Function, Optimizer
Comparing the training capability of Bidirectional LSTM and unidirectional LSTM, Siami-Namini et al. (2019) use RMSE together with loss computation as assessment metrics in evaluating the reduction error rates of the forecasting performance of the two variations of LSTM. Cryptocurrency prediction was the focus of the study of Wu's (2018) conventional LSTM and LSTM with Autoregressive (AR) model employing RMSE computation as one of the assessment metrics for the accuracy of prediction. Tang et al. (2019) dealt with power load forecasting based on a multi-layer bidirectional recurrent neural network and the evaluation index for the error formula was RMSE and the mean square error was specifically for loss computation. The author also employed the Adam optimization algorithm for its computed loss. The authors examine LSTM, support vector regression, and back propagation models to anticipate seasonal load separately. In comparison to the adopted models, the results of the comparison reveal that the proposed method which is the multilayer bidirectional LSTM has a higher forecasting accuracy. Jais et al. (2019) evaluated the effects of the Adam optimization function. The dataset used was diagnostic breast cancer that was fed to a conventional neural network with and without Adam as an optimizer. The result of the study concluded that the Adam optimization function indeed improves the performance of a wide and deep neural network.

METHODOLOGY
The study on Price Prediction with Bidirectional Long Short-Term Memory for High-Value Commercial Crops was built up from a yearning to make vegetable farming more profitable. Figure 1 specified the hardware and software needed in building the Price Prediction with Bidirectional LSTM for high-value commercial crops.
As the needed hardware for the system development is shown in Figure 1. A complete set of computers with at least an Intel Core i3 10 th Generation processor and an 8GB DDR4 RAM Capacity with a 1280 x 760 resolution and VRAM of 1 GB monitor plus a 256GB Storage (SSD) were minimum requirements.
For the development and testing of the proposed system, Figure 1 also shows the software needs. An open-sourced scripting language used was VB .NET 2015 together with Python 3.5.x. XAMPP Control panel v3.2.4 (Apache) served as Localhost Server and MySQL together with MSQL ODBC Driver 5.1 was used as database. TensorFlow 1.14.x was used as a machine learning platform and for the libraries, NumPy, pandas, matplotlib, and statsmodels were employed.
The choice of bidirectional LSTM algorithm as a predictive model for this study had gone through rigorous testing of available real data sets to the different algorithms before its selection. Table 1 reveals the result of the testing. The algorithm that had the lowest Root Mean Square Error (RMSE) which is a common evaluation metric being used in prediction, that was 7.03 is Bidirectional LSTM, using an input of 24, batch size of 3, and epochs of 50. This was so near with 7.2 of Stacked Bidirectional LSTM using an input of 24, batch size of 1, and epochs of 50. The highest RMSE was produced by LSTM (50) Dropout 0.15 Dense (32) with an input of 48, batch size of 12, and epochs of 50. A bidirectional LSTM model with the same parameters was also deployed to determine the RMSE of the squash data set and the result is more convincing since it gives a result lower RMSE that is 4.36.
Price prediction was the only variable used by the researcher due to the unavailability of data from other variables. The price itself took the researcher less than two years to gather data used in this study. Though the volume of production together with the price was the initial variable considered in the study, the insufficiency of data limited the researcher to focus on price only.

Planting Calendar
The office of the Bureau of Plant Industry (2020) posted on their official website a guide to planting and growing vegetable crops. Table 2 shows the vegetable/crop and the ideal month of planting for tomato and squash.
The months of November to January were the best months to plant squash and tomato is ideally planted from January to May and from September to October.

Data Set
Upon collecting from available scattered data from different sources, the researcher preprocessed it to make it worthwhile for the study. These were the prices of tomatoes and squash from the year 2015 to 2019. Table 3 shows several data sets for training and testing. The raw data gathered was daily but there were days when price data is not available. The researcher preprocessed the raw data available by computing the mean of daily prices available to make its weekly data.

Bidirectional LSTM
Bidirectional LSTM is a variant of LSTM and another Recurrent Neural Network variant developed by Schuster and Paliwal (1997) to train the network using past and future input data sequences. The input data was processed using two connected layers. The procedures were carried out by each layer in reversed time step order. The findings could then be blended using various merging techniques. Similarly, Bidirectional LSTM employed two layers, one of which conducted operations in the same direction as the data sequence and the other of which performed operations in the opposite direction, as illustrated in Figure 2. Having two LSTM as one layer in the application improved the learning long-term dependency and, as a result, the model performance.

Figure 2. Bidirectional LSTM Model
The backward LSTM layer output sequence was generated using the reversed inputs from time t-1 to t-n, similar to how the forward LSTM layer output sequence was calculated as shown in Figure 1. These output sequences were then passed into the σ function, which combines them into the yt output vector. The final output of a BiLSTM layer, like the LSTM layer, can be represented as a vector, yt =[yt-n,..., yt-1], and the last element, yt-1, represented the expected price for the following iteration.

Hyperparameters
In the Bidirectional LSTM algorithm, there were different hyper parameters, and hyperparameter adjustments can increase model performance. Epoch, batch size, and step per epoch are all basic hyperparameters. The epoch was the number of training sessions that employ all the training data. The number of batches to feed the model before calling the epoch full and moving on to the next epoch was referred to as steps per epoch. It was used to signal the end of one epoch and the start of the next. When training a model, the batch size indicates that the complete data was divided into the same size sections and that batch size data was entered at a time.

Activation Function and Optimizer
Rectified Linear Units (ReLU) activation function was employed in this study. Hahnloser et al. (2000) presented ReLU as a biologically and mathematically valid activation function. It was first demonstrated in 2011 to allow for improved training of deeper networks.

Figure 3. Rectified Linear Unit
In 2011, it was shown that it benefited deep neural network training. It worked by setting the threshold at 0, i.e. f (x) = max (0, x) as shown in Figure 3. Simply put, it outputs 0 when x is zero and a linear function when x is greater than zero. Because of their closeness to linear units, it was now commonly accepted that ReLU networks were simple to optimize, with the exception that ReLU units output zero throughout half of their domain. Because of this, the gradient of a rectified linear unit remained not just huge, but also constant, even when the unit was active.
The Adam optimization method was used in the model. Adam is a new optimization technique that uses repeated cycles of "adaptive moment estimation" to solve non-convex problems faster and with fewer resources than many current optimization algorithms. It works best in very large data sets since the gradients are kept "tighter" across many learning iterations. Adam combines the benefits of Adaptive Gradients and Root Mean Square Propagation, two other stochastic gradient techniques, to offer a novel learning methodology for optimizing a variety of neural networks.

Assessment Metrics
Deep learning algorithms often report the "loss" figures. Loss is a kind of penalty for a bad prediction. In more detail, if the model's forecast is flawless, the "loss" value will be 0. As a result, the goal was to reduce the loss values. Mean Squared Error was employed in the computation of loss of the training and testing dataset.

Equation 1
Where: o yi is the i th observed value. o ŷi is the corresponding predicted value. o n = the number of observations.
Researchers frequently use the Root-Mean-Square-Error (RMSE) to analyze prediction performance in addition to loss, which is used by deep learning systems. The difference between actual and predicted values was measured by the RMSE. The main advantage of RMSE is that it penalizes huge mistakes.
The RMSE formula is as follows: where N is the total number of observations, while yi is the actual number of observations; yi, on the other hand, is the predicted value. The lower the RMSE the better the model is evaluated.

Equation 2
1432 Figure 4 shows how the researcher designed and developed the system. The researchers used hardware and software specifications to create and develop the system's graphical user interface. The researchers used cutting-edge hardware needs, including a laptop with an Intel Core i3 10th generation processor, 8GB of DDR4 RAM, and a monitor with VRAM and Solid-State Drive storage, to speed up processing and the generation of the prediction's results during testing. The system architecture combined Python and VB.Net in terms of the software utilized. The system's learning capabilities were implemented using Tensorflow, which integrates nicely with key libraries for the study like NumPy, pandas, matplotlib, and statsmodels. The usage of the accessible Structured Query Language in connection with its driver, MySQL ODBC, is required for data management and storage. The price prediction function of the system, which is the main objective of the study, was implemented using a bidirectional LSTM algorithm. ➢ Master List -It shows the list of the top five HVCCs being cultivated in the province but the researcher focuses only on tomatoes and squash in this study. Three other crops were included in the system for future study. ➢ User management -This is where the system admin adds/register, update and delete accounts. ➢ Advance -It is the icon where the prediction functionality of the system is navigated and performed. ➢ About -This is where we can know the developer of the system.

Price forecasting
The system received data inputs from its users which included year, month, weeks, and prices as seen in Figure 6. Before any forecasting is done by the system, data preparation and fine-tuning of the data entered must be done first. Figure 7 shows the preparation of the data before fine-tuning it to be used for forecasting. System user needed to choose first the vegetable which was tomato or squash in the price radio button shown in the GUI before clicking the 'Prepare' button.
After preparing the data and hyperparameters (see Figure 8) were set it would be ready for fine-tuning where the Bidirectional LSTM algorithm takes its role in the system. Loss in each epoch was calculated using MSE with the activation function relU. RMSE of the whole data set was also calculated upon fine-tuning.  Figure 9 shows the training data set which was the blue color and the prediction which was the red color in the graph. The system took the whole data set less twelve (12) rows as the training and the last twelve rows which was the last quarter/last three months of the whole data set was used for testing as specified in Table 3 and Figure 10. After training and testing, the system was now ready to forecast. Upon clicking the forecast button as shown in Figure 11, a one-year price prediction would be shown by the first week only. Forecast of prices can be seen in print preview and available for download in PDF, Word, and Excel format.

Decision Support
The DA Region Two (02) provides a tomato production guide that includes the most up-to-date and locally applicable technical information on tomato cultivation in the region. Tomato harvesting begins 55-65 days after transplantation or 15-20 days after blossoming. Depending on the cultivar and management measures used, tomatoes can grow for up to 100 days after transplantation.
Squash, like tomatoes, mature after three to four months following sowing, according to Tepper (2013), Agriculturist II of BPI-Los Banos National Crop Research and Development Center. Squash is ready for harvest 30-40 days after pollination. Marketready fruits (with a light-yellow stripe on the skin) were widely gathered. If the fruit was meant for seeds, the emergence of a powdery, whitish substance on the surface and the stiffening of the rind were indicators of maturity. Squash seeds are mature and fully formed at this point. With the information and facts already available to farmers, combined with the predicted price of tomato and squash, farmers can now adjust their planting calendar. For tomatoes, the highest price predicted was in the first week of September 2020 and the lowest was the first week of March as shown in Table 4. This prediction was very much useful on the part of the farmer when deciding when to invest and when to start planting tomatoes. Squash price prediction was so smooth even from ten to eleven pesos throughout the first week of every month in the whole year.

Computation of RMSE of actual and predicted data
With the predicted price and actual price available that were not included in the training and test data, the RMSE was computed to check its accuracy.  Table 5 shows that tomato and squash acquired an RMSE of 2.07 and 2.04 respectively which was low enough to conclude that the HVCC Price Prediction with Bidirectional Algorithm predict well. shows that the highest error or difference between the forecast and actual price occurred in the third month and the nearest forecast price to the actual price transpired in May of the year being evaluated for tomatoes. For squash, the almost steady price prediction for the whole year of squash reaped the farthest distance from actual on the first month and the nearest was May, which was the same with tomato, the year being measured as shown in Figure 10.

Computation of loss
Loss during the training of tomato started from 0.0553 and ended in 0.0128 upon finishing the epoch while during the test it started from 0.0570 and was completed in 0.0128. Figure 11 displayed the loss throughout the whole epoch that shows the nearness of losses during training and testing of a dataset. Loss in training and testing of a dataset of squash was lower compared to tomato. Figure 12 presents that during training, the loss started at 0.0172 which was so near to the loss in testing which was 0.0176. The end of training and loss of squash was 0.0019 during training and 0.0020 during testing of data set which was way lower and near each other compared to the loss of tomato.

CONCLUSIONS AND RECOMMENDATIONS
Based on the actual prices of tomato and squash over the previous five years, it is concluded that using the bidirectional long short-term memory technique could be used in price prediction for High-Value Commercial Crops. Farmers and farm investors can use the developed system's capacity to forecast prices for High-Value Commercial Crops as a reference when choosing when to sow their crops. The steady decline in losses of the training and test datasets and the achievement of low RMSE of actual and projected data demonstrated that the accuracy of the prediction is strongly advised.
To improve the capability and accuracy of price prediction, it is recommended that price data for tomatoes and squash be gathered in the upcoming years. The inclusion of more high-value commercial crops to be researched with their multi-variant dataset that directly influences price fluctuation could greatly boost the accuracy of price prediction by extending the study's scope. The model employed in this study could be rectified and/or combined with another method to increase the precision of the predictions.

RESEARCH IMPLICATIONS
Farmers, farm investors, and agricultural government offices can use the developed system's capacity to forecast prices for high-value commercial crops as a reference when choosing and advising when to sow crops.

PRACTICAL IMPLICATION
The results of the study would serve as the foundation for the beginning of a broader agricultural endeavor, namely price forecasting for high-value commercial commodities, which includes the bulk of the top HVCC goods in the area. Because the provincial agricultural government offices don't have a system in place for storing price data, the result of the study is also crucial in organizing and safeguarding price storage.