Abstract:
Despite the fact that machine learning approaches have demonstrated to efficaciously model the perturbations tangled within the weather patterns, they are still under deployed in under represented countries. This proves the existence of gaps between the weather service providers and institutions that advocate for the data driven approaches of modelling stochastic systems like weather. For instance, the Botswana Department of Meteorological Services is currently looking for new avenues that can be deployed to compliment the conventional weather models; particularly for one-to-three months step-ahead of localised minimum and maximum temperature forecasts.Thereto,thisworkappliespredictiveanalyticsonlocal climatological data harvested, using Perl, from the Shakawe automated weather station starting from 01 July 2014 to 28 February 2019. First, statistical metrics such as scatter plots, box-plots, and correlation coefficients are used to infer patterns and relationships hidden within the collected numerical data. The same process, coupled with Random Forests, is deployed to reduce dimensions of the collected data, hence redundant variables are discarded. In the first phase, the models (MultiLayer-Perceptron (MLP), k-Nearest neighbourhood, Random Forests) are built using the available data. In the second phase, the selected variables (average air temperature, diurnal temperature range, average wind speed, humidity, minimum temperature and barometer pressure) are used to build and compare the proposed models. The models were fit to 70% of the training data, and validated on 30% testing data. The results show that MLP outperforms other models based on the correlation coefficient, Root Mean Squared Error and Minimum Absolute Error.