Abstract:
Automatic Sign Language Recognition (ASLR) helps with converting hand gestures to spoken
language, therefore, enabling communication between those able to hear and those unable to hear.
There is abundant research work on ASLR of British Sign Language and American Sign Language.
However, Botswana Sign Language has received less attention at least in terms of computational
representation leading to automatic sign language recognition which can be attributed to lack of a
Botswana Sign Language dataset Sign Language Dataset. Work done on other languages is not
always directly applicable to Botswana Sign Language because sign languages differ significantly
from country to country. A dataset plays a pivotal role in sign language recognition pipeline.
However, one of the major challenges researcher’s encounter is accurately extracting hands and
fingers of a signer when the hands or fingers are not in the field of view of the camera (Occlusion).
Researchers have argued that using multiple sensors addresses occlusion better than using a single
sensor. This study proposes an approach to developing a Botswana Sign Language dataset based
on tracking data from the Microsoft’s Kinect sensor and the leap motion controller. The feature
sets from both devices are combined in order to improve recognition performance (especially when
occlusion). Recognition is performed by Support Vector Machines (SVM) and K Nearest Neighbor
(KNN). The resulting dataset consisted of five thousand four hundred and thirty-three (5433)
Botswana Sign Language gestures comprised of five (5) different sign words. The experimental
results obtained show that recognition performance improves when compared to using one device
to capture sign gestures. An overall recognition accuracy of 99.90% and 99.40% have been
recorded using SVM and KNN respectively.