For many people, online calls have become a big part of their day-to-day work routine. Software companies have also kept up the pace, adding more useful features to these platforms, like highlighting the person who is speaking. But when a person is using sign language, the software doesn’t give them any special recognition. Furthermore, these platforms can act as a barrier against those reliant on sign language; one engineer seeks to change that.
To bridge that gap, Priyanjali Gupta, an engineering student from Tamil Nadu’s Vellore Institute of Technology (VIT), created an AI model that translates American sign language (ASL) into English in real-time. She shared her creation on LinkedIn, reaching over 60,000 likes on the platform as people seem impressed with the creative idea.
Gupta said the driving factor of the software was her mum, who was encouraging for her to do something different as an engineering student. “It made me contemplate what I could do with my knowledge and skillset. The idea of inclusive technology struck me. That triggered a set of plans,” Gupta said in an interview with Interesting Engineering.
A big leap
The system developed by Gupta converts signals into English text by analyzing the motions of multiple body parts, such as arms and fingers, using image recognition technology. She digitalized sign language from a few people to develop the technology, as she explains in her Github post – which went viral on the programming platform.
The AI software offers a dynamic way of communicating with deaf or hard-of-hearing people since it’s in real-time. Nevertheless, it’s only in its initial stages. It can covert movements into roughly six gestures – Yes, No, Please, Thank You, I Love You, and Hello. A lot of sign language data would be needed to create a more reliable model, but as a proof of concept, this really does seem to work.
Gupta said the dataset was made manually with a webcam and given annotations. The model is only trained on single frames, so it can’t detect videos yet. Gupta said she’s currently researching the use of Long-Short Term Memory networks (LSTM), an artificial recurrent neural network, to incorporate multiple frames in the software.
The model was made with the help of Nicholas Renotte, a machine learning expert with a popular YouTube channel, Gupta said. While she acknowledged that creating software for sign detection is rather complex, Gupta said to be hopeful that the open-source community will soon help her find a solution that can further expand her work.
American Sign Language is believed to be the third most commonly used language in the United States after English and Spanish. However, applications to translate ASL to another language have a lot of catching up to do. The pandemic has put this into the spotlight, with the work done by Gupta and others being a very good starting point.
Last year, Google researchers presented a language detection model that can identify people who are signing in real-time with up to 91% accuracy. While she welcomed these developments, Gupta said the first step would be to normalize sign languages and other means of communication and work on tackling the communication gap.