Subdegree ypec-2024

SD06 – Gemini Pro Vision AI Screen Reader

Generative artificial intelligence (Gen AI) such as Google Gemini can be used to bridge the information gap of internet access for visual impairments. The problem being addressed is the inability of visually impaired individuals to access image information due to the lack of adherence to W3C web accessibility initiatives by websites. Currently, about 60% of websites lack meaningful alternate text for their images. Moreover, it is unfeasible to retroactively add descriptive text to all existing websites manually. Sometimes, the provided web image description may not fit what visual impairments want. As a result, we’ve turbocharged the traditional Google ChromeVox Classic Screen Reader with the mighty power of Google Gemini Pro Vision to tackle the challenge of web image information access for those with visual impairments. Not just providing automatic descriptions for internet images, but also as an assistant for visual impairments daily use. For instance, when they are shopping online, the Gen AI provides detailed descriptions on the commodities such as the appearance and the usage. It helps them to buy stuffs online confidently. By utilising the popular and fully functional opensource screen reader, we can quickly provide a production-ready and affordable solution to the visual impaired people.

2 replies on “SD06 – Gemini Pro Vision AI Screen Reader”

Leave a Reply

Your email address will not be published.