Description
Navigation for the visually impaired and blind remains to be a major barrier to
independence. Existing assistive tools like guide dogs or white mobility canes provide
limited, immediate information within a range of about 5 feet. Alternatively, assistive
applications for navigation only provide static, generalizable information about a
broader area that could be a few hundred feet radius to miles. Currently, no solution
effectively covers the 5 to 20 feet range, leaving users without crucial information
about their surroundings in this mid-distance area. This project explores the potential
of state-of-the-art vision-language models (VLMs) to provide new navigation solutions
for the visually impaired and blind that bridge the aforementioned gap in information
about the environment. VLMs prove capable of identifying key objects and reasoning
from corresponding text and images in real time, making them the ideal candidate for
assistive technology. Leveraging these capabilities, these models may be integrated
into wearable or extendable devices that allow users to receive continuous support in
unfamiliar environments, improving their independence and maintaining safety. This
project investigates the practical application of VLMs in real-world scenarios, with
an emphasis on ease of use and reliability. This work has the potential to expand the
role of assistive technology in daily life and complement existing solutions for more
intuitive and responsive understanding.
Details
Contributors
- Raines, Kelly (Author)
- Senanayake, Ransalu (Thesis director)
- Osburn, Steven (Committee member)
- Barrett, The Honors College (Contributor)
- Computer Science and Engineering Program (Contributor)
- School of International Letters and Cultures (Contributor)
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2024-12