An assistive mobile app created to empower visually impaired users; automatically converts camera images into spoken descriptions with intutive voice command activation and prompting.
Perception.Highlight.Video.mp4
Perception is a mobile application designed to assist visually impaired individuals in navigating their surroundings through an intuitive interface, on-device prompt transcription, and AI-powered image recognition. The app captures images through the device's camera and converts visual information into detailed audio descriptions in real-time. The intuitive voice command interface allows users to interact with the application completely hands-free and address specific objects within their environment, making it truly accessible for those with visual impairments.
Core Feature - Real-Time Guidance: Simply speak to the app directly as if it is a human and recieve near instant audio responses with spatial awareness.
Spatial Awareness:
- Visual Awareness: With every prompt, the application also captures and processes an image of your nearby surroundings.
- Locational Awareness: The application can access the user's location to address location-related prompts.
- Environmental Awareness: The application can provides real-time weather information and forecasts to address weather-related prompts.
- Time Awareness: The application can provide information about the current time, date, and time zone when prompted.
Common Use Cases:
- Text Recognition: Extract and read text from images, documents, signs, and labels
- Object Recognition: Identify common objects, people, and environments with high accuracy
- Scene Understanding: Receive contextual descriptions of surroundings for better spatial awareness
- Intuitive Audio Feedback: Clear voice prompts guide users through the application
- Customizable Settings: Adjust recognition models, activation threshold angle, speech recognition timeout period, and compression settings.
- Frameworks: React Native with
Expo - Language: TypeScript
- Text-to-Speech:
expo-speech
- Speech transcription:
SFSpeechRecognizer(iOS) andSpeechRecognizer(Android) for local transcription - Image recognition: Gemini Vision for its low latency and cost
-
Install dependencies:
npm install
-
Ensure the Android SDK is installed
-
Create a development build:
npx expo prebuild --platform android
-
Create
android/local.propertiesto point to Android SDK path:sdk.dir=C:\\Users\\YOUR_USERNAME\\AppData\\Local\\Android\\Sdk -
Compile into native Android code
npx expo run:android- Reflects real-time code changes
- Automatically connects to local development server after app is started
cd android
./gradlew assembleDebug- Located at
android/app/build/outputs/apk/debug/app-debug.apk - Once installed on testing device, open dev settings (by shaking the device), and change the build location to
YOUR_LOCAL_IP:8081 - Used for distribution of a development build for initial testing
cd android
./gradlew assembleRelease- Located at
android/app/build/outputs/apk/release/app-release.apk - Bundles all JavaScript into the APK
- Suitable for distribution to end users
cd android
./gradlew bundleRelease- Located at
android/app/build/outputs/bundle/release/app-release.aab - Bundles all JavaScript into the APK
- Suitable for distribution on the Play Store
-
Ensure Xcode is installed (Mac only)
-
Create a development build:
npx expo prebuild --platform ios
-
Install iOS dependencies:
cd ios pod install cd ..
-
Compile and run on simulator or device:
npx expo run:ios
cd ios
xcodebuild -workspace iSight.xcworkspace -scheme iSight -configuration Release -archivePath iSight.xcarchive archive- Start the app
npx expo start
Perception prioritizes user privacy. All image processing happens on-device when possible, and any data sent to cloud services is anonymized and not stored permanently.



