Perception

An assistive mobile app created to empower visually impaired users; automatically converts camera images into spoken descriptions with intutive voice command activation and prompting.

Perception.Highlight.Video.mp4

Overview

Perception is a mobile application designed to assist visually impaired individuals in navigating their surroundings through an intuitive interface, on-device prompt transcription, and AI-powered image recognition. The app captures images through the device's camera and converts visual information into detailed audio descriptions in real-time. The intuitive voice command interface allows users to interact with the application completely hands-free and address specific objects within their environment, making it truly accessible for those with visual impairments.

Features

Core Feature - Real-Time Guidance: Simply speak to the app directly as if it is a human and recieve near instant audio responses with spatial awareness.

Spatial Awareness:

Visual Awareness: With every prompt, the application also captures and processes an image of your nearby surroundings.
Locational Awareness: The application can access the user's location to address location-related prompts.
Environmental Awareness: The application can provides real-time weather information and forecasts to address weather-related prompts.
Time Awareness: The application can provide information about the current time, date, and time zone when prompted.

Common Use Cases:

Text Recognition: Extract and read text from images, documents, signs, and labels
Object Recognition: Identify common objects, people, and environments with high accuracy
Scene Understanding: Receive contextual descriptions of surroundings for better spatial awareness
Intuitive Audio Feedback: Clear voice prompts guide users through the application
Customizable Settings: Adjust recognition models, activation threshold angle, speech recognition timeout period, and compression settings.

Tech Stack

Frontend

Frameworks: React Native with Expo
Language: TypeScript
Text-to-Speech: expo-speech

Backend

Speech transcription: SFSpeechRecognizer (iOS) and SpeechRecognizer (Android) for local transcription
Image recognition: Gemini Vision for its low latency and cost

Get Started

Install dependencies:
```
npm install
```

Android

Ensure the Android SDK is installed
Create a development build:
```
npx expo prebuild --platform android
```

Create android/local.properties to point to Android SDK path:

sdk.dir=C:\\Users\\YOUR_USERNAME\\AppData\\Local\\Android\\Sdk

Compile into native Android code

Option 1: Development Testing with (Android Studio Emulator or Connected Android Device)

npx expo run:android

Reflects real-time code changes
Automatically connects to local development server after app is started

Option 2: Development Build APK

cd android
./gradlew assembleDebug

Located at android/app/build/outputs/apk/debug/app-debug.apk
Once installed on testing device, open dev settings (by shaking the device), and change the build location to YOUR_LOCAL_IP:8081
Used for distribution of a development build for initial testing

Option 3: Standalone APK

cd android
./gradlew assembleRelease

Located at android/app/build/outputs/apk/release/app-release.apk
Bundles all JavaScript into the APK
Suitable for distribution to end users

Option 4: Standalone Distribution Bundle (AAB)

cd android
./gradlew bundleRelease

Located at android/app/build/outputs/bundle/release/app-release.aab
Bundles all JavaScript into the APK
Suitable for distribution on the Play Store

iOS

Ensure Xcode is installed (Mac only)
Create a development build:
```
npx expo prebuild --platform ios
```
Install iOS dependencies:
```
cd ios
pod install
cd ..
```
Compile and run on simulator or device:
```
npx expo run:ios
```

To create a standalone iOS build:

cd ios
xcodebuild -workspace iSight.xcworkspace -scheme iSight -configuration Release -archivePath iSight.xcarchive archive

Start the app
```
npx expo start
```

Privacy and Security

Perception prioritizes user privacy. All image processing happens on-device when possible, and any data sent to cloud services is anonymized and not stored permanently.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
app		app
assets		assets
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.json		app.json
package.json		package.json
privacyPolicy.md		privacyPolicy.md
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perception

Overview

Features

Tech Stack

Frontend

Backend

Get Started

Android

Option 1: Development Testing with (Android Studio Emulator or Connected Android Device)

Option 2: Development Build APK

Option 3: Standalone APK

Option 4: Standalone Distribution Bundle (AAB)

iOS

To create a standalone iOS build:

Privacy and Security

License

About

Uh oh!

Releases 3

Uh oh!

Languages

License

nicholasching/Perception

Folders and files

Latest commit

History

Repository files navigation

Perception

Overview

Features

Tech Stack

Frontend

Backend

Get Started

Android

Option 1: Development Testing with (Android Studio Emulator or Connected Android Device)

Option 2: Development Build APK

Option 3: Standalone APK

Option 4: Standalone Distribution Bundle (AAB)

iOS

To create a standalone iOS build:

Privacy and Security

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Languages