Vision 👁️🎙️

Vision is an AI-powered mobile application built with React Native and Expo that helps users "see" the world through auditory feedback. By capturing a photo, the app analyzes the user's facial expression or the objects in their environment and speaks the results back to them with a touch of personality.

🚀 Features

Emotion Detection: Recognizes a wide range of human emotions (Happy, Sad, Angry, Surprised, Calm, etc.) using AWS Rekognition.
Intelligent Object Identification: Beyond simple labeling, the app uses Google Gemini to refine multiple detected tags into a single, most-likely physical object.
Voice Synthesis: Converts detected data into natural speech using Google Cloud Text-to-Speech (HD Chirp models).
Concurrent Analysis: Processes both facial expressions and object detection simultaneously for a comprehensive understanding of the scene.
Dynamic Responses: Features a variety of personality-filled responses based on the detected emotion.

🛠️ Tech Stack

Framework: React Native with Expo
Navigation: Expo Router
Computer Vision: AWS Rekognition SDK
AI/LLM: Google Generative AI (Gemini 2.5 Flash Lite)
Audio & Speech: Google Cloud Text-to-Speech API and expo-av

📂 Project Structure

/app: Contains the main application screens including the Home (index.tsx) and Camera (capture.tsx) interfaces.
/lib: Core logic for API integrations:
- rekognition.ts: Handles AWS facial and label analysis.
- gemini.ts: Refines object labels using generative AI.
- googleTTS.ts: Manages text-to-speech synthesis.
/assets: Stores project images and custom fonts.

⚙️ Setup & Installation

Clone the repository
Install dependencies:
```
npm install
```
Configure Environment: Create a config.ts file (referenced in the source) and provide your API credentials for:
- AWS Rekognition (Access Key, Secret Key, Region)
- Google Cloud API Key (for TTS)
- Google Gemini API Key
Start the app:
```
npx expo start
```

🖥️ Usage

Open the app to the Welcome Screen.
Tap the Vision Logo to open the camera.
Point the camera at yourself or an object and press the Capture button.
The app will display "Analyzing..." while it communicates with AWS and Google Cloud.
Listen to the AI describe your emotion or the object it sees!

Developed with assistance from Gemini 2.5 Pro.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
app		app
assets		assets
components		components
constants		constants
lib		lib
.gitignore		.gitignore
README.md		README.md
app.json		app.json
config.ts		config.ts
env.d.ts		env.d.ts
globals.js		globals.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision 👁️🎙️

🚀 Features

🛠️ Tech Stack

📂 Project Structure

⚙️ Setup & Installation

🖥️ Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision 👁️🎙️

🚀 Features

🛠️ Tech Stack

📂 Project Structure

⚙️ Setup & Installation

🖥️ Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages