Sixth Sense

Overview

Sixth Sense is an AI-powered real-time assistant designed to aid visually impaired individuals in navigation, object detection, and automation of daily tasks. Utilizing computer vision, voice recognition, and conversational AI, this system provides real-time feedback on the surrounding environment through audio alerts and enables seamless voice-command-based interactions.

Built by Team Code Crusaders for Nexathon'25

Key Features

Real-Time Object & Obstacle Detection: Identifies objects and obstacles in the user's path and provides audio alerts with relative distance.
Indoor & Outdoor Navigation: Provides voice-guided navigation from one location to another.
Conversational AI Interface: Accepts and processes voice commands for various tasks.
Image, Face, and Text Recognition: Describes specific images, recognizes faces, and extracts text.
Environmental Awareness: Describes what’s happening around the user.
Smart Assistant Features: Allows texting, calling, and other mobile interactions via voice commands.

Technologies Used

Computer Vision: OpenCV, YOLO (You Only Look Once) for object detection.
Speech Processing: Azure Speech Services for voice commands and text-to-speech.
Cloud Services: Azure Cognitive Services (Computer Vision, Speech APIs).
Automation & Device Control: ADB (Android Debug Bridge) for mobile automation.

Workflow

User opens the mobile app and provides a live camera feed.
The system detects objects & obstacles in real-time and alerts the user.
The user can issue voice commands to perform actions like navigation, object recognition, or automation tasks.
If a destination is provided, the app offers step-by-step voice-guided navigation.
The system integrates with third-party apps for tasks like calling, messaging, or media playback.

Voice Command Actions

Navigation & Object Detection:
- Get directions (source & destination).
- Detect obstacles in a path with relative distance.
- Describe images or surroundings.
Text & Speech Processing:
- Extract text from an image.
- Read out what’s on the screen.
App Integrations & Automation:
- Search on YouTube.
- Search on Google.
- Send a WhatsApp message.
- Play music on Spotify.
- Order food via Zomato.
- Book a ride on Rapido.
- Book tickets via Redbus.
- Check unread messages on WhatsApp.
- Upload an Instagram story.
- Send a WhatsApp voice message.

Installation & Setup

Prerequisites

Python 3.x
OpenCV, YOLO, Azure SDKs
Android Debug Bridge (ADB) for automation

Installation Steps

Clone the Repository

git clone https://github.com/RO-HIT17/SixthSense.git
cd sixthsense

Install Dependencies
```
pip install -r requirements.txt
```
Set Up Azure Services
- Create an Azure Cognitive Services account.
- Obtain API keys for Speech and Computer Vision.
- Store them in an .env file:
```
AZURE_SPEECH_KEY=your_speech_api_key
AZURE_VISION_KEY=your_vision_api_key
```

Enable ADB for Mobile Automation

adb tcpip 5555
adb connect DEVICE_IP:5555

Run the Application
```
python main.py
```

Future Enhancements

Offline Functionality: Reduce dependency on cloud services for better accessibility and performance.
Advanced NLP Integration: Improve natural language processing to enhance conversational capabilities.
Wearable Support: Extend functionality to smart glasses and other assistive wearables.
AI-Powered Assistance: Evolve towards a Jarvis-like AI assistant for comprehensive automation.
Enhanced Depth Estimation: Improve object detection accuracy with advanced depth perception for safer navigation.
Smart Device & App Integration: Enable seamless interaction with IoT devices and third-party applications.
Contextual Awareness: Provide more personalized and situation-aware assistance for users.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
Mobile_App		Mobile_App
ObjectDetectionApp		ObjectDetectionApp
Obstacle_Detection		Obstacle_Detection
Sandy		Sandy
SixthSense		SixthSense
depth_estimation		depth_estimation
gmapss		gmapss
live_feed_test		live_feed_test
model		model
sample_ob_app		sample_ob_app
server		server
webRTC		webRTC
whatsapp_automation		whatsapp_automation
.gitignore		.gitignore
Readme.md		Readme.md
Summary.txt		Summary.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sixth Sense

Overview

Key Features

Technologies Used

Workflow

Voice Command Actions

Installation & Setup

Prerequisites

Installation Steps

Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

RO-HIT17/SixthSense

Folders and files

Latest commit

History

Repository files navigation

Sixth Sense

Overview

Key Features

Technologies Used

Workflow

Voice Command Actions

Installation & Setup

Prerequisites

Installation Steps

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages