Egocentric Diarization

## 🚀 Feature
EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite, with 3,600 hrs (and counting) of densely narrated video and a wide range of annotations across five new benchmark tasks. It covers hundreds of scenarios (household, outdoor, workplace, leisure, etc.) of daily life activity captured in-the-wild by 926 unique camera wearers from 74 worldwide locations and 9 different countries. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. 


This project focuses on the following use cases:
Audio-visual speaker diarization: Given an egocentric video clip, identify which person spoke and when they spoke.
Speech transcription: Given an egocentric video clip, transcribe the speech of each person.
Use Ego4D video clip

## Additional context

Example application can be found [here](https://eval.ai/web/challenges/challenge-page/1640/overview)
The code should be added at folder is https://github.com/facebookresearch/labgraph/tree/main/extensions/labgraph_diarization
Create setup.py and README.md, where example can be found at: https://github.com/facebookresearch/labgraph/tree/main/extensions/labgraph_viz
Add github action support, reference: https://github.com/facebookresearch/labgraph/actions/workflows/main.yml
Add proper license header.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Egocentric Diarization #118

🚀 Feature

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Egocentric Diarization #118

Description

🚀 Feature

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions