Audio Weaver: Empowering Podcast Creation and Collaborative Learning

Audio Weaver is a cutting-edge technological solution designed to streamline podcast creation and foster a deeper, more collaborative learning experience around audio content. Leveraging the power of Large Language Models (LLMs), Audio Weaver enables users to generate podcast episodes and engage with the content in meaningful ways.

Deploy to Netlify

Key Affordances

Audio Weaver is designed to offer the following key functionalities:

Effortless Podcast Generation: Quickly create podcast episodes based on your input, powered by advanced Gemini.
Guided Comprehension of Complex Information: Navigate and understand dense topics through well-structured audio, potentially with segmented content and integrated summaries.
Active Engagement and Shared Understanding: Interact with content through integrated note-taking, time-stamped comments, discussion forums, and self-assessment features.
Potential for Future Multilingual Content: While currently English-only, the platform is designed with future multilingual capabilities in mind.
Seamless Sharing and Community Building: Easily share podcast episodes or specific segments across various platforms and connect with others around the content.

Limitations

Please be aware of the current limitations of Audio Weaver:

Podcast Output Quality Dependent on LLM: The quality of the generated audio content is inherently tied to the capabilities and nuances of the underlying Large Language Model.
Limited Output Control Beyond System Prompts: User control over the generated output is primarily managed through the system prompts provided to the application.
Potential for Users to Bypass Scaffolding: While designed to guide learning, users may choose to interact with the content outside of the intended scaffolding features.
Emphasis on User Experience and Journey: Continuous effort is being placed on refining the user experience and ensuring an intuitive user journey within the application.
English Outputs Only (Currently): The current version of Audio Weaver exclusively supports the generation of podcast content in English.
LLM Performance in Other Languages: While LLMs generally perform exceptionally well in English, performance in other languages is gradually improving and will be a focus for future development.

Getting Started

Prerequisites

Audio Weaver relies on external services for its AI-powered features. You will need to provide API keys for these services. While the application allows you to enter these keys through an in-app settings modal, it's recommended to configure them as environment variables for more robust backend functionality, especially in deployed environments.

GEMINI_API_KEY:
- Purpose: Used for generating summaries and other AI-driven content analysis via Google's Gemini models.
- Setup: Obtain your API key from Google AI Studio (or your Google Cloud project) and set it as an environment variable named GEMINI_API_KEY.
ELEVENLABS_API_KEY:
- Purpose: Used for converting the generated text summaries into high-quality audio using ElevenLabs' text-to-speech service.
- Setup: Obtain your API key from your ElevenLabs account and set it as an environment variable named ELEVENLABS_API_KEY.

Fallback Behavior:

The application's backend services (like document processing) will prioritize API keys set as environment variables.
If these environment variables are not set, the backend will attempt to use keys that you have saved through the in-app settings modal.
If no keys are found in either environment variables or the in-app settings, the backend will use placeholder demo keys. This allows the application to run, but the AI summarization and text-to-speech features will not function correctly.
The in-app settings modal (mentioned in the "Usage" section below) provides a user-friendly way to manage your API keys, especially for client-side operations or if you prefer not to set environment variables during local development.

The app currently utilizies Gemini 2.0 Flash Lite (or a similar model accessible via your API key) for improved performance on PDF parsing with lesser costs.
The app also utilizes Eleven Labs Flash v2.5 (or a similar model accessible via your API key) for producing high quality TTS output.
The app provides modal for user to reflect on the produced content.

Installation

Use Replit or Netlify to host the application

Database Setup

Audio Weaver requires a PostgreSQL database to store user data, podcast information, and other application data.

Set up a PostgreSQL Database:
- You can use any PostgreSQL provider. Some popular choices include:
  - Neon (Serverless PostgreSQL, offers a free tier)
  - AWS RDS for PostgreSQL
  - Google Cloud SQL for PostgreSQL
  - A local PostgreSQL instance for development.
Configure DATABASE_URL:
- Once your database is provisioned, you will get a connection string (URL).
- Set this connection string as an environment variable named DATABASE_URL in your deployment environment (e.g., Replit Secrets, Netlify environment variables).
- The format typically looks like: postgresql://user:password@host:port/database

Schema Management

The database schema is defined in shared/schema.ts. This project uses Drizzle ORM and Drizzle Kit for schema management. The Drizzle Kit configuration is located in drizzle.config.ts.

There are two primary ways to manage and apply schema changes:

Generating SQL Migrations (Recommended for Production):
- After making changes to shared/schema.ts, you can generate SQL migration files by running:
```
npm run db:generate
```
- This command will create new migration files in the ./migrations directory. These files contain the SQL statements to update your database schema.
- Review these files and commit them to your version control system.
- To apply these migrations to your database in a production environment, you would typically use a separate migration tool or script that executes the SQL in these files. (Note: This project does not currently include a built-in script for running these SQL migrations; db:push is used for development.)
Pushing Schema Changes Directly (Convenient for Development):
- For development purposes, you can directly synchronize your database schema with the definitions in shared/schema.ts by running:
```
npm run db:push
```
- This command will attempt to make the necessary changes to your database to match the schema.
- Caution: db:push is destructive and not recommended for production databases as it can lead to data loss if not used carefully. It's best suited for local development and prototyping.

Usage

The app will ask you for your API key that will be stored locally in your instance for connecting with the AI services (Gemini, ElevenLabs)

The app will ask you to upload relevant documents that you want to parse and analyze

Before processing documents, the app will ask the user for additional context based on the following questions

The app works its magic to generate a summary and an audio output

The user will critically reflect on the output through a series of prompts to enhance the audio generation

Contributing

We welcome contributions to Audio Weaver! If you have ideas for improvements, bug reports, or would like to contribute code, please follow these guidelines:

Fork the repository.
Create a new branch for your feature or bug fix.
Make your changes and commit them.
Push your changes to your fork.
Submit a pull request.

Please adhere to our Code of Conduct. ## Support

For any questions, issues, or feedback, please [link to your support channel, e.g., GitHub Issues, a forum, or contact email].

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

Powered by [Name of the LLM or API provider]
Inspired by research in [relevant research areas]

Future Enhancements

We are continuously working on improving Audio Weaver. Some planned future enhancements include:

Expansion to support multiple languages for both prompting and output.
Refinements to the audio output quality through advanced LLM techniques or post-processing options.
More granular control over the generated audio (e.g., voice selection, music integration).
Enhanced scaffolding features to further guide and support learning.
Integration with other learning platforms and tools.
Improved user interface and user experience based on user feedback.

Stay tuned for updates and new features as we continue to weave the future of audio learning!

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
attached_assets		attached_assets
client		client
server		server
shared		shared
uploads		uploads
README.md		README.md
drizzle.config.ts		drizzle.config.ts
generated-icon.png		generated-icon.png
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
sample.pdf		sample.pdf
tailwind.config.ts		tailwind.config.ts
test_sample.pdf		test_sample.pdf
theme.json		theme.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Weaver: Empowering Podcast Creation and Collaborative Learning

Deploy to Netlify

Key Affordances

Limitations

Getting Started

Prerequisites

Installation

Database Setup

Schema Management

Usage

Contributing

License

Acknowledgements

Future Enhancements

About

Uh oh!

Languages

aliabidhasan/AudioWeaver

Folders and files

Latest commit

History

Repository files navigation

Audio Weaver: Empowering Podcast Creation and Collaborative Learning

Deploy to Netlify

Key Affordances

Limitations

Getting Started

Prerequisites

Installation

Database Setup

Schema Management

Usage

Contributing

License

Acknowledgements

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages