Audio Weaver is a cutting-edge technological solution designed to streamline podcast creation and foster a deeper, more collaborative learning experience around audio content. Leveraging the power of Large Language Models (LLMs), Audio Weaver enables users to generate podcast episodes and engage with the content in meaningful ways.
Audio Weaver is designed to offer the following key functionalities:
- Effortless Podcast Generation: Quickly create podcast episodes based on your input, powered by advanced Gemini.
- Guided Comprehension of Complex Information: Navigate and understand dense topics through well-structured audio, potentially with segmented content and integrated summaries.
- Active Engagement and Shared Understanding: Interact with content through integrated note-taking, time-stamped comments, discussion forums, and self-assessment features.
- Potential for Future Multilingual Content: While currently English-only, the platform is designed with future multilingual capabilities in mind.
- Seamless Sharing and Community Building: Easily share podcast episodes or specific segments across various platforms and connect with others around the content.
Please be aware of the current limitations of Audio Weaver:
- Podcast Output Quality Dependent on LLM: The quality of the generated audio content is inherently tied to the capabilities and nuances of the underlying Large Language Model.
- Limited Output Control Beyond System Prompts: User control over the generated output is primarily managed through the system prompts provided to the application.
- Potential for Users to Bypass Scaffolding: While designed to guide learning, users may choose to interact with the content outside of the intended scaffolding features.
- Emphasis on User Experience and Journey: Continuous effort is being placed on refining the user experience and ensuring an intuitive user journey within the application.
- English Outputs Only (Currently): The current version of Audio Weaver exclusively supports the generation of podcast content in English.
- LLM Performance in Other Languages: While LLMs generally perform exceptionally well in English, performance in other languages is gradually improving and will be a focus for future development.
Audio Weaver relies on external services for its AI-powered features. You will need to provide API keys for these services. While the application allows you to enter these keys through an in-app settings modal, it's recommended to configure them as environment variables for more robust backend functionality, especially in deployed environments.
GEMINI_API_KEY:- Purpose: Used for generating summaries and other AI-driven content analysis via Google's Gemini models.
- Setup: Obtain your API key from Google AI Studio (or your Google Cloud project) and set it as an environment variable named
GEMINI_API_KEY.
ELEVENLABS_API_KEY:- Purpose: Used for converting the generated text summaries into high-quality audio using ElevenLabs' text-to-speech service.
- Setup: Obtain your API key from your ElevenLabs account and set it as an environment variable named
ELEVENLABS_API_KEY.
Fallback Behavior:
- The application's backend services (like document processing) will prioritize API keys set as environment variables.
- If these environment variables are not set, the backend will attempt to use keys that you have saved through the in-app settings modal.
- If no keys are found in either environment variables or the in-app settings, the backend will use placeholder demo keys. This allows the application to run, but the AI summarization and text-to-speech features will not function correctly.
- The in-app settings modal (mentioned in the "Usage" section below) provides a user-friendly way to manage your API keys, especially for client-side operations or if you prefer not to set environment variables during local development.
- The app currently utilizies Gemini 2.0 Flash Lite (or a similar model accessible via your API key) for improved performance on PDF parsing with lesser costs.
- The app also utilizes Eleven Labs Flash v2.5 (or a similar model accessible via your API key) for producing high quality TTS output.
- The app provides modal for user to reflect on the produced content.
- Use Replit or Netlify to host the application
Audio Weaver requires a PostgreSQL database to store user data, podcast information, and other application data.
- Set up a PostgreSQL Database:
- You can use any PostgreSQL provider. Some popular choices include:
- Neon (Serverless PostgreSQL, offers a free tier)
- AWS RDS for PostgreSQL
- Google Cloud SQL for PostgreSQL
- A local PostgreSQL instance for development.
- You can use any PostgreSQL provider. Some popular choices include:
- Configure
DATABASE_URL:- Once your database is provisioned, you will get a connection string (URL).
- Set this connection string as an environment variable named
DATABASE_URLin your deployment environment (e.g., Replit Secrets, Netlify environment variables). - The format typically looks like:
postgresql://user:password@host:port/database
The database schema is defined in shared/schema.ts. This project uses Drizzle ORM and Drizzle Kit for schema management. The Drizzle Kit configuration is located in drizzle.config.ts.
There are two primary ways to manage and apply schema changes:
-
Generating SQL Migrations (Recommended for Production):
- After making changes to
shared/schema.ts, you can generate SQL migration files by running:npm run db:generate
- This command will create new migration files in the
./migrationsdirectory. These files contain the SQL statements to update your database schema. - Review these files and commit them to your version control system.
- To apply these migrations to your database in a production environment, you would typically use a separate migration tool or script that executes the SQL in these files. (Note: This project does not currently include a built-in script for running these SQL migrations;
db:pushis used for development.)
- After making changes to
-
Pushing Schema Changes Directly (Convenient for Development):
- For development purposes, you can directly synchronize your database schema with the definitions in
shared/schema.tsby running:npm run db:push
- This command will attempt to make the necessary changes to your database to match the schema.
- Caution:
db:pushis destructive and not recommended for production databases as it can lead to data loss if not used carefully. It's best suited for local development and prototyping.
- For development purposes, you can directly synchronize your database schema with the definitions in
- The app will ask you for your API key that will be stored locally in your instance for connecting with the AI services (Gemini, ElevenLabs)
- The app will ask you to upload relevant documents that you want to parse and analyze
- Before processing documents, the app will ask the user for additional context based on the following questions
- The app works its magic to generate a summary and an audio output
- The user will critically reflect on the output through a series of prompts to enhance the audio generation
We welcome contributions to Audio Weaver! If you have ideas for improvements, bug reports, or would like to contribute code, please follow these guidelines:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them.
- Push your changes to your fork.
- Submit a pull request.
Please adhere to our Code of Conduct. ## Support
For any questions, issues, or feedback, please [link to your support channel, e.g., GitHub Issues, a forum, or contact email].
This project is licensed under the MIT License - see the LICENSE file for details.
- Powered by [Name of the LLM or API provider]
- Inspired by research in [relevant research areas]
We are continuously working on improving Audio Weaver. Some planned future enhancements include:
- Expansion to support multiple languages for both prompting and output.
- Refinements to the audio output quality through advanced LLM techniques or post-processing options.
- More granular control over the generated audio (e.g., voice selection, music integration).
- Enhanced scaffolding features to further guide and support learning.
- Integration with other learning platforms and tools.
- Improved user interface and user experience based on user feedback.
Stay tuned for updates and new features as we continue to weave the future of audio learning!





