Media Tool Kit is my personal framework for media manipulation. It's designed for LLM's within generative coding environments to make modifications to images, sounds, and video quickly and consistently.
The idea is to quickly make edits to media by just asking it.
"Compress this video to 10MB" or "Remove transparency". That kind of stuff.
This is python-centric workspace. Scripts created in this workspace will leverage the following frameworks.
- Video: ffmpeg
- Image: ImageMagick
- Audio: SoX (Sound eXchange)
I expect this list to grow over time, but for now this will do.
brew install ffmpeg imagemagick sox python3sudo apt update
sudo apt install ffmpeg imagemagick sox python3 python3-pipffmpeg -version
convert -version
sox --version
python3 --versionmedia-tool-kit/
├── projects/ # Individual media processing projects
├── scripts/ # Reusable utility scripts
├── .cursor/rules/ # Cursor IDE rules and guidelines
└── README.md
- Drop media files into the root or describe what you need
- AI assistant creates organized project folder in
projects/ - Processing happens with clear logging
- Results saved to project's
output/directory - Commands logged for reproducibility
Each new project should be listed in the projects/ directory and carry a consistent structure:
projects/
└── [project-name]/
├── input/ # Source/original files
├── output/ # Processed results
└── README.md # What was done, commands used
This keeps work organized, originals safe, and makes it easy to reproduce or reference past work.
Just describe what you want to do with your media files. Examples:
- "Convert all these images to PNG"
- "Extract audio from this video and normalize it"
- "Trim this video from 1:30 to 2:45"
