Welcome to the Space Biology Knowledge Engine, a web application designed to explore, search, and analyze decades of NASA's space biology research. This project was built for the NASA Space Apps Challenge.
Our engine transforms a simple CSV list of publications into a rich, interactive, and AI-summarized database, making complex scientific discoveries more accessible to researchers, students, and enthusiasts alike.
You can view the live project deployment here:
- AI-Powered Summaries: Leverages the Google Gemini AI to read research papers and generate concise, easy-to-understand summaries, categories, and keywords.
- Robust Data Pipeline: A powerful Node.js script uses a headless browser (Puppeteer) to reliably scrape publication data, processes it with AI, and builds a rich JSON database from a simple CSV source.
- Advanced Search & Filtering: Instantly search through hundreds of publications by title or keyword, and filter by specific research categories.
- Thematic UI: A beautiful, responsive interface with a custom-animated "space biology" background, available in both light and dark themes.
- Detailed Research View: Click on any research card to open a detailed modal with the AI summary and a direct link to the full publication on the NCBI website.
- Resilient Data Processing: The data generation script is built to be robust, featuring automatic retries, batch processing, and an API key pool to manage rate limits.
- Framework: React
- Language: TypeScript
- Build Tool: Vite
- Styling: Tailwind CSS
- Icons: Lucide React
- Runtime: Node.js
- AI Model: Google Gemini
- Web Scraping: Puppeteer & Cheerio
- CSV Parsing: Papaparse
- API Key Management: Dotenv
Follow these instructions to get a copy of the project up and running on your local machine.
- Node.js (version 20.x or higher recommended)
npm(comes with Node.js)- Google Account(s) to generate Gemini API keys.
Clone the repository to your local machine:
git clone https://github.com/parmarth-kumar/Zen-1.git
cd Zen-1Install the required npm packages:
npm installThis project requires API keys from Google AI to power the data generation script.
-
Generate API Keys: Have each team member go to Google AI Studio and create a free API key.
-
Create
.envfile: In the root of the project, create a file named.env. -
Add Keys: Open the
.envfile and add your keys as a comma-separated list.GEMINI_API_KEYS=key_from_person_1,key_from_person_2,key_from_person_3
The application's data is generated by a script that reads the SB_publication_PMC.csv file and processes it with AI. You only need to do this once.
- Run the Script: In your terminal, run the script. It will process one batch of articles.
node scripts/generate-dataset.js
- Repeat: Run the same command again. The script will automatically pick up where it left off.
- Note: This script is slow but reliable as it uses a real browser. You will need to run it repeatedly over several sessions to process all 600+ articles due to API rate limits.
The final data will be saved in src/data/ai-generated-data.json.
Once the data generation is complete, you can run the web application.
npm run devOpen your browser and navigate to the local URL provided (usually http://localhost:5173).
- Data Source: This project uses the "Space Biology Publications" dataset provided by NASA GeneLab. It can be found here.
- Team: Built with ❤️ by ZeN-1.
This project is licensed under the MIT License. See the LICENSE file for details.