Scribe is a web app that aims to make it easier for the public to understand what goes on during Singapore's Parliament sittings. It aggregates data from the Hansard, which contains the official reports of parliamentary debates, and organises the data from each session into a more user-friendly format.
NOTE: Currently, Scribe only contains data for sittings from the (current) 15th Parliament onwards (September 2025 - present).
The official Hansard reports are text-only and rather difficult to read. Thus, each sitting's report is broken down into sections, organised by content type, and properly paginated where necessary, allowing the user to view the proceedings in a more digestible manner.
Broadly speaking, each sitting has three categories of content: questions, bills, and (other) motions. These are each collated into their own respective pages, and users can search through them.
Each question, bill, and motion is tagged to their respective speakers. MPs have their own individual page which lists which of these they have spoken on. Users can then use the MP's page to find out more about their involvement in Parliament.
Each question, bill, and motion is usually associated with a particular ministry. Thus, where possible, each piece of content is tagged to the ministry. This serves as a proxy for the topic of discussion, and users can thus use each ministry's page to find parliamentary discussions that are related to that ministry and/or topic.
These are provided for:
- Each section of a sitting
- Each MP's contributions in Parliament (based on their last 20 involvements)
Scribe is written in TypeScript and built using Next.js, with styling done using Tailwind CSS. The database uses PostgreSQL and is hosted on Supabase. Finally, the summaries are generated using Llama 3.1 via Groq's API.
Data is fetched from the Hansard API, processed, and then inserted into the database using the Python scripts found in the /python directory. For more details, please refer to the README.md there.
Note that the script only works for Hansard data for sittings from ~2012 onwards, as earlier sittings have their data stored in a different format.
If you wish to replicate Scribe independently, you can follow the steps below.
-
Clone the repository
git clone https://github.com/isaacyclai/scribe.git cd scribe -
Install frontend dependencies
npm install
-
Setup environment
Create an
.env.localfile with your database credentials:DATABASE_URL=your_database_url
If you are using Supabase, you will also need to add your Supabase URL and keys. These can be found by clicking "Connect" on the Supabase dashboard.
-
Install Python dependencies
We use
uvto manage dependencies.cd python uv sync -
Setup database
Create an
.envfile with your database URL:DATABASE_URL=your_database_url
Then, run the following script. Dates should be in
DD-MM-YYYYformat.uv run batch_process.py <start_date> <end_date>
-
(Optional) Generate summaries
Add your Groq API key in the
.envfile:GROQ_API_KEY=your_groq_api_key
To generate summaries for sessions and members, run the following.
uv run generate_summaries.py --sessions <start_date> <end_date> uv run generate_summaries.py --members
-
Run Development Server
npm run dev
Open http://localhost:3000 to view the app.
src/app: Next.js App Router pages and API routes.src/components: Reusable React components (MemberCard, Pagination, Filters).src/lib: Utility functions and database connection.python: Data ingestion and processing scripts.supabase: Database schema and migrations.
The choice to use Llama 3.1 via the Groq API was entirely due to its generous API limits (and my lack of credits for other providers' APIs). If you are using this code and would like to use some other API, simply paste your API key into the .env file and modify the first few lines of generate_summaries.py to set up the OpenAI client with the model and provider of your choice.
This project is inspired by the creators of Telescope and Pair Search.
The copyright to the Hansard is owned by the Singapore Government.