Janus

Real-time AI companion for children, built to turn curiosity into safe, guided exploration while keeping parents in the loop.

Demo • The Problem • What Janus Does • Why It Wins • Google Stack • Safety & Security • How It Works • Architecture • Getting Started • Roadmap

Kids do not want to "search the internet." They want to point, ask, explore, move, and wonder.
Janus turns that energy into a live, safe, visual learning experience and gives parents visibility instead of guesswork.

The Problem

Children are not interested in typing careful search queries into a box and reading blue links. Their curiosity is physical, fast, emotional, and constant. They want to hold up a flower, ask about a dinosaur, point at a machine, go on a scavenger hunt, and keep moving.

At the same time, one of the biggest real-world pressures on parents is keeping that curiosity alive while they handle everything else life demands. Parents want a tool that helps keep their child engaged, learning, and safe, without becoming an opaque black box.

There is a gap between:

how children naturally explore the world
how the internet expects them to ask questions
how much visibility parents have into AI interactions

Janus is our answer to that gap.

Demo

Built for the Gemini Live API Challenge.

Demo links can be dropped in here before submission:

Live demo: TODO
Demo video: TODO

What Janus Does

Janus is a real-time multimodal AI companion for youth that sees through the camera, listens through the microphone, responds in natural voice, and brings learning into the child's physical space.

Instead of asking a child to sit still and type into a search bar, Janus lets them:

ask questions out loud in natural language
show real-world objects to the camera
explore with AR-style teaching overlays
go on short scavenger hunts
see safe images and videos related to what they are curious about
place generated creatures or objects onto visible surfaces like a table or desk
interact with those generated objects through hand tracking

And Janus does not stop at the child experience. It exports the Gemini session into a parent-facing review flow so a parent can check what happened from their phone, see summaries, review alerts, and ask follow-up questions about the session.

Why Janus Wins

1. It uses Gemini Live the way kids actually behave

Children do not interact with intelligence as text. They interact with intelligence as presence.

Janus uses live audio, live video, realtime transcription, visual grounding, and tool-driven interaction so the model can do more than answer. It can guide, react, point attention, celebrate, and stay quiet when the child is focused.

2. It transforms the internet into something a child can actually use

Janus is not "Google, but with a chatbot."

It changes the interaction model entirely:

from typed search to spoken curiosity
from abstract answers to grounded visuals
from passive browsing to guided co-play
from isolated sessions to parent-visible learning

3. It helps parents, not just children

This is not an AI toy that disappears into a black box.

Janus creates a parent layer with:

exported Gemini session history
conversation summaries
activity logs
distress and safety alerts
at-risk event capture
parent Q&A about what happened during the session

That makes Janus useful for the whole family, not just impressive for a demo.

4. It is ambitious in product design, not just model usage

We did not just connect a model and call it done. We built:

a realtime child-facing companion
a mobile-friendly parent portal
an AR-style visual teaching system
a retrieval stack for grounded visuals
a clip-and-alert flow for parent awareness
deployment infrastructure on Google Cloud

Features

Real-time multimodal conversation

Live microphone streaming into Gemini
Live camera streaming into Gemini
Native voice responses in real time
Realtime transcription for both the child and Janus
Context-aware silence and wake behavior so Janus does not constantly interrupt

Curiosity-driven visual learning

show_visual tool for on-demand images and videos
Real image retrieval first, AI generation second
Child-safe prompts for image generation
Clean visual rendering with graceful frontend fallbacks

AR-style co-play

Guided teaching overlays for objects the child is showing
Short scavenger hunts that keep energy focused
Celebration states for successful interaction
Generated AR characters/objects placed onto visible surfaces
Hand tracking so the child can grab and move those generated objects

Parent intelligence layer

Session history
Session summaries
Activity timelines
Distress and safety indicators
Parent-side Q&A assistant
Mobile-aware portal for reviewing sessions on a phone

Built With Google

Janus is deeply built around the Google AI and Google Cloud ecosystem.

Gemini Live API

Powers the core realtime child-facing experience
Streams audio/video into a live model session
Produces natural spoken responses
Enables tool-calling during live interaction

Gemini via the Google GenAI SDK

Used for structured helper reasoning
Used for parent-side assistant responses
Used for summary generation
Used for prompt-based scene/anchor reasoning
Used for at-risk evaluation and contextual decisioning

Imagen 3

Generates child-safe visuals when grounded retrieval is not available
Generates sprite-like AR characters and objects
Generates photorealistic fallback images for curiosity-driven visual moments

Vertex AI Search

Retrieves real images for real-world concepts
Lets Janus prefer grounded, factual, non-hallucinated visuals when possible
Helps keep the experience educational instead of fully synthetic

Firebase / Firestore

Stores session transcripts
Stores summaries, activities, co-play state, and alerts
Creates the data backbone for the parent portal

Google Cloud Run

Hosts the backend service
Runs with a dedicated runtime service account
Uses default credentials in production instead of shipping secrets into runtime code

Google Cloud Secret Manager

Stores API secrets such as the YouTube key used for video retrieval
Keeps sensitive values out of source code and deployment config

Google Cloud IAM + Artifact Registry + Cloud Build

IAM roles are granted for runtime and build responsibilities
Artifact Registry stores container images
Cloud Build builds the deployable container

Safety & Security

Janus was designed for youth, which means safety and privacy could not be an afterthought.

Child safety guardrails

The core system prompt explicitly blocks violent, scary, adult, or age-inappropriate content
Janus is instructed not to request or retain personal identifying information from children
Emotional safety rules redirect the model toward comfort and a trusted grown-up instead of overstepping into counseling
The visual generation prompts are constrained to bright, friendly, age-appropriate outputs

Parent visibility by default

Sessions are not hidden inside a model conversation
Parents can review summaries, transcripts, activities, and alerts
This creates accountability instead of blind trust

Risk detection layers

Transcript analysis looks for distress and safety signals
Helper-model evaluation checks for at-risk behavior requiring adult attention
At-risk events can include captured media context, not just text labels

Cloud security posture

Production runs on Cloud Run with runtime service accounts
Firestore access is handled through Firebase Admin on trusted backend infrastructure
Secrets are managed through Secret Manager
Local development uses a service account file, while cloud runtime uses platform credentials

Important note

Janus is a hackathon prototype with a strong safety-first architecture. It is not presented as a fully audited production child-safety platform. What it does show is the right direction: controlled prompts, parent visibility, backend-managed credentials, secret handling, and multiple layers of intervention logic.

How It Works

Child flow

A child opens Janus and activates the experience.
The browser starts the webcam and microphone.
Audio and video stream to the backend over Socket.io.
The backend opens a Gemini Live session and forwards realtime media input.
Gemini responds with live voice and can call tools to launch visuals, overlays, hunts, or generated AR objects.
Session activity is stored so parents can review it later.

Parent flow

Session messages and activities are stored in Firestore.
The backend derives summaries, alerts, and at-risk events.
The parent portal fetches this session data.
A parent can review the session and ask Janus questions about what happened.

Visual flow

Janus receives a child request like "show me a tiger" or "show me the Eiffel Tower."
The backend first searches for a real image using Vertex AI Search.
Search results are filtered and scored to remove junk results.
If no good real result survives, Imagen 3 generates a safe fallback image.
The child sees a visual answer without needing to browse the web.

Architecture

flowchart TD
    A["Child Browser"] --> B["Webcam + Microphone"]
    B --> C["Socket.io Realtime Stream"]
    C --> D["Express + GeminiInteractionSystem"]
    D --> E["Gemini Live API"]
    D --> F["Gemini Helper Models"]
    D --> G["Vertex AI Search"]
    D --> H["Imagen 3"]
    D --> I["Firestore"]
    I --> J["Parent Portal"]
    D --> J
    E --> A
    G --> A
    H --> A

Core backend pieces

backend/src/server.ts Serves the frontend, exposes parent APIs, and hosts the realtime socket server.
backend/src/scripts/gemini-interactions-system.ts The main realtime engine. Handles Gemini Live sessions, tool calls, silence logic, transcript storage, summary generation, at-risk detection, generated AR objects, and activity persistence.
backend/src/scripts/media-search-service.ts Handles image and video retrieval. This became a major quality-control layer after naive search returned junk results.
backend/src/scripts/parent-chat.ts Shapes stored session data into something a parent can understand and query.

Security-Focused Design Choices

These are not marketing bullets. They are reflected directly in the implementation:

Child PII is explicitly discouraged in the system prompt
Risky content is blocked at the prompt layer
Distress and safety language is surfaced for parents
Production credentials come from Cloud Run identity instead of hardcoded runtime secrets
Secrets are provisioned into Secret Manager during deployment
Firestore session storage keeps parent-facing review on the backend side
The deployment script creates a dedicated runtime service account and grants only needed roles

Why This Matters For Youth

The internet was not designed around the way young children learn.

Children learn by:

asking out loud
pointing at things
moving around
repeating themselves
seeing vivid feedback
staying engaged through play

Janus meets them there.

Instead of flattening curiosity into text entry, Janus gives children a guided, visual, voice-first path through exploration while giving parents confidence that the experience is visible, reviewable, and designed with safeguards.

This is not just a better interface. It is a different model for how youth interact with the internet.

Getting Started

Prerequisites

Node.js
npm
A Google Cloud project
Gemini / Vertex access configured in Google Cloud
Firestore access configured in Google Cloud
service-account.json in root for local development (note: we recommend just setting owner in IAM to help speed up the process)
Optional YOUTUBE_API_KEY

Install

npm install

Local development

npm run dev

This starts:

Vite frontend
TypeScript backend with nodemon

Build

npm run build

Production start

npm run start

Deployment

Janus includes a deployment script that prepares a Google Cloud deployment path:

enables required Google Cloud APIs
creates or reuses a project
ensures Firestore exists
creates a Cloud Run runtime service account
grants runtime IAM roles
syncs the YouTube key into Secret Manager
builds the container with Cloud Build
deploys the service to Cloud Run

A prerequisite is you need gcloud cli installed, with your account authenticated via gcloud auth login.

Note: A billing account must be connected for deployment to succeed.

Run:

./deploy.sh

Roadmap

Persistent child-safe memory
Better personalized learning loops
Stronger parent controls and settings
More precise risk detection and review workflows
Better AR grounding and placement
Richer mobile parent experience
More structured lesson modes for specific age ranges

Built For

Janus was built for the Gemini Live API Challenge.

We wanted to show what a live AI agent can look like when it is designed for the way children actually explore the world: energetic, visual, curious, and always moving, with parents included instead of excluded.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.junie/skills		.junie/skills
backend		backend
frontend		frontend
.aiignore		.aiignore
.gcloudignore		.gcloudignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
ddebug		ddebug
deploy.sh		deploy.sh
package-lock.json		package-lock.json
package.json		package.json
vite.config.ts		vite.config.ts

Folders and files

Latest commit

History

Repository files navigation