Question 1

What is Gemma 3n Voice Companion?

Accepted Answer

Gemma 3n Voice Companion is a privacy-first AI voice assistant that runs entirely on a NVIDIA Jetson Orin NX — no cloud, no internet required. It listens, understands, and replies in real-time with warm, patient conversation. It was my 1st-place entry in the Google Gemma 3n Impact Challenge on Kaggle.

Question 2

Does it send any data to the cloud?

Accepted Answer

No. The speech-to-text, language model, and text-to-speech all run on the device. Not a single byte of conversation leaves the hardware. You can unplug the Ethernet and Wi-Fi and it still works.

Question 3

How much does the hardware cost?

Accepted Answer

About $499 for an NVIDIA Jetson Orin NX 16GB module plus basic accessories. Total power draw is around 15 watts under load.

Question 4

Who is it built for?

Accepted Answer

Anyone who wants a warm, private AI companion: children, elderly family members, therapy settings, classrooms, campers, off-grid homes, and anyone who would rather keep their conversations to themselves.

Question 5

What model does it use?

Accepted Answer

Google's Gemma 3n 4B-parameter model, quantized to Q4_K_XL via Unsloth and served with Ollama. Whisper Small handles speech-to-text, Silero VAD handles voice activity detection, and Piper TTS (voice 'Amy') handles text-to-speech — all running locally.

Question 6

How fast does it respond?

Accepted Answer

End-to-end latency is 2-3 seconds from end-of-speech to start-of-reply. The LLM generates around 12 tokens per second, and TTS streams out sentence-by-sentence so there's no silent wait.

An offline AI companion, built for the ones you love.

Watch it in action

The full story, on Kaggle

Why this matters

Privacy by Design

Edge AI Innovation

Friendly by Default

Real-Time Pipeline

A dad, a soldering iron, and a stubborn idea.

Under the hood

Wake Word Detection

Speech Recognition

Language Model

Voice Synthesis

Questions people ask

Curious how it all fits together?