Posted on Feb 25

🚀 Getting Started with Deepgram Nova-3 for Real-Time Speech-to-Text

Introduction

Deepgram’s Nova-3 is the latest evolution in speech-to-text AI, offering real-time multilingual transcription, improved accuracy, and instant vocabulary updates. If you’re working on AI-driven transcription, you’ll want to explore this model.

At Transgate.ai, we took our first look at Nova-3, and the results were promising! (Check out our insights here: Transgate’s article).

In this guide, let’s get started with Nova-3 by setting up a simple transcription pipeline using Deepgram’s API.

Step 1: Set Up Your Deepgram API Key

First, sign up at Deepgram and grab your API key.

Then, install the Deepgram SDK:

npm install @deepgram/sdk

Step 2: Basic Real-Time Transcription

Using Node.js, we’ll create a simple WebSocket connection to transcribe live audio.

import { Deepgram } from '@deepgram/sdk'; import WebSocket from 'ws'; import fs from 'fs'; // Replace with your Deepgram API key const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY'; const audioFile = 'sample.wav'; // Path to your audio file const deepgram = new Deepgram(deepgramApiKey); const ws = new WebSocket('wss://api.deepgram.com/v1/listen', { headers: { Authorization: `Token ${deepgramApiKey}` }, }); ws.on('open', () => { console.log('Connected to Deepgram WebSocket'); const stream = fs.createReadStream(audioFile); stream.on('data', (chunk) => ws.send(chunk)); stream.on('end', () => ws.close()); }); ws.on('message', (message) => { const transcript = JSON.parse(message); console.log('Transcript:', transcript.channel.alternatives[0].transcript); }); ws.on('close', () => console.log('Connection closed'));

What This Script Does:

✅ Connects to Deepgram’s real-time transcription API

✅ Streams an audio file for processing

✅ Logs transcriptions in real time

Step 3: Customizing the Transcription

Nova-3 supports custom vocabulary and language models. To enhance accuracy, pass custom parameters like this:

const ws = new WebSocket('wss://api.deepgram.com/v1/listen?model=nova-3&language=en&keywords=AI,transcription');

This boosts accuracy for domain-specific terms like AI, medical jargon, or industry-specific words.

Final Thoughts

Deepgram’s Nova-3 is fast, multilingual, and highly customizable. It’s a powerful tool for anyone building real-time voice applications.

DEV Community