Introduction
Deepgram’s Nova-3 is the latest evolution in speech-to-text AI, offering real-time multilingual transcription, improved accuracy, and instant vocabulary updates. If you’re working on AI-driven transcription, you’ll want to explore this model.
At Transgate.ai, we took our first look at Nova-3, and the results were promising! (Check out our insights here: Transgate’s article).
In this guide, let’s get started with Nova-3 by setting up a simple transcription pipeline using Deepgram’s API.
Step 1: Set Up Your Deepgram API Key
First, sign up at Deepgram and grab your API key.
Then, install the Deepgram SDK:
npm install @deepgram/sdk Step 2: Basic Real-Time Transcription
Using Node.js, we’ll create a simple WebSocket connection to transcribe live audio.
import { Deepgram } from '@deepgram/sdk'; import WebSocket from 'ws'; import fs from 'fs'; // Replace with your Deepgram API key const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY'; const audioFile = 'sample.wav'; // Path to your audio file const deepgram = new Deepgram(deepgramApiKey); const ws = new WebSocket('wss://api.deepgram.com/v1/listen', { headers: { Authorization: `Token ${deepgramApiKey}` }, }); ws.on('open', () => { console.log('Connected to Deepgram WebSocket'); const stream = fs.createReadStream(audioFile); stream.on('data', (chunk) => ws.send(chunk)); stream.on('end', () => ws.close()); }); ws.on('message', (message) => { const transcript = JSON.parse(message); console.log('Transcript:', transcript.channel.alternatives[0].transcript); }); ws.on('close', () => console.log('Connection closed')); What This Script Does:
✅ Connects to Deepgram’s real-time transcription API
âś… Streams an audio file for processing
âś… Logs transcriptions in real time
Step 3: Customizing the Transcription
Nova-3 supports custom vocabulary and language models. To enhance accuracy, pass custom parameters like this:
const ws = new WebSocket('wss://api.deepgram.com/v1/listen?model=nova-3&language=en&keywords=AI,transcription'); This boosts accuracy for domain-specific terms like AI, medical jargon, or industry-specific words.
Final Thoughts
Deepgram’s Nova-3 is fast, multilingual, and highly customizable. It’s a powerful tool for anyone building real-time voice applications.
🚀 Next Steps:
- Try it with your own audio files 🎙️
- Experiment with different languages 🌍
- Fine-tune with custom vocabulary đź”§
Check out Transgate.ai’s first impressions: Read here.
What do you think about Nova-3? Let’s discuss in the comments! 👇
Top comments (0)