whisper-node

Node.js bindings for OpenAI's Whisper. Transcription done local.

Features

Output transcripts to JSON (also .txt .srt .vtt)
Optimized for CPU (Including Apple Silicon ARM)
Timestamp precision to single word

Installation

Add dependency to project

npm install whisper-node

Download whisper model of choice [OPTIONAL]

npx whisper-node download

Requirement for Windows: Install the make command from here.

Usage

import whisper from 'whisper-node'; const transcript = await whisper("example/sample.wav"); console.log(transcript); // output: [ {start,end,speech} ]

Output (JSON)

[ { "start": "00:00:14.310", // time stamp begin "end": "00:00:16.480", // time stamp end "speech": "howdy" // transcription } ]

Full Options List

import whisper from 'whisper-node'; const filePath = "example/sample.wav"; // required const options = { modelName: "base.en", // default // modelPath: "/custom/path/to/model.bin", // use model in a custom directory (cannot use along with 'modelName') whisperOptions: { language: 'auto' // default (use 'auto' for auto detect) gen_file_txt: false, // outputs .txt file gen_file_subtitle: false, // outputs .srt file gen_file_vtt: false, // outputs .vtt file word_timestamps: true // timestamp for every word // timestamp_size: 0 // cannot use along with word_timestamps:true } } const transcript = await whisper(filePath, options);

Input File Format

Files must be .wav and 16Hz

Example .mp3 file converted with an FFmpeg command: ffmpeg -i input.mp3 -ar 16000 output.wav

Made with

Roadmap

Support projects not using Typescript
Allow custom directory for storing models
Config files as alternative to model download cli
Remove path, shelljs and prompt-sync package for browser, react-native expo, and webassembly compatibility
fluent-ffmpeg to automatically convert to 16Hz .wav files as well as support separating audio from video
Pyanote diarization for speaker names
Implement WhisperX as optional alternative model for diarization and higher precision timestamps (as alternative to C++ version)
Add option for viewing detected langauge as described in Issue 16
Include typescript typescript types in d.ts file
Add support for language option
Add support for transcribing audio streams as already implemented in whisper.cpp

Modifying whisper-node

npm run dev - runs nodemon and tsc on '/src/test.ts'

npm run build - runs tsc, outputs to '/dist' and gives sh permission to 'dist/download.js'

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
lib		lib
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whisper-node

Features

Installation

Usage

Output (JSON)

Full Options List

Input File Format

Made with

Roadmap

Modifying whisper-node

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

whisper-node

Features

Installation

Usage

Output (JSON)

Full Options List

Input File Format

Made with

Roadmap

Modifying whisper-node

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages