Skip to content

screenpipe/uniOCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

uniocr 📸

Crates.io Docs.rs MIT/Apache-2.0

universal ocr engine for rust that works everywhere. supports native ocr on macos, windows, tesseract, and cloud providers.

need a feature like NodeJS, HTTP example, etc.? open an issue or PR.

features 🚀

  • native ocr
    • macos: native vision kit api
    • windows: windows ocr engine
  • tesseract integration
    • full support for tesseract with custom models
    • fast initialization and caching
  • cloud providers
    • custom ocr provider
  • unified api
    • single interface for all providers
    • easy provider switching
    • batch processing support
  • performance focused
    • async/await support
    • parallel processing
    • memory efficient
    • unsafe code memory leaks battle tested

quickstart 🏃

[dependencies] uni-ocr = { git = "https://github.com/mediar-ai/uniocr.git" }
use uniocr::{OcrEngine, OcrProvider}; use anyhow::Result; #[tokio::main] async fn main() -> Result<()> { // auto-detect best available provider let engine = OcrEngine::new(OcrProvider::Auto)?; // perform ocr on an image let text = engine.recognize_file("path/to/image.png").await?; println!("extracted text: {}", text); Ok(()) }

providers 🔌

// use native macos vision let engine = OcrEngine::new(OcrProvider::MacOS)?; // use windows ocr let engine = OcrEngine::new(OcrProvider::Windows)?; // use tesseract let engine = OcrEngine::new(OcrProvider::Tesseract)?; // use google cloud vision // let engine = OcrEngine::new(OcrProvider::GoogleCloud { // credentials: ..., // })?;

advanced usage 🛠️

use uni_ocr::{OcrEngine, OcrProvider, OcrOptions}; // configure ocr options let options = OcrOptions::default() .languages(vec!["eng", "fra"]) .confidence_threshold(0.8) .timeout(std::time::Duration::from_secs(30)); let engine = OcrEngine::new(OcrProvider::Auto)? .with_options(options); // batch processing let images = vec!["img1.png", "img2.png", "img3.png"]; let results = engine.recognize_batch(images).await?;

installation requirements 🔧

  • macos: no additional setup (vision kit included)
  • windows: windows 10+ with ocr capabilities
  • tesseract: tesseract-ocr installed:
    # macos brew install tesseract # ubuntu apt-get install tesseract-ocr # windows winget install tesseract

performance 📊

benchmark results on m4 macbook pro max (images/second):

provider speed accuracy
macos vision 3.2 90.0%
windows ocr 1.2 95.2%
tesseract tbd tbd
google cloud tbd tbd

contributing 🤝

contributions welcome!

license 📜

this project is licensed under either of:

at your option.

acknowledgments 🙏

  • apple vision team
  • microsoft windows ocr team
  • tesseract ocr project
  • cloud provider teams

examples 📚

the repository includes several example programs demonstrating different use cases:

run examples

# basic example cargo run --example basic # batch processing cargo run --example batch_processing # custom options cargo run --example custom_options # platform specific cargo run --example platform_specific

check the examples directory for more detailed examples including:

  • batch processing multiple images
  • configuring custom options
  • using platform-specific providers
  • handling multilingual text

About

native OCR for MacOS, Windows, Linux

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages