Skip to content

ezoic/scigo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

137 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

SciGo 🚀

SciGo Mascot Gopher

SciGo's official mascot - Ready, Set, SciGo!

The blazing-fast scikit-learn compatible ML library for Go

Say "Goodbye" to slow ML, "Sci-Go" to fast learning!

CI Codecov Go Report Card License: MIT Go Version GoDoc Release


🌟 Why SciGo?

SciGo = Statistical Computing In Go

SciGo brings the power and familiarity of scikit-learn to the Go ecosystem, offering:

  • 🔥 Blazing Fast: Native Go implementation with built-in parallelization
  • 🎯 scikit-learn Compatible: Familiar Fit/Predict API for easy migration
  • 🌲 LightGBM Support: Full compatibility with Python LightGBM models (.txt/JSON/string)
  • 📖 Well Documented: Complete API documentation with examples on pkg.go.dev
  • 🌊 Streaming Support: Online learning algorithms for real-time data
  • 🚀 Zero Heavy Dependencies: Pure Go implementation (only scientific essentials)
  • 📊 Comprehensive: Regression, classification, clustering, tree-based models, and more
  • 🧪 Production Ready: Extensive tests, benchmarks, and error handling
  • Superior to leaves: Not just inference - full training, convenience features, and numerical precision

📦 Installation

Go Module (Recommended)

go get github.com/ezoic/scigo@latest

Quick Start Options

  • 🐳 Docker: docker run --rm -it ghcr.io/yuminosukesato/scigo:latest
  • ☁️ GitPod: Open in Gitpod
  • 📦 Go Install: go install github.com/ezoic/scigo/examples/quick-start@latest

🚀 Quick Start

💡 Tip: For complete API documentation with examples, visit pkg.go.dev/scigo

Option 1: One-Liner with LightGBM 🌲

package main import ( "github.com/ezoic/scigo/sklearn/lightgbm" "gonum.org/v1/gonum/mat" ) func main() { // Super convenient one-liner training! X := mat.NewDense(100, 4, data) // Your data y := mat.NewDense(100, 1, labels) // Your labels // Train and predict in one line! result := lightgbm.QuickTrain(X, y) predictions := result.Predict(X_test) // Or use AutoML for automatic tuning best := lightgbm.AutoFit(X, y) // Load Python LightGBM models directly! model := lightgbm.NewLGBMClassifier() model.LoadModel("python_model.txt") // Full compatibility! predictions, _ := model.Predict(X_test) }

Option 2: Classic Linear Regression

package main import ( "fmt" "log" "github.com/ezoic/scigo/linear" "gonum.org/v1/gonum/mat" ) func main() { // Create and train model - just like scikit-learn! model := linear.NewLinearRegression() // Training data X := mat.NewDense(4, 2, []float64{ 1, 1, 1, 2, 2, 2, 2, 3, }) y := mat.NewDense(4, 1, []float64{ 2, 3, 3, 4, }) // Fit the model if err := model.Fit(X, y); err != nil { log.Fatal(err) } // Make predictions XTest := mat.NewDense(2, 2, []float64{ 1.5, 1.5, 2.5, 3.5, }) predictions, _ := model.Predict(XTest) fmt.Println("Ready, Set, SciGo! Predictions:", predictions) }

📚 API Documentation

GoDoc

📖 Package Documentation

Package Description Go Doc
sklearn/lightgbm 🌲 LightGBM with Python model compatibility & convenience features GoDoc
sklearn/linear_model Linear models with full scikit-learn compatibility GoDoc
preprocessing Data preprocessing utilities (StandardScaler, MinMaxScaler, OneHotEncoder) GoDoc
linear Linear machine learning algorithms (LinearRegression) GoDoc
metrics Model evaluation metrics (MSE, RMSE, MAE, R², MAPE) GoDoc
core/model Base model with weight export/import and scikit-learn compatibility GoDoc

📋 Complete API Examples

The documentation includes comprehensive examples for all major APIs. Visit the Go Doc links above or use go doc locally:

# View package documentation go doc github.com/ezoic/scigo/preprocessing go doc github.com/ezoic/scigo/linear go doc github.com/ezoic/scigo/metrics # View specific function documentation go doc github.com/ezoic/scigo/preprocessing.StandardScaler.Fit go doc github.com/ezoic/scigo/linear.LinearRegression.Predict go doc github.com/ezoic/scigo/metrics.MSE # Run example tests go test -v ./preprocessing -run Example go test -v ./linear -run Example go test -v ./metrics -run Example

📚 Algorithms

Supervised Learning

Linear Models

  • Linear Regression - Full scikit-learn compatible implementation with QR decomposition
  • SGD Regressor - Stochastic Gradient Descent for large-scale learning
  • SGD Classifier - Linear classifiers with SGD training
  • Passive-Aggressive - Online learning for classification and regression

Data Preprocessing

  • StandardScaler - Standardizes features by removing mean and scaling to unit variance
  • MinMaxScaler - Scales features to a given range (e.g., [0,1] or [-1,1])
  • OneHotEncoder - Encodes categorical features as one-hot numeric arrays

Tree-based Models

  • LightGBM - Full Python model compatibility (.txt/JSON/string formats)
    • LGBMClassifier - Binary and multiclass classification
    • LGBMRegressor - Regression with multiple objectives
    • QuickTrain - One-liner training with automatic model selection
    • AutoFit - Automatic hyperparameter tuning
    • Superior to leaves - training + convenience features
  • 🚧 Random Forest (Coming Soon)
  • 🚧 XGBoost compatibility (Coming Soon)

Unsupervised Learning

Clustering

  • MiniBatch K-Means - Scalable K-Means for large datasets
  • 🚧 DBSCAN (Coming Soon)
  • 🚧 Hierarchical Clustering (Coming Soon)

Special Features

Online Learning & Streaming

  • Incremental Learning - Update models with new data batches
  • Partial Fit - scikit-learn compatible online learning
  • Concept Drift Detection - DDM and ADWIN algorithms
  • Streaming Pipelines - Real-time data processing with channels

🎯 scikit-learn Compatibility

SciGo implements the familiar scikit-learn API with full compatibility:

// Just like scikit-learn! model.Fit(X, y) // Train the model model.Predict(X) // Make predictions  model.Score(X, y) // Evaluate the model model.PartialFit(X, y) // Incremental learning // New in v0.3.0 - Full scikit-learn compatibility model.GetParams(deep) // Get model parameters model.SetParams(params) // Set model parameters weights, _ := model.ExportWeights() // Export model weights model.ImportWeights(weights) // Import with guaranteed reproducibility // Streaming - unique to Go! model.FitStream(ctx, dataChan) // Streaming training

🆕 New Features in v0.3.0

  • Complete Weight Reproducibility - Guaranteed identical outputs with same weights
  • gRPC/Protobuf Support - Distributed training and prediction
  • Full Parameter Management - GetParams/SetParams for all models
  • Model Serialization - Export/Import with full precision

📊 Performance Benchmarks

SciGo leverages Go's concurrency for exceptional performance:

Algorithm Dataset Size SciGo scikit-learn (Python) Speedup
Linear Regression 1M×100 245ms 890ms 3.6×
SGD Classifier 500K×50 180ms 520ms 2.9×
MiniBatch K-Means 100K×20 95ms 310ms 3.3×
Streaming SGD 1M streaming 320ms 1.2s 3.8×

Benchmarks on MacBook Pro M2, 16GB RAM

Memory Efficiency

Dataset Size Memory Allocations
100×10 22.8KB 22
1,000×10 191.8KB 22
10,000×20 3.4MB 57
50,000×50 41.2MB 61

🏗️ Architecture

scigo/ ├── linear/ # Linear models ├── sklearn/ # scikit-learn compatible implementations │ ├── linear_model/ # SGD, Passive-Aggressive │ ├── cluster/ # Clustering algorithms │ └── drift/ # Concept drift detection ├── metrics/ # Evaluation metrics ├── core/ # Core abstractions │ ├── model/ # Base model interfaces │ ├── tensor/ # Tensor operations │ └── parallel/ # Parallel processing ├── datasets/ # Dataset utilities └── examples/ # Usage examples 

📊 Metrics

Comprehensive evaluation metrics with full documentation:

🧪 Testing & Quality

# Run tests go test ./... # Run benchmarks go test -bench=. -benchmem ./... # Check coverage (76.7% overall coverage) go test -cover ./... # Run linter (errcheck, govet, ineffassign, staticcheck, unused, misspell) make lint-full # Run examples to see API usage go test -v ./preprocessing -run Example go test -v ./linear -run Example go test -v ./metrics -run Example go test -v ./core/model -run Example

Quality Gates

  • Test Coverage: 76.7% (target: 70%+)
  • Linting: golangci-lint with comprehensive checks
  • Documentation: Complete godoc for all public APIs
  • Examples: Comprehensive example functions for all major APIs

📚 Examples

Check out the examples directory:

🤝 Contributing

We welcome contributions! Please see our Contributing Guide.

Development Setup

# Clone the repository git clone https://github.com/ezoic/scigo.git cd scigo # Install dependencies go mod download # Run tests go test ./... # Run linter golangci-lint run

🚀 Continuous Delivery (CD)

SciGo uses automated continuous delivery for releases:

  • Automatic Release: Every push to the main branch triggers an automatic patch version release
  • Version Management: Versions are automatically incremented (e.g., 0.4.0 → 0.4.1)
  • Release Assets: Binaries for Linux, macOS, and Windows are automatically built and attached
  • Docker Images: Docker images are automatically built and pushed to GitHub Container Registry (ghcr.io)
  • Documentation: pkg.go.dev is automatically updated with the latest version

Release Process

  1. Merge PR to main: When a PR is merged to main branch
  2. Automatic Tests: CI runs all tests and coverage checks
  3. Version Bump: Patch version is automatically incremented
  4. Create Release: GitHub Release is created with:
    • Multi-platform binaries (Linux, macOS, Windows)
    • Release notes from CHANGELOG.md
    • Docker image at ghcr.io/yuminosukesato/scigo:VERSION
  5. Post-Release: An issue is created to track post-release verification tasks

Manual Release

For major or minor version releases, create and push a tag manually:

git tag v0.5.0 -m "Release v0.5.0" git push origin v0.5.0

This will trigger the release workflow via the existing release.yml workflow.

🗺️ Roadmap

Phase 1: Core ML (Current)

  • ✅ Linear models
  • ✅ Online learning
  • ✅ Basic clustering
  • 🚧 Tree-based models

Phase 2: Advanced Features

  • Neural Networks (MLP)
  • Deep Learning integration
  • Model serialization (ONNX export)
  • GPU acceleration

Phase 3: Enterprise Features

  • Distributed training
  • AutoML capabilities
  • Model versioning
  • A/B testing framework

📖 Documentation

Core Documentation

API Quick Reference

API Package Documentation
StandardScaler preprocessing pkg.go.dev/preprocessing.StandardScaler
MinMaxScaler preprocessing pkg.go.dev/preprocessing.MinMaxScaler
OneHotEncoder preprocessing pkg.go.dev/preprocessing.OneHotEncoder
LinearRegression linear pkg.go.dev/linear.LinearRegression
BaseEstimator core/model pkg.go.dev/model.BaseEstimator

Migration & Advanced Guides

🙏 Acknowledgments

📄 License

SciGo is licensed under the MIT License. See LICENSE for details.

📧 Contact


🚀 Ready, Set, SciGo! 🚀

Where Science Meets Go - Say goodbye to slow ML!

Made with ❤️ and lots of ☕ in Go
### Running scikit-learn parity tests

Development-only parity tests compare the Go implementation against scikit-learn outputs. They are not part of the default go test; use the parity build tag explicitly.

Steps

  1. Generate golden data
    • Use uv instead of pip.
    • Command: uv run --with scikit-learn --with numpy --with scipy python scripts/golden/gen_logreg.py
  2. Run parity tests
    • Command: go test ./sklearn/linear_model -tags=parity -run Parity -v

One-liner

make parity-linear

Notes

  • Current LogisticRegression uses simplified gradient descent. After implementing lbfgs/newton-cg, tolerances will be tightened.
  • Golden file is written to tests/golden/logreg_case1.json.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Go 87.8%
  • Python 9.1%
  • Shell 1.5%
  • Makefile 1.3%
  • Dockerfile 0.3%