scikit-learn: machine learning in Python
- Updated
Nov 29, 2025 - Python
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
scikit-learn: machine learning in Python
Deep Learning for humans
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Streamlit — A faster way to build and share data apps.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
💫 Industrial-strength Natural Language Processing (NLP) in Python
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
Data Apps & Dashboards for Python. No JavaScript Required.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
matplotlib: plotting with Python
Best Practices on Recommendation Systems
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Topic Modelling for Humans