Skip to content
View pfletcherhill's full-sized avatar

Highlights

  • Pro

Organizations

@PatientBank @hillstreetlabs @veilco

Block or report pfletcherhill

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

Jupyter Notebook 876 145 Updated Mar 9, 2026

[ICLR 2026] Learning to Reason without External Rewards

Python 403 43 Updated Jan 26, 2026

Code and Data for Tau-Bench

Python 1,135 187 Updated Mar 18, 2026

A benchmark for LLMs on complicated tasks in the terminal

Python 1,753 488 Updated Jan 22, 2026

[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents

Python 601 114 Updated Mar 16, 2026

Long context evaluation for large language models

Python 228 27 Updated Mar 3, 2025

Multiple datasets for ARC (Abstraction and Reasoning Corpus)

Python 86 15 Updated Mar 28, 2025

Bootstrapping ARC

Python 156 24 Updated Nov 20, 2024

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/

Python 2,821 180 Updated Mar 9, 2026

Orbax provides common checkpointing and persistence utilities for JAX users

Python 490 85 Updated Mar 21, 2026

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 7,127 793 Updated Mar 21, 2026

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,382 772 Updated Mar 18, 2026

The original code for the paper "How to train your MAML" along with a replication of the original "Model Agnostic Meta Learning" (MAML) paper in Pytorch.

Python 828 147 Updated Dec 5, 2023

higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.

Python 1,628 128 Updated Mar 25, 2022

Domain Specific Language for the Abstraction and Reasoning Corpus

Python 324 68 Updated Oct 11, 2024

Reverse Engineering the Abstraction and Reasoning Corpus

Jupyter Notebook 335 54 Updated Feb 24, 2025

LLM training in simple, raw C/CUDA

Cuda 29,220 3,438 Updated Jun 26, 2025

LLM101n: Let's build a Storyteller

36,568 2,000 Updated Aug 1, 2024

The Abstraction and Reasoning Corpus

JavaScript 4,735 705 Updated Apr 4, 2025

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 4,522 800 Updated Mar 19, 2026

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 18,803 2,025 Updated Mar 16, 2026

Embeddable Postgres with real-time, reactive bindings.

TypeScript 14,912 368 Updated Mar 19, 2026

Devika is the first open-source implementation of an Agentic Software Engineer. Initially started as an open-source alternative to Devin.

Python 19,499 2,590 Updated Sep 25, 2025

A Desktop App for Easily Viewing and Editing Markdown Files

TypeScript 1,180 44 Updated Jun 17, 2024

Noosphere is a protocol for thought; let's discover it together!

Rust 694 38 Updated Jul 8, 2024

A powerful, flexible, Markdown-based authoring framework.

TypeScript 7,933 216 Updated Mar 11, 2026

Markdown for the component era

JavaScript 19,336 1,181 Updated Mar 21, 2026

A fast implementation of a Farcaster Hub, in Rust.

Rust 58 7 Updated Sep 9, 2024

Generative Agents: Interactive Simulacra of Human Behavior

20,940 2,922 Updated Aug 5, 2024
Next