Skip to content

kadarakos/Transformer-Cookbook

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer Cookbook

This repository is synced with the Transformer Cookbook Overleaf project.

What's Cooking?

Informally, this is an initiative within the FLaNN community to make a "recipe-book" for transformer Formally, this is a collection of algorithmic constructions that we can show transformers are able to implement (in theory). Many of these constructions have been used in a series of work that has shown lower bounds on the expressivity of transformers (See the survey), however, their proofs lay scattered around the appendices of different papers. We think a centralized collection of everything we've proven transformers can do, alongisde the constructions, will be useful for researchers.

This collection will be cover constructions implementable in unique hard attention, average hard attention, and the standard softmax attention transformer. The exact details on what we consider an "implementable construction" will be fleshed out as the project unfolds. The scope will be limited to theoretical constructions and not empirical observations. That is, we'll only talk about what algorithms we can engineer into transformers, not what algorithms people have been able to reverse-engineer from transformers.

How can I contribute?

Feel free to make a pull request! For other suggestions and to get more involved, email me ayang4@nd.edu and I can add you to our discord server.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TeX 93.3%
  • Python 6.7%