Name	Name	Last commit message	Last commit date
Latest commit History 131 Commits
.config	.config
.dvc	.dvc
analysis	analysis
container	container
data	data
docs/compspec	docs/compspec
logs	logs
nomelt	nomelt
notebooks	notebooks
scripts	scripts
testing	testing
tmp	tmp
.dvcignore	.dvcignore
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
dvc.lock	dvc.lock
dvc.yaml	dvc.yaml
environment.yml	environment.yml
params.yaml	params.yaml
requirements.txt	requirements.txt
setup.py	setup.py

Name

Last commit message

Last commit date

131 Commits

nomelt

Designing high temperature protein sequences via learned language processing

Install

TODO conda installs rosetta install seperate af install seperate FATCAT

Config

Accelerate config

ENV variables

TMP AF_APPTAINER_SCRIPT LOG_LEVEL

TODO

HF caching: caching and DVC clash a little bit. Be default, when you do operations on a HF dataset, it creates cache files in the dataset folder, which makes DVC think the dataset has changed. If you want to use those cache operations, you have to commit the data/dataset object to DVC with changes. Instead, it would be better if cacheing dataset operations were abstracted out into their own script, and the cache file manually pathed to a dvc tracked output. Thus if paramters in the pipeline would change the operation, that one stage would be run, but downstream stages that use the same operations (eg. tokenization) could reuse that dvc tracked cache.

Models

nomelt models are all designed to produce amino acid sequences of proteins stable at high temperature, conditioned on an input

nomelt-s2s: (seq -> seq) translate from moderate to high temperature variants of proteins
- Traditional architectures (eg seq2seq T5, autoregressive Decoder only) and tokenizers for protein LM usable out of the box
- TODO
nomelt-hmm: (hmm -> seq) develop high temperature variants of protein from a representative HMM
- Traditional architectures (eg seq2seq T5, autoregressive Decoder only) LM usable out of the box
- Novel tokenizer required to prepare HMM inputs
- TODO
nomelt-hmm+: (hmm, T -> seq) develop variants of a protein stable at a specific temperature from a representative HMM
- Novel architecure and tokenizer required
- TODO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nomelt

Install

Config

ENV variables

TODO

Models

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

nomelt

Install

Config

ENV variables

TODO

Models

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages