Test Time Training Models

Heavily WIP. May never be finished. Oh well!

An (attempted!) implementation of "test-time training" language models for long context in PyTorch, where an inner memory module learns to encode memory over the sequence by optimizing its weights as it processes tokens, introduced by Learning to (Learn at Test Time): RNNs with Expressive Hidden States by Sun, et. al.

This likely won't work because I am not good at computer, but it's worth a shot anyway.

This framework will use the Hugging Face ecosystem, including the Transformers Trainer. Easier this way.

Installation

The usual.

pip3 install -r requirements.txt

Two libraries in the requirements file are custom kernels from others:

Please make sure your GPUs can support these libraries. I'll add alternative native versions later.

Architecture Implementation Checklist

Will add more papers as I find them.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
tto		tto
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test Time Training Models

Installation

Architecture Implementation Checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Test Time Training Models

Installation

Architecture Implementation Checklist

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages