conceptofmind / attention_sinks Goto Github PK
View Code? Open in Web Editor NEWThis project forked from tomaarsen/attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
Home Page: https://huggingface.co/blog/tomaarsen/attention-sinks
License: Apache License 2.0