Attention GLMV - Search News

XAttention: Block Sparse Attention with Antidiagonal Scoring

XAttention is a plug-and-play sparse attention framework for Transformers that speeds up long-context inference by up to 13.5× — without sacrificing accuracy. It introduces a lightweight metric based ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

XAttention: Block Sparse Attention with Antidiagonal Scoring

Trending now