Skip to content

v0.1.2

Compare
Choose a tag to compare
@christopher-w-murphy christopher-w-murphy released this 26 Aug 18:38
· 37 commits to main since this release
1a9de7a

The attention bias in MosaicBERT has attn_bias.ndim == 4, so I generalized flash_attention_n to accomodate this.