You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's because in flashinfer we currently use -5e4 as a surrogate of -inf, and when sequence length is large the alibi bias might be smaller than -5e4. The main reason of choosing -5e4 is that -inf cannot do some operations (and will result in nan) and we want this value is within the valid data range of the data type of m (it's fp32 in almost all cases but we provide an option of using fp16 when allow_fp16_qk_reduction=True).
latest main, A100
The text was updated successfully, but these errors were encountered: