#position-bias
1 paper
-
inspiration
Position Is Not Neutral
A model that fits 128K tokens can still fail to use information you placed at token 60K. The context window is a capacity claim. Where you put information inside that window is a separate engineering decision — one with a measurable performance cost if you get it wrong.