The right context window for the right job
Bigger is not better. The advertised number lies. Pick the smallest window that fits the work and you save money, time, and accuracy.
Read article5 long-form essays. Deep dives, opinions, tutorials.
Bigger is not better. The advertised number lies. Pick the smallest window that fits the work and you save money, time, and accuracy.
Read article43 words a second. 0.29 second wait. 20 of 20 coding answers correct. 19.5 GB used out of 24. The card runs this AI very well — and it can hold a 128k window with 4 chats at the same time.
Read article44 words a second. 0.26 second wait. 19 of 20 coding answers worked. 18 GB used out of 24. The card runs this AI well — and it can hold a 128k window with 4 chats at the same time.
Read article105 words a second solo. 160 words a second with 2 chats. 0.56 second wait. 32 GB card almost full. Tested on vLLM.
Read articleThree ways to work with Claude. They are not interchangeable. Using the wrong one wastes hours. Here is when to reach for each.
Read article