the new laptop only has 16GB of memory total, with another 7 dedicated to the NPU.
i tried pulling up Qwen 3 4B on it, but the max context i can get loaded is about 12k before the laptop crashes.
my next attempt is gonna be a 0.5B one, but i think ill still end up having to compress the context every call, which is my real challenge