Technically yes - if you have lots of ram you can use that and your CPU, as you say, the performance would be pretty poor, though, especially as it’s a toll where you want to tweak your responses quite frequently. I’ve been running and old Nvidia Tesla P100 card. I got cheap on eBay for awhile now it has 16 GB of VRAM but it is pretty old. I’m so interested in this now I’ve gone out and got myself a secondhand RTX 3090 - something I never thought I’d do, but I’d really like to run 30B models in GPU.