1. Install llama.cpp from https://github.com/ggerganov/llama.cpp. Alternatively, install https://github.com/oobabooga/text-generation-webui 2. Go to https://huggingface.co/TheBloke and search for GGUF. Download and put the model file in the same directory. Then find the "example llama.cpp command line" and run without the "-ngl 35" switch.
However, at this point, if your laptop has at least 32 GB of RAM, there is no point in trying anything except Mixtral 8x7b and its fine-tunes. It is fast (4 tokens per second on an 8-core Ryzen without any GPU acceleration - which would not work on integrated Ryzen APUs anyway because they don't have dedicated VRAM) and provides answers only slightly worse than ChatGPT 3.5. Its main deficiency is the tendency to forget the initial instruction - for example, when asked to explain a particular SAMBA configuration file, it started OK, but then continued mentioning directives that were not in the file under discussion.