Running LLMs from a single file! With LLamaFile from Mozilla. All you need is 

curl -LO https://huggingface.co/jartine/llava-v1.5-7B-GGUF/resolve/main/llamafile-server-0.1-llava-v1.5-7b-q4
chmod 755 llamafile-server-0.1-llava-v1.5-7b-q4
./llamafile-server-0.1-llava-v1.5-7b-q4


Now, I am waiting for an easy way to finetune an LLM from a Mac (this is mostly doable now) or using a CPU, and we will soon be within reach of the world of custom agents promised by @cstross