Local private LLM
Appearance
Motivation
- Large language models (LLMs) are very good at analyzing (large) texts
- There are multiple free-to-test providers in the internet
- Sensitive private or confidential data (like diary/journal, medical records, etc.) should NOT be shared with any external service
Solution
Run your own LLM
Installation
I use the nice open-source tool Ollama to manage and host the LLM.
Running
Start in terminal/command window by passing a model (the llama3.2 was not working well for me, see below)
ollama run llama3.2:latest
A list of supported LLM models can be found here.
For me the qwen3 model worked best, so far.
Besides the model, you need to select a model size, that fits to your hardware.
For me:
MacBook Pro M1 16GB -> models up to 8b work, but with patience Windows PC with gaming graphics card (GPU) GeForce RTX 3060 12GB -> models up to 14b run very smooth
To check if the model fits into you GPU memory, first run
ollama run <model>
than open a second terminal window and run
ollama ps
to see where the model is stored.
Usage
chat commands
/clear : clear current session /bye : exit
command line commands
show currently running model(s) and where it is stored (RAM or GPU RAM)
ollama ps
stop model (automatically done after 5min inactivity)
ollama stop llama3.2:latest
delete model
ollama rm llama3.2:latest
Remote access
Set env variable
OLLAMA_HOST=192.168.0.123:11434
Local storage of the models
macOS: ~/.ollama/models Linux: /usr/share/ollama/.ollama/models Windows: C:\Users\%username%\.ollama\models
change via env var OLLAMA_MODELS (ollama service restart needed)