Llama-cpp: Difference between revisions
new heading MoE and styling improvements |
m added webui segment, and a warning about naming |
||
| Line 19: | Line 19: | ||
==== in NixOS ==== | ==== in NixOS ==== | ||
After enable Unfree software in NixOS add CUDA to your packages<syntaxhighlight lang="nixos"> | After enable Unfree software in NixOS add CUDA to your packages<syntaxhighlight lang="nixos">{ | ||
{ | |||
environment.systemPackages = [ | environment.systemPackages = [ | ||
(pkgs.llama-cpp.override { cudaSupport = true; }) | (pkgs.llama-cpp.override { cudaSupport = true; }) | ||
]; | ]; | ||
} | }</syntaxhighlight>And do a switch to the new configuration | ||
</syntaxhighlight>And do a switch to the new configuration | |||
sudo nixos-rebuild switch | sudo nixos-rebuild switch | ||
| Line 157: | Line 155: | ||
== llama-server == | == llama-server == | ||
<code>llama-server</code> runs a server, and it can run models on demand. It's quite similar to [[Ollama]]. | <code>llama-server</code> runs a server, and it can run models on demand. It supports OpenAI API standard. It's quite similar to [[Ollama]]. | ||
You can manually start the server from your terminal, it's usage, is not that different from <code>llama-cli</code>, but we are going to see the integration with NixOS as a service.<syntaxhighlight lang="nixos"> | |||
You can manually start the server from your terminal, it's usage, is not that different from <code>llama-cli</code>, but we are going to see the integration with NixOS as a service. | |||
{{Warning|Pay attention, that the service is actually called llama-cpp not llama-server}}<syntaxhighlight lang="nixos"> | |||
{ | { | ||
services.llama-cpp = { | services.llama-cpp = { | ||
enable = true; | enable = true; | ||
package = pkgs.llama-cpp-vulkan; | package = pkgs.llama-cpp-vulkan; | ||
# package = (pkgs.llama-cpp.override { cudaSupport = true; }) | |||
# package = pkgs.llama-cpp-rocm; | |||
# Takes care of downloading if model not present | # Takes care of downloading if model not present | ||
modelsPreset = { | modelsPreset = { | ||
| Line 183: | Line 187: | ||
sudo nixos-rebuild switch | sudo nixos-rebuild switch | ||
</pre> | </pre> | ||
=== Web UI === | |||
The llama-cpp service includes a web interface, where you can chat. To access you must navigate to http://localhost:8080 . Or the <code>services.llama-cpp.port</code> configured. | |||
=== Troubleshooting === | === Troubleshooting === | ||