Ollama: Difference between revisions

Line 46:

To find out whether a model is running on CPU or GPU, you can either

look at the logs of

ollama serve

$ ollama serve

</syntaxhighlight>

and search for "looking for compatible GPUs" and "new model will fit in available VRAM in single GPU, loading"

or while a model is answering run

or while a model is answering run in another terminal

$ ollama ps

Line 66:

Line 69:

$ ollama run codellama:13b-instruct "Write an extended Python program with a typical structure. It should print the numbers 1 to 10 to standard output."

</syntaxhighlight>

=== See usage and speed statistics ===

Add "--verbose" to see statistics after each prompt:

$ ollama run codellama:13b-instruct --verbose "Write an extended Python program..."

...

total duration: 50.302071991s

load duration: 50.912267ms

prompt eval count: 49 token(s)

prompt eval duration: 4.654s

prompt eval rate: 10.53 tokens/s <- how fast it processed your input prompt

eval count: 182 token(s)

eval duration: 45.595s

eval rate: 3.99 tokens/s <- how fast it printed a response

</syntaxhighlight>

@@ Line 46: / Line 46: @@
 To find out whether a model is running on CPU or GPU, you can either
 look at the logs of
-ollama serve
+<syntaxhighlight lang="bash">
+$ ollama serve
+</syntaxhighlight>
+and search for "looking for compatible GPUs" and "new model will fit in available VRAM in single GPU, loading"
-or while a model is answering run
+or while a model is answering run in another terminal
 <syntaxhighlight lang="bash">
 $ ollama ps
@@ Line 66: / Line 69: @@
 <syntaxhighlight lang="bash">
 $ ollama run codellama:13b-instruct "Write an extended Python program with a typical structure. It should print the numbers 1 to 10 to standard output."
+</syntaxhighlight>
+=== See usage and speed statistics ===
+Add "--verbose" to see statistics after each prompt:
+<syntaxhighlight lang="bash">
+$ ollama run codellama:13b-instruct --verbose "Write an extended Python program..."
+...
+total duration:       50.302071991s
+load duration:        50.912267ms
+prompt eval count:    49 token(s)
+prompt eval duration: 4.654s
+prompt eval rate:     10.53 tokens/s <- how fast it processed your input prompt
+eval count:           182 token(s)
+eval duration:        45.595s
+eval rate:            3.99 tokens/s  <- how fast it printed a response
 </syntaxhighlight>