No description
| src | ||
| .gitignore | ||
| .python-version | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
energybench
Benchmark LLMs for energy consumption and task performance.
Requirements
- Ollama. The Ollama server needs to be running in the background, with the models you need downloaded.
Installation
pip install -e .
Usage
python3 src/energybench.py <model> <benchmark> [options]
Arguments
model: LLM model to use (e.g.,llama3.2)benchmark: Benchmark to run (gsm8k,mmlu,arc_easy,arc_challenge,boolq)
Options
-t, --temperature: Sampling temperature (default: 0.0)-s, --samples: Number of samples to run (default: all)-i, --iterations: Iterations per prompt (default: 1)-e, --evaluate: Enable evaluation of responses--evaluator: Evaluator to use (numeric,exact_match,multiple_choice,boolean, default:none)-p, --prompt-template: Prompt template name (seeprompt_templates.py)-sp, --system-prompt: System prompt name (seeprompt_templates.py)--no-energy: Disable energy measurement--offline: Use offline datasets (download datasets by runningpython3 src/download_datasets.py)--subset: Use dataset subset--subset-size: Size of subset (default: 200)
Example
python3 src/energybench.py llama3.2 arc_easy -e -s 100 -i 5