Reproducibility is a cornerstone of scientific research. localLLM is designed with reproducibility as a first-class feature, ensuring that your LLM-based analyses can be reliably replicated.
All generation functions in localLLM (quick_llama(),
generate(), and generate_parallel()) use
deterministic greedy decoding by default. This means
running the same prompt twice will produce identical results.
library(localLLM)
# Run the same query twice
response1 <- quick_llama("What is the capital of France?")
response2 <- quick_llama("What is the capital of France?")
# Results are identical
identical(response1, response2)#> [1] TRUE
Reproducibility is ensured even when temperature > 0:
# Stochastic generation with seed control
response1 <- quick_llama(
"Write a haiku about data science",
temperature = 0.9,
seed = 92092
)
response2 <- quick_llama(
"Write a haiku about data science",
temperature = 0.9,
seed = 92092
)
# Still reproducible with matching seeds
identical(response1, response2)#> [1] TRUE
# Different seeds produce different outputs
response3 <- quick_llama(
"Write a haiku about data science",
temperature = 0.9,
seed = 12345
)
identical(response1, response3)#> [1] FALSE
All generation functions compute SHA-256 hashes for both inputs and outputs. These hashes enable verification that collaborators used identical configurations and obtained matching results.
result <- quick_llama("What is machine learning?")
# Access the hashes
hashes <- attr(result, "hashes")
print(hashes)#> $input
#> [1] "a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1"
#>
#> $output
#> [1] "b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5"
The input hash includes: - Model identifier - Prompt text - Generation parameters (temperature, seed, max_tokens, etc.)
The output hash covers the generated text, allowing collaborators to verify they obtained matching results.
For multi-model comparisons, explore() computes hashes
per model:
res <- explore(
models = models,
prompts = template_builder,
hash = TRUE
)
# View hashes for each model
hash_df <- attr(res, "hashes")
print(hash_df)#> model_id input_hash output_hash
#> 1 gemma4b a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5... b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9...
#> 2 llama3b c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0... d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1...
Set hash = FALSE to disable hash computation if not
needed.
Use document_start() and document_end() to
capture everything that happens during your analysis. The log
records:
# Start documentation
document_start(path = "analysis-log.txt")
# Run your analysis
result1 <- quick_llama("Classify this text: 'Great product!'")
result2 <- explore(models = models, prompts = prompts)
# End documentation
document_end()The log file contains a complete audit trail:
localLLM Run Log
File: /path/to/analysis-log.txt
Started: 2025-01-15 14:30:22 EST
Ended: 2025-01-15 14:35:12 EST
Duration: 289.9 seconds
Events:
- [2025-01-15 14:30:22 EST] document_start
{
"package_version": "1.2.1",
"r_version": "4.4.1",
"platform": "aarch64-apple-darwin22.6.0",
"os": "Darwin",
"user": "researcher",
"working_directory": "/home/user/analysis"
}
- [2025-01-15 14:30:25 EST] quick_llama
{
"model": "Llama-3.2-3B-Instruct-Q5_K_M.gguf",
"prompt_count": 1,
"n_gpu_layers": 999,
"n_ctx": 2048,
"max_tokens": 100,
"temperature": 0,
"seed": 1234,
"auto_format": true,
"clean": false
}
- [2025-01-15 14:30:25 EST] quick_llama_hash
{
"input_hash": "a3f2b8c9...",
"output_hash": "b4c5d6e7..."
}
- [2025-01-15 14:35:12 EST] document_end
{
"duration_seconds": 289.9,
"total_events": 4
}
Hash (SHA-256): e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2...
Even with temperature = 0, explicitly setting seeds
documents your intent:
Record your setup at the start of analysis:
#> $os
#> [1] "Darwin"
#>
#> $cpu_cores
#> [1] 10
#>
#> $ram_total
#> [1] 17179869184
#>
#> $gpu
#> $gpu$name
#> [1] "Apple M2 Pro"
Wrap your entire analysis in documentation calls:
| Feature | Function/Parameter | Purpose |
|---|---|---|
| Deterministic output | temperature = 0 (default) |
Same input = same output |
| Seed control | seed = 42 |
Reproducible stochastic generation |
| Hash verification | attr(result, "hashes") |
Verify identical configurations |
| Audit trails | document_start()/document_end() |
Complete session logging |
| Hardware info | hardware_profile() |
Record execution environment |
With these tools, your LLM-based analyses become fully reproducible and verifiable.