pensar

Personal wiki engine for R with a large language model (LLM) as research assistant.

Alpha. Expect bugs and missing functionality. Feel free to open an issue if you’re noticing or missing something specific that’s not already tracked.

An open-source R implementation of the Obsidian markdown vault concept combined with Karpathy’s LLM wiki pattern. Three modes share the same vault:

Vaults are plain markdown with YAML frontmatter and [[wikilinks]] — diffable, version-controllable, Obsidian-compatible.

pensar (“to think”) has four base dependencies: curl, digest, stringdist, yaml. The LLM workflows are gated by Suggests: (llm.api, jsonlite).

Install

install.packages("pensar")   # or
remotes::install_github("cornball-ai/pensar")

Quick start

Autonomous research

library(pensar)

init_vault("~/wiki")
use_vault("~/wiki")

Sys.setenv(TAVILY_API_KEY = "tvly-...")        # free tier at tavily.com
Sys.setenv(ANTHROPIC_API_KEY = "sk-ant-...")   # or OPENAI_API_KEY / MOONSHOT_API_KEY

res <- autoresearch("transformer scaling laws")
print(res)
show_page(res$synthesis$slug)

Without API keys, autoresearch() falls back to deterministic heuristics so the pipeline still runs (search-and-file with no LLM judgment, much lower output quality).

Adopt an existing Obsidian vault

init_vault("~/Documents/my-obsidian-vault", adopt = TRUE)
status("~/Documents/my-obsidian-vault")

Adopt mode adds only three small files (schema.md, index.md, log.md) and flags the vault as read-first. ingest() and the wiki writer refuse to write unless you pass force = TRUE. See vignette("adopt-obsidian", package = "pensar") for a walkthrough across six real Obsidian vaults.

Manual ingest

ingest("Article content here...",
       type = "articles",
       source = "https://example.com/interesting-post",
       title = "Interesting Post")

status()
backlinks("Interesting Post")

Configuring the vault path

Per CRAN policy, pensar never writes to a default home-filespace location. You have to point it somewhere. Resolution order (first hit wins):

  1. The vault = argument when passed explicitly.
  2. The PENSAR_VAULT environment variable.
  3. Walk-up from getwd() for a directory containing schema.md.
  4. options("pensar.vault"), set by use_vault() — typically in ~/.Rprofile.

If none resolves, pensar errors with a setup hint.

Vault structure

{vault}/
  raw/
    articles/       clipped articles, autoresearch sources, pasted text
    chats/          conversation logs worth keeping
    briefings/      project briefings
    matrix/         messages from Matrix rooms
  wiki/             synthesized pages (autoresearch + your edits)
  index.md          auto-generated catalog of everything
  log.md            append-only record of operations
  schema.md         conventions for content in the vault
  CLAUDE.md         instructions for Claude Code when started here
  AGENTS.md         same content for Codex and other agents
  {name}.Rproj      RStudio project file

Working with an AI agent (guided mode)

init_vault() seeds CLAUDE.md and AGENTS.md so any agent you start in the vault (Claude Code, Codex, etc.) knows how to operate on it — what files are immutable, how to drill down with pensar show, when to rebuild the site.

A Claude Code autoresearch skill ships at system.file("skills/pensar/autoresearch", package = "pensar"). Symlink it into your skills directory:

ln -s "$(Rscript -e 'cat(pensar::pensar_skill_path())')" ~/.claude/skills/pensar

The skill routes research requests through pensar::autoresearch(), not through a manual WebSearch/WebFetch/file-edit loop. Pass agent_instructions = FALSE to init_vault() if you don’t want CLAUDE.md / AGENTS.md.

Versioning: git or syncthing?

Use both, for different things:

Don’t sync the vault source via Syncthing. Concurrent edits from multiple devices on the same .md file get messy.

Functions

Function What it does
autoresearch(topic, vault) Decompose, search, ingest, synthesize into the vault
init_vault(path, adopt) Create or adopt a vault
use_vault(path) Remember a vault path for this session
ingest(content, type, source) Write a source to raw/, update index + log
ingest_url(url) Fetch + file a URL into raw/articles/
update_index(vault) Regenerate index.md from all vault pages
vault_commit(message) Auto-commit vault changes (no-op for non-git)
status(vault) Page counts by category
backlinks(page) Find all pages linking to a given page
outlinks(page) Find pages this page cites
related_pages(name, k) Surface related pages by overlap
show_page(page) Content + outlinks + backlinks
lint(vault) Orphans, broken wikilinks, tag clusters
vault_export(vault, out_dir) Render vault to static HTML (requires pandoc)

A pensar CLI is installed at {pkg}/bin/pensar:

pensar status              page counts by category
pensar lint                health check
pensar show "<page>"       drill-down inspection
pensar back "<page>"       backlinks only
pensar tag <tag>           pages with this tag
pensar log [n]             last n log entries
pensar export [out-dir]    render to static HTML

Symlink {pkg}/bin/pensar to somewhere on your PATH (e.g., ~/.local/bin/pensar) to use it as a command.

Conventions

Every page uses YAML frontmatter and plain [[wikilinks]]. Obsidian-style aliases ([[page-slug|display text]]) are supported: pensar resolves page-slug as the target and renders display text for readers.

---
title: Page Title
type: concept
source: "[[Raw Source]]"
date: 2026-04-13
tags:
  - R
  - testing
---

This connects to [[Other Page]] and builds on [[Raw Source]].

Sister packages

Package Purpose
saber Context engineering for LLM agents (CRAN)
pensar Personal LLM wiki engine (this package)
corteza Agent runtime and chat loop
llm.api LLM provider connectivity (CRAN)
mx.api Matrix Client-Server API

License

Apache 2.0