API and Database Reference

Overview

metasurvey provides a REST API built with plumber backed by MongoDB for sharing recipes, workflows, and variable metadata with the community. The API can be self-hosted (see vignette("self-hosting")) and is used by both the R client functions (api_*) and the Shiny exploration application.

After deploying, the Swagger UI interface at <your-api-url>/__docs__/ provides an interactive endpoint explorer automatically generated by plumber. For detailed request/response schemas and MongoDB collection documentation, see the sections below.

Configuration

library(metasurvey)

# Point to your self-hosted API
configure_api("https://your-api-host.example.com")

# Or use an environment variable
Sys.setenv(METASURVEY_API_URL = "https://your-api-host.example.com")

The R client reads the URL first from configure_api(), then falls back to the METASURVEY_API_URL environment variable.

Authentication

The API uses JWT (JSON Web Token) authentication with HMAC-SHA256 signing. Tokens expire after 24 hours; long-lived tokens (90 days) can be generated for automated scripts.

Registration

# Individual account (auto-approved)
api_register("Ana Garcia", "ana@example.com", "password123")

# Institutional member (requires admin review)
api_register(
  "Carlos Rodriguez",
  "carlos@ine.gub.uy",
  "password123",
  user_type = "institutional_member",
  institution = "INE Uruguay"
)

Account types:

Type	Description	Approval
`individual`	Independent researcher	Automatic
`institutional_member`	Member of a recognized institution	Requires admin review
`institution`	Institutional account	Requires admin review

Login

api_login("ana@example.com", "password123")

The token is stored in the session and used automatically in subsequent API calls. The client automatically renews tokens within 5 minutes of their expiration.

Session Management

# View current user profile
api_me()

# Refresh token
api_refresh_token()

# Logout
api_logout()

Long-lived Tokens

For automated scripts and CI/CD, generate a 90-day token from the Shiny application (Profile tab) or use it directly:

Sys.setenv(METASURVEY_TOKEN = "your-long-lived-token")

# API calls work without interactive login
recipes <- api_list_recipes(survey_type = "ech")

API Endpoints

Recipes

Method	Endpoint	Auth	Description
`GET`	`/recipes`	No	List and search recipes
`GET`	`/recipes/:id`	No	Get an individual recipe
`POST`	`/recipes`	Yes	Publish a new recipe
`POST`	`/recipes/:id/download`	No	Increment download counter

List Recipes

# All recipes
all <- api_list_recipes()

# Filter by survey type
ech <- api_list_recipes(survey_type = "ech")

# Search by text
labor <- api_list_recipes(search = "empleo")

# Filter by topic
income <- api_list_recipes(topic = "income")

# Filter by certification level
official <- api_list_recipes(certification = "official")

# Pagination
page2 <- api_list_recipes(limit = 10, offset = 10)

Query parameters:

Parameter	Type	Description
`search`	string	Regex search on recipe name
`survey_type`	string	`ech`, `eaii`, `eph`, `eai`
`topic`	string	`labor_market`, `income`, `education`, `health`, `demographics`, `housing`
`certification`	string	`community`, `reviewed`, `official`
`user`	string	Filter by author email
`limit`	integer	Maximum results (default 50)
`offset`	integer	Skip N results (default 0)

Get Recipe

recipe <- api_get_recipe("ech_employment_001")

Publish Recipe

api_login("ana@example.com", "password123")
api_publish_recipe(my_recipe)

The server automatically sets the user field from the JWT, initializes downloads = 0, generates an id if not provided, and assigns the community certification by default.

Workflows

Method	Endpoint	Auth	Description
`GET`	`/workflows`	No	List and search workflows
`GET`	`/workflows/:id`	No	Get an individual workflow
`POST`	`/workflows`	Yes	Publish a new workflow
`POST`	`/workflows/:id/download`	No	Increment download counter

# List workflows for ECH
wf <- api_list_workflows(survey_type = "ech")

# Find workflows that use a specific recipe
wf <- api_list_workflows(recipe_id = "ech_employment_001")

# Get specific workflow
w <- api_get_workflow("wf_labor_market_001")

# Publish
api_publish_workflow(my_workflow)

ANDA Variable Metadata

Note: The ANDA integration is an unofficial implementation that parses DDI XML metadata from INE Uruguay’s public ANDA catalog. It is not endorsed by INE and may contain errors or become outdated if INE changes the catalog structure. Always verify critical variable definitions against the official codebook.

The /anda/variables endpoint provides variable metadata obtained from INE Uruguay’s ANDA catalog (DDI XML format). This includes variable labels, value categories, and type information.

Method	Endpoint	Auth	Description
`GET`	`/anda/variables`	No	Get variable metadata

# Get all ECH variables
vars <- api_get_anda_variables(survey_type = "ech")

# Get specific variables
vars <- api_get_anda_variables(
  survey_type = "ech",
  var_names = c("pobpcoac", "e27", "ht11")
)

Query parameters:

Parameter	Type	Description
`survey_type`	string	Survey type (default `"ech"`)
`names`	string	Comma-separated variable names (all if empty)

Each variable document contains:

Field	Description
`name`	Variable name (lowercase)
`label`	Human-readable label
`type`	`discrete`, `continuous`, or `unknown`
`value_labels`	List of code-label mappings
`description`	Extended description
`source_edition`	Survey edition (e.g., `"2024"`)
`source_catalog_id`	ANDA catalog ID (e.g., `767`)

Administration

Method	Endpoint	Auth	Description
`GET`	`/admin/pending-users`	Admin	List institutional accounts pending review
`POST`	`/admin/approve/:email`	Admin	Approve an institutional account
`POST`	`/admin/reject/:email`	Admin	Reject an institutional account

Admin access is controlled via the METASURVEY_ADMIN_EMAIL environment variable on the server.

Health Check

Method	Endpoint	Auth	Description
`GET`	`/health`	No	API and MongoDB status

{
  "status": "ok",
  "service": "metasurvey-api",
  "version": "2.0.0",
  "database": "metasurvey",
  "mongodb": "connected",
  "timestamp": "2026-02-15T12:00:00Z"
}

MongoDB Schema

The database has four collections, each with JSON Schema validation and optimized indexes.

Entity-Relationship Diagram

The following diagram shows the MongoDB collections and their relationships:

  ┌──────────────────┐       ┌──────────────────────┐
  │     users         │       │      recipes          │
  ├──────────────────┤       ├──────────────────────┤
  │ email (PK)       │──┐    │ id (PK)              │
  │ name             │  │    │ name                 │
  │ password_hash    │  ├───>│ user (FK)            │
  │ user_type        │  │    │ survey_type          │
  │ institution      │  │    │ edition              │
  └──────────────────┘  │    │ steps[]              │
                        │    │ certification{}      │
                        │    │ categories[]         │
                        │    └──────────┬───────────┘
                        │               │
                        │    ┌──────────┴───────────┐
                        │    │     workflows         │
                        │    ├──────────────────────┤
                        │    │ id (PK)              │
                        └───>│ user (FK)            │
                             │ survey_type          │
                             │ recipe_ids[] (FK)    │
                             │ calls[]              │
                             └──────────────────────┘

  ┌──────────────────────┐
  │   anda_variables      │
  ├──────────────────────┤
  │ survey_type (PK)     │
  │ name (PK)            │
  │ label                │
  │ type                 │
  │ value_labels{}       │
  └──────────────────────┘

  Relationships:
    users    ──1:N──>  recipes     (publishes)
    users    ──1:N──>  workflows   (publishes)
    recipes  ──1:N──>  workflows   (referenced by)

Collections

`users`

Field	Type	Required	Description
`name`	string	Yes	Display name
`email`	string	Yes	Email (unique, validated)
`password_hash`	string	Yes	SHA-256 hash (64 characters)
`user_type`	enum	Yes	`individual`, `institutional_member`, `institution`
`institution`	string	No	Institution name
`verified`	boolean	No	Whether identity is verified
`review_status`	enum	No	`approved`, `pending`, `rejected`
`reviewed_by`	string	No	Reviewing admin’s email
`reviewed_at`	string	No	ISO timestamp
`created_at`	string	Yes	ISO timestamp

Indexes: unique on email.

`recipes`

Field	Type	Required	Description
`id`	string	No	Unique identifier (auto-generated)
`name`	string	Yes	Recipe name
`user`	string	Yes	Author email
`survey_type`	enum	Yes	`ech`, `eaii`, `eph`, `eai`
`edition`	string/array	No	Survey edition(s)
`description`	string	No	Description
`topic`	enum	No	`labor_market`, `income`, `education`, `health`, `demographics`, `housing`
`version`	string	No	Semantic version (default `"1.0.0"`)
`downloads`	number	No	Download counter (default `0`)
`steps`	array	No	Step expressions as strings
`depends_on`	array	No	Required input variable names
`depends_on_recipes`	array	No	IDs of dependent recipes
`categories`	array	No	Category objects
`certification`	object	No	`{level, certified_at, certified_by, notes}`
`user_info`	object	No	`{name, user_type, email, url, verified}`
`doc`	object	No	`{input_variables, output_variables, pipeline}`
`data_source`	object	No	`{s3_bucket, s3_prefix, file_pattern, provider}`

Indexes: unique on id; on user, survey_type, topic, downloads (desc), certification.level; compound on (survey_type, edition); text search on (name, description, topic).

`workflows`

Field	Type	Required	Description
`id`	string	No	Unique identifier (auto-generated)
`name`	string	Yes	Workflow name
`user`	string	Yes	Author email
`survey_type`	enum	Yes	`ech`, `eaii`, `eph`, `eai`
`edition`	string/array	No	Survey edition(s)
`description`	string	No	Description
`version`	string	No	Semantic version
`downloads`	number	No	Download counter
`estimation_type`	string/array	No	`annual`, `quarterly`, `monthly`
`recipe_ids`	array	No	Referenced recipe IDs
`calls`	array	No	Estimation calls as strings
`call_metadata`	array	No	Call descriptions
`categories`	array	No	Category objects
`certification`	object	No	Same as recipes
`user_info`	object	No	Same as recipes

Indexes: unique on id; on user, survey_type, recipe_ids, downloads (desc); compound on (survey_type, edition); text search on (name, description).

`anda_variables`

Field	Type	Required	Description
`survey_type`	string	Yes	Survey type
`name`	string	Yes	Variable name (lowercase)
`label`	string	Yes	Human-readable label
`type`	enum	No	`discrete`, `continuous`, `unknown`
`value_labels`	object	No	Code-label mappings
`description`	string	No	Extended description
`source_edition`	string	No	Edition (e.g., `"2024"`)
`source_catalog_id`	number	No	ANDA catalog ID

Indexes: compound unique on (survey_type, name); on survey_type.

Database Setup

To set up the database on a new deployment:

# 1. Create collections with JSON Schema validation and indexes
mongosh "$METASURVEY_MONGO_URI" inst/scripts/setup_mongodb.js

# 2. Seed recipes, workflows, and users
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_ech_recipes.R

# 3. Seed ANDA variable metadata from INE catalog
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_anda_metadata.R

The setup script creates the four collections and builds the indexes. It is idempotent: existing collections are skipped.

Server Deployment

Environment Variables

Variable	Required	Default	Description
`METASURVEY_MONGO_URI`	Yes	—	MongoDB connection string
`METASURVEY_DB`	No	`metasurvey`	Database name
`METASURVEY_JWT_SECRET`	No	`metasurvey-dev-secret-...`	JWT signing secret (override in production)
`METASURVEY_ADMIN_EMAIL`	No	—	Admin email for institutional review

Running Locally

METASURVEY_MONGO_URI="mongodb+srv://user:pass@cluster.mongodb.net" \
  Rscript -e 'plumber::plumb("inst/api/plumber.R")$run(port = 8787)'

The Swagger UI interface will be available at http://localhost:8787/__docs__/.

Docker

docker build -t metasurvey-api inst/api/
docker run -p 8787:8787 \
  -e METASURVEY_MONGO_URI="mongodb+srv://..." \
  -e METASURVEY_JWT_SECRET="your-production-secret" \
  -e METASURVEY_ADMIN_EMAIL="admin@example.com" \
  metasurvey-api

Railway

The API is configured for Railway deployment via the render.yaml file in inst/api/. Push the repository and configure the environment variables in the Railway dashboard.

CORS

The API allows cross-origin requests from any origin:

Allowed methods: GET, POST, OPTIONS
Allowed headers: Content-Type, Authorization

Next Steps

Interactive recipe explorer – Browse recipes and workflows through the Shiny web application
Creating and publishing recipes – Build recipes programmatically and publish them to the API
Estimation workflows – Compute weighted survey estimates with workflow()