metasurvey provides a REST API built with plumber backed by MongoDB for
sharing recipes, workflows, and variable metadata with the community.
The API can be self-hosted (see vignette("self-hosting"))
and is used by both the R client functions (api_*) and the
Shiny exploration application.
After deploying, the Swagger UI interface at
<your-api-url>/__docs__/ provides an interactive
endpoint explorer automatically generated by plumber. For detailed
request/response schemas and MongoDB collection documentation, see the
sections below.
library(metasurvey)
# Point to your self-hosted API
configure_api("https://your-api-host.example.com")
# Or use an environment variable
Sys.setenv(METASURVEY_API_URL = "https://your-api-host.example.com")The R client reads the URL first from configure_api(),
then falls back to the METASURVEY_API_URL environment
variable.
The API uses JWT (JSON Web Token) authentication with HMAC-SHA256 signing. Tokens expire after 24 hours; long-lived tokens (90 days) can be generated for automated scripts.
# Individual account (auto-approved)
api_register("Ana Garcia", "ana@example.com", "password123")
# Institutional member (requires admin review)
api_register(
"Carlos Rodriguez",
"carlos@ine.gub.uy",
"password123",
user_type = "institutional_member",
institution = "INE Uruguay"
)Account types:
| Type | Description | Approval |
|---|---|---|
individual |
Independent researcher | Automatic |
institutional_member |
Member of a recognized institution | Requires admin review |
institution |
Institutional account | Requires admin review |
The token is stored in the session and used automatically in subsequent API calls. The client automatically renews tokens within 5 minutes of their expiration.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/recipes |
No | List and search recipes |
GET |
/recipes/:id |
No | Get an individual recipe |
POST |
/recipes |
Yes | Publish a new recipe |
POST |
/recipes/:id/download |
No | Increment download counter |
# All recipes
all <- api_list_recipes()
# Filter by survey type
ech <- api_list_recipes(survey_type = "ech")
# Search by text
labor <- api_list_recipes(search = "empleo")
# Filter by topic
income <- api_list_recipes(topic = "income")
# Filter by certification level
official <- api_list_recipes(certification = "official")
# Pagination
page2 <- api_list_recipes(limit = 10, offset = 10)Query parameters:
| Parameter | Type | Description |
|---|---|---|
search |
string | Regex search on recipe name |
survey_type |
string | ech, eaii, eph,
eai |
topic |
string | labor_market, income,
education, health, demographics,
housing |
certification |
string | community, reviewed,
official |
user |
string | Filter by author email |
limit |
integer | Maximum results (default 50) |
offset |
integer | Skip N results (default 0) |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/workflows |
No | List and search workflows |
GET |
/workflows/:id |
No | Get an individual workflow |
POST |
/workflows |
Yes | Publish a new workflow |
POST |
/workflows/:id/download |
No | Increment download counter |
Note: The ANDA integration is an unofficial implementation that parses DDI XML metadata from INE Uruguay’s public ANDA catalog. It is not endorsed by INE and may contain errors or become outdated if INE changes the catalog structure. Always verify critical variable definitions against the official codebook.
The /anda/variables endpoint provides variable metadata
obtained from INE Uruguay’s ANDA catalog (DDI XML format). This includes
variable labels, value categories, and type information.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/anda/variables |
No | Get variable metadata |
# Get all ECH variables
vars <- api_get_anda_variables(survey_type = "ech")
# Get specific variables
vars <- api_get_anda_variables(
survey_type = "ech",
var_names = c("pobpcoac", "e27", "ht11")
)Query parameters:
| Parameter | Type | Description |
|---|---|---|
survey_type |
string | Survey type (default "ech") |
names |
string | Comma-separated variable names (all if empty) |
Each variable document contains:
| Field | Description |
|---|---|
name |
Variable name (lowercase) |
label |
Human-readable label |
type |
discrete, continuous, or
unknown |
value_labels |
List of code-label mappings |
description |
Extended description |
source_edition |
Survey edition (e.g., "2024") |
source_catalog_id |
ANDA catalog ID (e.g., 767) |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/admin/pending-users |
Admin | List institutional accounts pending review |
POST |
/admin/approve/:email |
Admin | Approve an institutional account |
POST |
/admin/reject/:email |
Admin | Reject an institutional account |
Admin access is controlled via the
METASURVEY_ADMIN_EMAIL environment variable on the
server.
The database has four collections, each with JSON Schema validation and optimized indexes.
The following diagram shows the MongoDB collections and their relationships:
┌──────────────────┐ ┌──────────────────────┐
│ users │ │ recipes │
├──────────────────┤ ├──────────────────────┤
│ email (PK) │──┐ │ id (PK) │
│ name │ │ │ name │
│ password_hash │ ├───>│ user (FK) │
│ user_type │ │ │ survey_type │
│ institution │ │ │ edition │
└──────────────────┘ │ │ steps[] │
│ │ certification{} │
│ │ categories[] │
│ └──────────┬───────────┘
│ │
│ ┌──────────┴───────────┐
│ │ workflows │
│ ├──────────────────────┤
│ │ id (PK) │
└───>│ user (FK) │
│ survey_type │
│ recipe_ids[] (FK) │
│ calls[] │
└──────────────────────┘
┌──────────────────────┐
│ anda_variables │
├──────────────────────┤
│ survey_type (PK) │
│ name (PK) │
│ label │
│ type │
│ value_labels{} │
└──────────────────────┘
Relationships:
users ──1:N──> recipes (publishes)
users ──1:N──> workflows (publishes)
recipes ──1:N──> workflows (referenced by)
users| Field | Type | Required | Description |
|---|---|---|---|
name |
string | Yes | Display name |
email |
string | Yes | Email (unique, validated) |
password_hash |
string | Yes | SHA-256 hash (64 characters) |
user_type |
enum | Yes | individual, institutional_member,
institution |
institution |
string | No | Institution name |
verified |
boolean | No | Whether identity is verified |
review_status |
enum | No | approved, pending,
rejected |
reviewed_by |
string | No | Reviewing admin’s email |
reviewed_at |
string | No | ISO timestamp |
created_at |
string | Yes | ISO timestamp |
Indexes: unique on email.
recipes| Field | Type | Required | Description |
|---|---|---|---|
id |
string | No | Unique identifier (auto-generated) |
name |
string | Yes | Recipe name |
user |
string | Yes | Author email |
survey_type |
enum | Yes | ech, eaii, eph,
eai |
edition |
string/array | No | Survey edition(s) |
description |
string | No | Description |
topic |
enum | No | labor_market, income,
education, health, demographics,
housing |
version |
string | No | Semantic version (default "1.0.0") |
downloads |
number | No | Download counter (default 0) |
steps |
array | No | Step expressions as strings |
depends_on |
array | No | Required input variable names |
depends_on_recipes |
array | No | IDs of dependent recipes |
categories |
array | No | Category objects |
certification |
object | No | {level, certified_at, certified_by, notes} |
user_info |
object | No | {name, user_type, email, url, verified} |
doc |
object | No | {input_variables, output_variables, pipeline} |
data_source |
object | No | {s3_bucket, s3_prefix, file_pattern, provider} |
Indexes: unique on id; on
user, survey_type, topic,
downloads (desc), certification.level;
compound on (survey_type, edition); text search on
(name, description, topic).
workflows| Field | Type | Required | Description |
|---|---|---|---|
id |
string | No | Unique identifier (auto-generated) |
name |
string | Yes | Workflow name |
user |
string | Yes | Author email |
survey_type |
enum | Yes | ech, eaii, eph,
eai |
edition |
string/array | No | Survey edition(s) |
description |
string | No | Description |
version |
string | No | Semantic version |
downloads |
number | No | Download counter |
estimation_type |
string/array | No | annual, quarterly,
monthly |
recipe_ids |
array | No | Referenced recipe IDs |
calls |
array | No | Estimation calls as strings |
call_metadata |
array | No | Call descriptions |
categories |
array | No | Category objects |
certification |
object | No | Same as recipes |
user_info |
object | No | Same as recipes |
Indexes: unique on id; on
user, survey_type, recipe_ids,
downloads (desc); compound on
(survey_type, edition); text search on
(name, description).
anda_variables| Field | Type | Required | Description |
|---|---|---|---|
survey_type |
string | Yes | Survey type |
name |
string | Yes | Variable name (lowercase) |
label |
string | Yes | Human-readable label |
type |
enum | No | discrete, continuous,
unknown |
value_labels |
object | No | Code-label mappings |
description |
string | No | Extended description |
source_edition |
string | No | Edition (e.g., "2024") |
source_catalog_id |
number | No | ANDA catalog ID |
Indexes: compound unique on
(survey_type, name); on survey_type.
To set up the database on a new deployment:
# 1. Create collections with JSON Schema validation and indexes
mongosh "$METASURVEY_MONGO_URI" inst/scripts/setup_mongodb.js
# 2. Seed recipes, workflows, and users
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_ech_recipes.R
# 3. Seed ANDA variable metadata from INE catalog
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_anda_metadata.RThe setup script creates the four collections and builds the indexes. It is idempotent: existing collections are skipped.
| Variable | Required | Default | Description |
|---|---|---|---|
METASURVEY_MONGO_URI |
Yes | — | MongoDB connection string |
METASURVEY_DB |
No | metasurvey |
Database name |
METASURVEY_JWT_SECRET |
No | metasurvey-dev-secret-... |
JWT signing secret (override in production) |
METASURVEY_ADMIN_EMAIL |
No | — | Admin email for institutional review |
METASURVEY_MONGO_URI="mongodb+srv://user:pass@cluster.mongodb.net" \
Rscript -e 'plumber::plumb("inst/api/plumber.R")$run(port = 8787)'The Swagger UI interface will be available at
http://localhost:8787/__docs__/.
The API is configured for Railway deployment via the
render.yaml file in inst/api/. Push the
repository and configure the environment variables in the Railway
dashboard.
The API allows cross-origin requests from any origin:
workflow()