Basalt

Basalt

· #349 most-used

Manage, version, and monitor every AI prompt your team ships

ProductivityAnalyticsDeveloperAIMonitoring & Alerts

Basalt is an AI product development platform that gives engineering and product teams a single hub for prompt management, dataset curation, experiment tracking, and production observability. Connect it to Actionist and your agents can fetch the latest deployed prompt for any feature, log traces and evaluation results automatically, add rows to training datasets from live interactions, and create experiments to compare prompt variants — all without manually touching the Basalt dashboard.

Average time saved
10 hours
per person · per month
≈ 1 workdays back

Eliminates manual work. Agents eliminate the manual cycle of copying prompts between environments, logging evaluation results, and assembling experiment comparison reports by hand.

Schedule

What your Basalt agent runs on autopilot

A week of scheduled jobs your Actionist agent will execute on your behalf.

28Scheduled jobs
7Agents at work
24/7Always on
Agents
TueThu
Tue
Wed
Thu
7a
8a
9a
10a
11a
12p
1p
2p
3p
4p
5p
6p
Multi-app workflows

Basalt × every other app you use

End-to-end automations that span multiple apps — each one a real business outcome.

6Workflows
5Apps spanned
~26 hrsSaved / week
6Personas served
For operations
Featured3 apps

Prompt quality gate before every production promotion

When the team approves a prompt candidate in Slack, the agent fetches the current production version and the evaluation dataset from Basalt, runs an experiment to compare the two, and writes the score delta to the promotion log in Google Sheets. If the candidate scores at least 5% higher, the agent promotes it to production automatically — every promotion is backed by an experiment result, not guesswork.

~6 hrs

Time saved for your team — every week, on autopilot

The flow
Trigger·When a prompt candidate is approved for production testing
Result
Create experiment comparing candidate vs productionWrite experiment result and score delta to the promotion logPublish prompt to production if experiment score exceeds threshold
The win
Saved per run
45 min
Runs / week
~8×
Zero production prompt changes without a passing experiment result
Driven byOperations Agent
ROI

Savings

What your team gets back — two angles: what you stop doing manually, and what that's worth.

Without Actionist

What you do manually today

With Actionist

What your agent runs for you

  • Sales
    20 min / week
    Manual prompt version checks before outreach runs

    Sales ops manually confirms which prompt version is live in Basalt before each outreach campaign, copying version IDs between tools and logging them in a spreadsheet.

    Sales Agent
    0 min
    Agent confirms prompt version and logs every run automatically

    The sales agent fetches the current production prompt at runtime, logs every LLM call to Basalt monitoring, and flags low-confidence outputs before they reach prospects — with no manual version checks.

  • Marketing
    60 min / week
    Manual prompt iteration tracking and experiment setup

    The marketing team tracks prompt variants in a shared doc, manually runs comparison tests by calling the LLM twice with each version, and eyeballs outputs to decide which is better.

    Marketing Agent
    0 min
    Agent creates and runs experiments automatically, then promotes the winner

    When a candidate prompt is ready, the agent creates a Basalt experiment against the evaluation dataset, waits for results, and promotes the winning version to production — no eyeballing, no manual comparison.

  • Customer Support
    25 min / week
    Escalations logged manually with no connection to the evaluation set

    When a support escalation is caused by a bad AI response, the team logs it in a ticket but the input/output pair is never captured in a format usable for prompt improvement.

    Customer Support Agent
    0 min
    Agent captures every escalation as an evaluation row the moment it happens

    When an escalation fires, the agent retrieves the original Basalt trace, adds the corrected pair to the evaluation dataset, and posts a summary to Slack — the failure immediately improves the next experiment.

  • Human Resources
    15 min / week
    No audit trail for which prompt version screened which candidate

    HR uses AI for screening but has no record of which prompt version was active when each candidate was evaluated — a compliance gap that only surfaces during audits.

    Human Resources Agent
    0 min
    Agent logs prompt version and trace ID for every screening run

    The HR agent fetches the production prompt, creates a trace for each screening workflow, and logs the version and trace ID to the compliance record — every candidate decision is attributable to a specific prompt version.

  • Finance
    30 min / week
    AI token costs discovered on the invoice, not during the billing period

    Finance has no per-prompt cost visibility during the month — token usage spikes only become visible when the LLM provider invoice arrives, too late to optimise.

    Finance Agent
    0 min
    Agent surfaces per-prompt token usage weekly before the billing cycle closes

    The finance agent retrieves Basalt traces weekly, groups token counts by prompt, writes cost estimates to the spend tracker, and alerts engineering when a prompt exceeds its budget — spikes are caught with time to act.

  • Operations
    45 min / week
    Prompt governance is ad-hoc, rollbacks are slow

    Operations tracks prompt versions in a spreadsheet, rolls back by manually calling the Basalt API, and discovers missing rollback versions during incidents rather than in advance.

    Operations Agent
    0 min
    Agent audits tags, confirms rollback readiness, and executes promotions automatically

    The operations agent audits all production tags weekly, confirms every prompt has a valid rollback version, and executes data-driven promotions from experiment results — governance is a scheduled task, not a scramble.

  • Legal
    30 min / week
    Compliance change logs assembled manually before each quarterly audit

    Before each audit, legal manually retrieves Basalt prompt version histories, formats them into a compliance export, and cross-references with the approval log — a multi-hour effort each quarter.

    Legal Agent
    0 min
    Agent generates a compliance-ready change log every week automatically

    The legal agent exports full version histories for all compliance-register prompts to Google Sheets every Monday — the quarterly audit artefact is always current, and the audit itself takes hours instead of days.

+ 100s of other Basalt automations
Average time saved
23 hrs / person / month
Calculator

Calculate what your team saves

Team size
5 people
Hourly rate
$75 / hr
Hours saved / week
13
Hours saved / year
625
Annual ROI
$46,875

Based on Basalt's typical team usage — the visible tasks plus a few other automations the agent runs: ~2.5 hrs / person / week of admin work automated.

Connect

How to plug Basalt into Actionist

Pick the connection method that suits your environment.

Authenticate with a Basalt API key scoped to your workspace. The key grants read and write access to prompts, datasets, experiments, and monitoring endpoints.

1
Open Basalt Settings

Log in to your Basalt workspace, go to Settings, and open the API Keys section. Click Create key to generate a new API key.

2
Copy the API key

Copy the key immediately — it is shown once. Store it in a secrets manager, not in plain text.

3
Paste into Actionist

Paste the key into the Actionist connection dialog and click Test connection. Actionist will call List Prompts to verify access.

Credentials you'll need
API key*
Basalt dashboard → Settings → API Keys → Create key
Actions

12 actions your agent can call

Read and write operations available to your Actionist agent.

Triggers

0 events your agent can react to

Events your agent watches for, and the actions it kicks off in response.

This app has no triggers yet.
MCP servers

MCP servers that work with Basalt

Connect Actionist to MCP servers built for or around this app.

Basalt Design System

Exposes Basalt design tokens, components, icons, and accessibility data via MCP. For Cursor, Claude Code, and Windsurf — lets AI coding tools generate on-brand code automatically using your actual token files.

FAQs

Questions about Basalt + Actionist

How does Actionist connect to Basalt?
Go to the Apps tab, find Basalt, and click Connect. Paste your Basalt API key — generated in the Basalt dashboard under Settings → API Keys — into the connection dialog. Actionist runs a test call to List Prompts to verify the key before saving. Once connected, any agent in your workspace can fetch prompts, log monitoring traces, update datasets, or create experiments against your Basalt workspace.
What permissions does the Basalt API key need?
The Basalt API key is workspace-scoped — it grants read and write access to prompts, datasets, experiments, and monitoring endpoints within the workspace where it was generated. There is no per-resource permission granularity in Basalt's current API; the key either has access to the workspace or it does not. Generate the key with the account that owns the workspace, and rotate it from the Basalt Settings → API Keys page if you ever need to revoke access.
Can Actionist agents fetch a prompt and use it in an LLM call in the same workflow?
Yes. An Actionist agent can call Get Prompt to retrieve the current production-tagged version of a Basalt prompt, use the returned text as the system or user message in an LLM API call (via Actionist's built-in AI step), and then log the result back to Basalt with Monitor Single Prompt — all in the same scheduled agent task. This creates a closed loop: prompts are managed in Basalt, executed by Actionist agents, and their outputs are observed in Basalt.
What is the difference between monitoring a single prompt and creating a full trace?
Monitor Single Prompt logs a single LLM input/output pair — useful when your workflow has one AI call and you want to record its prompt version, model, latency, and output. Create Trace instruments a multi-step workflow end-to-end using OpenTelemetry spans — it captures every step (retrieval, classification, generation, post-processing) as part of one timeline. For simple one-call tasks, use Monitor Single Prompt. For RAG pipelines, multi-agent handoffs, or any workflow where you need to see which step caused a latency spike or error, use Create Trace.
How do agents automatically promote a winning prompt to production?
After creating a Basalt experiment and waiting for it to complete, the agent checks the result score against a configured threshold. If the candidate variant scores above the threshold (typically 5% better than production on the evaluation dataset), the agent calls Publish Prompt with the candidate version number and the 'production' tag — this moves the tag to the new version atomically. The next time any other agent calls Get Prompt with the 'production' tag, it gets the newly promoted version. The experiment ID is logged alongside the promotion for full traceability.
Can I use Basalt with the Actionist App Store MCP server for Cursor, Claude, and Windsurf?
Yes — Basalt publishes an MCP server (listed in the Actionist App Store's linked MCP servers) that exposes design tokens, components, and accessibility data to AI coding tools like Cursor, Claude Code, and Windsurf. This is separate from Basalt's prompt management API. The MCP server is for design-system consumers who want their AI coding assistant to generate on-brand code automatically using Basalt's token files. The Actionist integration targets Basalt's prompt management, dataset, experiment, and observability APIs.
What happens if a prompt is not tagged 'production' when an agent tries to fetch it?
Basalt's Get Prompt endpoint only returns a prompt version if it has been published (tagged for deployment). If no version of the prompt has the requested tag, the API returns an error rather than a default or latest version. Actionist agents handle this by checking for the error response and posting a warning to the configured Slack channel — the workflow pauses gracefully rather than proceeding with an empty or stale prompt. This is why weekly prompt audits that verify production tags are valuable: they catch untagged prompts before an agent run fails in production.
How do I make sure dataset rows added by Actionist agents are useful for future experiments?
The quality of rows added by Add Dataset Row determines how useful your evaluation set is. For support escalations, capture the original AI input (the user message and system context) and the human-corrected output with a label explaining why the AI was wrong. For human overrides in HR or sales contexts, include the reason for the override as a metadata field. Basalt's experiment runner uses these rows as evaluation examples — rows with rich context and clear labels produce more discriminating experiment results than rows with just input/output pairs and no annotation.