# Replicate AI integration on Definable

> Replicate allows users to run AI models via a cloud API without managing infrastructure.

## What this connects

Replicate allows users to run AI models via a cloud API without managing infrastructure.

Vendor: https://replicate.com/

## Tools available

**31** tools available. First 12:

- `REPLICATE_ACCOUNT_GET` — Get Account Information — Tool to get authenticated account information. Use when you need to retrieve details about the account associated with the API token.
- `REPLICATE_CANCEL_PREDICTION` — Cancel Prediction — Tool to cancel a prediction that is still running. Use when you need to stop an in-progress prediction to free up resources or halt execution.
- `REPLICATE_COLLECTIONS_GET` — Get model collection — Tool to get a specific collection of models by its slug. Use when you need detailed information about a collection and its models.
- `REPLICATE_COLLECTIONS_LIST` — List model collections — Tool to list all collections of models. Use when you need to retrieve available model collections. Collections are curated groupings of related models. Response includes only collection metadata (name, slug, description), not individual models within each collection; use REPLICATE_MODELS_GET for per-model details. Response may include a non-null `next` field indicating additional pages; follow it to enumerate all collections.
- `REPLICATE_CREATE_MODEL` — Create Model — Tool to create a new Replicate model with specified owner, name, visibility, and hardware. Use when you need to create a destination model before launching LoRA/fine-tune training.
- `REPLICATE_CREATE_PREDICTION` — Create Prediction — Tool to create a prediction for a Replicate Deployment. IMPORTANT: This action ONLY works with Replicate Deployments (persistent instances you create and manage), NOT public models. Deployments are created via REPLICATE_DEPLOYMENTS_CREATE. To run public models (e.g., 'meta/llama-2-70b-chat', 'stability-ai/sdxl'), use REPLICATE_MODELS_PREDICTIONS_CREATE instead. Use 'wait_for' to wait until the prediction completes.
- `REPLICATE_DEPLOYMENTS_CREATE` — Create Deployment — Tool to create a new deployment with specified model, version, hardware, and scaling parameters. Use when you need to deploy a model for production use with auto-scaling.
- `REPLICATE_DEPLOYMENTS_DELETE` — Delete Deployment — Tool to delete a deployment from your account. Use when you need to remove a deployment. Deployments must be offline and unused for at least 15 minutes before deletion.
- `REPLICATE_DEPLOYMENTS_GET` — Get Deployment Details — Tool to get deployment details by owner and name. Use when you need information about a specific deployment including its release configuration and hardware settings.
- `REPLICATE_DEPLOYMENTS_LIST` — List deployments — Tool to list all deployments associated with the account. Use when you need to retrieve deployment configurations and their latest releases.
- `REPLICATE_FILES_CREATE` — Create File — Tool to create or upload a file to Replicate. Use when you need to upload file content with optional metadata.
- `REPLICATE_FILES_DELETE` — Delete File — Tool to delete a file by its ID. Use when you need to remove a file from storage. Returns 204 No Content on success.

## Auth

Auth schemes: `API_KEY`.

## How agents use Replicate

Inside a Definable workflow, Replicate is one of the tools the **Distributor specialist** can call. Example coordination patterns:

- **Researcher → Replicate** — the Researcher (GPT-5.5) pulls context from Replicate (records, threads, documents), synthesises findings, and briefs the rest of the team.
- **Writer → Distributor → Replicate** — the Writer (Claude Opus 4.7) drafts copy in brand voice, the Verifier passes it, then the Distributor writes the result into Replicate (create record, post message, draft email).
- **Designer / Engineer → Distributor → Replicate** — the Designer ships an asset or the Engineer ships a code change, the Distributor delivers it via Replicate (attach file, open PR comment, post status).

The Verifier checks every Replicate call. On rate limit, schema drift, or auth refresh it self-heals and retries — the workflow completes without manual intervention.

## Categories

- artificial intelligence — https://definable.ai/apps/category/artificial-intelligence/
- ai models — https://definable.ai/apps/category/ai-models/

## Related

- HTML page: https://definable.ai/apps/replicate/
- Same category (artificial intelligence): https://definable.ai/apps/category/artificial-intelligence/
- All integrations: https://definable.ai/apps/
- Workflow (multi-agent loop): https://definable.ai/workflow/
- Apps llms.txt index: https://definable.ai/llms-apps.txt