# How to Test Codex Plugins Before Publishing: Complete Local Testing Guide

> Test Codex plugins locally with plugin-eval. Lint manifests, simulate requests, and benchmark executions before publishing to the marketplace. Your complete local testing guide.

- Repository: [OpenAI/plugins](https://github.com/openai/plugins)
- Tags: how-to-guide
- Published: 2026-06-07

---

**Run the `plugin-eval` CLI to lint manifests, simulate natural language requests, and benchmark live Codex executions locally before submitting your plugin to the marketplace.**

Testing a Codex plugin locally guarantees that the **manifest**, **skills**, and **runtime commands** are functional before publication. The `openai/plugins` repository provides a built-in `plugin-eval` harness that mirrors how Codex discovers plugins in production. This guide covers the step-by-step validation workflow using the actual CLI implementation found in [`plugins/plugin-eval/scripts/plugin-eval.js`](https://github.com/openai/plugins/blob/main/plugins/plugin-eval/scripts/plugin-eval.js).

## Install the Plugin Locally for Testing

Codex discovers plugins through a local marketplace directory. Before running tests, you must symlink your plugin into the correct location so the runtime can load it.

### Create the Marketplace Directory

Create the plugins directory in your home folder if it does not exist:

```bash
mkdir -p ~/.agents/plugins

```

### Link Your Plugin Source

Create a symbolic link from your development directory to the marketplace location:

```bash
ln -sfn "$(pwd)/plugins/<plugin-name>" "$HOME/.agents/plugins/<plugin-name>"

```

For example, to test the `render` plugin from the repository root:

```bash
ln -sfn "$(pwd)/plugins/render" "$HOME/.agents/plugins/render"

```

### Register in marketplace.json

Add your plugin to the marketplace descriptor so Codex can discover it. Create or edit `~/.agents/plugins/marketplace.json`:

```json
{
  "name": "local",
  "interface": { "displayName": "Local Plugins" },
  "plugins": [
    {
      "name": "<plugin-name>",
      "source": { "source": "local", "path": "./plugins/<plugin-name>" },
      "policy": {
        "installation": "AVAILABLE",
        "authentication": "ON_INSTALL"
      },
      "category": "Developer Tools"
    }
  ]
}

```

Restart Codex or the Codex CLI to reload the marketplace configuration.

## Validate the Plugin Manifest

Each plugin requires a valid manifest at [`.codex-plugin/plugin.json`](https://github.com/openai/plugins/blob/main/.codex-plugin/plugin.json). Run the analysis command to verify schema compliance:

```bash
node ./plugins/plugin-eval/scripts/plugin-eval.js analyze ./plugins/<plugin-name> --format markdown

```

The `analyze` command validates required fields including `name`, `description`, `version`, and `manifestVersion`. If the JSON is malformed, the CLI reports the exact line and field requiring correction.

## Simulate User Interactions with Chat-First Testing

Test how Codex routes natural language requests to your plugin skills using the `start` command:

```bash
node ./plugins/plugin-eval/scripts/plugin-eval.js start ./plugins/<plugin-name> \
  --request "Evaluate this plugin." \
  --format markdown

```

This outputs the exact command sequence Codex will execute, including the routed skill and any subsequent `analyze` or initialization steps. According to the `plugin-eval` README, this simulates the live user experience without requiring a full Codex UI session.

## Run Static Analysis

Perform a comprehensive static analysis to check skill metadata, asset consistency, and safety constraints:

```bash
node ./plugins/plugin-eval/scripts/plugin-eval.js analyze ./plugins/<plugin-name> --format markdown

```

The analysis validates:

- **Schema compliance**: Required fields in [`plugin.json`](https://github.com/openai/plugins/blob/main/plugin.json)
- **Skill front-matter**: Presence of `retrieval.aliases`, `intents`, and `pathPatterns` in SKILL.md files
- **Asset integrity**: All referenced files exist with no stray binaries
- **Safety constraints**: No hard-coded secrets and proper permission declarations

## Benchmark Live Executions (Optional)

For plugins that interact with external services, generate realistic usage profiles to test runtime behavior.

### Initialize and Run Benchmarks

```bash
node ./plugins/plugin-eval/scripts/plugin-eval.js init-benchmark ./plugins/<plugin-name>
node ./plugins/plugin-eval/scripts/plugin-eval.js benchmark ./plugins/<plugin-name> --format markdown

```

Benchmarks write JSONL usage logs to `.plugin-eval/runs/<timestamp>/usage.jsonl`.

### Generate Measurement Plans

Feed usage logs back into the evaluator to create measurement plans:

```bash
node ./plugins/plugin-eval/scripts/plugin-eval.js measurement-plan ./plugins/<plugin-name> \
  --observed-usage .plugin-eval/runs/<timestamp>/usage.jsonl \
  --format markdown

```

The benchmark harness implementation is documented in [`plugins/plugin-eval/references/benchmark-harness.md`](https://github.com/openai/plugins/blob/main/plugins/plugin-eval/references/benchmark-harness.md).

## Common Failure Modes and Fixes

When `plugin-eval` reports errors, check these typical issues:

- **Missing front-matter**: SKILL.md files require `retrieval.aliases`, `intents`, and other metadata fields
- **Invalid paths**: Verify all paths in [`.codex-plugin/plugin.json`](https://github.com/openai/plugins/blob/main/.codex-plugin/plugin.json) resolve correctly from the plugin root
- **Environment variables**: Ensure required variables like `OPENAI_API_KEY` are set; note that `plugin-eval` deliberately never prints secret values to prevent leakage

Fix reported issues iteratively, re-running the relevant `plugin-eval` commands until all diagnostics pass.

## Summary

Testing Codex plugins locally prevents publishing broken manifests or non-functional skills. Key steps include:

- Symlink your plugin to `~/.agents/plugins/` and register it in [`marketplace.json`](https://github.com/openai/plugins/blob/main/marketplace.json)
- Run `plugin-eval analyze` to validate JSON schema and skill metadata
- Use `plugin-eval start` to simulate natural language routing
- Execute `plugin-eval benchmark` for live execution testing of external service integrations
- Iterate until static analysis reports zero errors and benchmarks complete successfully

## Frequently Asked Questions

### How do I test a Codex plugin without modifying my marketplace?

Use the CLI direct path invocation. Run `node ./plugins/plugin-eval/scripts/plugin-eval.js analyze ./plugins/<plugin-name>` without creating symlinks or editing [`marketplace.json`](https://github.com/openai/plugins/blob/main/marketplace.json). This approach works in CI pipelines where persistent marketplace modifications are undesirable.

### What does the `plugin-eval start` command actually simulate?

The `start` command simulates Codex's natural language processing pipeline. It takes a user request, routes it to the appropriate skill based on front-matter matching, and displays the exact CLI command sequence that would execute. This verifies intent recognition without requiring a running Codex UI instance.

### Where does `plugin-eval` store benchmark results?

Benchmark results are written to `.plugin-eval/runs/<timestamp>/usage.jsonl` in your plugin directory. You can reference this file with the `--observed-usage` flag when generating measurement plans to analyze performance characteristics and API call patterns.

### Why does `plugin-eval` not show my API key in error messages?

The tool explicitly filters secret values from all output to prevent credential leakage in logs. As documented in the `plugin-eval` README, the CLI performs safety checks but never prints environment variable values that match secret patterns, ensuring secure CI/CD integration.