12
/100
prowl
Benchmarked Apr 06, 2026

Anthropic

Claude AI models — chat completions, tool use, vision, long context (200K tokens). Models: Claude 4.5 Sonnet, Opus 4, Haiku. REST API, API key auth.

aillm api_benchmark api_key
Benchmark Your API

Score Breakdown

Latency 4/10
Consistency 2/10
Documentation 2/10
Auth Simplicity 2/10
Token Efficiency 1/10
First-Try Success 1/10
Parseability 1/10
Error Clarity 0/10

Benchmark Analysis Log

Full LLM thinking from the 4-phase benchmark pipeline.

Analyze
```json
{
  "service_type": "llm_provider",
  "base_url": "https://www.anthropic.com",
  "auth_method": "api_key_header",
  "auth_config": {
    "header": "Authorization",
    "prefix": "Bearer"
  },
  "endpoints": [],
  "pricing_model": {
    "type": "unknown",
    "details": {
      "input_cost_per_1m_tokens": null,
      "output_cost_per_1m_tokens": null,
      "context_window": 200000,
      "max_output_tokens": null
    },
    "free_tier": null,
    "paid_tiers": []
  },
  "rate_limits": {
    "rpm": null,
    "tpm": null,
    "daily": null,
    "concurrent": null
  },
  "capabilities": ["chat completions", "tool use", "vision", "long context"],
  "agent_readiness": {
    "supports_x402": false,
    "supports_streaming": false,
    "has_sandbox": false,
    "sdks": [],
    "agent_auth_methods": ["api_key"]
  }
}
```
Plan
```json
{
  "tests": [
    {
      "name": "verify_basic_connectivity",
      "endpoint": "/v1/status",
      "method": "GET",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      },
      "payload": {},
      "expected_status": 200,
      "expected_behavior": "API should respond with a status OK",
      "metrics": ["latency", "status_code"],
      "validation": {"field": "status", "type": "string", "min_length": 1}
    },
    {
      "name": "test_chat_completions_endpoint",
      "endpoint": "/v1/chat/completions",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      },
      "payload": {
        "model": "default",
        "messages": [{"role": "user", "content": "Hello world!"}],
        "max_tokens": 30
      },
      "expected_status": 200,
      "expected_behavior": "Returns a chat completion",
      "metrics": ["latency", "accuracy", "status_code"],
      "validation": {"field": "choices", "type": "array", "min_length": 1}
    },
    {
      "name": "test_error_handling_missing_params",
      "endpoint": "/v1/chat/completions",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      },
      "payload": {
        "model": "default"
      },
      "expected_status": 400,
      "expected_behavior": "API should return an error about missing fields",
      "metrics": ["latency", "status_code"],
      "validation": {"field": "error", "type": "object", "min_length": 1}
    }
  ],
  "pricing_probes": [
    {
      "name": "verify_token_pricing",
      "description": "Send known-length input, check response headers for usage",
      "endpoint": "/v1/chat/completions",
      "method": "POST",
      "payload": {
        "model": "default",
        "messages": [{"role": "user", "content": "Test pricing check!"}],
        "max_tokens": 30
      },
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      },
      "check": "response.usage.prompt_tokens should match ~input_length/4"
    }
  ],
  "stress_profile": {
    "concurrent_requests": 5,
    "duration_seconds": 10,
    "ramp_up": true
  }
}
```
Execute

0/3 tests passed

TestEndpointStatusLatency
verify_basic_connectivityGET /v1/status404469ms
test_chat_completions_endpointPOST /v1/chat/completions404130ms
test_error_handling_missing_paramsPOST /v1/chat/completions404122ms
Interpret
{"multi_model": true, "models_used": ["openai", "claude_cli"], "model_scores": {"GPT-4o": {"overall": 0, "dimensions": {"token_efficiency": 2.0, "first_try_success": 2.0, "response_parseability": 2.0, "error_clarity": 1.0, "doc_quality": 3.0, "auth_simplicity": 4.0, "latency": 7.0, "consistency": 3.0}}, "Claude CLI": {"overall": 0, "dimensions": {"token_efficiency": 0.0, "first_try_success": 0.0, "response_parseability": 0.0, "error_clarity": 0.0, "doc_quality": 0.0, "auth_simplicity": 0.0, "latency": 0.0, "consistency": 0.0}}}, "averaged": true}

Agent Readiness

x402 Payments
Not supported
Streaming
No
Sandbox
None
Agent Auth
api_key
SDKs
None listed
MCP Support
No

Want the full interactive view?

See operational metrics, LLM evaluations, agent readiness, and more.

Open in Dashboard