API Reference
Log In
API Reference

Meter AI Completion

Submit AI completion metadata for metering and billing purposes.

Recent Requests
Log in to see full request history
TimeStatusUser Agent
Retrieving recent requests…
LoadingLoading…
Body Params

AI completion metadata including token counts, costs, and timing information

Completion metadata details for LLM completions, capturing identifiers and cost metrics for analytics and monetization.

string

Unique identifier for this specific AI completion transaction. Used for deduplication, correlation with request/response pairs, and transaction lookup in Revenium analytics. If not provided, a UUID will be auto-generated. For best practices, generate a UUID in your application before making the AI call and use the same ID when submitting to Revenium. You can use either 'spanId' (recommended) or 'transactionId'.

string

Optional trace identifier to group multiple related AI completion calls that belong to the same overall user request or workflow. For example, if a single user query triggers multiple LLM calls (e.g., retrieval + generation), use the same traceId for all calls to analyze them together in Revenium's analytics. Leave null for standalone completions.

string
required

The AI model identifier used for this completion. Should match the exact model name from your AI provider (e.g., 'gpt-4', 'claude-3-opus-20240229', 'gemini-pro'). This is used for cost calculation, performance analytics, and model comparison reporting in Revenium. Valid model names in Revenium for proper cost estimate can be verified using the sources/ai/models endpoint.

double

Optional quality score for the AI response on a 0.0-1.0 scale. Set by your application's evaluation logic (e.g., RAGAS, human feedback, custom scoring). Used in Revenium analytics to correlate quality with cost, model choice, and other metrics. Leave null if not tracking quality scores.

string

The routing or aggregation layer used to access the AI model. This identifies whether you're calling the AI provider directly or through an intermediary service.

Common values: 'DIRECT', 'LITELLM', 'OPENROUTER', 'PORTKEY', 'AZURE_OPENAI', or provider names ('OPENAI', 'ANTHROPIC', 'GOOGLE', etc.) when calling directly.

Custom values are accepted for specialized routing layers or gateways. This field is used for integration tracking and analytics.

int64
required

The count of consumed input tokens

int64
required

The count of consumed output tokens

int64

The number of reasoning tokens used in the completion. Reasoning tokens are extended thinking tokens used by AI models for complex problem-solving. These are sometimes billed separately from regular input/output tokens. Only include this field if your AI provider reports reasoning tokens Revenium's middleware will always populate this field if reasoning tokens are reported by the AI provider. Leave null for models without reasoning capabilities.

int64

The number of tokens used to create new cache entries (prompt caching). When you send a long prompt for the first time, the AI provider may cache it for faster subsequent requests. Cache creation tokens are typically billed at a higher rate than regular input tokens. Only include if your provider supports prompt caching (e.g., Anthropic Claude, OpenAI with cache-enabled models). Revenium's middleware will always populate this field automatically. Leave null otherwise.

int64

The number of tokens read from cache (prompt caching). When reusing a previously cached prompt, these tokens are read from cache instead of being processed as new input tokens. Cache read tokens are typically billed at a lower rate than regular input tokens. Only include if your provider supports prompt caching and reports cache hits. Revenium's middleware will always populate this field automatically. Leave null otherwise.

int64
required

The total number of tokens

string
enum
required

The reason for stopping the completion

double

The cost in USD for input tokens in this completion. Typically leave null to let Revenium automatically calculate costs based on the model and provider's current pricing. Only provide a value if you have custom pricing agreements or want to override Revenium's cost calculation. Note: Manual cost override may not be available on all Revenium plans.

double

The cost in USD for output tokens in this completion. Typically leave null to let Revenium automatically calculate costs based on the model and provider's current pricing. Only provide a value if you have custom pricing agreements or want to override Revenium's cost calculation. If provided, this will override Revenium's automatic calculation. Note: Manual cost override may not be available on all Revenium plans.

double

The cost in USD for cache creation tokens in this completion. Typically leave null to let Revenium automatically calculate costs based on the model and provider's caching pricing. Only provide a value if you have custom pricing agreements or want to override Revenium's cost calculation. If provided, this will override Revenium's automatic calculation.

double

The cost in USD for cache read tokens in this completion. Typically leave null to let Revenium automatically calculate costs based on the model and provider's caching pricing. Only provide a value if you have custom pricing agreements or want to override Revenium's cost calculation. If provided, this will override Revenium's automatic calculation.

double

The total cost in USD for this completion (sum of all token costs). Typically leave null to let Revenium automatically calculate the total based on token counts and current pricing. Only provide a value if you have custom pricing agreements or want to override Revenium's cost calculation. If provided, this will override Revenium's automatic calculation.

string
enum

The type of cost being tracked. Currently always 'AI' for AI completion costs. This field is used internally by Revenium to categorize different types of metered usage. You typically do not need to set this field as it defaults to 'AI'.

Allowed:
string
required

The timestamp when your application sent the request to the AI provider, in ISO 8601 format with UTC timezone (e.g., '2025-03-02T15:04:05Z'). This is used to calculate request duration and analyze usage patterns over time. Set this to the time immediately before calling the AI provider's API.

string
required

The timestamp when the AI completion started generating output, in ISO 8601 format with UTC timezone. For streaming requests, this is when the first token was received. For non-streaming requests, this is typically the same as or very close to responseTime. Used to calculate time-to-first-token latency for streaming completions.

int64

The latency in milliseconds from request start to first token received. Calculated as (completionStartTime - requestTime). This metric is particularly important for streaming completions to measure perceived responsiveness. For non-streaming completions, this may be null or equal to requestDuration.

boolean
required

Indicates whether this completion used streaming (true) or non-streaming/batch mode (false). Streaming completions receive tokens incrementally as they're generated, while non-streaming completions wait for the complete response. This affects how timeToFirstToken and responseTime are interpreted.

string
required

The timestamp when the AI completion finished, in ISO 8601 format with UTC timezone. For streaming requests, this is when the last token was received and the stream closed. For non-streaming requests, this is when the complete response was received. Used to calculate total request duration.

int64
required

The total duration of the AI completion request in milliseconds, from request start to completion. Calculated as (responseTime - requestTime). This includes network latency, AI processing time, and any mediation/gateway overhead. Used for performance analytics and SLA monitoring.

int64

The latency in milliseconds introduced by intermediate systems between your application and the AI provider, such as API gateways, proxies, or AI mediation layers. This helps identify performance bottlenecks outside of the AI provider's processing time. Leave null if not using intermediate systems or if latency is not tracked separately.

string
required

The underlying AI provider/vendor whose model is actually processing the request. This identifies which company's AI model is being used, regardless of how you're accessing it (direct API, proxy, or gateway).

Common values: 'OpenAI' (for GPT models), 'Anthropic' (for Claude models), 'Google' (for Gemini models), 'Cohere', 'Mistral', 'Meta' (for Llama models), 'Amazon Bedrock', 'Azure'.

Custom values are accepted but may affect analytics categorization. Revenium looks up model pricing primarily by model name (e.g., 'gpt-4', 'claude-3-opus'), so using non-standard provider names will not break cost calculation. However, using standard provider names ensures proper categorization in analytics and reporting.

If using an aggregation service like LiteLLM or OpenRouter, this should still be the actual provider (e.g., 'Anthropic' not 'LiteLLM'). If using Revenium middleware, this is typically auto-populated from the AI provider's API response. Supported provider models can be verified using the sources/ai/models endpoint which returns both providers and model names.

string

Optional category to group related AI tasks for cost and performance analysis. Use consistent values to compare metrics across different models or vendors performing the same type of work. Examples: 'chat', 'summarization', 'code-generation', 'translation', 'image-generation', 'embeddings', 'classification', 'sentiment-analysis'. This is freeform text - choose values that match your use cases.

string

The name of the subscriber's organization from your system to allow Revenium to track usage & costs by company. i.e. AcmeCorp. If several subscriberIds have the same organizationName, Revenium's reporting will show usage for the entire organization broken down by subscriberId. This field is used for lookup and auto-creation of organizations in Revenium.

string

Unique identifier of the subscription from your own system that you wish to use to correlate usage between Revenium & your application.

string

The name of the product from your system that you wish to use to correlate usage between Revenium & your application. This field is used for lookup and auto-creation of products in Revenium.

string

The AI agent that is making the request

string
enum

The type of operation performed

string

A unique identifier provided by the AI provider that represents the statistical signature of the language model that generated this completion. This fingerprint can be used for model attribution, debugging, and monitoring model behavior across requests. Automatically provided by some AI providers (e.g., OpenAI) in their API responses. Leave null if your provider does not supply this value.

double

The temperature parameter used for this completion, controlling randomness in the AI's output. Typically ranges from 0.0 (deterministic) to 2.0 (very random). Track this to correlate temperature settings with response quality, cost, or other metrics. Useful for A/B testing different temperature values.

string

Error message or reason if the AI completion failed. Include this field when the AI provider returns an error (e.g., rate limit exceeded, invalid API key, model not found, content policy violation). Used for error rate analytics and debugging. Leave null for successful completions.

int32

HTTP error code if the completion failed (e.g., 429, 503, 500). Used with billingSkipped to determine no-charge scenarios (e.g., OpenAI flex tier 429 = no charge).

subscriber
object

Metadata about the subscriber/end-user making this AI request. Include this to track usage by individual users within an organization. Contains user identifiers and associated credential information. Leave null if not tracking individual user-level usage.

string

Identifier of the Revenium middleware package or SDK that captured and submitted this AI completion metadata. This field is AUTOMATICALLY SET by Revenium's middleware packages (e.g., 'revenium-openai-python', 'revenium-anthropic-node'). You typically should NOT manually set this field. It is used for analytics to track which integration methods are being used and for debugging middleware-specific issues.

string

Deployment environment where this AI completion was executed. Used for filtering and analyzing usage patterns across different deployment stages. Common values: 'production', 'staging', 'development', 'test'. Leave null if not tracking by environment.

string

Technical classification of the specific operation or tool used within the AI completion. This provides finer-grained categorization than operationType. Examples: 'web_search', 'function_call', 'code_interpreter', 'sql_query', 'http_request', 'file_read', 'ocr'. Used for analyzing which tools or capabilities are being used most frequently. Leave null for standard completions without tool usage.

string

Parent transaction ID for hierarchical tracing. When an AI completion is part of a larger workflow or spawned by another AI call, this field references the parent's spanId/transactionId. Used to build transaction trees and understand call hierarchies in complex multi-step AI workflows. Leave null for root-level transactions. You can use either 'parentSpanId' (recommended) or 'parentTransactionId'.

string

Human-readable name for this transaction. Provides context about what this AI completion is doing in business terms. Examples: 'Summarize Application', 'Credit Risk Analysis', 'Customer Support Response'. Used in trace visualization for better readability. Falls back to taskType if not provided.

string

Cloud region or geographic location where this AI completion was processed. Used for analyzing latency patterns, compliance requirements, and regional cost differences. Examples: 'us-east-1', 'eu-west-1', 'ap-southeast-2'. Leave null if not tracking by region.

int32

Retry attempt number for this AI completion. 0 indicates the first attempt (no retry), 1 indicates first retry, 2 indicates second retry, etc. Used for analyzing failure rates and retry patterns. Each retry attempt should be a separate transaction with incrementing retryNumber and the same traceId. Leave null or set to 0 for first attempts.

string

Categorical identifier for grouping similar workflows (e.g., 'create_video', 'write_script'). Enables trace-level analytics and anomaly detection within workflow cohorts. Must contain only alphanumeric characters, hyphens, and underscores. Max 128 characters. Defaults to 'uncategorized' if not provided or invalid. All transactions in the same trace must have the same traceType.

string

Human-readable label for individual trace instances (e.g., 'Marketing Video Q4 2025'). Enables finding specific workflow executions. Can contain any UTF-8 string. Max 256 characters. All transactions in the same trace must have the same traceName.

string

Identifier for multi-agent framework team or squad (e.g., ARK, CrewAI). Used to group AI operations by agent team for analytics and cost tracking. Free-form string, max 256 characters.

string

Human-readable name for the squad type (e.g., 'Loan Processing', 'Document Analysis'). Used for display in the UI and analytics grouping. If squadId is provided without squadName, squadId will be used as the display name.

string

The agent's role within this squad execution (e.g., 'Document Extractor', 'Credit Checker'). Used to identify the purpose of each agent's contribution to the squad workflow.

string

Unique identifier for the agentic job instance. Used to correlate all AI operations within a single job execution. Should be unique per job run (e.g., UUID or meaningful ID like 'loan-123-processing').

string

Human-readable name for the agentic job. Used for display in the UI and analytics grouping. If agenticJobId is provided without agenticJobName, the ID will be used as the display name.

string

Category or type of the agentic job. Normalized to lowercase on ingest to prevent data fragmentation. Used for grouping jobs by type in analytics and ROI calculations.

string

Version identifier for the job definition. Used to track job evolution and compare performance across versions.

string

System prompt/instructions sent to the model.

string

User input messages as JSON array.

string

Assistant output/response content.

boolean

Indicates if prompts were truncated due to size limits.

boolean

If true, backend returns $0 cost. Set for free tier, rate-limited requests (429 on flex), or other no-charge scenarios.

string
enum

Reason why billing was skipped.

Allowed:
string
enum

Pricing tier for batch discounts. BATCH = 50% discount, STANDARD = normal pricing.

Allowed:
string

Service tier requested by user (e.g., 'priority', 'default', 'flex').

string

Actual service tier used for billing. May differ from requested if downgraded. Backend uses this for cost calculation.

string

The subscription tier for coding assistant tools (e.g., 'pro', 'free', 'unknown'). Only populated for coding_assistant cost sources.

double

Multiplier applied to raw AI provider costs for this coding assistant event. Used to normalize costs relative to subscription pricing.

string

Account UUID from the coding assistant tool (e.g., CLAUDE_CODE_ACCOUNT_UUID). Used for subscriber upsert to correlate coding assistant usage to a specific account.

Headers
string
length between 1 and 255

Optional Stripe-style retry-safety key. If present, the response (status + body) is cached keyed by (tenant, key). Identical retries replay the cached response; body mismatch returns 409 idempotency_key_mismatch; a concurrent in-flight call returns 409 idempotency_key_in_progress with Retry-After: 1. Must be 1-255 printable ASCII characters; UUID v4 recommended. See https://docs.revenium.io/integrations/idempotency for full behavior.

Responses

Language
Credentials
Header
LoadingLoading…
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json