Overview
Apers uses a token-based billing system called Smart Request Credit. Each API request is metered based on actual compute consumption, and credits are charged accordingly. This page describes how credit calculations work.
Design philosophy
Consumer-facing AI products operating on fixed-rate subscriptions face a fundamental economic constraint: the provider must ensure that average per-user compute remains below the subscription price.
In practice, this leads to silent trade-offs — routing requests to smaller models, truncating context windows, compressing output quality, or throttling usage during peak demand. The end user rarely sees these decisions, but they directly affect the reliability and quality of every response.
Institutional and production-grade AI systems operate under a different set of requirements. Consistent output quality is not a preference — it is an expectation. When AI is embedded in professional workflows, the cost of a degraded response is measured in downstream errors, missed analysis, and lost time.
In this context, transparent compute-based pricing is not just a billing model. It is an architectural guarantee that the system will never silently reduce capability to protect margins.
Smart Request Credit is built on this principle. Every request is served by the same model at full capability. There is no tiering, no throttling, and no quality trade-off. Cost scales linearly with actual token usage, and the calculation is fully visible to the user.
Credit formula
Each request consumes a number of tokens across input and output. The total credit charge is computed as:
Total Credits Charged = Unit Cost × Tokens per Query ÷ 0.1
Credits are computed as precise fractional values with no floor rounding or minimum charge applied. Your account balance is deducted by the exact amount. Credit totals are displayed as whole numbers on the dashboard.
Token-based unit costs
Tokens are the fundamental units of input and output in large language models. Each token — typically a few characters of text — carries a unit cost that depends on the operation being performed. Input tokens, output tokens, and cached tokens are each priced at different rates.
Apers implements a Cache Read/Write optimization layer. When data from a previous request can be reused, cached token rates apply, significantly reducing cost for large or repetitive workloads.
Examples
The following examples illustrate how credits are calculated for requests of varying complexity. All figures are approximate.
Simple query
Prompt: "Explain what is cap rate."
A straightforward question-and-answer exchange consumes less than 1 credit.
Moderate query
Prompt: "Read five documents and compare the differences."
Multi-document reading and comparison tasks typically fall in the range of 5–15 credits.
Complex task
Prompt: "Plan and build a 5-tab Excel spreadsheet with linked cash flow and sensitivity analysis."
Multi-step generation tasks involving structured output and iterative reasoning scale proportionally with token volume.
Summary
Credit charges are determined entirely by measured token consumption. There are no per-request minimums, no rounding adjustments, and no model tier restrictions. All requests are served by the same model at the same level of capability.