Smart Request Credits

How Credit Works

Overview

Apers uses a token-based billing system called Smart Request Credit. Each API request is metered based on actual compute consumption, and credits are charged accordingly. This page describes how credit calculations work.

Design philosophy

Consumer-facing AI products operating on fixed-rate subscriptions face a fundamental economic constraint: the provider must ensure that average per-user compute remains below the subscription price.

In practice, this leads to silent trade-offs — routing requests to smaller models, truncating context windows, compressing output quality, or throttling usage during peak demand. The end user rarely sees these decisions, but they directly affect the reliability and quality of every response.

Institutional and production-grade AI systems operate under a different set of requirements. Consistent output quality is not a preference — it is an expectation. When AI is embedded in professional workflows, the cost of a degraded response is measured in downstream errors, missed analysis, and lost time.

In this context, transparent compute-based pricing is not just a billing model. It is an architectural guarantee that the system will never silently reduce capability to protect margins.

Smart Request Credit is built on this principle. Every request is served by the same model at full capability. There is no tiering, no throttling, and no quality trade-off. Cost scales linearly with actual token usage, and the calculation is fully visible to the user.

Credit formula

Each request consumes a number of tokens across input and output. The total credit charge is computed as:

Total Credits Charged = Unit Cost × Tokens per Query ÷ 0.1

ParameterDescription
Unit CostThe per-token rate, which varies by token type (e.g., input, output, cache read/write).
Tokens per QueryThe total number of tokens consumed during the request.
0.1The credit denomination. Dividing the raw dollar cost by 0.1 converts it into credit units.

Credits are computed as precise fractional values with no floor rounding or minimum charge applied. Your account balance is deducted by the exact amount. Credit totals are displayed as whole numbers on the dashboard.

Token-based unit costs

Tokens are the fundamental units of input and output in large language models. Each token — typically a few characters of text — carries a unit cost that depends on the operation being performed. Input tokens, output tokens, and cached tokens are each priced at different rates.

Apers implements a Cache Read/Write optimization layer. When data from a previous request can be reused, cached token rates apply, significantly reducing cost for large or repetitive workloads.

Examples

The following examples illustrate how credits are calculated for requests of varying complexity. All figures are approximate.

Simple query

Prompt: "Explain what is cap rate."

ComponentCalculationCredits
Input (500 tokens)0.000006 × 500 ÷ 0.10.03
Output (1,500 tokens)0.00003 × 1,500 ÷ 0.10.45
Total0.48

A straightforward question-and-answer exchange consumes less than 1 credit.

Moderate query

Prompt: "Read five documents and compare the differences."

ComponentCalculationCredits
Input (60,000 tokens)0.000006 × 60,000 ÷ 0.13.6
Output (20,000 tokens)0.00003 × 20,000 ÷ 0.16.0
Total9.6

Multi-document reading and comparison tasks typically fall in the range of 5–15 credits.

Complex task

Prompt: "Plan and build a 5-tab Excel spreadsheet with linked cash flow and sensitivity analysis."

ComponentCalculationCredits
Input (230,000 tokens)0.000006 × 230,000 ÷ 0.113.8
Output (120,000 tokens)0.00003 × 120,000 ÷ 0.136.0
Total49.8

Multi-step generation tasks involving structured output and iterative reasoning scale proportionally with token volume.

Summary

Credit charges are determined entirely by measured token consumption. There are no per-request minimums, no rounding adjustments, and no model tier restrictions. All requests are served by the same model at the same level of capability.

PrincipleDetail
Precision billingCredits reflect exact compute used.
No minimumsSub-credit charges are applied as fractional values.
Proportional scalingCost increases linearly with token usage.
Full model accessCompute metering replaces model gating. All users access the highest-performing model.

/ APERS

The End-to-End Automation System for
Real Estate Capital

Unifying your deals, workflows, strategies, and knowledge into one autonomous system.
Contact Sales
Start for free