Guides

How Scoring Works

A detailed walkthrough of the scoring algorithm, bin matching rules, and performance characteristics. Understanding this process helps you debug unexpected scores and optimize your integration.

Scoring Algorithm

When the scoring engine receives a request, it executes these five steps:

Load the scorecard spec — The engine retrieves the CalibrScorecardSpec for the requested scorecard ID from an in-memory cache. If not cached, it is loaded from the database and cached for subsequent requests.
Match bins for each variable — For each variable defined in the spec, the engine looks up the applicant's value and finds the matching bin. The matching logic depends on the variable type (numeric or categorical).
Sum points — The total score is computed as:
bash
total_score = base_points + sum(variable_points)
Compute probability of default (PD) — The score is converted to a probability using the scaling parameters:
bash
PD = 1 / (1 + exp((score - offset) / factor))
Map to risk grade — If risk grades are configured, the engine finds the grade whose score range contains the total score and includes it in the response.

Bin Matching Rules

Numeric variables

Numeric bins use the interval notation (min, max], which is left-exclusive, right-inclusive. The engine parses the range string, extracts the boundaries, and checks:

bash

min < value <= max

Special boundary values -Infinity and Infinity are supported for the first and last bins respectively.

Categorical variables

Categorical bins match by exact string comparison(case-sensitive). A single bin can match multiple categories using the %,% delimiter:

json

// Single category { "value": "MORTGAGE", "points": 41 } // Multiple categories grouped into one bin { "value": "OTHER%,%NONE", "points": -5 } // Matches "OTHER" or "NONE"

If a categorical value does not match any defined bin, the engine falls through to the Missing bin (if defined) or assigns 0 points.

Multi-category delimiter

The %,% delimiter is used instead of a comma because category names themselves may contain commas. The engine splits the bin value on %,% and checks if the applicant's value matches any of the resulting strings.

Missing Value Handling

A value is considered “missing” when the variable is absent from the request body or explicitly set to null. The engine handles missing values as follows:

Look for a bin with "range": "Missing" (numeric) or "value": "Missing" (categorical)
If a Missing bin exists, use its points
If no Missing bin is defined, the variable contributes 0 points to the total score

Why 0 points for missing values? In weight-of-evidence (WOE) based scorecards, a WOE of 0 means the observation carries no predictive information — it neither increases nor decreases the score. This is the correct neutral fallback for unknown or unmapped values.

Strict Mode Warnings

When a scoring request includes variables not defined in the scorecard spec, or when a value does not match any bin, the response includes a warnings array:

json

{ "score": 687, "pd": 0.034, "risk_grade": "B", "warnings": [ { "type": "unknown_variable", "variable": "credit_score", "message": "Variable 'credit_score' is not defined in the scorecard spec and was ignored" }, { "type": "no_bin_match", "variable": "home_ownership", "value": "UNKNOWN_TYPE", "message": "Value 'UNKNOWN_TYPE' did not match any bin for 'home_ownership'; used 0 points" } ] }

Warnings do not prevent scoring — they are informational. Check the warnings array in your integration to catch data quality issues early.

Performance

The scoring engine is optimized for low-latency, high-throughput scoring:

Single score: ~0.1ms per applicant (excluding network round-trip)
Batch scoring: Up to 1,000 applicants per request via the POST /api/v1/score/batch endpoint
Spec caching: Scorecard specs are cached in memory after first load, with a TTL-based invalidation strategy
No ML runtime: Scoring uses pure arithmetic (addition and a single sigmoid), so there is no dependency on ML frameworks at inference time