DQ.

Consistency — Data Quality Dimension

Consistency

Consistency failures are the hardest to detect because they require comparing multiple values — not inspecting a single column in isolation. A row where start_date > end_date passes all completeness, uniqueness, and validity checks. It fails consistency.

Definition

Consistency measures the proportion of rows where related values agree with each other, either within a single row or across two tables representing the same entity.

consistency = consistent_row_count / total_rows

A score of 1.0 means all defined relationships hold for every row. Consistency rules express business invariants: totals that must sum correctly, dates that must be ordered, statuses that must be logically compatible with other fields.

How DQ Measures It

DQ supports two types of consistency checks:

Intra-row consistency. Rules that compare columns within the same row using SQL expressions:

-- order_total must equal the sum of line item amounts
order_total = subtotal + tax + shipping_fee

-- end_date must be after start_date
end_date >= start_date

-- refund_amount cannot exceed original charge
refund_amount <= charge_amount

Cross-table consistency. Rules that compare aggregate values or individual records across two registered datasources:

-- Total revenue in orders must match total in accounting_ledger
-- for the same date range (cross-source check)
ABS(orders.daily_total - ledger.daily_total) / ledger.daily_total < 0.001

DQ evaluates each rule and returns the proportion of rows (or date windows) that pass. Failed rows are surfaced with the specific values that violated the rule.

Consistency rules are authored as SQL expressions in the DQ rule editor. For intra-row rules, DQ can auto-suggest candidates based on column name patterns and type pairs detected during profiling.

Common Failure Modes

Currency mismatch. order_total is stored in USD but line_items.amount is stored in the local currency. The sum check fails for all non-USD orders. The bug is invisible without the consistency rule.

Timezone handling. created_at is stored in UTC; local_event_time is stored in the user's local timezone without explicit offset. For users in UTC-5, local_event_time appears 5 hours before created_at, failing a local_event_time >= created_at consistency rule.

Denormalization drift. A summary table is populated by a nightly job that reads from a transactional table. The job fails silently for one day. The summary table now has stale values that disagree with the source — cross-table consistency score drops.

Status-field contradictions. An order_status = "delivered" row has a null delivered_at. The delivery date is required when status is delivered. Neither completeness nor validity catches this without a conditional consistency rule.

For a full overview of all six dimensions, see /blog/six-dimensions-of-data-quality.


About DQ. DQ is the data quality engine that profiles, validates, and remediates your tables in 90 seconds. Built by K/20X Labs.