Overall score
91.4%
+0.3% vs last week
Six-dimension scores
completeness
99.7%
+0.1%
All required fields populated; 0.3% nullable metadata fields left intentionally null.
uniqueness
99.3%
-0.1%
transaction_id 100% unique; 487 orphan merchant_id references skew composite uniqueness.
validity
92.1%
+0.4%
Currency enum valid (COP/USD/EUR). 0.6% amount_cop values below floor — possible refunds.
consistency
90.7%
-0.8%
created_at 7-day KS drift: D=0.12, p=0.031. Upper-tail amount_cop trending higher.
integrity
89.1%
-0.2%
487 rows reference merchant_id not in merchants table (deleted merchants). FK integrity 99.97%.
accuracy
90.4%
+0.2%
Benford first-digit χ²=0.038 on amount_cop — slightly off but within tolerance for COP denominations.
payments_pg.transactions
1,800,000 rows · 14 cols · scanned 14h ago| Column | Type | Null% | Distinct | Mean | p95 | Dimensions | Notes |
|---|---|---|---|---|---|---|---|
transaction_id | uuid | 0% | 1,800,000 | — | — | compl 100.0% valid 100.0% uniq 100.0% | Primary key. 100% complete and unique. |
amount_cop | numeric | 0% | 1,200,000 | 412,000 | 2,100,000 | compl 100.0% valid 99.4% uniq 66.7% | Colombian Peso amounts. Mean 412k, p95 2.1M. Benford χ²=0.038 — mild anomaly. |
currency | text | 0% | 3 | — | — | compl 100.0% valid 100.0% uniq 0.0% | 3 distinct values. COP 92.4%, USD 6.1%, EUR 1.5%. |
merchant_id | uuid | 0% | 4,200 | — | — | compl 100.0% valid 100.0% uniq 0.2% | 487 orphan rows reference deleted merchants. FK integrity 99.97%. |
created_at | timestamptz | 0% | 1,710,000 | — | — | compl 100.0% valid 100.0% uniq 95.0% | Drift detected vs 7-day baseline. KS D=0.12, p=0.031. |
customer_id | uuid | 0.3% | 150,000 | — | — | compl 99.7% valid 100.0% uniq 8.3% | |
card_last4PII credit_card_last4 | char(4) | 12.0% | 10,000 | — | — | compl 88.0% valid 99.9% uniq 0.0% | |
status | text | 0% | 5 | — | — | compl 100.0% valid 100.0% uniq 0.0% |
payments_pg.merchants
4,200 rows · 8 cols · scanned 14h ago| Column | Type | Null% | Distinct | Mean | p95 | Dimensions | Notes |
|---|---|---|---|---|---|---|---|
merchant_id | uuid | 0% | 4,200 | — | — | compl 100.0% valid 100.0% uniq 100.0% | |
legal_name | text | 1.0% | 4,158 | — | — | compl 99.0% valid 97.0% uniq 98.0% | |
contact_emailPII | text | 2.0% | 4,116 | — | — | compl 98.0% valid 96.0% uniq 97.0% | 3 typo-domain emails detected. |
nitPII co_nit | text | 3.0% | 4,158 | — | — | compl 97.0% valid 98.0% uniq 99.0% |
payments_pg.chargebacks
28,000 rows · 9 cols · scanned 14h ago| Column | Type | Null% | Distinct | Mean | p95 | Dimensions | Notes |
|---|---|---|---|---|---|---|---|
chargeback_id | uuid | 0% | 28,000 | — | — | compl 100.0% valid 100.0% uniq 100.0% | |
transaction_id | uuid | 0% | 28,000 | — | — | compl 100.0% valid 100.0% uniq 100.0% | 168 chargebacks reference transactions no longer in main table. |
amount_cop | numeric | 0% | 22,400 | 318,000 | 1,800,000 | compl 100.0% valid 99.8% uniq 80.0% |
Auto-prioritized fixes (2)
Handle 487 orphan transactions referencing deleted merchants
487 transaction rows reference a merchant_id that no longer exists in the merchants table (merchants deleted without cascade). These transactions are invisible in merchant-level reporting and may cause FK violations in downstream pipelines.
-- Option A: Archive deleted merchants INSERT INTO merchants_archive SELECT * FROM merchants_deleted WHERE merchant_id IN ( SELECT DISTINCT merchant_id FROM transactions t LEFT JOIN merchants m ON t.merchant_id = m.merchant_id WHERE m.merchant_id IS NULL ); -- Option B: Null out orphan references (loses traceability) UPDATE transactions SET merchant_id = NULL WHERE merchant_id NOT IN (SELECT merchant_id FROM merchants);
487 rows affected
Monitor amount_cop distribution drift — KS D=0.12
KS test on amount_cop vs 7-day rolling baseline: D=0.12, p=0.031. The upper tail (p95+) is trending ~18% higher than expected. Could be a seasonal spike (Día sin IVA) or a legitimate merchant ramp-up. Not yet an anomaly but warrants a watchlist rule.
-- Set up a weekly KS monitor
INSERT INTO dq_monitors (dataset, column, test, threshold_D, threshold_p, alert_channel)
VALUES ('payments_pg', 'transactions.amount_cop', 'ks_2sample', 0.15, 0.05, 'slack:#data-alerts');
-- Also review by currency split: COP vs USD separately.1,800,000 rows affected
Column similarity matches
| Column A | Column B | Jaccard |
|---|---|---|
| crm_postgres.customers.email | payments_pg.merchants.contact_email | 0.89 |
| payments_pg.transactions.customer_id | crm_postgres.customers.customer_id | 0.83 |
| payments_pg.merchants.nit | legacy_mysql.suppliers.supplier_name | 0.61 |