Data Catalog
5 datasets · 14 tables · 146 columns · 8,270,952 total rows
Connections
| Dataset | Score |
|---|---|
| payments_pg pg-prod-01.us-east-1.k20x.internal | 91.4% |
| crm_postgres pg-crm-01.us-east-1.k20x.internal | 76.8% |
| analytics_orders xy12345.us-east-1.snowflakecomputing.com | 88.2% |
| compliance_consent sheets.googleapis.com | 82.1% |
| legacy_mysql mysql-legacy-01.k20x.internal | 64.3% |
Column similarity — MinHash + Jaccard
Pairs with Jaccard ≥ 0.60 flagged as potential duplicates, renames, or shared master data.
| Column A | Column B | Jaccard |
|---|---|---|
| crm_postgres.customers.email | payments_pg.merchants.contact_email | 0.89 |
| crm_postgres.customers.document_id | compliance_consent.consent_log.subject_id | 0.97 |
| legacy_mysql.products.sku | analytics_orders.dim_products.product_code | 0.94 |
| crm_postgres.customers.customer_id | analytics_orders.dim_customers.customer_id | 0.98 |
| payments_pg.transactions.customer_id | crm_postgres.customers.customer_id | 0.83 |
| legacy_mysql.products.price_cop | analytics_orders.fct_orders.revenue_local | 0.71 |
| crm_postgres.customers.city | crm_postgres.addresses.city | 0.86 |
| payments_pg.merchants.nit | legacy_mysql.suppliers.supplier_name | 0.61 |