How to Monitor Data Quality Without Writing Code
How to Monitor Data Quality Without Writing Code
Code-first tools like Great Expectations, Soda Core, and dbt tests are powerful — once configured. Getting there requires writing YAML expectation suites, managing Python environments, and wiring up checkpoint runners. Most teams spend two to four weeks before their first failing test is meaningful.
DQ takes a different path: connect a datasource, run an automatic profile, then click to promote any discovered pattern into a standing rule. No YAML. No Python. No pipeline edits.
Why YAML Rule Files Slow You Down
Great Expectations stores expectations in JSON/YAML files under version control. Every new column requires a new expectation. Every schema change breaks existing suites. Soda uses SodaCL, a custom YAML dialect with similar overhead. dbt tests are simpler but live inside your dbt project — useless if you don't run dbt.
The mental model is: write the rules, then run them. This is fine for tables you already understand. It breaks down when a table has 80 columns and you have no idea which ones matter.
Auto-Profile: Discover Before You Validate
DQ flips the model. On the first run, DQ:
- Samples up to 1 million rows (configurable).
- Computes all six quality dimensions for every column.
- Infers data types, formats, and value distributions.
- Surfaces anomalies via Isolation Forest (Liu et al., 2008) and Benford's law checks.
The result is a scorecard — no rules written yet. From the scorecard, one click converts any observation into a persistent rule: "null rate on order_total must stay below 2 %", "all email values must match RFC-5322 format".
Connect Postgres in 90 Seconds
pip install dq-cli
dq connect \
--type postgres \
--url "postgresql://user:pass@host:5432/mydb"
dq profile --table public.orders
dq scorecard --table public.orders
The fourth command prints a terminal scorecard. Open the web UI for the full interactive version with column-level drill-down.
Compared to Great Expectations, Soda, and dbt Tests
| Dimension | Great Expectations | Soda Core | dbt tests | DQ | |---|---|---|---|---| | First result | 2–4 weeks | 1–3 days | Hours (dbt required) | 90 seconds | | Rule authoring | Python/YAML | SodaCL YAML | SQL macros | Click-to-enable | | Auto-profiling | No | Partial | No | Yes | | Catalog + lineage | No | No | Partial | Yes | | Remediation | No | No | No | Yes |
For teams already deep in dbt, dbt tests are fine for schema-level coverage. For everyone else, the code overhead is a barrier. See the full breakdown at /compare/great-expectations.
Pricing and Getting Started
DQ is available on a free tier (up to 3 datasources) and paid plans starting at $99/month. See /pricing for current limits.
For deeper understanding of what DQ measures, read the six dimensions explainers or browse /dimensions.
FAQ
Q: Does DQ require an agent or sidecar running in my infrastructure? A: No. DQ connects directly to your database over a standard connection string. Read-only credentials are sufficient for profiling.
Q: How does DQ handle schema changes? A: When a column is added or removed, DQ re-profiles automatically on the next run and flags rules that no longer apply.
Q: Can DQ run on a schedule without manual intervention? A: Yes. Schedules are configured in the UI (cron syntax) or via the API. Failed checks trigger webhook or Slack alerts.
About DQ. DQ is the data quality engine that profiles, validates, and remediates your tables in 90 seconds. Built by K/20X Labs.