Guide · about 1 min read
Understanding Data Quality
Dimensions of quality, lightweight checks, and how to document issues for stakeholders.
Dimensions of quality
Accuracy, completeness, consistency, timeliness, and fitness-for-use are common lenses. A dataset can be internally consistent yet wrong for your question if definitions shifted—see metadata and provenance.
Practical checks
Profile key columns: min/max for numerics, cardinality for categories, and null rates for missing data. Compare row counts before and after joins to catch unintended duplication.
Quality and bias
Quality reviews should explicitly ask who is over- or under-represented and whether collection incentives skew results. The bias glossary entry links to storytelling pitfalls.
Communicating limitations
Plain-language limitation statements build trust. Point readers to the Data Literacy Basics hub for vocabulary you can reuse in footnotes and slide decks.
Related glossary terms
Related guides
Curated external resources
- W3C Data on the Web Best Practices (opens in new tab)
Standards-oriented checklist for publishing usable, documented datasets.
- NIST de-identification overview (opens in new tab)
Introductory framing for protecting individuals when sharing data.