Data cleaning toolkit

Interactive data demo

Upstream prerequisite, not a side utility: downstream models and dashboards ingest the same reviewed tables—multi-format inputs become auditable CSV/Parquet/JSON plus HTML step log and before/after views, capped near 100K rows; rules cover bad formats, duplicates, skewed categories, optional outliers, plus bundled samples for dry runs.

Commercial fit

Paid work follows the same constraints as the storefront Data Prep Sprint: a frozen export list you sign off, auditable artefacts your next KPI or automation step can ingest, explicit exclusions—not open-ended exploratory cleaning.

Reference overview

Handoff notes

JSON flattening stays one level by design. Pairs with EDA on this page (profile vs fix). Deploy mirrors the repository limits and validation logic.

Data cleaning toolkit

Reference overview

Handoff notes

Repositories & demos