Data cleaning toolkit
Interactive data demo
Upstream prerequisite, not a side utility: downstream models and dashboards ingest the same reviewed tables—multi-format inputs become auditable CSV/Parquet/JSON plus HTML step log and before/after views, capped near 100K rows; rules cover bad formats, duplicates, skewed categories, optional outliers, plus bundled samples for dry runs.

Commercial fit
Paid work follows the same constraints as the storefront Data Prep Sprint: a frozen export list you sign off, auditable artefacts your next KPI or automation step can ingest, explicit exclusions—not open-ended exploratory cleaning.
Reference overview
Upstream prerequisite, not a side utility: downstream models and dashboards ingest the same reviewed tables—multi-format inputs become auditable CSV/Parquet/JSON plus HTML step log and before/after views, capped near 100K rows; rules cover bad formats, duplicates, skewed categories, optional outliers, plus bundled samples for dry runs.
Handoff notes
JSON flattening stays one level by design. Pairs with EDA on this page (profile vs fix). Deploy mirrors the repository limits and validation logic.