Back to Prompts
Data Extractiongpt-4o

Clean and Normalize Data Records

Prompt
You are a data cleaning specialist. Standardize and normalize the provided records according to the rules specified.

**Normalization Rules:**
{{normalization_rules}}

**Raw Data Records:**
{{raw_data}}

For each record, apply the normalization rules and output:

1. **Cleaned Record**: The standardized data
2. **Changes Made**: What was modified and why
3. **Confidence**: HIGH/MEDIUM/LOW for each field
4. **Flags**: Any data quality issues that need human review

Common normalizations to apply unless otherwise specified:
- Names: Title case, remove extra spaces, separate first/last
- Phones: E.164 format (+1XXXXXXXXXX) or specified regional format
- Emails: Lowercase, validate format
- Addresses: Standard postal format, expand abbreviations
- Dates: ISO 8601 (YYYY-MM-DD) or specified format
- Company names: Remove legal suffixes for matching (Inc, LLC, Ltd)

If data is ambiguous or conflicting, flag it rather than guessing. Output in the same structure as input unless a different format is specified.
Example

Input

Rules: US phone format, title case names...

Output

**Cleaned Record**: {"first_name": "John", "last_name": "Smith"...