The workflow enhances deferment data quality by using an LLM-powered engine that reads and interprets free-form operator comments. It identifies the most likely true deferment cause behind each entry, using cause definitions, relevant industry abbreviations list and asset's context awareness. Only high-confidence interpretations are applied back into the dataset, ensuring every correction is both justified and trustworthy.
Once integrated, the system produces a visual summary that highlights how many events were reclassified, how deferment volumes shifted, and where the biggest improvements in accuracy occurred. This creates a more consistent, audit-ready dataset that reduces manual review effort and strengthens the reliability of deferment analytics across the business.
| Step | Inputs | Function | Outputs |
|---|---|---|---|
|
1. LLM-powered Comments-to-Cause Classification |
Comments.csv
(free text)
defer_cause_definitions.json
(EC manual definitions)
Abbreviations.json
(domain-specific terms)
|
Use LLM to classify causes based on Comments, Cause definitions and Abbreviations list. Provide justification and confidence (high/medium/low) for proposed Causes. |
Comments_Cause_Fixed.csv
|
|
2. Generate Improved Cause Classification Dataset & Report |
Deferment_dataset_preprocess.csv
(base data)
Comments_Cause_Fixed.csv
|
Integrate corrected Causes into dataset where Cause changed and confidence = high. Generate resulting deferments table and HTML report with changes overview. |
Deferment_dataset_Cause-fixed.csv
Deferment_Cause_Reclassification_Report.html
|