LLM-ready, augmented dataset with chunking/paraphrasing, back-translation, self-consistency, counterfactuals.
Data processing. Derived from 500k medical knowledge mix