Image samples with identical or highly similar content and pixels in a dataset. Duplicate images cause the model to overfit related samples and distort evaluation results, requiring detection and removal through methods such as image hashing and feature comparison.





