This collection contains Turkish multimodal datasets that are suitable for the task of Image-Text-to-Text.