## Download the COCO captions, RefCOCO, RefCOCO+. RefCOCOg, visual genome, textcaps, LLaVA, gqa, AOK-VQA, OK-VQA, OCR-VQA, filtered Flickr-30k, multi-task conversation, and Unnatural instruction datasets Download the dataset Image source | Download path --- | :---: COCO 2014 images | images    captions COCO VQA | vqa train    vqa val Visual Genome | images part1 images part2 TextCaps | images annotations RefCOCO | annotations RefCOCO+ | annotations RefCOCOg | annotations LLaVA | Compelex reasoning    Detailed description    Conversation OKVQA | annotations AOK-VQA | annotations OCR-VQA | annotations Filtered Flickr-30k | annotations Multi-task conversation | annotations Filtered unnatural instruction | annotations ### COCO captions Download the COCO 2014 images and captions ``` ${MINIGPTv2_DATASET} ├── coco_captions │ ├── coco_images │ └── annotations │ ├── coco_karpathy_train.json │ ... ... ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** to the coco_karpathy_train.json path - [minigpt4/configs/datasets/coco/caption.yaml](../minigpt4/configs/datasets/coco/caption.yaml) ### COCO VQA Download the vqa v2 train and validation json files ``` ├── ${MINIGPTv2_DATASET} │ ├── vqav2 │ ├── vqa_train.json | ├── vqa_val.json ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** to the vqa_train.json and vqa_val.json path - [minigpt4/configs/datasets/coco/defaults_vqa.yaml](../minigpt4/configs/datasets/coco/defaults_vqa.yaml) ### Visual genome Download visiual genome images and annotation files ``` ${MINIGPTv2_DATASET} ├── visual_genome │ ├── VG_100K │ ├── VG_100K_2 │ ├── region_descriptions.json │ ... ... ``` Set **image_path** to visual_genome folder. Similarly, set **ann_path** to the visual_genome folder. - [minigpt4/configs/datasets/vg/ref.yaml](../minigpt4/configs/datasets/vg/ref.yaml) ### TextCaps Download the TextCaps images and annotation files ``` ${MINIGPTv2_DATASET} ├── TextCaps │ ├── train_images │ └── TextCaps_0.1_train.json ... ``` Set **image_path** to TextCaps train_images folder. Similarly, set **ann_path** to the TextCaps_0.1_train.json path - [minigpt4/configs/datasets/textcaps/caption.yaml](../minigpt4/configs/datasets/textcaps/caption.yaml) ### RefCOCO, RefCOCO+, RefCOCOg Download the RefCOCO, RefCOCO+, RefCOCOg annotation files ``` ${MINIGPTv2_DATASET} ├── refcoco_annotations │ ├── refcoco │ │ ├── instances.json │ │ ├── refs(google).p │ │ └── refs(unc).p │ ├── refcoco+ │ │ ├── instances.json │ │ └── refs(unc).p │ └── refcocog │ ├── instances.json │ ├── refs(google).p │ └─── refs(und).p ... ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** in all the following configs to the above folder *refcoco_annotations* that contains refcoco, refcoco+, and refcocog. - [minigpt4/configs/datasets/coco_bbox/refcoco.yaml](../minigpt4/configs/datasets/coco_bbox/refcoco.yaml) - [minigpt4/configs/datasets/coco_bbox/refcocog.yaml](../minigpt4/configs/datasets/coco_bbox/refcocog.yaml) - [minigpt4/configs/datasets/coco_bbox/refcocop.yaml](../minigpt4/configs/datasets/coco_bbox/refcocop.yaml) - [minigpt4/configs/datasets/coco_bbox/invrefcoco.yaml](../minigpt4/configs/datasets/coco_bbox/invrefcoco.yaml) - [minigpt4/configs/datasets/coco_bbox/invrefcocog.yaml](../minigpt4/configs/datasets/coco_bbox/invrefcocog.yaml) - [minigpt4/configs/datasets/coco_bbox/invrefcocop.yaml](../minigpt4/configs/datasets/coco_bbox/invrefcocop.yaml) ### LLaVA ``` Location_you_like ${MINIGPTv2_DATASET} ├── llava │ ├── conversation_58k.json │ ├── detail_23k.json │ └── complex_reasoning_77k.json ... ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** to the location of the previous downloaded conversation_58k.json, detail_23k.json, and complex_reasoning_77k.json in conversation.yaml, detail.yaml, and reason.yaml, respectively. - [minigpt4/configs/datasets/llava/conversation.yaml](../minigpt4/configs/datasets/llava/conversation.yaml) - [minigpt4/configs/datasets/llava/detail.yaml](../minigpt4/configs/datasets/llava/detail.yaml) - [minigpt4/configs/datasets/llava/reason.yaml](../minigpt4/configs/datasets/llava/reason.yaml) ### OKVQA ``` ${MINIGPTv2_DATASET} ├── OKVQA │ ├── okvqa_train.json │ ... ... ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** to the location of the OKVQA dataset - [minigpt4/configs/datasets/okvqa/defaults.yaml](../minigpt4/configs/datasets/okvqa/defaults.yaml) ### COCO-VQA - [OK-VQA Input Questions](https://okvqa.allenai.org/static/data/OpenEnded_mscoco_train2014_questions.json.zip) - [OK-VQA Annotations](https://okvqa.allenai.org/static/data/mscoco_train2014_annotations.json.zip) ### AOK-VQA Download the AOK-VQA annotation dataset ``` export AOKVQA_DIR=YOUR_DATASET_PATH mkdir -p ${AOKVQA_DIR} curl -fsSL https://prior-datasets.s3.us-east-2.amazonaws.com/aokvqa/aokvqa_v1p0.tar.gz | tar xvz -C ${AOKVQA_DIR} ``` ``` ${MINIGPTv2_DATASET} ├── AOKVQA │ ├── aokvqa_v1p0_train.json │ ... ... ``` Set **image_path** to the COCO 2014 image folder. Similarly, set **ann_path** to the location of the AOKVQA dataset - [minigpt4/configs/datasets/aokvqa/defaults.yaml](../minigpt4/configs/datasets/aokvqa/defaults.yaml) ### OCR-VQA Download the OCR-VQA annotation files ``` ${MINIGPTv2_DATASET} ├── OCR-VQA │ ├── images │ ├── dataset.json │ ... ... ``` Set **image_path** as the OCR-VQA image folder. Similarly, set **ann_path** to the lhe OCR-VQA dataset.json - [minigpt4/configs/datasets/ocrvqa/ocrvqa.yaml](../minigpt4/configs/datasets/ocrvqa/ocrvqa.yaml) ### filtered Flickr-30k Download filtered Flickr-30k images and annotation files ``` ${MINIGPTv2_DATASET} ├── filtered_flickr │ ├── images │ ├── captiontobbox.json │ ├── groundedcaption.json │ └── phrasetobbox.json ... ``` Set **image_path** as the flickr-30k images foler. Similarly, set **ann_path** to the groundedcaption.json, captiontobbox.json and phrasetobbox.json for the grounded image caption, caption to bbox, and phrase to bbox datasets. - [minigpt4/configs/datasets/flickr/default.yaml](../minigpt4/configs/datasets/flickr/default.yaml) - [minigpt4/configs/datasets/flickr/caption_to_phrase.yaml](../minigpt4/configs/datasets/flickr/caption_to_phrase.yaml) - [minigpt4/configs/datasets/flickr/object_to_phrase.yaml](../minigpt4/configs/datasets/flickr/object_to_phrase.yaml) ### Multi-task conversation Download the multi-task converstation dataset ``` Location_you_like ${MINIGPTv2_DATASET} ├── multitask_conversation │ └── multitask_conversation.json ... ``` Set **image_path** as the COCO 2014 images folder. Similarly, set **ann_path** to the multitask_conversation.json file path - [minigpt4/configs/datasets/multitask_conversation/default.yaml](../minigpt4/configs/datasets/multitask_conversation/default.yaml) ### Unnatural instruction Download the filtered unnatural instruction annotation files (we remove the very long sentences from the original unnatural instruction dataset) ``` ${MINIGPTv2_DATASET} ├── unnatural-instructions │ └── filtered_unnatural_instruction.json ... ``` There is no image path. Similarly, set **ann_path** to the filtered_unnatural_instruction.json file path - [minigpt4/configs/datasets/nlp/unnatural_instruction.yaml](../minigpt4/configs/datasets/nlp/unnatural_instruction.yaml)