MiniGPT-4/dataset/README_MINIGPTv2_FINETUNE.md
2023-10-23 19:27:01 +00:00

3.9 KiB

Download the COCO captions, RefCOCO, RefCOCO+. RefCOCOg, visual genome, textcaps, LLaVA, gqa, AOK-VQA, OK-VQA, OCR-VQA, filtered Flickr-30k, multi-task conversation, and Unnatural instruction datasets

COCO captions

Visual genome

TextCaps

RefCOCO, RefCOCO+, RefCOCOg

Make sure you have the COCO 2014 images first.

Then, download RefCOCO, RefCOCO+, and RefCOCOg annotation files in the following links.

Unzip these files to the location you like. It should have the structure like the following

Location_you_like
├── refcoco
│   ├── instances.json
│   ├── refs(google).p
│   └── refs(unc).p
├── refcoco+
│   ├── instances.json
│   └── refs(unc).p
└── refcocog
    ├── instances.json
    ├── refs(google).p
    └── refs(umd).p

Set image_path in all the following dataset configuration files to the COCO 2014 image folder. Similarly, set ann_path in all the following configs to the above folder (Location_you_like) that contains refcoco, refcoco+, and refcocog.

Visual Genome

textcaps

LLaVA

Makesure you have the COCO 2014 images first.

Download Llava annotation files in the following link to the place you like.

Set image_path in all the following dataset configuration files to the COCO 2014 image folder. Similarly, set ann_path to the location of the previous downloaded conversation_58k.json, detail_23k.json, and complex_reasoning_77k.json in conversation.yaml, detail.yaml, and reason.yaml, respectively.

OKVQA

AOK-VQA

export AOKVQA_DIR=YOUR_DATASET_PATH
mkdir -p ${AOKVQA_DIR}
curl -fsSL https://prior-datasets.s3.us-east-2.amazonaws.com/aokvqa/aokvqa_v1p0.tar.gz | tar xvz -C ${AOKVQA_DIR}

OCR-VQA

filtered Flickr-30k

Multi-task conversation

Unnatural instruction