MiniGPT-4/README_MINIGPTv2_FINETUNE.md at 45a97de8cc02e1b85bb92bf5a194a2569a0a5ada

chris/MiniGPT-4

Fork 0

mirror of https://github.com/Vision-CAIR/MiniGPT-4.git synced 2025-04-05 02:20:47 +00:00

Deyao Zhu 45a97de8cc remove unused pretrain set in the finetune readme

2023-10-23 21:49:33 +03:00

3.4 KiB

Raw Blame History

Download the COCO captions, RefCOCO, RefCOCO+. RefCOCOg, visual genome, textcaps, LLaVA, gqa, AOK-VQA, OK-VQA, OCR-VQA, filtered Flickr-30k, multi-task conversation, and Unnatural instruction datasets

COCO captions

train2017

Visual genome

part1, part2

TextCaps

RefCOCO, RefCOCO+, RefCOCOg

Make sure you have the COCO 2014 images first.

Then, download RefCOCO, RefCOCO+, and RefCOCOg annotation files in the following links.

Unzip these files to the location you like. It should have the structure like the following

Location_you_like
├── refcoco
│   ├── instances.json
│   ├── refs(google).p
│   └── refs(unc).p
├── refcoco+
│   ├── instances.json
│   └── refs(unc).p
└── refcocog
    ├── instances.json
    ├── refs(google).p
    └── refs(umd).p

Set image_path in all the following dataset configuration files to the COCO 2014 image folder. Similarly, set ann_path in all the following configs to the above folder (Location_you_like) that contains refcoco, refcoco+, and refcocog.

Visual Genome

textcaps

LLaVA

Makesure you have the COCO 2014 images first.

Download Llava annotation files in the following link to the place you like.

Set image_path in all the following dataset configuration files to the COCO 2014 image folder. Similarly, set ann_path to the location of the previous downloaded conversation_58k.json, detail_23k.json, and complex_reasoning_77k.json in conversation.yaml, detail.yaml, and reason.yaml, respectively.

3.4 KiB

Raw Blame History

Download the COCO captions, RefCOCO, RefCOCO+. RefCOCOg, visual genome, textcaps, LLaVA, gqa, AOK-VQA, OK-VQA, OCR-VQA, filtered Flickr-30k, multi-task conversation, and Unnatural instruction datasets

COCO captions

Visual genome

TextCaps

RefCOCO, RefCOCO+, RefCOCOg

Visual Genome

textcaps

LLaVA

OKVQA

AOK-VQA

OCR-VQA

filtered Flickr-30k

Multi-task conversation

Unnatural instruction

3.4 KiB Raw Blame History

Download the COCO captions, RefCOCO, RefCOCO+. RefCOCOg, visual genome, textcaps, LLaVA, gqa, AOK-VQA, OK-VQA, OCR-VQA, filtered Flickr-30k, multi-task conversation, and Unnatural instruction datasets

COCO captions

Visual genome

TextCaps

RefCOCO, RefCOCO+, RefCOCOg

Visual Genome

textcaps

LLaVA

OKVQA

AOK-VQA

OCR-VQA

filtered Flickr-30k

Multi-task conversation

Unnatural instruction

3.4 KiB

Raw Blame History