mirror of
https://github.com/Vision-CAIR/MiniGPT-4.git
synced 2025-04-04 18:10:47 +00:00
add minigptv2_train md
This commit is contained in:
parent
eb560920e0
commit
7073c19bb3
3
.gitignore
vendored
3
.gitignore
vendored
@ -180,4 +180,5 @@ jobs/
|
||||
slurm*
|
||||
sbatch_generate*
|
||||
eval_data/
|
||||
dataset/Evaluation.md
|
||||
dataset/Evaluation.md
|
||||
jupyter_notebook.slurm
|
@ -1,22 +1,21 @@
|
||||
## Finetune of MiniGPT-4
|
||||
|
||||
The training of MiniGPT-4 contains two alignment stages.
|
||||
|
||||
**1. First pretraining stage**
|
||||
You firstly need to prepare the dataset. you can follow this step to prepare the dataset.
|
||||
our [dataset preparation](dataset/README_MINIGPTv2_FINETUNE.md).
|
||||
|
||||
in train_configs/minigptv2_finetune.yaml, you need to set up the paths
|
||||
llama_model checkpoint path: "/path/to/llama_checkpoint"
|
||||
ckpt: "/path/to/pretrained_checkpoint"
|
||||
ckpt save path: "/path/to/save_checkpoint"
|
||||
|
||||
For ckpt, you may load from our pretrained model checkpoints:
|
||||
| MiniGPT-v2 (after stage-2) | MiniGPT-v2 (after stage-3) | MiniGPT-v2 (developing model (online demo)) |
|
||||
|------------------------------|------------------------------|------------------------------|
|
||||
| [Download](https://drive.google.com/file/d/1Vi_E7ZtZXRAQcyz4f8E6LtLh2UXABCmu/view?usp=sharing) |[Download](https://drive.google.com/file/d/1jAbxUiyl04SFJMN4sF1vvUU69Etuz4qa/view?usp=sharing) | [Download](https://drive.google.com/file/d/1aVbfW7nkCSYx99_vCRyP1sOlQiWVSnAl/view?usp=sharing) |
|
||||
|
||||
In the first pretrained stage, the model is trained using image-text pairs from Laion and CC datasets
|
||||
to align the vision and language model. To download and prepare the datasets, please check
|
||||
our [first stage dataset preparation instruction](dataset/README_1_STAGE.md).
|
||||
After the first stage, the visual features are mapped and can be understood by the language
|
||||
model.
|
||||
To launch the first stage training, run the following command. In our experiments, we use 4 A100.
|
||||
You can change the save path in the config file
|
||||
[train_configs/minigpt4_stage1_pretrain.yaml](train_configs/minigpt4_stage1_pretrain.yaml)
|
||||
|
||||
```bash
|
||||
torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/minigpt4_stage1_pretrain.yaml
|
||||
torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/minigptv2_finetune.yaml
|
||||
```
|
||||
|
||||
A MiniGPT-4 checkpoint with only stage one training can be downloaded
|
||||
[here (13B)](https://drive.google.com/file/d/1u9FRRBB3VovP1HxCAlpD9Lw4t4P6-Yq8/view?usp=share_link) or [here (7B)](https://drive.google.com/file/d/1HihQtCEXUyBM1i9DQbaK934wW3TZi-h5/view?usp=share_link).
|
||||
Compared to the model after stage two, this checkpoint generate incomplete and repeated sentences frequently.
|
||||
|
@ -1,35 +0,0 @@
|
||||
#!/bin/bash -l
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --cpus-per-task=6
|
||||
#SBATCH --gres=gpu:1
|
||||
#SBATCH --reservation=A100
|
||||
#SBATCH --mem=32GB
|
||||
#SBATCH --time=4:00:00
|
||||
#SBATCH --partition=batch
|
||||
##SBATCH --account=conf-iclr-2023.09.29-elhosemh
|
||||
|
||||
# Load environment which has Jupyter installed. It can be one of the following:
|
||||
# - Machine Learning module installed on the system (module load machine_learning)
|
||||
# - your own conda environment on Ibex
|
||||
# - a singularity container with python environment (conda or otherwise)
|
||||
|
||||
module load machine_learning
|
||||
|
||||
# get tunneling info
|
||||
export XDG_RUNTIME_DIR="" node=$(hostname -s)
|
||||
user=$(whoami)
|
||||
submit_host=${SLURM_SUBMIT_HOST}
|
||||
port=10035
|
||||
echo $node pinned to port $port
|
||||
# print tunneling instructions
|
||||
|
||||
echo -e "
|
||||
To connect to the compute node ${node} on IBEX running your jupyter notebook server, you need to run following two commands in a terminal 1.
|
||||
Command to create ssh tunnel from you workstation/laptop to glogin:
|
||||
|
||||
ssh -L ${port}:${node}:${port} ${user}@glogin.ibex.kaust.edu.sa
|
||||
|
||||
Copy the link provided below by jupyter-server and replace the NODENAME with localhost before pasting it in your browser on your workstation/laptop "
|
||||
|
||||
# Run Jupyter
|
||||
jupyter notebook --no-browser --port=${port} --port-retries=50 --ip=${node}
|
Loading…
Reference in New Issue
Block a user