diff --git a/PrepareVicuna.md b/PrepareVicuna.md new file mode 100644 index 0000000..89372cd --- /dev/null +++ b/PrepareVicuna.md @@ -0,0 +1,30 @@ +## How to Prepare Vicuna Weight +Vicuna is an open-source LLAMA-based LLM that has a performance close to ChatGPT. +We currently use the v0 version of Vicuna-13B. + +To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0). In case you have git-lfs installed (https://git-lfs.com), this can be done by + +``` +git lfs install +git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0 +``` + +Note that this is not directly the working weight, but the difference between the working weight and the original weight of LLAMA-13B. (Due to LLAMA’s rules, we cannot distribute the weight of LLAMA.) + +Then, you need to obtain the original LLAMA-13B weights in the HuggingFace format either following the instruction provided by HuggingFace [here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet. + +When these two weights are ready, we can use tools from Vicuna’s team to create the real working weight. +First, Install their library that is compatible with v0 Vicuna by + +``` +pip install git+https://github.com/huggingface/transformers@v0.1.10 +``` + +Then, run the following command to create the final working weight + +``` +python -m fastchat.model.apply_delta --base /path/to/llama-13b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13b-delta-v0/ +``` + +Now you are good to go! + diff --git a/README.md b/README.md index c00046b..a6a5d8d 100644 --- a/README.md +++ b/README.md @@ -53,7 +53,8 @@ conda activate minigpt4 **2. Prepare the pretrained Vicuna weights** The current version of MiniGPT-4 is built on the v0 versoin of Vicuna-13B. -Please refer to their instructions [here](https://huggingface.co/lmsys/vicuna-13b-delta-v0) to obtaining the weights. +Please refer to our instruction [here](PrepareVicuna.md) +to prepare the Vicuna weights. The final weights would be in a single folder with the following structure: ``` @@ -105,7 +106,7 @@ You can change the save path in the config file torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/minigpt4_stage1_pretrain.yaml ``` -**1. Second finetuning stage** +**2. Second finetuning stage** In the second stage, we use a small high quality image-text pair dataset created by ourselves and convert it to a conversation format to further align MiniGPT-4.