From 446ede21292b2ce3c4eae6cef314373358e93277 Mon Sep 17 00:00:00 2001 From: Deyao Zhu Date: Thu, 20 Apr 2023 22:03:34 +0300 Subject: [PATCH] add checkpoint for vicuna 7b --- PrepareVicuna.md | 13 +++++++++---- README.md | 16 ++++++++++------ 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/PrepareVicuna.md b/PrepareVicuna.md index df24b31..0585e62 100644 --- a/PrepareVicuna.md +++ b/PrepareVicuna.md @@ -2,16 +2,21 @@ Vicuna is an open-source LLAMA-based LLM that has a performance close to ChatGPT. We currently use the v0 version of Vicuna-13B. -To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0). In case you have git-lfs installed (https://git-lfs.com), this can be done by +To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0). +In case you have git-lfs installed (https://git-lfs.com), this can be done by ``` git lfs install -git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0 +git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0 # more powerful, need at least 24G gpu memory +# or +git clone https://huggingface.co/lmsys/vicuna-7b-delta-v0 # smaller, need 12G gpu memory ``` Note that this is not directly the working weight, but the difference between the working weight and the original weight of LLAMA-13B. (Due to LLAMA’s rules, we cannot distribute the weight of LLAMA.) -Then, you need to obtain the original LLAMA-13B weights in the HuggingFace format either following the instruction provided by HuggingFace [here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet. +Then, you need to obtain the original LLAMA-7B or LLAMA-13B weights in the HuggingFace format +either following the instruction provided by HuggingFace +[here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet. When these two weights are ready, we can use tools from Vicuna’s team to create the real working weight. First, Install their library that is compatible with v0 Vicuna by @@ -23,7 +28,7 @@ pip install git+https://github.com/lm-sys/FastChat.git@v0.1.10 Then, run the following command to create the final working weight ``` -python -m fastchat.model.apply_delta --base /path/to/llama-13b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13b-delta-v0/ +python -m fastchat.model.apply_delta --base /path/to/llama-13bOR7b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13bOR7b-delta-v0/ ``` Now you are good to go! diff --git a/README.md b/README.md index c7c3c52..d06acf4 100644 --- a/README.md +++ b/README.md @@ -69,8 +69,13 @@ Then, set the path to the vicuna weight in the model config file **3. Prepare the pretrained MiniGPT-4 checkpoint** -To play with our pretrained model, download the pretrained checkpoint -[here](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link). +Download the pretrained checkpoints according to the Vicuna model you prepare. + +| Checkpoint Aligned with Vicuna 13B | Checkpoint Aligned with Vicuna 7B | +:------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------: + [Downlad](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link) | [Download](https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing) + + Then, set the path to the pretrained checkpoint in the evaluation config file in [eval_configs/minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml#L10) at Line 11. @@ -84,10 +89,9 @@ Try out our demo [demo.py](demo.py) on your local machine by running python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0 ``` -Here, we load Vicuna as 8 bit by default to save some GPU memory usage. -Besides, the default beam search width is 1. -Under this setting, the demo cost about 23G GPU memory. -If you have a more powerful GPU with larger GPU memory, you can run the model +To save GPU memory, Vicuna loads as 8 bit by default, with a beam search width of 1. +This configuration requires about 23G GPU memory for Vicuna 13B and 11.5G GPU memory for Vicuna 7B. +For more powerful GPUs, you can run the model in 16 bit by setting low_resource to False in the config file [minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml) and use a larger beam search width.