MiniGPT-4

mirror of https://github.com/Vision-CAIR/MiniGPT-4.git synced 2025-04-18 03:30:50 +00:00

History

Bingyi Kang 64472dedb1 Audio dataset (#1 ) * add audio dataset setup --------- Co-authored-by: bingyikang <bingyikang@bytedance.com>		2023-05-24 14:16:50 +08:00
..
make_tar.py	Audio dataset (#1 )	2023-05-24 14:16:50 +08:00
process.py	Audio dataset (#1 )	2023-05-24 14:16:50 +08:00
README.md	Audio dataset (#1 )	2023-05-24 14:16:50 +08:00
setup.sh	Audio dataset (#1 )	2023-05-24 14:16:50 +08:00

README.md

Audio Dataset

Stage1: Pretraining

We mainly use WavCaps dataset for pre-training.

Download

# install git-lfs
sudo apt update
sudo apt-get install git-lfs


git clone https://huggingface.co/datasets/cvssp/WavCaps
cd WavCaps
git lfs pull --include "*"

Processing

Extract zip file

# merge shards first
zip -s- FILE_NAME.zip -O COMBINED_FILE.zip
unzip COMBINED_FILE.zip

Processing Extract raw audio data

unzip COMBINED_FILE.zip -d /target/dir

Create json files (annotations) for each example. Before processing, modify dataset/audio/process.py to set data and json path.

python3 --dataset test --data_dir /path/to/data --json_path /path/to/json

Pack with tar

python3 dataset/audio/make_tar.py --input /path/to/data --output /path/to/web_dataset \
    --dataclass none --filename filename --num_element 500

To view tar file

tar tf filename.tar | sed 10q

To setup in one line:

# DATASET=soundbible bbc audioset freesound
DATASET=soundbible bash dataset/audio/setup.sh