mirror of
https://github.com/Vision-CAIR/MiniGPT-4.git
synced 2025-04-18 03:30:50 +00:00
* add audio dataset setup --------- Co-authored-by: bingyikang <bingyikang@bytedance.com> |
||
---|---|---|
.. | ||
make_tar.py | ||
process.py | ||
README.md | ||
setup.sh |
Audio Dataset
Stage1: Pretraining
We mainly use WavCaps dataset for pre-training.
Download
# install git-lfs
sudo apt update
sudo apt-get install git-lfs
git clone https://huggingface.co/datasets/cvssp/WavCaps
cd WavCaps
git lfs pull --include "*"
Processing
- Extract zip file
# merge shards first
zip -s- FILE_NAME.zip -O COMBINED_FILE.zip
unzip COMBINED_FILE.zip
- Processing Extract raw audio data
unzip COMBINED_FILE.zip -d /target/dir
Create json files (annotations) for each example. Before processing, modify dataset/audio/process.py
to set data and json path.
python3 --dataset test --data_dir /path/to/data --json_path /path/to/json
- Pack with tar
python3 dataset/audio/make_tar.py --input /path/to/data --output /path/to/web_dataset \
--dataclass none --filename filename --num_element 500
To view tar file
tar tf filename.tar | sed 10q
To setup in one line:
# DATASET=soundbible bbc audioset freesound
DATASET=soundbible bash dataset/audio/setup.sh