Download#

Dataset Download via ModelScope#

Step 1: Install Modelscope Library#

pip install -U modelscope

Step 2: Download Dataset#

Create Python script with the following code:

1from modelscope.hub.snapshot_download import snapshot_download
2
3model_dir = snapshot_download(
4    'https://www.modelscope.cn/models/AiurRuili/TheMatrix',
5    cache_dir='your/custom/path',  # Optional: specify custom storage path
6    ignore_file_pattern=['*.bin', '*.pt']  # Optional: exclude weight files
7)

Configuration Options:#

Parameter

Description

cache_dir

Custom storage path (default: ~/.cache/modelscope)

ignore_file_pattern

File patterns to exclude (e.g. model weights)

revision

Dataset version (default: ‘main’)

Dataset Download via Hugging Face#

Step 1: Install Required Libraries#

pip install -U huggingface_hub datasets

Step 2: Download Dataset#

Create Python script with the following code:

 1from huggingface_hub import snapshot_download
 2
 3dataset_dir = snapshot_download(
 4    repo_id="https://huggingface.co/MatrixTeam/TheMatrix/",
 5    repo_type="dataset",
 6    cache_dir="your/custom/path",  # Optional: custom storage path
 7    ignore_patterns=["*.weights", "*.safetensors"],  # Optional: exclude model files
 8    token="hf_YourAccessToken"  # Required for private datasets
 9)

Configuration Options:#

Parameter

Description

repo_type

Type of repository (dataset/model/space)

cache_dir

Custom storage path (default: ~/.cache/huggingface)

ignore_patterns

File patterns to exclude (e.g. model weights)

revision

Dataset version (default: ‘main’)

token

Access token for private repositories