Allow download via ftp. #27

Open
opened 2025-12-07 17:59:38 +00:00 by brent.edwards · 0 comments
Member

When one tries to run python scripts/upload_all_datasets.py --dataset uniprot, it stops quickly with the message:

Downloading uniprot... Attempt {attempt_count}

Processing: UniProt RDF

Step 1: Downloading raw RDF dataset...

Downloading: UniProt RDF
Description: Comprehensive protein knowledgebase with functional annotations
Format: xml
Size: 50.0 GB (compressed) → 50.0 GB (extracted)
Entities: ~90M protein entries
Triples: ~3.4B
Downloading ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/? bytes ?  
Unexpected error during download: Request URL has an unsupported protocol 
'ftp://'.
Error downloading dataset: Request URL has an unsupported protocol 'ftp://'.

Download failed
Download failed for uniprot (exit code: 1)
Try running manually: python scripts/rdf_dataset_downloader.py uniprot -o 
dataset_processing/downloads/uniprot
Download failed. Aborting.

Add ftp as a protocol.

When one tries to run `python scripts/upload_all_datasets.py --dataset uniprot`, it stops quickly with the message: ``` Downloading uniprot... Attempt {attempt_count} Processing: UniProt RDF Step 1: Downloading raw RDF dataset... Downloading: UniProt RDF Description: Comprehensive protein knowledgebase with functional annotations Format: xml Size: 50.0 GB (compressed) → 50.0 GB (extracted) Entities: ~90M protein entries Triples: ~3.4B Downloading ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/? bytes ? Unexpected error during download: Request URL has an unsupported protocol 'ftp://'. Error downloading dataset: Request URL has an unsupported protocol 'ftp://'. Download failed Download failed for uniprot (exit code: 1) Try running manually: python scripts/rdf_dataset_downloader.py uniprot -o dataset_processing/downloads/uniprot Download failed. Aborting. ``` Add `ftp` as a protocol.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleverdatasets/dataset-uploader#27
No description provided.