How to download Hiro archives

Comprehensive guide for downloading large archive files reliably with troubleshooting tips.

Overview

Hiro Archive files are large datasets (ranging from several GB to several hundred GB) hosted on Google Cloud Storage. Due to their size, downloads can be interrupted by network issues, rate limits, or connection timeouts. This guide provides multiple download methods and troubleshooting solutions.

File sizes and requirements

Before downloading, ensure you have sufficient:

Disk space: Archives range from 10GB (APIs) to several hundred GB+ (blockchain data)
Bandwidth: Downloads can take hours or days depending on your connection
Storage for extraction: Archives expand significantly when extracted

Download methods

Method 1: wget with resume (Recommended for most users)

The wget command with the -c flag enables resuming interrupted downloads:

macOS Users

You may need to install wget first: brew install wget. Alternatively, use the curl method below which is pre-installed on macOS.

Terminal

$ wget -c https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz
$ wget -c --progress=bar:force:noscroll https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz

Advantages:

Resumes interrupted downloads automatically
Built into most Unix systems
Simple to use

Disadvantages:

Single-threaded downloads
May still experience connection resets

Method 2: curl with retries

Use curl with automatic retries for robust downloads. The --continue-at - flag resumes partial downloads, while --output specifies the filename:

Terminal

$ curl --continue-at - --retry 10 --retry-delay 5 --retry-max-time 0 \
  --progress-bar \
  --output mainnet-stacks-blockchain-latest.tar.gz \
  https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz

Advantages:

Automatic retry mechanism
Resume capability with -C -
More configuration options

Method 3: gcloud storage cp (Fastest, requires authentication)

Google Cloud CLI provides the fastest download speeds with parallel transfers. First authenticate with gcloud auth login, then either download the file to disk or stream directly to extraction:

Download file to current directory

Terminal

$ gcloud storage cp gs://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz .

OR stream directly to extraction (saves disk space but slower due to sequential download)

Terminal

$ gcloud storage cp gs://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz - | tar -xz

Advantages:

Significantly faster downloads (2-3x speed improvement)
Built-in parallel transfers
Automatic retry handling
Can stream directly to extraction (useful when disk space is limited, but disables parallel transfers)

Disadvantages:

Requires Google account authentication
Additional software installation needed

Method 4: Download managers (JDownloader)

For users who prefer GUI applications or need advanced download management:

1Download and install JDownloader
2Copy the archive URL into JDownloader
3Configure parallel connections for faster downloads

Advantages:

Graphical interface
Parallel downloading
Advanced retry mechanisms
Cross-platform support

Verification and extraction

After downloading, verify the file integrity:

Checksum Availability

SHA256 checksum files are available for all archives to verify download integrity.

Download the checksum file:

Terminal

$ wget https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.sha256

Verify the download:

Terminal

$ echo "$(cat mainnet-stacks-blockchain-latest.sha256 | awk '{print $1}')  mainnet-stacks-blockchain-latest.tar.gz" | shasum -a 256 -c

Extract the archive:

Terminal

$ tar -zxvf mainnet-stacks-blockchain-latest.tar.gz -C /target/directory

Large file extraction

The marf.sqlite.blobs file can be very large and may take significant time to extract. Ensure you have sufficient disk space and be patient during extraction.

Performance tips

1Use gcloud for fastest downloads - requires authentication but provides significant speed improvements
2Download during off-peak hours - typically late night or early morning
3Use wired connections - avoid Wi-Fi for large downloads when possible
4Monitor disk space - extracted archives can be 2-3x larger than compressed files
5Consider streaming extraction with gcloud to save disk space

How to download Hiro archives

Overview

File sizes and requirements

Download methods

Method 1: wget with resume (Recommended for most users)

Method 2: curl with retries

Method 3: gcloud storage cp (Fastest, requires authentication)

Download file to current directory

OR stream directly to extraction (saves disk space but slower due to sequential download)

Method 4: Download managers (JDownloader)

Verification and extraction

Performance tips

FAQ

How is this guide?

How to download Hiro archives

Why do downloads keep failing?

Which download method should I use?

How long should a download take?

Can I resume a failed download?

Why is gcloud faster?

Do I need to verify downloaded files?

How is this guide?