How to download Hiro archives

Comprehensive guide for downloading large archive files reliably with troubleshooting tips.

Overview

Hiro Archive files are large datasets (ranging from several GB to several hundred GB) hosted on Google Cloud Storage. Due to their size, downloads can be interrupted by network issues, rate limits, or connection timeouts. This guide provides multiple download methods and troubleshooting solutions.

File sizes and requirements

Before downloading, ensure you have sufficient:

  • Disk space: Archives range from 10GB (APIs) to several hundred GB+ (blockchain data)
  • Bandwidth: Downloads can take hours or days depending on your connection
  • Storage for extraction: Archives expand significantly when extracted

Download methods

The wget command with the -c flag enables resuming interrupted downloads:

macOS Users

You may need to install wget first: brew install wget. Alternatively, use the curl method below which is pre-installed on macOS.

Terminal
$
wget -c https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz
$
wget -c --progress=bar:force:noscroll https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz

Advantages:

  • Resumes interrupted downloads automatically
  • Built into most Unix systems
  • Simple to use

Disadvantages:

  • Single-threaded downloads
  • May still experience connection resets

Method 2: curl with retries

Use curl with automatic retries for robust downloads. The --continue-at - flag resumes partial downloads, while --output specifies the filename:

Terminal
$
curl --continue-at - --retry 10 --retry-delay 5 --retry-max-time 0 \
--progress-bar \
--output mainnet-stacks-blockchain-latest.tar.gz \
https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz

Advantages:

  • Automatic retry mechanism
  • Resume capability with -C -
  • More configuration options

Method 3: gcloud storage cp (Fastest, requires authentication)

Google Cloud CLI provides the fastest download speeds with parallel transfers. First authenticate with gcloud auth login, then either download the file to disk or stream directly to extraction:

Download file to current directory

Terminal
$
gcloud storage cp gs://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz .

OR stream directly to extraction (saves disk space but slower due to sequential download)

Terminal
$
gcloud storage cp gs://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.tar.gz - | tar -xz

Advantages:

  • Significantly faster downloads (2-3x speed improvement)
  • Built-in parallel transfers
  • Automatic retry handling
  • Can stream directly to extraction (useful when disk space is limited, but disables parallel transfers)

Disadvantages:

  • Requires Google account authentication
  • Additional software installation needed

Method 4: Download managers (JDownloader)

For users who prefer GUI applications or need advanced download management:

  1. 1Download and install JDownloader
  2. 2Copy the archive URL into JDownloader
  3. 3Configure parallel connections for faster downloads

Advantages:

  • Graphical interface
  • Parallel downloading
  • Advanced retry mechanisms
  • Cross-platform support

Verification and extraction

After downloading, verify the file integrity:

Checksum Availability

SHA256 checksum files are available for all archives to verify download integrity.

  1. 1

    Download the checksum file:

    Terminal
    $
    wget https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-latest.sha256
  2. 2

    Verify the download:

    Terminal
    $
    echo "$(cat mainnet-stacks-blockchain-latest.sha256 | awk '{print $1}') mainnet-stacks-blockchain-latest.tar.gz" | shasum -a 256 -c
  3. 3

    Extract the archive:

    Terminal
    $
    tar -zxvf mainnet-stacks-blockchain-latest.tar.gz -C /target/directory
Large file extraction

The marf.sqlite.blobs file can be very large and may take significant time to extract. Ensure you have sufficient disk space and be patient during extraction.

Performance tips

  1. 1Use gcloud for fastest downloads - requires authentication but provides significant speed improvements
  2. 2Download during off-peak hours - typically late night or early morning
  3. 3Use wired connections - avoid Wi-Fi for large downloads when possible
  4. 4Monitor disk space - extracted archives can be 2-3x larger than compressed files
  5. 5Consider streaming extraction with gcloud to save disk space

FAQ

How is this guide?