Context

The site and its sub-projects need general-purpose object storage for serving files to clients and storing versioned data. Two concrete use cases already exist in the Terraform configuration:

  1. Map Tiles: Projects like the Leaflet demo and pathology-related blogs use pre-generated map tile images that need to be served publicly with aggressive edge caching via tiles.robbiepalmer.me.
  2. DVC (Data Version Control): ML pipelines produce versioned datasets and model artifacts that need to be stored privately and retrieved reproducibly via S3-compatible APIs.

Beyond these, there are several natural extensions for a project-oriented personal site:

  • Server-Generated Files: Storing outputs from server-side processes—e.g. a PDF resume generated by a build step, processed data pipeline results, or generated reports that are too large for git but need to be served to clients. (Client-side generation with local download is preferred when feasible, but server-side generation needs somewhere to put the output.)
  • User-Uploaded Content: If sub-projects ever accept user input, R2 provides a natural home for uploaded files without provisioning a separate storage service.
  • Backup & Archival: Snapshots of external data sources, API responses, or scraped datasets used in blog posts—ensuring reproducibility even if the original source disappears.
  • Image Hosting via R2 + Worker: As noted in ADR 029, Cloudflare Images costs ~$5/month for storage and delivery. Since Cloudflare merged Images and Image Resizing, the transformation API is available on a free tier (5,000 unique transformations/month) without needing a Pro plan or Images subscription. A Pages Function or Worker sitting in front of an R2 bucket could serve and resize images on-the-fly using the Image Transformations API—paying only for R2 storage and operations, with no fixed monthly cost. Beyond the free tier, transformations cost $0.50 per 1,000 unique variants, which is still significantly cheaper than the Images subscription at personal-project scale.

Decision

Use Cloudflare R2 as the primary object storage layer for the site and its sub-projects.

R2 buckets are provisioned via Terraform in infra/cloudflare/main.tf and scoped per use case (e.g. map-tiles, dvc). Public buckets are exposed via custom subdomains with Cloudflare Rulesets controlling cache behaviour. Private buckets are accessed via S3-compatible API tokens.

Why R2 Over Other Object Stores

Zero Egress Fees

The defining feature of R2. Traditional cloud object stores (AWS S3, GCP GCS) charge per-GB egress fees that scale with traffic. For a personal site that serves assets directly to browsers—map tiles, images, datasets—egress can become the dominant cost. R2 eliminates this entirely: you pay only for storage and operations (Class A/B), never for bandwidth. This makes cost predictable and effectively zero at personal-project scale.

Platform Consolidation

The site already runs on Cloudflare: Pages for hosting ADR 011, DNS ADR 028, Images ADR 029, and Terraform for IaC ADR 017. Adding R2 keeps everything under one account, one billing relationship, and one set of API tokens—Less Is More. There is no new platform to sign up for, no new IAM model to learn, and no cross-cloud networking to configure.

S3 API Compatibility

R2 implements the S3 API, which means existing tooling (AWS CLI, Boto3, DVC, Terraform) works without modification. This provides an exit path: if R2 ever becomes insufficient, migrating to S3 or any S3-compatible store (MinIO, Backblaze B2) is a configuration change, not a rewrite.

Simplicity

Compared to provisioning AWS S3 (which requires an AWS account, IAM policies, bucket policies, and optionally CloudFront for CDN), R2 is operationally simpler. A single Terraform resource creates the bucket. A custom domain and a Cloudflare Ruleset handle public access and caching. No separate CDN configuration is needed—Cloudflare's edge network sits in front of R2 by default.

Alternatives Considered

AWS S3 + CloudFront

  • Pros: Industry standard. Battle-tested. Broadest ecosystem support.
  • Cons: Egress fees scale with traffic. Requires a separate AWS account, IAM configuration, and CloudFront distribution—significant platform overhead for a personal site already consolidated on Cloudflare.
  • Decision: Rejected. The operational complexity and egress cost model don't justify the marginal maturity advantage for this use case.

Backblaze B2

  • Pros: Low-cost storage. Free egress when paired with Cloudflare CDN (Bandwidth Alliance).
  • Cons: Adds a second platform (Backblaze account, separate API keys). No native Cloudflare integration—still requires configuring Cloudflare as a CDN proxy manually. Terraform support is less mature.
  • Decision: Rejected. R2 provides the same cost benefit with tighter integration and fewer moving parts.

Google Cloud Storage

  • Pros: Tight integration with BigQuery and GCP ML tooling.
  • Cons: Same platform sprawl and egress cost issues as S3. No synergy with the existing Cloudflare stack.
  • Decision: Rejected.

Git LFS

  • Pros: Keeps assets versioned alongside code.
  • Cons: Bloats the repository for large or frequently changing files. GitHub LFS has bandwidth and storage quotas on the free tier. Not suitable for serving assets to end users.
  • Decision: Rejected. R2 + DVC is a superior model for versioning large artifacts without polluting git history.

Consequences

Positive

  • Zero Egress: Serving map tiles, datasets, and images to users costs nothing beyond storage and operations—ideal for a site where traffic is unpredictable.
  • Consolidation: All infrastructure remains under one Cloudflare account, managed via Terraform—Less Is More.
  • Scalable Image Strategy: R2 + a Pages Function/Worker provides a path to replace or supplement Cloudflare Images ADR 029 with a pay-as-you-go model, eliminating the fixed $5/month cost.
  • Tooling Compatibility: S3-compatible API means DVC, AWS CLI, and any S3 SDK work out of the box with no custom integration.
  • IaC: Buckets, access controls, and caching rules are all defined in Terraform, versioned in git, and reviewed via Pull Requests.
  • Edge Performance: Public R2 buckets served via custom domains benefit from Cloudflare's global edge network automatically—no separate CDN setup required.

Negative

  • Cloudflare Dependency: Further deepens the reliance on a single platform. If Cloudflare changes pricing or terms, more infrastructure is affected. Mitigated by S3 API compatibility providing a portable exit path.
  • Operational Limits: R2 enforces a limit of 1 concurrent write per second to the same object key (HTTP 429 on violation) and does not publish per-bucket read throughput caps for custom domain access. By comparison, S3 documents 5,500 GET and 3,500 PUT requests per second per prefix, scalable linearly by adding prefixes. In practice, reads through a custom domain benefit from Cloudflare's edge cache, so origin throughput is rarely the bottleneck. Unlikely to matter at personal-project scale, but worth noting if write-heavy workloads emerge.