Incompatible: Static S3 Website With CloudFront Forwarding All Headers

a small lesson learned in setting up a static web site with S3 and CloudFront

I created a static web site hosted in an S3 bucket named www.example.com (not the real name) and enabled accessing it as a website. I wanted delivery to be fast to everybody around the world, so I created a CloudFront distribution in front of the S3 bucket.

I wanted S3 to automatically add “index.html” to URLs ending in a slash (CloudFront can’t do this), so I configured the CloudFront distribution to access the S3 bucket as a web site using www.example.com.s3-website-us-east-1.amazonaws.com as the origin server.

Before sending all of the www.example.com traffic to the new setup, I wanted to test it, so I added test.example.com to the list of CNAMEs in the CloudFront distribution.

After setting up Route53 so that DNS lookups for test.example.com would resolve to the new CloudFront endpoint, I loaded it in my browser and got the following error:

404 Not Found

Code: NoSuchBucket
Message: The specified bucket does not exist
BucketName: test.example.com
RequestId: [short string]
HostId: [long string]

Why would AWS be trying to find an S3 bucket named test.example.com? That was pointing at the CloudFront distribution endpoint, and CloudFront was configured to get the content from www.example.com.s3-website-us-east-1.amazonaws.com

After debugging, I found out that the problem was that I had configured the CloudFront distribution to forward “all” HTTP headers. I thought that this would be a sneaky way to turn off caching in CloudFront so that I could keep updating the content in S3 and not have to wait to see the latest changes.

However, this also means that CloudFront was forwarding the HTTP Host header from my browser to the S3 website handler. When S3 saw that I was requesting the host of test.example.com it looked for a bucket of the same name and didn’t find it, resulting in the above error.

When I turned off forwarding all HTTP headers in CloudFront, it then started sending through the correct header:

Host: www.example.com.s3-website-us-east-1.amazonaws.com

which S3 correctly interpreted as accessing the correct S3 bucket www.example.com in the website mode (adding index.html after trailing slashes).

It makes sense for CloudFront to support forwarding the Host header from the browser, especially when your origin server is a dynamic web site that can act on the original hostname. You can set up a wildcard *.example.com DNS entry pointing at your CloudFront distribution, and have the back end server return different results depending on what host the browser requested.

However, passing the Host header doesn’t work so well for an origin server S3 bucket in website mode. Lesson learned and lesson passed on.