Scale Or Optimize

As documented earlier this week, I migrated my whole online ecosystem to AWS over the weekend. It was a fun project. I shared this with a co-worker at lunch today which naturally launched into a conversation about our own environment in the healthcare industry. Because of HIPPA compliance and such, it’s probably not a worthwhile endeavor (for now at least). Our conversation led to scale in general. Does every app really need to scale to the level that AWS allows? Probably not. We own our servers, and, as far as I know, we have maybe three boxes…not even enough to fill a rack. Maybe there’s more, but we support tons of stuff on those three boxes. AWS is way overkill for my measly WordPress site, but it was fun to build and get some experience in nonetheless.

So, should we plan for scaling? Or should we optimize our existing code base to be as performant as possible while minimizing our additional overhead in load-balancers and other such fun magic? We chose to simplify. Less code to maintain means we get to do more fun projects in the future because we have far less cognitive load in keeping up an increasingly complex code base that has more room for bugs. Fewer lines and less complicated architecture keeps smells from creeping in because there’s no place for it to hide.

Migrating to AWS

I spent some time over the course of this past week migrating my entire online ecosystem to AWS. There was talk at work about using the AWS SQS for some pushes that we have for various objects to various back-end service providers. I already have a CDN at AWS, so I took the opportunity to migrate a piece at a time.

I started by moving DNS to Route53. I created a new CloudFront for my static site and requested some new certificates using ACM. Once I got certs applied to the new CloudFront instance, I moved my static site (this one) to S3. I shuffled the DNS around to point to the right place. I’d spent some time awhile back with GitLab’s pipelines to automatically deploy this Jekyll site. Now I’ve got a dilemma: at this point because I couldn’t automatically deploy to S3 from GitLab. I could; I just wasn’t sure how. More on this later.

The other project I took on was getting familiar with RDS and EC2. I started by pointing the existing site (not on AWS) to the RDS instance. I got that working, then recreated my WordPress install on EC2. I got that working, but certs were broken… 😕 The whole reason I switched to ACM was to keep from having to upload new certs through the AWS CLI each time I needed to renew. After some quick research, I discovered that ACM can manage certs automatically but only for Elastic Beanstalk and load balanced instances. Oi! I guess I’m going to get real comfortable with AWS!

I did some reading on ELB and got everything sorted out there for a WordPress install. After some trial and error, I finally got it working on Saturday morning. This was the trickiest part, not due to anything AWS related, but in how I wanted my SSL to work. I don’t want anything non-SSL on any site. After some finagling of the WordPress database, and a lot of tinkering with the ELB environment and load balancer, I’ve got it working exactly how I want it.

Now, back to CI for my static site. Naturally I started with the code repository. I moved the repo to CodeCommit, created a CodeBuild project, then automated all of it with CodePipeline. This was the easiest part of the whole process, and it put a big smile on my face. Even this post has been automatically deployed using all of the magic above. Git push, and we’re on our way.

To summarize:

  • Moved DNS to Route53
  • Moved SSL certs to ACM
  • Moved static professional site to S3
  • Moved WordPress family site to ELB with load balancer (like we’ll ever need it 😉 )
  • Moved repos for both sites to CodeCommit
  • Automated deployment for each site to ELB and S3 with CodeBuild and CodePipeline respectively
  • All of the above include establishing IAM roles and policies to ensure appropriate access for every service involved in the processes

After some lengthy evenings (in the midst of trying to buy a house!), I was done…at least for now. It was frustrating at times because I was in some very new and foreign territory to me. Thanks to lots of tutorials, I was successful.

I was very pleased with the entire experience, though. Amazon has made it incredibly easy to onboard into their ecosystem…almost too easy. It was an extremely fun project. Their documentation is stellar. The community around it is also very active and seems to provide lots of helpful tips. I’m not on a paid support plan, so it took a little bit of digging to find the answers I needed, but I can’t see myself moving from AWS for quite awhile.

Refactor or Hack

In my day to day, I often come across code that was written before I arrived. Sometimes it’s great. Easy to understand, abstracted at an appropriate level, maybe even commented (gasp!). Sometimes, though, it’s been written poorly. If the person is still with the company, I can just ask about their intentions or for clarity if something is easily followed. The worst case is when the code is written poorly and by someone who is no longer with the company. Now I have code that is hard to understand, the person that wrote it is gone, and I’m being asked to either fix or implement a new feature utilizing pieces of this existing code.

As I dig further in to understand what’s happened before me, I realize that there are a multitude of issues with the existing code. It didn’t account for this case or that business rule that it was supposed to. It’s duplicating data in two tables or it’s written in an unintelligible way in order to be “fancy”. I like fancy code as much as the next developer, but more often than not, simplicity is best. Me: 0. Technical Debt: 1.

This brings me to the all-important question: Refactor or hack? Should I take the time to refactor the existing code, clean it up, and make it easier for the next person that comes along or should I hack at the existing code to deliver the feature or bug fix on time and on budget? It’s difficult for a developer to answer this question because most of us want to do a good job and create the best possible code base even if it takes longer. We take pride in our work and genuinely want to do a good job. Often the pressure from the business side reveals that they don’t care how well the code works under the covers. They just want it to work and work today.

I’m fortunate to have a great business team that understands the long game and genuinely wants an easily maintainable code base, but I still struggle with “how long has this story been kicked down the sprint road because of other broken places in the code?” I don’t have an answer for this question at the moment. It’s just a thought that’s been tumbling around in my mind as I’ve added at least three stories to refactor large portions of code in an effort to simplify and make go-forward maintenance easier.

Let’s Encrypt Nginx & AWS Cloudfront

As I learn so many different technologies, I figured it would be helpful to set up a sub domain for things not directly related to the Walsh family. In typical Stephen-like fashion, I didn’t half-ass it. It’s probably overkill, over-engineered, and way more complicated than it needs to be, but that’s why you love me. ♥️

High Level

If you don’t want to read through the more technical steps, take a look at the table below to get the high level. In a nutshell, I wanted simple for the top-level domain (i.e. WordPress), and I wanted something that I could play around with and keep some of my professional work grouped together on a sub-domain that wasn’t WordPress as a working portfolio of sorts.

Domain Location
connectwithawalsh.com SSL’d Nginx on Digital Ocean
stephen.connectwithawalsh.com SSL’d Nginx on Digital Ocean
media.connectwithawalsh.com SSL’d S3/Cloudfront

Again for those that want to skim, here are links to each of the tutorials that I followed to get the above setup. I’m a big fan of Digital Ocean, and you should be too 🙂

Links

  • Ubuntu 16.04 initial server setup1
  • Nginx server blocks (for the top level and sub domains on Digital Ocean)2
  • SSL Encryption with Let’s Encrypt for Nginx3
  • S3 Static Site CloudFront CDN4
  • Let’s Encrypt S3/CloudFront CDN5
  • MySQL backup to S36

Now, here are the specifics of how I set things up. I hope this is beneficial in some way to others out there that want to run a secure, “light-weight” infrastructure, that costs less than $15/month. I’m assuming that you know how to get around comfortably on the command line, know how to make some basic DNS adjustments when needed, and have a “can-do” attitude because you’ll more than likely enter something incorrectly like I did and have to back-track. Get comfortable with snap-shotting your server so you can backup without completely destroying all of your work.

Ubuntu 16.04 initial server setup

I’ve used Ubuntu for years. I’ve tried other stuff. It just seems to keep trucking along with no signs of stopping. There is nothing real fancy in this tutorial that was extra special to my setup.

Nginx server blocks to setup a sub domain hosted on the same server

I’ve not used Nginx for WordPress up until recently. I’ve been happy with how easy it is to setup even with SSL (which we’ll get to later). Nothing real important to note here. I’ll give some more detail on the server blocks in the below section because they have SSL directives which won’t make sense out of the context of encryption.

SSL Encryption with Let’s Encrypt for Nginx

This is where things start to get a bit more custom and interesting. Let’s Encrypt has not provided any “out of the box” functionality for Nginx. It’s not wholly unsupported, but it does require a bit of know how. First, this tutorial is going to ask you to install Let’s Encrypt manually via a Git package. This is not necessary. Simply run the apt-get command. If you have a sudo user setup (which you should!), then this will run without a hitch.

sudo apt-get install letsencrypt

Running from the package manager makes running some of the other commands here quite a bit easier, instead of:

cd /opt/letsencrypt
./letsencrypt-auto

You just get do:

sudo letsencrypt --foobar command

When you get to the point of actually wanting to generate your certs, you run this:

sudo letsencrypt certonly --webroot -w /var/www/mytoplevel.com/html/ -d www.mytoplevel.com -d mytoplevel.com -w /var/www/sub.mytoplevel.com/html -d sub.mytoplevel.com

You’ll notice I didn’t include the S3/Cloudfront. There’s a reason for that which we’ll get to shortly. Now for some server blocks. If you followed the guide above, you’ll have three sites in /etc/nginx/sites-available. Now for their server blocks.

Default server block

server {
        listen 80 default_server;
        listen [::]:80 default_server;

        root /var/www/html;

        # Add index.php to the list if you are using PHP
        index index.php index.html index.htm index.nginx-debian.html;

        server_name _;      

        # other configuration below
}

Top level domain server blocks

server {
        listen 80;
        listen [::]:80;
        server_name toplevel.com www.toplevel.com;
        return 301 https://$server_name$request_uri; # NOTE this line. It may have to be commented out, then commented in after the S3/CloudFront configuration
}

server {
        # SSL configuration

        listen 443 ssl http2 default_server;
        listen [::]:443 ssl http2 default_server;
        include snippets/ssl-toplevel.com.conf;
        include snippets/ssl-params.conf;

        root /var/www/toplevel.com/html;

        # Add index.php to the list if you are using PHP
        index index.php index.html index.htm index.nginx-debian.html;

        server_name toplevel.com www.toplevel.com;

        # other configuration below
}

Sub domain server blocks

server {
        listen 80;
        listen [::]:80;
        server_name sub.toplevel.com;
        return 301 https://$server_name$request_uri;
}

server {
        # SSL configuration

        listen 443 ssl http2;
        listen [::]:443 ssl http2;
        include snippets/ssl-toplevel.com.conf;  # NOTE that this is the same file as the top level domain based on the tutorial
        include snippets/ssl-params.conf;

        root /var/www/sub.toplevel.com/html;

        # Add index.php to the list if you are using PHP
        index index.php index.html index.htm index.nginx-debian.html;

        server_name sub.toplevel.com;

        # other configuration below
}

Make sure you point your top level and sub domain server blocks to the same snippets for ssl-toplevel.com.conf and ssl-params.conf. Note also that you have to comment out the 301 redirect for your top level if the S3/CloudFront configuration doesn’t work as expected. This line reroutes traffic from a standard http:// request to the https:// endpoint which doesn’t yet exist for the S3/CloudFront sub domain.

Check your work

Can you get to both domains with an encrypted connection? If you followed the server block tutorial, you should have some test index.html files to validate that your domain and sub domain are routing and working properly with SSL. Try it with http:// to make sure it’s rerouting. If so take a snapshot of your VPS. I took about 5-7 snapshots at various points throughout this process in case I messed something up.

S3 Static Site CloudFront CDN

Setting up a static site was pretty straightforward. Just be mindful that I am not using Amazon Route 53 for DNS, so I skipped those steps in this tutorial. I did a standard CNAME pointer from my sub domain to the CloudFront given domain name. Make sure you can get the CloudFront provided URL before moving forward. It may take 15-20 minutes to provision everything and be ready for traffic. Remember that you are creating a bucket for your sub domain and title it appropriately. In my case this was media.connectwithawalsh.com. Since I’m not using a top level domain and www style domain in S3, I had to adjust the tutorial to suit my needs.

Let’s Encrypt S3/CloudFront CDN

This tutorial was a great help, but I piece-mealed from some other sites as well. The first thing you’ll want to do is install the AWS CLI. This will require you to install python-pip from apt-get if you’re using Ubuntu. While you’re waiting on that, go set up a new IAM user on AWS and give it an appropriate policy to upload certificates.

To recap, we have the AWS CLI installed and we have an IAM user ready to send a not yet generated certificate to our AWS CloudFront. Still with me? If you haven’t already, take a snap shot before we start working on this next section.

First, you should already have a terminal open from which you’ve been running all of these other commands. Open a second one. This is important. When you run the next letsencrypt command it’s going to validate a file in your CloudFront. You must place that file there with the AWS CLI prior to hitting Enter on the first terminal window. We’ll generate that file in just a bit. Ready?

 sudo letsencrypt certonly --manual

This is going to give you a prompt for your CloudFront sub domain (i.e. media.connectwithawalsh.com). Remember when we didn’t generate a certificate for this domain earlier? We did that because wanted to run this manually. It’s special. Follow the prompts and you’ll get something that looks like this:

Make sure your web server displays the following content at
http://your-site.whatever/.well-known/acme-challenge/some_long_path before continuing:

some_long_string

Content-Type header MUST be set to text/plain.

... <snip> ...
Press ENTER to continue

Notice it says, “Press ENTER to continue”. Don’t do that yet. We have to copy the acme-challenge file up to your S3 bucket first. If you ran the `aws configure’ command with your new IAM role, great. If not, go ahead and run this in a separate terminal window:

aws configure

It’s going to prompt you for your AWS Access Key ID, AWS Secret Access Key, and your Default region name. You should enter the default region for your S3 bucket. When you’ve configured this correctly, you can view your credentials and config files in ~/.aws just to make sure they are correct.

Now we’re ready to put the acme-challenge file into our S3 bucket. Store the some_long_string file in a temp file first ensuring that you’ve replaced some_long_string with the long string from your first terminal window.

printf "%s" some_long_string > /tmp/acme-challenge

Then upload the verification file to your S3 bucket ensuring that you’ve replaced some_long_path with the long path from your first terminal window.

aws s3 cp /tmp/acme-challenge s3://your_s3_bucket_name/.well-known/acme-challenge/some_long_path --content-type text/plain

Sanity check that the file is there.

curl -D - http://your-site.whatever/.well-known/acme-challenge/some_long_path

I had a bit of trouble with this. I was able to successfully upload the file to my S3 bucket, but for some reason I couldn’t get to it with my sub domain address. Remember that line that we noted on our top level domain server block? Try commenting that out and restarting Nginx. You can also set each level of that URL in your S3 bucket to public by clicking the checkbox, then going to actions and selecting “Make Public”. Do this starting with .well-known, acme-challenge, and the actual some_long_path file itself. Try loading the file up in a private browser section with no caches until it works. Then run the curl command above. When you’re able to view that file, then you can press enter on your first terminal window. If you get a “Self-verify of challenge failed.” error from Let’s Encrypt, it’s more than likely because it can’t get to that file. Read the description of the error and troubleshoot as necessary. It’s a pretty detailed log of what happened.

Assuming that you’ve passed the self verify, you should get a message like Congratulations! Your certificate and chain have been saved at /etc/letsencrypt/live/your-site.whatever/fullchain.pem. Now you can upload your certificate for use in CloudFront. It took some trial and error, but the following worked for me:

sudo aws iam upload-server-certificate --server-certificate-name your-site.whatever --certificate-body file:///etc/letsencrypt/live/your-site.whatever/cert.pem --private-key file:///etc/letsencrypt/live/your-site.whatever/privkey.pem --certificate-chain file:///etc/letsencrypt/live/your-site.whatever/chain.pem --path /cloudfront/

Obviously you’ll want to make sure that the your-site.whatever directory is where your certificates are stored. Also mind the forward slashes and the third / in the file:/// directive. My certs are stored in /etc/letsencrypt, so I needed that third one to tell the AWS CLI where it was supposed to start looking for the file. If it works you’ll get a confirmation. Now head on over to CloudFront and select the proper certificate that you just uploaded under the “Distribution Settings” and complete the remainder of Nathan Parry’s tutorial for Step 4.

If everything is working, take a snap shot. 📸

MySQL backup to S3

Home stretch! Once you’ve got all of these pieces in place, you can finish up the WordPress installation if that’s what you’re using. If not, you’re free to go. 🏃 If you are using WordPress, let’s backup up that precious MySQL database to S3 as well. Go create another IAM user, and give it AmazonS3FullAccess. I created a separate group for S3 access so I could reuse that group later if needed. Make sure you run s3cmd --configure before creating the bash script and running it. S3CMD is going to use your other IAM user. In that bash script you’ll need to specify S3_BUNDLE. This is a bucket name. I created a separate bucket for this sole purpose. That way my MySQL backups are separate from my sub domain that was setup earlier. Make your bash script executable and run it. If all goes well, you should see a gunzipped file in your bucket. Set up cron task to automate your nightly or weekly backups, and you’re off to the races.

Summary

Whew! That was a lot. Here’s what we accomplished:

  1. We setup an Nginx server that hosts two domains (one top level, and one sub).
  2. We secured both domains with SSL certificates from Let’s Encrypt.
  3. We created some IAM users to upload certificates and MySQL backups.
  4. We created a static S3 bucket behind a CloudFront for content delivery.
  5. We secured our sub domain that points to our CloudFront CDN with an SSL certificate from Let’s Encrypt.
  6. We automated certificate renewal and MySQL backups with cron tasks.

This seems like a lot of work, but securing your server and sub-domains is a good practice to get in. Let me know if there’s anything I missed or need to correct in the comments.