Automating AWS Backups

As developers, we all know that things go wrong; machines break (for sometimes totally unpredictable reasons) and data can become corrupt. What better way of helping to mitigate these problems than taking backups and snapshots of your infrastructure and data.

In the office we use AWS extensively for our infrastructure. We have various tools running in EC2 instances that we use for our day to day development. You’d think that a suite of tools as sophisticated as AWS would have a way of automatically backing up volumes and instances and retaining them for periods of time!

We found a bash script that pretty much did what we were after, but there were licence implications with using it. Bash scripting can be a bit of a black art, especially for sophisticated operations, so why not do it in a language we all  understand РJava!

Spring Boot allows us to create a command line Java application, simply by implementing the CommandLineRunner interface:

@SpringBootApplication
public class BackupApplication implements CommandLineRunner {

  private final BackupService backupService;

  @Autowired
  public BackupApplication(final BackupService backupService) {
    this.backupService = backupService;
  }

  public static void main(String[] args) {
   SpringApplication.run(BackupApplication.class, args);
  }
 
  @Override
  public void run(String... args) throws Exception {
    backupService.backup();
    backupService.purge();
  }
}

Connecting to AWS using an already created access key and secret is also relatively simple using the AWS SDK. You can pass the parameters below on the command line (-Daws.secret.key=KEY):

@Configuration
public class AwsConfiguration {

  @Bean
  public AWSCredentials credentials(@Value("${aws.secret.key}") final String secretKey, @Value("${aws.secret.pass}") final String secretPass) {
    return new BasicAWSCredentials(secretKey, secretPass);
  }

  @Bean
  public AmazonEC2Client ec2Client(final AWSCredentials credentials) {
    return new AmazonEC2Client(credentials);
  }
}

In order to find the instances I want to back up, they’ve been tagged with ‘mint-backup’ as the key, and the period of backup for the value (e.g. ‘nightly’, ‘weekly’). These are passed in on the command line as arguments and are used by the Backup Service:

@Autowired
public BackupServiceImpl(final AmazonEC2Client ec2Client,
  @Value("${tag.key}") final String tagKey,
  @Value("${tag.value}") final String tagValue,
  @Value("${retention.days}") final long days) {
  this.ec2Client = ec2Client;
  this.tagKey = tagKey;
  this.tagValue = tagValue;
  this.duration = Duration.ofDays(days);
}

The AWS SDK allows you to search on Tags:

private List getVolumes() {
  final Filter filter = new Filter("tag:"+ tagKey, Collections.singletonList(tagValue));
  final DescribeVolumesRequest request = new DescribeVolumesRequest();

  DescribeVolumesResult result = ec2Client.describeVolumes(request.withFilters(filter));
  String token;
  final List volumes = result.getVolumes();
  volumes.addAll(result.getVolumes());
  while ((token = result.getNextToken()) != null) {
    request.setNextToken(token);
    result = ec2Client.describeVolumes(request);
    volumes.addAll(result.getVolumes());
  }
  return volumes;
}

Now I have a list of Volumes, I can create a snapshot of each one:

private List createSnapshots(final List volumes) {
  final List snapshotIds = new ArrayList<>();
  volumes.forEach(volume -> {
  final CreateSnapshotRequest createSnapshotRequest = new CreateSnapshotRequest(volume.getVolumeId(),
  "SNAPSHOT-" + DateTime.now().getMillis());
  final CreateSnapshotResult createSnapshotResult = ec2Client.createSnapshot(createSnapshotRequest);
  final Snapshot snapshot = createSnapshotResult.getSnapshot();
  snapshotIds.add(snapshot.getSnapshotId());
  });
return snapshotIds;
}

Once they are all created they are then tagged with a purge date, so that another process can remove them once they have expired:

private void tagSnapshots(final List snapshotIds) {
  final long purgeDate = DateTime.now().plus(period).getMillis();
  final Tag purgeTag = new Tag("purge-date", String.valueOf(purgeDate));
  final CreateTagsRequest createTagsRequest = new CreateTagsRequest()
    .withTags(purgeTag)
    .withResources(snapshotIds);
  ec2Client.createTags(createTagsRequest);
}

Purging is a little easier. We can request for all snapshots that have a ‘purge-date’ tag, and then filter them based on the value of the tag so that we only get ones before now, grab the ids and create a collection of new requests and issue each one to the ec2 client:

@Override
public void purge() {
  final DescribeSnapshotsRequest request = new DescribeSnapshotsRequest();
  final Filter filter = new Filter("tag-key", Collections.singletonList("purge-date"));
  DescribeSnapshotsResult result = ec2Client.describeSnapshots(request.withFilters(filter));
  String token;
  final List snapshots = new ArrayList<>();
  snapshots.addAll(result.getSnapshots());
  while ((token = result.getNextToken()) != null){
    request.setNextToken(token);
    result = ec2Client.describeSnapshots(request);
  snapshots.addAll(result.getSnapshots());
  }
  final DateTime now = DateTime.now();
  snapshots.stream()
    .filter(snapshot -> filterSnapshot(snapshot, now))
    .map(Snapshot::getSnapshotId)
    .map(DeleteSnapshotRequest::new)
    .forEach(ec2Client::deleteSnapshot);
  }

private boolean filterSnapshot(final Snapshot snapshot, final DateTime now) {
  for (final Tag tag : snapshot.getTags()){
    if (tag.getKey().equals("tag-key") && readyForDeletion(tag.getValue(), now)) {
      return true;
    }
  }
  return false;
}

private boolean readyForDeletion(final String tagValue, final DateTime now) {
  final long purgeTag = Long.parseLong(tagValue);
  final DateTime dateTime = new DateTime(purgeTag);
  return dateTime.isBefore(now);
}

This is packaged as an executable Jar, so that it can be placed on our Jenkins instances, and executed as needed (either by specifying a cron expression or clicking Build Now). It can also be run from the command line on a developers PC if needed. It means that the process of taking snapshots of our EBS Volumes is consistent, no matter who or where the process is executed, which should help avoid any problems or differences between back up executions.