In HowTo: Rotate Logs to S3 we saw how to rotate logs to S3, and amongst those logs were the logs for apache web servers. In that article, the S3 path had the website hard coded in the logrotate configuration file; here we will see how it can be set based off of an EC2 tag.

In addition to requiring s3cmd (see HowTo: Install AWS CLI - Amazon Simple Storage Service (S3) - S3cmd for how to install), we will also need the Amazon Web Service command line tool aws (see HowTo: Install AWS CLI - AWS Command Line Interface for how to install), and the command line JSON processor jq (see HowTo: Install JQ for how to install).

For this example, we will use a tag named Site. If the value of the Site tag is set to www.example.com, then the logs will upload to s3://logging-bucket/apache/www.example.com/; however if the tag Site is not set, the logs will upload to s3://logging-bucket/apache/default/.

/etc/logrotate.d/apache2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Apache2 logrotate snipet for Gentoo Linux
# Contributes by Chuck Short
#
/var/log/apache2/*log {
  missingok
  notifempty
  sharedscripts
  postrotate
  /etc/init.d/apache2 reload > /dev/null 2>&1 || true

  BUCKET=logging-bucket
  INSTANCE_ID=`curl --silent http://169.254.169.254/latest/meta-data/instance-id`
  SITE=`aws --region $(curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e "s/.$//") ec2 describe-tags --filters "[{\"name\": \"key\", \"values\": [\"Site\"]},{\"name\": \"resource-id\", \"values\": [\"${INSTANCE_ID}\"]}]" | jq -r ".Tags[0].Value // \"default\""`
  /usr/bin/s3cmd -m text/plain sync /var/log/apache2/access_log-* s3://${BUCKET}/apache/access_log/site=${SITE}/instance=${INSTANCE_ID}/
  /usr/bin/s3cmd -m text/plain sync /var/log/apache2/error_log-* s3://${BUCKET}/apache/error_log/site=${SITE}/instance=${INSTANCE_ID}/
  /usr/bin/s3cmd -m text/plain sync /var/log/apache2/ssl_access_log-* s3://${BUCKET}/apache/ssl_access_log/site=${SITE}/instance=${INSTANCE_ID}/
  /usr/bin/s3cmd -m text/plain sync /var/log/apache2/ssl_error_log-* s3://${BUCKET}/apache/ssl_error_log/site=${SITE}/instance=${INSTANCE_ID}/
  /usr/bin/s3cmd -m text/plain sync /var/log/apache2/ssl_request_log-* s3://${BUCKET}/apache/ssl_request_log/site=${SITE}/instance=${INSTANCE_ID}/

  endscript
}

Breakdown

INSTANCE_ID

INSTANCE_ID=`curl --silent http://169.254.169.254/latest/meta-data/instance-id`

This gets the instance id of the instance:

Console - user@hostname ~ $

1
curl --silent http://169.254.169.254/latest/meta-data/instance-id

Output

1
i-b96483d1

SITE

SITE=`aws --region $(curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e "s/.$//") ec2 describe-tags --filters "[{\"name\": \"key\", \"values\": [\"Site\"]},{\"name\": \"resource-id\", \"values\": [\"${INSTANCE_ID}\"]}]" | jq -r ".Tags[0].Value // \"default\""`

Get the availability zone the instance is in

Console - user@hostname ~ $

1
curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone

Output

1
us-east-1a

Get the availability zone and drop the last character, giving us the region

Console - user@hostname ~ $

1
curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e "s/.$//"

Output

1
us-east-1

Get the Site tag for the current instance

Console - user@hostname ~ $

1
aws --region $(curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e "s/.$//") ec2 describe-tags --filters "[{\"name\": \"key\", \"values\": [\"Site\"]},{\"name\": \"resource-id\", \"values\": [\"${INSTANCE_ID}\"]}]"

Output - If Tag Is Not Set

1
2
3
{
    "Tags": []
}

Output - If Tag Is Set

1
2
3
4
5
6
7
8
9
10
{
    "Tags": [
        {
            "ResourceType": "instance",
            "ResourceId": "i-b96483d1",
            "Value": "www.example.com",
            "Key": "Site"
        }
    ]
}

If the Site tag is set, get the value; if not, return default

jq -r ".Tags[0].Value // \"default\""

s3cmd

/usr/bin/s3cmd -m text/plain sync /var/log/apache2/access_log-* s3://${BUCKET}/apache/access_log/site=${SITE}/instance=${INSTANCE_ID}/
/usr/bin/s3cmd -m text/plain sync /var/log/apache2/access_log-* s3://logging-bucket/apache/${SITE}/access_log/instance=${INSTANCE_ID}/

Make sure all the files starting with access_log- in the directory /var/log/apache2/ are uploaded to our S3 bucket and have the key start with apache/access_log/, then site= the value of the Site tag or default if the Site tag is not set; then /instance=, followed by the instance id.

The other logs, such as error_log and the SSL logs will behave similarly.