Table of contents

How to estimate S3 PutObject events

You can find the total incoming file count per hour by enabling CloudTrail to log S3 PutObject events. You can then run a query on the CloudTrail logs on the events.

Prerequisite: You need to set up a dedicated bucket to store the CloudTrail logs before you can enable CloudTrail for S3 PutObject events.

  1. In CloudTrail, click "Create trail".

    Dashboard page

  2. Fill out Trail name and Trail log bucket name to store the logs.

    1. (Optional) Fill out Prefix of the bucket for logs.
    2. (Optional) Configure the SSE-KMS for bucket data encryption.
    3. Click Next.

    Choose trail attributes

  3. Configure what log events to collect:

    1. Check Data events, uncheck Management events.
    2. In Data events, select:
      • S3 for Data event type.
      • Log writeOnly events for Log selector template.
    3. Click Next.

    Choose log events

  4. Click Create trail. In the Dashboard, you can see that the Trail is in Logging status.

    Dashboards with trail

Create Athena table to query logs

To create a database and table in Athena, you need to describe the schema and the location where the table data are located in Amazon S3 for read-time querying.

  1. Go to CloudTrail > Event history.

  2. Click "Create Athena table".

    Create Athena Table

  3. Select the bucket name of the bucket storing the trail logs (ex. "my-trail-logs-bucket-name") for Storage location.

    Create Table in AWS

  4. Click Create table.

Query PutObject logs in Athena

  1. In Athena, Click "+" to add a new Query.

    Add query

  2. Copy the following query string to query panel.

    select count() as totalevent,
        eventname,
        SUBSTR(eventtime, 1, 13) as eventhour,
        json_extract(requestparameters, '$.bucketName') as bkt
    from cloudtrail_logs_<bucket_name>
    where eventname = 'PutObject' and errorcode is NULL
    group by eventname,
        json_extract(requestparameters, '$.bucketName'),
        SUBSTR(eventtime, 1, 13)
    order by eventhour
    
  3. Replace from "cloudtrail_logs_" with the table name in the "Tables".

  4. Click "Run" or "Run again".