How to estimate S3 PutObject events
You can find the total incoming file count per hour by enabling CloudTrail to log S3 PutObject events. You can then run a query on the CloudTrail logs on the events.
Prerequisite: You need to set up a dedicated bucket to store the CloudTrail logs before you can enable CloudTrail for S3 PutObject events.
In CloudTrail, click "Create trail".
Fill out Trail name and Trail log bucket name to store the logs.
- (Optional) Fill out Prefix of the bucket for logs.
- (Optional) Configure the SSE-KMS for bucket data encryption.
- Click Next.
Configure what log events to collect:
- Check Data events, uncheck Management events.
- In Data events, select:
- S3 for Data event type.
- Log writeOnly events for Log selector template.
- Click Next.
Click Create trail. In the Dashboard, you can see that the Trail is in Logging status.
Create Athena table to query logs
To create a database and table in Athena, you need to describe the schema and the location where the table data are located in Amazon S3 for read-time querying.
Go to CloudTrail > Event history.
Click "Create Athena table".
Select the bucket name of the bucket storing the trail logs (ex. "my-trail-logs-bucket-name") for Storage location.
Click Create table.
Query PutObject logs in Athena
In Athena, Click "+" to add a new Query.
Copy the following query string to query panel.
select count() as totalevent, eventname, SUBSTR(eventtime, 1, 13) as eventhour, json_extract(requestparameters, '$.bucketName') as bkt from cloudtrail_logs_<bucket_name> where eventname = 'PutObject' and errorcode is NULL group by eventname, json_extract(requestparameters, '$.bucketName'), SUBSTR(eventtime, 1, 13) order by eventhour
Replace from "cloudtrail_logs_
"with the table name in the "Tables".
- Click "Run" or "Run again".