My Panther Data Explorer query to list cloud resources IDs is missing resources
Issue
I am seeing DELETE events for IAM users, S3 buckets, and other cloud resources that still exist. This is my query:
SELECT DISTINCT id FROM resource_history WHERE resourcetype = 'AWS.IAM.User' AND (changetype != 'DELETED') AND id NOT IN ( SELECT DISTINCT id FROM resource_history WHERE resourcetype = 'AWS.IAM.User' AND changetype = 'DELETED' )
How do I know which events are real and get a list of IDs for only those that exist?
Resolution
To resolve this issue, modify your resource_history
query to add a check that integrationLabel
and p_source_label
match.
SELECT DISTINCT id FROM resource_history WHERE resourcetype = 'AWS.IAM.User' AND (changetype != 'DELETED') AND id NOT IN ( SELECT DISTINCT id FROM resource_history WHERE resourcetype = 'AWS.IAM.User' AND changetype = 'DELETED' AND integrationLabel = p_source_label --new addition )
Cause
This issue occurs because there is a bug in the code for cloud resource event history forwarding to the Data Lake that we are actively tracking down. All events are still being sent to the Data Lake, but there are also extra events included in the history with mismatched source labels.
For now, adding the clause to ensure matched labels will filter all erroneous events from the resource_history until a patch is released.