What are the fields in the panther_monitor.public.data_audit table, and what is it used for?
Last updated: May 7, 2026
QUESTION
I'd like to understand the schema of panther_monitor.public.data_audit, what each field means and what the table represents. Is it only for S3-based log sources, or does it cover other ingestion paths too? Can I use it as a shortcut for per-source event counts instead of running counts across panther_views.public.all_logs?
ANSWER
The panther_monitor.public.data_audit is part of Panther's data pipeline monitoring infrastructure that provides an audit trail of S3 object processing events. It's essentially a log for every file that Panther ingests from your log sources. Each read of an object in those internal buckets produces one row in data_audit.
The table contains the columns below, in addition to the standard Panther fields:
Column | Type | Description |
|---|---|---|
operation | variant | The pipeline operation that occurred |
status | variant | Outcome of the operation (SUCCESS or FAILURE) |
error | variant | The error occurred. Populated only if status is FAILURE |
s3bucket | variant | The S3 object's bucket |
s3key | variant | The S3 object key (file path) that was processed |
s3objectsize | variant | The size of the S3 object in bytes. -1 if unknown. Empty if status is FAILURE |
s3objectversionid | variant | S3 version ID of the object; empty on failure |
recordcount | variant | The records parsed from the s3 object. Zero if status is FAILURE |
assumedrolearn | variant | The AWS role ARN used to access the s3 object |
What this table is good for
Confirming that Panther read a specific object from your source, useful when troubleshooting "is my data reaching Panther?" questions.
Spotting failed reads (
status = 'FAILURE') for a particular bucket, key prefix, or source.Comparing record-count totals.
Investigating ingestion filtering or compression issues.
Example query (failed reads for a specific source label in the last 24 hours):
SELECT s3bucket, s3key, p_event_time, recordcount, status
FROM panther_monitor.public.data_audit
WHERE p_occurs_since('1 day') AND status = 'FAILURE' AND p_source_label = 'My Log Source'
ORDER BY p_event_time DESC
LIMIT 100;Please note that recordcount reflects what Panther saw at read time on the input object. That said, each row represents one object read, and one object can contain many records so count(*) on data_audit is a count of read operations, not events.
For per-source and per-schema event counts, feel free to check: How do I query daily log collection volume by source and schema in Panther?