# How Bytebase Handles Audit Logging

> How Bytebase audits the workflow layer: every SQL, schema change, approval, and export tied to a real user identity, with structured exports for any SIEM.

Tianzhou | 2026-05-05 | Source: https://www.bytebase.com/blog/bytebase-audit-logging/

---

Database audit logging splits into two layers. The **infrastructure layer** captures what the provider does: provisioning, configuration, backups. The **workflow layer** captures what happens inside the database: schema changes, queries, approvals, exports. Why compliance needs both is covered in [Database Audit Logging: Two Layers, One Trail](/blog/database-audit-logging/).

Bytebase covers the workflow layer. This post is how.

## How Bytebase Closes the Three Gaps

```mermaid
flowchart LR
    subgraph EN["Engine-native auditing"]
        direction TB
        EI["❌ app_user"]
        EC["❌ SQL only"]
        EV["❌ per-engine format"]
    end

    subgraph BB["Bytebase"]
        direction TB
        BI["✅ SSO end user"]
        BC["✅ SQL + ticket + approver"]
        BV["✅ one unified schema"]
    end

    EI -->|Identity| BI
    EC -->|Context| BC
    EV -->|Coverage| BV

    style BB fill:#e0f2fe,stroke:#0369a1
```

Engine-native auditing leaves three gaps: identity, context, coverage. Bytebase closes each one because it sits in front of the database, not inside it.

**Identity.** Bytebase records the SSO-mapped end user. Everyone authenticates to Bytebase before a single SQL goes out, so attribution survives the shared `app_user` connection. That shared connection is exactly where engine-native logs lose the thread: every query shows up as `app_user`, and you have no idea who actually ran it.

**Context.** Bytebase pairs every SQL execution with the approval chain, project, and issue ID. The justification gets captured at change time, next to the SQL, instead of reconstructed weeks later when an auditor asks why row 4,000,000 changed.

**Coverage.** Bytebase emits one audit schema across PostgreSQL, MySQL, SQL Server, Oracle, and the rest. Capture happens at the gateway, so a multi-engine fleet produces a single-format trail rather than five dialects of audit log you have to normalize yourself.

## What Bytebase Records

Six categories of events:

- **SQL execution**: every query through SQL Editor or change rollouts, with full SQL text, target database, parameters, and execution result
- **Schema changes**: issue creation, approval decisions, rollout status, rollback
- **Data access**: queries and exports tied to the requesting user, including row count and export destination
- **Authentication**: login, logout, SSO token exchange, failed attempts
- **Permission changes**: role grants, project membership updates, policy modifications
- **System configuration**: instance connections, environment settings, workspace policies

Data access entries also carry **per-column masking decisions**: which columns came back masked, which semantic type triggered it, and which rule matched.

Each entry carries the user's email, source IP, timestamp, operation duration, affected resource path, and request/response payloads. Sensitive fields (passwords, certificates, SSH keys, connection strings) are redacted automatically before the record is written.

A typical SQL execution record (with `request` and `response` shown decoded; on the wire they are JSON-encoded strings):

```json
{
	"name": "workspaces/-/auditLogs/12345",
	"createTime": "2026-05-05T14:23:11Z",
	"user": "users/alice@example.com",
	"method": "/bytebase.v1.SQLService/Query",
	"severity": "INFO",
	"resource": "instances/prod-postgres/databases/app",
	"request": {
		"name": "instances/prod-postgres/databases/app",
		"statement": "SELECT email, plan FROM users WHERE id = 12345",
		"limit": 1000
	},
	"response": {
		"results": [
			{
				"columnNames": ["email", "plan"],
				"columnTypeNames": ["TEXT", "TEXT"],
				"rowsCount": "1",
				"masked": [
					{
						"semanticTypeId": "bb.default.email",
						"semanticTypeTitle": "Email",
						"algorithm": "Range Mask",
						"context": "Matched global rule: PII Protection"
					},
					{}
				]
			}
		]
	},
	"status": { "code": 0 },
	"latency": "0.018s",
	"requestMetadata": {
		"callerIp": "10.0.5.42",
		"callerSuppliedUserAgent": "bytebase-web/3.x"
	}
}
```

The `masked` array is the part engine-native auditing can't give you. Engine-native logs that the SQL ran and returned a row. Bytebase logs the per-column masking reason: `email` came back masked because it matched the `bb.default.email` semantic type under the global PII Protection rule. Non-masked columns show up as empty entries, which keeps column-to-reason alignment intact. The shape is identical across SQL Editor queries, change rollouts, and data exports. One schema, every action.

## Getting Audit Data Out

Three export paths.

### GUI

Settings → Audit Log. Filter by user, method, resource, severity, and date range. Good for spot checks and walking an auditor through the trail.

### API

Two endpoints, depending on scope:

- `/v1/auditLogs:search`: workspace-level, returns every audit record across projects
- `/v1/projects/{project}/auditLogs:search`: project-scoped

Both accept filter expressions on user, time range, method, severity, and resource. The response is structured JSON ready to ingest into any SIEM. Pagination via `pageToken`. Full schema and filter syntax are in the [API audit log tutorial](https://docs.bytebase.com/tutorials/api-audit-log).

### Log streaming

Settings → General → Audit Log Export. Bytebase writes audit records to stdout on top of internal storage. Add the `--enable-json-logging` flag to the process to emit structured JSON. Any sidecar log shipper (Vector, Fluent Bit, the Datadog Agent, the Splunk Universal Forwarder) picks up stdout and forwards to the SIEM.

Use streaming for high-volume environments where API polling adds latency. Use the API for ad-hoc evidence pulls and scheduled exports.

## Pairing with Engine-Native Auditing

Let's be honest about the boundary: Bytebase audits SQL routed through Bytebase. Other paths bypass it.

- Direct connections from `psql`, `mysql`, or other clients
- Application traffic from production services connecting with their own credentials
- Emergency SSH access to the database host
- Replication and CDC pipelines reading the WAL or binlog

So for full coverage, run Bytebase as the primary trail for human activity and keep engine-native auditing on as a backstop:

| Path                                | Captured by                                                       |
| ----------------------------------- | ----------------------------------------------------------------- |
| Human SQL through Bytebase          | Bytebase                                                          |
| Application service-account traffic | Engine-native (`pgaudit`, MySQL audit plugin, SQL Server Audit)   |
| Emergency direct access             | Engine-native + alerting on session source IP                     |
| Cloud infrastructure changes        | Provider audit logs (CloudTrail, Cloud Audit Logs, Azure Monitor) |

## Plan Availability

Audit logging is available on [Pro and Enterprise plans](https://www.bytebase.com/pricing/). Pro covers every event category, GUI and API access, and automatic redaction. Enterprise adds custom approval workflows and advanced access control, which generate additional audit events around policy enforcement.

For setup details and the latest field reference, see the [Bytebase audit log documentation](https://docs.bytebase.com/security/audit-log/).