March 26, 2026

and

No items found.

ElasticSearch

Event Log

syslog-ng

observability pipeline

axoflow

When Your Parser Breaks: Schema Drift and Detection Gaps That Sneak Up On You

Did you know that Palo Alto traffic log schemas have changed recently? PAN-OS 11.1 introduced seven new fields: ai_fwd_error, ai_traffic, cluster_name, dst_adv_dev_id, flow_type, k8s_cluster_id, and src_adv_dev_id. Also, two existing fields were replaced with version-qualified variants (you can find the full PAN-OS 11.1 traffic log field reference here).

If your SIEM parser was written for PAN-OS 10.x, it didn't crash. It didn't alert you. It kept running, reading the wrong columns, feeding malformed data into your detection rules. The lateral movement detection that keys off traffic action fields? Silently broken. The C2 pattern rule watching unusual destination zones? Reading garbage.

But you wouldn't know until a breach investigation showed you the gap.

This is schema drift. We've been building logging infrastructure since 1998 - syslog-ng, the Kubernetes Logging Operator, now AxoRouter. In that time, the single most underestimated cause of SOC blind spots isn't a failed SIEM or a misconfigured detection rule. It's the pipeline feeding the SIEM breaking quietly, without anyone noticing.

Here's exactly how it happens, why it's getting worse, and what you can actually do about it.

The Parser That Always Says It's Fine

Security teams are trained to respond to alerts. A SIEM that goes down triggers an incident. A source that stops sending logs shows up in your ingestion dashboard. You have runbooks for those.

Schema drift doesn't trigger any of that.

When a vendor changes their log format, your parser doesn't throw an exception. Most SIEM parsers work on a best-effort basis: if a field can't be extracted, it gets skipped. The parser produces output. It just produces wrong output. Your detection rule expects a field that's either missing entirely or now contains data from a different column. The rule doesn't fire. Nothing tells you it stopped working.

SIEMs have more than 18% of their rules that are broken and will never fire an alert due to common issues such as misconfigured data sources, missing fields, and parsing errors. 4th Annual Report: State of SIEM Detection Risk 2024 Edition (CardinalOps)

The reason this is so damaging is that everything looks normal. Logs are arriving. Ingestion volume looks healthy. Parsing technically "succeeds." You only find the problem when you're hunting through a timeline mid-investigation and realize there are holes in your firewall coverage, and by then you're already behind.

This isn't the same as a poorly written detection rule. A bad rule is a logic problem you can fix in an afternoon. Silent parser failure means your data is corrupted at the source. Every correlation, every investigation, every forensic timeline downstream is built on a broken foundation.

This Happens More Than You Think

Most security teams treat schema drift as an occasional nuisance. An unfortunate side effect of a major upgrade cycle. Something to handle once and move on from.

The data tells a different story.

Axoflow's schema monitor tracks vendor log schemas continuously across major security data sources. In the 12 months ending March 2026, across just five vendors, we detected 10 schema changes, 2 of them major breaking changes. Here's what two of those looked like in practice.

Palo Alto Networks, PAN-OS 11.1 (January 2026): Seven fields added to traffic logs, two removed. The additions include AI-related telemetry (ai_fwd_error, ai_traffic) and Kubernetes context (k8s_cluster_id), reflecting how the product is evolving to understand modern infrastructure. Existing fields cluster_name and flow_type were replaced with version-qualified variants, shifting field positions for parsers that rely on positional extraction. Full traffic log field changes are documented here.

Elastic ECS base schema (March 2026): 65 fields added, 3 removed, 2 modified in a single release. For teams using ECS as a common normalization layer across sources, this is a significant structural change. Any downstream rule that depends on modified field behavior needs to be reviewed and validated.

Both of these happened recently, to schemas that security teams depend on for daily detection. And this is only five vendors, while even a mid-sized organization uses dozens. Your environment almost certainly has more.

The underlying driver isn't carelessness, it's the normal pace of product development. Cloud providers add API telemetry quarterly. Security vendors add AI and Kubernetes context as they build new capabilities. Application teams evolve their structured logging without notifying the security team. Every individual change is reasonable. Collectively, they mean your parsers are always drifting toward obsolescence, and the clock restarts every time a vendor ships a release.

Four Ways Schema Drift Breaks Your Detections

Not all schema drift fails the same way. Understanding the failure modes helps you know where to look.

Field position shift. Certain parsers extract fields by position, not name. When a vendor inserts a new field mid-record, every field after it shifts by one. Your parser reads the wrong data for every subsequent field, not just the new one. This is the most destructive form because it corrupts the entire event, not a single attribute.

Field rename. A vendor renames src_ip to source.address as part of a normalization effort. Your detection rule still looks for src_ip. The field is gone. The rule never matches. This is common in major version bumps and schema standardization projects, and it's easy to miss because the field disappears rather than producing an obvious error.

Type change. A field that was an integer becomes a string. The parser extracts it correctly. But your threshold-based rule that compares numeric values against integers now behaves unpredictably. The field is present and populated, which is why this variant is the hardest to catch, everything looks fine until you realize a rule hasn't fired in three weeks.

Structural addition. New fields don't always append to the end of a record. In JSON logs, new nested objects change how parsers traverse the document. In CSV syslog, insertions shift positions. In Elastic ECS, adding 65 fields to the base schema can change what default normalization produces across every integration built on top of it.

In every case, the parser keeps running, the SIEM keeps ingesting, and your detection coverage quietly degrades.

Why Fixing This at the SIEM Layer Doesn't Work

The instinct is to fix schema drift at the SIEM. That's where the failure becomes visible, so that's where teams go to patch it. But there are three reasons why this approach keeps failing.

You find out too late. By the time you discover a parser failure, it's because a detection rule didn't fire, a threat hunt surfaced a gap, or an auditor flagged missing log coverage. The drift happened upstream days or weeks earlier. You're already in damage control.

It doesn't scale. If you're ingesting 30 log sources and each drifts once or twice a year, you're running a continuous parser maintenance operation. Each fix requires understanding the new schema, rewriting extraction logic in your SIEM's query language, validating that existing rules still work, and then doing all of it again for your second SIEM if you're running one.

You can't validate what you can't see. When a parser fails silently, you need to know which events failed, which fields were affected, and how long the failure has been running. Most SIEMs bury parsing warnings in debug logs, if they log them at all. Reconstructing the failure timeline from SIEM internals is painful work that teams rarely have the bandwidth for.

The right place to catch schema drift is before data reaches your SIEM, at the pipeline layer, where you can validate schema, flag mismatches, and make routing decisions before ingestion costs are incurred and before malformed data reaches your detection engine.

What Teams Report After Deployment

Teams under HIPAA or PCI DSS compliance requirements face this risk directly. When parsers break silently and events go missing from the retention timeline, the gap shows up during compliance reviews. Not as a security failure, but as a coverage failure: events the retention policy requires are simply not there because the parser stopped extracting them correctly.

With a centrally maintained classification database, the window of exposure shrinks. Instead of discovering a parser failure weeks later during an audit, the database update ships as part of a regular biweekly cycle. The team still needs to validate that their specific detection rules work against updated schemas, but the parser maintenance itself is no longer their burden.

One healthcare customer reduced SIEM ingestion from 45GB/day to 32GB/day (30% cost reduction, $180K annual savings) using AxoRouter for classification and routing.

After deployment, schema drift became a routine maintenance task. When vendors updated log formats, Axoflow updated the classification database of AxoRouter.

The time savings were measurable. Before AxoRouter, the team spent several hours each week troubleshooting SIEM parser failures, identifying which events failed, when the failure started, and which detection rules were affected. After deployment, that work collapsed into routine AxoRouter updates. A task that used to take hours now takes minutes.

For air-gapped environments with proprietary log formats, the challenge is different. Commercial SIEMs rarely ship parsers for custom or defense-specific log sources. Teams in these environments typically maintain their own parsing logic, and every time a system vendor updates their software, that logic needs updating too.

OCSF normalization at the pipeline layer reduces this burden. Instead of maintaining parsers per SIEM, teams define normalization rules once in AxoRouter. When a source format changes, one normalization rule gets updated. Every downstream destination (whether a legacy SIEM, a new cloud SIEM, or long-term storage) receives consistently structured data.

One government customer used this approach during a SIEM migration from legacy Splunk to Google SecOps, achieving 40% log volume reduction while maintaining zero data loss through AxoStore edge buffering in their air-gapped environment. Read the full case study.

AxoRouter gave them a better model. Both their legacy Splunk environment and their new Google SecOps deployment received correctly structured data. Parser maintenance became Axoflow’s problem.

Auditing Your Current Exposure

You don't need to deploy anything to start understanding your schema drift risk.

Start with your top 10 log sources by detection coverage, not by volume. The question isn't which sources generate the most data. It's the sources your critical detection rules depend on. A high-volume source with low detection value drifting is operationally annoying. A drift in a source feeding your lateral movement or credential theft detections is a genuine blind spot.

For each source, find out when the format last changed. Check vendor release notes for the past 12 months. If you upgraded to PAN-OS 11.1 after January 2026 and haven't reviewed your traffic log field changes, that's your first item. If you're normalizing to ECS and haven't validated your mappings against the latest ECS release notes since early 2026, you're likely behind.

Review your SIEM's parser warning logs. Enable verbose logging if it isn't already active. Look for messages about failed field extraction, unexpected data types, or schema validation failures. When you find them, trace them back to the source: which format changed, when, and which detection rules depend on the affected fields.

Then run a detection rule validation. Inject known-bad events, a simulated AssumeRole from an unauthorized principal, an authentication failure sequence that should trigger a brute-force detection, and confirm they produce the expected alerts. If they don't, work backwards to determine whether the failure is in the rule logic or in the parser feeding it.

Who Actually Needs This

Not every environment does. If you're running a small deployment with a handful of stable log sources and a single SIEM with maintained built-in parsers, you can manage schema drift manually. It's painful but tractable.

The threshold where pipeline-layer validation earns its place: 50 or more log source types, multi-SIEM architecture, regulated compliance requirements, or an environment where vendor software updates are frequent and your upgrade cycles mean you're regularly running parsers against schemas they weren't written for.

The calculus also changes if you're planning a SIEM migration. Normalizing to OCSF at the pipeline layer before migration means your detection rules survive the transition. You're not rewriting parsers twice. You rewrite them once, in a vendor-neutral schema, and AxoRouter handles routing to whichever destination you land on.

Where This Leaves You

Schema drift is not a corner case. Five vendors, 12 months, 10 schema changes, 2 of them breaking. These weren't minor documentation tweaks. There were structural changes to the log formats that security teams rely on for daily detection.

The failure mode is what makes it dangerous: parsers don't crash when schemas change. They produce wrong output silently, feeding malformed data to detection rules that stop working without any visible indication. You find out during an incident investigation, not before it.

Catching drift at the SIEM layer is too late. By then, you've ingested corrupted data, your rules have already missed events, and your forensic timeline has holes. The right place to catch it is at the pipeline layer, before ingestion, before normalization, before the data reaches any downstream system.

That's what AxoRouter does. 28 years of logging infrastructure. From the people who built syslog-ng.

Already know what you need? Book a demo.