Remote work has changed what "location" means for most organizations. For most roles, flexibility is the whole point. But for certain positions – export-controlled work, regional licensing requirements, contractual obligations – location still matters. When policy says people should be in specific places, you need a way to verify that without being invasive or building something overly complex.
A customer came to us with a specific problem: they suspected some remote workers weren't actually working from their reported locations. Rather than immediately building a multi-system correlation engine, we started with something simpler: use Zoom's own telemetry to flag employees whose connection patterns look consistently different from their peers.
Why IP geolocation doesn't work
The obvious first move is to check each participant's IP-derived country against their reported work location. Zoom includes location strings based on IP geolocation. But this signal falls apart for exactly the scenarios we care about:
VPN/proxy masking – Anyone can route through a domestic VPN, making a foreign connection appear local. If you're actually in Vietnam but connect through a US VPN service, your IP says "United States."
Corporate egress points – Remote workers who tunnel through HQ infrastructure show the office IP, not their home IP. Your entire remote workforce might appear to be in your headquarters city.
Cloud/VDI – Virtual desktops hosted in US data centers produce US IPs regardless of where the person controlling them is sitting. Someone operating a cloud desktop from overseas looks identical to someone using it from down the street.
Geolocation errors – Mobile carriers and smaller ISPs frequently map to the wrong city or even wrong country. IP geolocation databases are best-effort guesses, not ground truth.
The fundamental issue: IP country tells you where the connection appears to terminate, not where the human is physically located.
Why QoS metrics are better
Network quality telemetry reflects the actual path packets travel. These metrics are constrained by physics in ways IP addresses aren't.
Latency – Round-trip time to Zoom's infrastructure. A VPN can mask your IP origin, but it can't eliminate the extra milliseconds from routing through distant servers. Someone in Southeast Asia connecting via a US VPN will still show higher round-trip time than someone genuinely sitting in the US. You can fake where your packets come from, but you can't fake how long they take to get there.
Jitter – Variance in packet timing. Multi-hop international paths introduce more timing variation than direct domestic connections. Consistent low jitter suggests a clean, short path. Erratic jitter suggests a complex route.
Packet loss – Correlates with path complexity and congestion. International routes and overloaded VPN tunnels tend to show higher loss rates than direct domestic ISP connections.
The key insight: QoS metrics encode distance and path quality in ways that are hard to fake. Someone claiming to be in New York but consistently showing 250ms latency while their meeting peers show 40ms is worth investigating.
Why we compare within meetings, not against absolute thresholds
You might think the answer is to set absolute thresholds (e.g., flag anyone with latency over 200ms, or jitter above 50ms). But network conditions vary legitimately based on ISP quality, time of day, local infrastructure, and a dozen other factors. Rural fiber performs differently than urban cable, which performs differently than a mobile hotspot. What's "normal" in one context is terrible in another.
The solution is cohort-relative comparison: measure each participant against others in the same meeting, at the same time. This approach normalizes for meeting-specific conditions (Zoom server location, time of day, overall network weather) and surfaces participants who are consistently different from their peers.
If everyone in a meeting has high latency, that's a Zoom infrastructure issue or a bad meeting route. If one person has high latency while everyone else is fine, repeatedly, that's interesting.
The data we used
Building on the framework from part 1, we pulled from three related endpoints in Zoom's historical Dashboard/Metrics APIs:
Meeting context (/metrics/meetings)
- Meeting identifiers, topics, timestamps
- Duration—both scheduled and actual
- Basic metadata we use for grouping and filtering
Participant sessions (/metrics/meetings/{id}/participants)
- Identity: user_id, email, display name
- Session timing: when they joined, when they left
- Connection context: device type, IP address, location string
- We also calculated an attendance ratio (
participant_duration/meeting_duration) to filter out people who only dropped in briefly
QoS summaries (/metrics/meetings/{id}/participants/qos_summary)
This is where the signal lives. Per-participant network quality metrics, broken down by stream type:
audio_input,audio_output(microphone and speaker)as_input,as_output(screen share, both directions)
Each stream provides metrics that fall into two categories:
Degradation-when-high (bad when numbers go up):
- Latency (milliseconds)
- Jitter (milliseconds)
- Packet loss (percentage)
Degradation-when-low (bad when numbers go down):
- Bitrate (kbps)
- Frame rate (fps)
- Resolution
We also pulled sharing context from /metrics/meetings/{id}/participants/sharing for enrichment, though it didn't factor into the core scoring. ASN enrichment via MaxMind was available if we wanted ISP-level context, but we kept it optional to keep the initial implementation simple.
The detection logic
The approach has two layers: flag anomalies within individual meetings, then look for people who show the same pattern repeatedly across many meetings.
Step 1: Per-meeting peer comparison
For each meeting, each QoS metric type, and each participant, we calculated leave-one-out statistics:
peer_mean = (sum of all values - this participant's value) / (n - 1)
peer_std = standard deviation of all values excluding this participant
z_score = (participant_value - peer_mean) / peer_std
This gives us a measure of how different this participant is from everyone else in the same meeting, without letting their own value skew the baseline.
A participant was flagged as anomalous for that meeting if both of these conditions were true:
- Their z-score exceeded ~2 standard deviations in the "bad" direction for that metric type
(High for latency/jitter/loss, low for bitrate/framerate/resolution)
- Their absolute value fell in the extreme quantiles of the global distribution
(e.g., ≥75th percentile for high-is-bad metrics, ≤25th for low-is-good metrics)
Why the dual threshold? The z-score catches people who are outliers relative to their peers. The global quantile check prevents flagging someone just because they had the "worst" connection in a meeting where everyone's connection was actually fine. You need to be both different from your peers AND objectively in the bad tail of the overall distribution.
Step 2: Cross-meeting consistency
A single bad meeting means nothing. Someone's home internet could have hiccupped. They could have been on a train. We only surfaced participants with repeating patterns:
- Minimum meetings observed: participant must appear in enough meetings for the metric to be meaningful (we used 5 as a baseline)
- Minimum anomalous meetings: at least 2 meetings flagged as anomalous
- Anomaly ratio threshold: ≥60% of their appearances must show the same anomaly
- Directional consistency: sustained extreme values in the same direction across meetings (always high latency, not sometimes high/sometimes low)
The output wasn't just a list of names. For each flagged participant, we included:
- Meeting timestamps
- Devices used
- IP addresses and countries
- ASN organization (ISP) when available
- Attendance ratios (to understand if they were full participants or brief drop-ins)
What we found
The analysis produced a focused shortlist: participants whose QoS repeatedly diverged from their meeting peers in a consistent, directional way. This wasn’t a verdict – it was a prioritized review queue with enough context for responsible follow-up. The attendance ratio was especially helpful for filtering out brief, legitimate outliers (e.g., someone joining from a car for five minutes and dropping off).
One nuance worth calling out: the most interesting cases weren’t simply “weird geo.” They were consistently extreme QoS even when the IP geolocation looked normal and central/expected (e.g., within the US) – which is why, at this stage, we compared candidates against the locations inferred from their IPs rather than treating geo alone as the signal.
For example, one participant repeatedly showed 220–280ms latency while peers in the same meetings averaged 35–50ms. Another exhibited sustained high jitter and packet loss across 12 of 14 meetings, consistently tied to the same IP block that geolocated to the correct country, but whose ASN pointed to a commercial VPN provider.
Why single-source first
The mature version layers in cross-correlation with Okta sign-ins, VPN logs, endpoint telemetry, and even badge swipe data.
But starting with Zoom alone had real advantages: speed (no integrations or pipelines, analysis started the same day), explainability (every flag ties directly to observable peer differences in specific meetings), and validation (prove the signal exists before investing in broader infrastructure). Once the signal is clearly real and useful, then it can make sense to add additional context.
Next: Schema and repeatability
Zoom’s APIs are a goldmine – but to get the full value, they need to be mapped into the same normalized schema as the rest of your security telemetry.
In part 3, we’ll show how we mapped this custom Zoom stream (meetings, participants, QoS) to the customer’s schema alongside Okta/VPN/endpoint, so Zoom becomes a first-class source you can query consistently and correlate naturally with everything else.
.png)
.png)
