SharePoint Online Permission Auditing at Scale ~ M365 Show with Mirko Peters - Microsoft 365 Digital Workplace Daily Podcast

Your SharePoint permissions are probably a mess. Not because you
don’t manage them — but because nobody can keep up with thousands
of sites changing daily. The shocking part? Most organizations
have no single report showing who has access to what. In this
session, I’ll show you the exact steps to scan every site, every
library, every user — without touching a single site manually. By
the end, you’ll know how to automate the work that normally takes
weeks into something that delivers daily, accurate reports — and
actually sleep better knowing you have control.

Why Traditional Permission Reviews Break at Enterprise Scale

You know that annual permissions review everyone gets so excited
about? The spreadsheet goes out, site owners tick through their
lists, managers sign off, and for about twenty-four hours it
feels like you’ve got everything under control. By the next week,
someone’s shared a folder with a new contractor, a project site
has been spun up without notice, and the “final” record you just
archived is already missing reality by a mile. On a small
collection, it’s still possible to catch changes before they
spiral. You pull the list of site members, maybe check a couple
of groups, and confirm no one has oddball access. In that world,
manual review works. The permissions tree is short enough to see
in one screen, and the number of hands making changes is small
enough to track. It’s boring, but it’s manageable. At enterprise
scale, that model falls apart fast. You’re no longer looking at a
tidy set of five intranet sites. You might be staring down ten
thousand sites across departments, regions, and business units —
and they’re not static. Teams create new sites daily, archived
projects never quite disappear, and content churn means
permission changes happen constantly. The window between your
review and the next significant change is sometimes measured in
hours. Even worse, SharePoint is deceptive when you try to
eyeball it. Permissions can be inherited from the parent site,
overridden at the library level, tweaked on a folder, and then
patched again on a single file. A user’s access might not be
obvious because they’re coming in through a nested group — maybe
even through a security group synced from Azure AD that itself
holds other groups. One missing click into those layers, and you
have no clue they’re in there. Compliance teams still expect
clean audit logs and evidence of regular reviews. The reality is,
you’d need an army of admins to manually walk through each site’s
structure, note every permission, and confirm it’s valid. That’s
without factoring in time to re-check inherited and group-based
access, which changes the moment someone moves a user between
teams. The practicalities just don’t match the scale. I worked
with an organization that dedicated over 80 admin hours to one
quarterly review. They split the workload, pulled membership
reports, even had a formal process mapped out. The end file
looked thorough — but two weeks later, a penetration test found
guests with edit access to confidential folders that had been
missed entirely. Not because anyone failed at their job, but
because the access came through a nested group that never
appeared on the manual report. That’s the gap that will keep you
awake. Stale permissions hiding deep in site structures.
Terminated employees whose accounts linger in synced groups.
Guest accounts that were supposed to expire but didn’t. They’re
easy to miss, and if you’re relying on a manual sweep, you’re
counting on luck as much as process. You start to realise the
“snapshot once a year” model isn’t broken because people are lazy
— it’s broken because the system it’s trying to capture moves
constantly. Permissions are living data. Treating them like a
static list means you’re always in the past, never in the live
state of your environment. The solution isn’t throwing more
people at the review. It’s building a way to query and
consolidate this data automatically, so the moment something
changes, your reports reflect that. The next step is connecting
to every site without needing to click through them one by one —
and that’s where a more capable tool comes in.

Building the Foundation with PnP PowerShell

Imagine opening a PowerShell window, running one command, and
being connected to every SharePoint site in your tenant. No
browser tabs, no endless clicking through site collections — just
a direct line into the entire environment from a single place.
That’s exactly what PnP PowerShell gives you, and if you’ve only
used it for small ad‑hoc scripts, it can be a bit of a shock how
far it can actually stretch. PnP PowerShell is essentially your
bridge between SharePoint Online and your automation environment.
It wraps Microsoft’s APIs into commands that are easier to work
with, while still giving you access to advanced functionality
under the hood. At a small scale, you can get away with running
`Connect-PnPOnline` interactively, logging in with your account,
and pulling some site data. But at scale, interactive logins
become a nightmare — you can’t expect scheduled processes to sit
there waiting for someone to type a password or approve MFA.
That’s where the cracks start to show in naïve scripts. You might
get halfway through enumerating sites before your token expires.
You might hammer the service too quickly and hit throttling
limits. Or you discover that not every site fits the same neat
structure — some use modern team templates, others are classic
collections with oddball permissions and settings in unexpected
places. The more you try to brute‑force it, the more brittle it
becomes. A better way is to shift to app‑only connections. In
practice, this means creating an Azure AD app registration,
granting it the necessary SharePoint and Graph permissions, and
authenticating with a certificate rather than a user account.
That certificate‑based auth is far more stable for unattended
processes. PnP PowerShell supports it out of the box, so once you
have the certificate stored securely — preferably somewhere like
Azure Key Vault — your scripts can connect without prompts and
without risking expired passwords. Now, how do you actually find
all the sites to connect to? At tenant scale, you can’t maintain
a hardcoded list. You can use `Connect-PnPOnline` with the
Search‑based site enumeration or integrate with Microsoft Graph
to pull every site collection URL dynamically. Graph tends to be
better for consistency, but PnP’s Search approach can give you
quick wins in smaller tenants. The key is that the enumeration
itself has to be tenant‑wide and automated — no manual curation.
Once you have a list, you still need to be respectful to the
service. Batch your requests. Use pauses or throttling controls.
It’s not just about avoiding 429 errors from Microsoft; it’s
about making sure your process finishes in a realistic timeframe
without overwhelming the endpoints. Handling this well means
structuring your loops so they process a manageable subset of
sites at a time, writing interim results, and resuming gracefully
if a session drops. An example of secure handling in action would
be using a PowerShell runbook that pulls your certificate from
Key Vault at runtime, connects to the admin center, retrieves all
site URLs using Graph, and then iterates through them in
controlled batches. No login prompts. No hardcoded credentials.
Fully repeatable. You could run that on demand today, and
tomorrow on a schedule. At this point, you’ve essentially wired
your console into the nervous system of your SharePoint tenant.
You can reach every site programmatically without ever touching
the UI. That solves the first hurdle for enterprise‑scale
permission auditing — discovery and connection. But what you have
right now is still surface‑level. You can grab site properties,
maybe top‑level groups, but you’re not yet seeing the nested,
inherited access that actually matters for compliance. Getting
that depth means tapping into a richer dataset than PnP alone
provides. The commands here are great at orchestrating
connections and traversing sites, but to unpick the full
permission story across every file and folder, we need to bring
in another API that was built to expose those relationships
cleanly. That’s where the next layer of this approach comes into
play.

Mining Permission Data with Microsoft Graph API

If connecting with PnP PowerShell gives you the keys to every
site, using Microsoft Graph API is like walking into each one and
actually seeing the full guest list — who’s there because they
were invited directly, who’s part of a group, and who’s passing
through from an inherited door you didn’t even notice. It’s the
part where you stop guessing and start getting a clear, unified
view across thousands of sites and libraries at once. Graph sits
underneath a lot of Microsoft 365 services. For permissions, it
acts as the backbone that lets you query SharePoint, OneDrive,
and Teams in a consistent way. The difference is it doesn’t just
hand you a flat list. It lets you pull site objects, lists,
libraries, files, and the associated permission objects for each.
That matters, because nothing in SharePoint permissions lives
neatly in one place. Direct assignments live alongside group
memberships, which may be sitting in Azure AD groups that have
their own nested groups inside. For example, the
`/sites/{site-id}/permissions` endpoint can tell you about
sharing links and access grants at the site level, but that
doesn’t give you everything. List-level permissions might require
`/sites/{site-id}/lists/{list-id}/permissions`, and item or
file-level access calls might need
`/drives/{drive-id}/items/{item-id}/permissions`. To make sense
of who actually has what, you have to stitch those results
together. That includes looking up group memberships using
`/groups/{group-id}/members` and resolving user objects so you
know exactly who’s behind a group entry. Where it gets messy is
that inheritance is invisible if you only look at direct
permissions. A file might say it has no unique permissions, which
really means “look up a level.” If you stop there, you’ll miss
whole categories of access. So, you need logic in your process
that steps up the chain — from file to folder to library to site
— checking at each level and consolidating that data until you
see the complete inherited path. Pagination and throttling are
another reality here. Graph responses will often cut off after
200 items, and you need to follow `@odata.nextLink` tokens to
pull the rest. At scale, that means building request loops that
can handle thousands of responses without timing out or losing
context. Throttling is handled through 429 responses with a
suggested retry-after value, so your code has to respect that or
you’ll get nowhere fast. One trap admins fall into is only
collecting direct permissions. That produces a clean-looking
dataset that’s also dangerously incomplete. Using multiple Graph
calls together solves that — file-level permissions plus
library-level, list-level, and site-level data, cross-referenced
with full Azure AD group membership expansions. The end goal is
not separate spreadsheets for each type, but one flattened,
normalized dataset where each row shows the resource, the
resolved user, and the effective access level they have,
regardless of how it was granted. A practical approach is to run
collection in two passes. First, enumerate all resources — sites,
lists, and critical libraries or document sets. Second, for each
resource, query direct permissions and then walk upward to
collect inherited entries. During that, resolve any group IDs you
find into actual user accounts by calling the group membership
endpoints. That way, by the time you run analysis, you’re working
only with tangible user and guest objects, not cryptic IDs. The
result is a dataset that’s usable. You can sort by user and see
every resource they touch, or sort by resource and see every
account with access. You can apply filters for things like
“guest” or “external” and have instant answers without pulling
fresh reports. This is the kind of visibility that an annual
manual review could never match — because you can run it any time
you want and be confident that nothing’s hidden in a group
nesting three levels deep. With that level of accuracy, the next
obvious step is to stop running it manually at all. If you can
make the queries run on their own, on a schedule, you’ll always
have a fresh picture without someone hovering over a PowerShell
window. That’s where orchestration kicks in.

Automating the Audit with Azure Automation

Picture starting your day and finding a complete, up-to-date
permissions report sitting in your inbox — no late‑night scripts,
no one remoting into a server, no manual exports. That’s the
appeal of putting the whole process on autopilot, and Azure
Automation is one of the best ways to make it happen. It’s
essentially the scheduler and execution engine for all the PnP
PowerShell and Microsoft Graph work you’ve already put together,
but without you having to be in front of a keyboard. Azure
Automation runbooks are where your scripts live and run in the
cloud. Instead of leaving them on a server that someone might
reboot or lose access to, you upload them into a managed service.
That service handles the execution, logs the results, and lets
you trigger them on a schedule. But when scripts run without you
watching, things can go wrong in ways that are easy to miss —
like an expired certificate stopping authentication, or a
long‑running job hitting a timeout halfway through. If you don’t
plan for those, you’ll have a report that fails silently, or
worse, delivers incomplete data that looks fine at a glance. The
starting point is securing your authentication. Pasting
credentials into a script is a quick way to make a security team
very unhappy. The right approach is to store your certificate or
client secret in Azure Key Vault and have the runbook pull it at
runtime. Key Vault keeps the sensitive material encrypted, and
role‑based access controls make sure only your automation account
can retrieve it. When the certificate expires, you can roll it
over in one place without editing every script. Scheduling in
Azure Automation is flexible. Daily runs capture a near‑real‑time
picture, but if your environment changes more slowly, a weekly
schedule might be enough. You can set exact times, align with
off‑peak hours to reduce load on the tenant, and even kick off
runs in response to events instead of just the clock. If the job
needs more resources than the Azure sandbox can offer — for
example, if you’re dealing with extremely large tenants and
running very long enumerations — Hybrid Runbook Workers let you
execute those scripts on‑premises or in a dedicated VM while
still managing them from Azure Automation. Logging is just as
important as the output itself. Without logs, troubleshooting
becomes guesswork. Azure Automation can capture both standard
output and error streams into job logs, which you can review in
the portal or export for longer‑term storage. Keeping that
history means you can prove the audit ran at a given time and see
exactly what happened if it didn’t finish. For compliance, that
audit trail can be as valuable as the permission report itself.
When the script completes, you have options for where the data
lands. You could write the CSV or JSON output to a SharePoint
document library, drop it into Azure Blob Storage, or attach it
to an automated email via an SMTP relay or a Logic App. Each has
trade‑offs — SharePoint is great for team access, Blob Storage
handles very large files, email is instant but less secure for
sensitive datasets. The point is, you choose the delivery that
fits your review process. One more layer is identifying when
something truly needs attention. It’s possible to integrate basic
change detection — for example, compare today’s dataset with
yesterday’s, and if a new guest user appears in a sensitive site,
post an alert in Teams or send a flagged email to the security
group. That turns your scheduled job from just a reporting tool
into an active early‑warning system. By combining Azure
Automation’s scheduling and credential management with the data
collection you’ve already built using PnP PowerShell and Graph,
you move from reactive, ad‑hoc checks to a baked‑in, hands‑off
process. Now, those three parts — connection, data retrieval, and
automation — work together as one continuous, proactive posture
instead of three disconnected tasks.

Conclusion

At enterprise scale, guessing at permissions isn’t an option.
Without a live, accurate view, you’re hoping nothing slips
through — and hope isn’t a security strategy. The tools are there
to make this effortless once you connect them. If you do one
thing this week, set up a PnP PowerShell connection to your
tenant. That’s the base you can build on. From there, expand into
Graph queries and automation. When you move from chasing problems
to monitoring in real time, you stop firefighting. You start
managing with intent — and that shift changes both your
productivity and your peace of mind.

Get full access to M365 Show - Microsoft 365 Digital Workplace
Daily at m365.show/subscribe

SharePoint Online Permission Auditing at Scale

Beschreibung

Weitere Episoden

Dev Tunnels in Visual Studio for Microsoft 365 App Testing

Dynamics 365 Embedded Analytics with Fabric & Power BI

How to Audit User Activity with Microsoft Purview

Fabric Lakehouse Governance & Data Lineage

Governed AI: Keeping Copilot Secure and Compliant

Kommentare (0)

Abonnenten

Anmelden mit