SOP extraction from interviews

The workflow goes: interview the senior operator about how they actually do the work, transcribe the conversation, run it through an AI pipeline that pulls out steps, inputs, outputs, and decision points, then do a gap-finding pass that flags everything the operator forgot to mention. The operator narrates. The pipeline writes.

The reason this matters: every growing company has the operator who keeps the books closing on time, the office manager who knows every vendor and lease, the customer-success lead who handles every executive escalation. The standard ask is “write up your SOPs.” It’s also the standard failure mode. Senior operators are bad at writing SOPs about their own work because the work is too obvious to them — they leave out the steps they consider too basic to mention, and those are usually the ones that trip a new hire.

This piece walks through the pipeline end to end: the interview structure that captures real workflow, the extraction that produces a follow-able document, and the gap-finding pass that catches the institutional knowledge the operator forgot to mention.

When to use

Where this fits — and where it doesn't

Use this if you have senior operators with tribal-knowledge workflows, the workflows are operationally important (closing the books, handling escalations, managing vendors), and the documentation gap is real. Common fits: companies past 30 employees, ops teams where founders are still doing operator work, services businesses where the principal’s workflow is the product, any team facing a senior-operator transition.

Don’t use this if the workflow is already well-documented (use what exists), the operator is willing and able to write the SOP themselves (let them — the result is better), or the workflow involves heavy software / UI interaction (use a screen-recording tool like Tango or Scribe instead, which captures the UI steps automatically). For workflow types where the operational knowledge is mostly in software (a specific reporting workflow in Salesforce), screen capture beats interview extraction.

Prerequisites

What you'll need before starting

A senior operator willing to be interviewed — usually 30–60 minutes per workflow.
A transcription tool — Otter, Fireflies, or your meeting platform’s native transcription. Quality of transcript matters; live transcripts with poor accuracy cost more time than they save.
A model API key with reasonable context window — Claude, GPT, or Gemini. Interviews can run to 10–30k tokens; all three handle this comfortably.
A draft SOP structure for each workflow type. “Customer escalation procedure” has a different shape than “monthly close” has a different shape than “vendor onboarding.” Pre-built skeletons accelerate extraction.
A reviewer who will validate the extracted SOP — usually the operator themselves, plus one person from the team that will execute the procedure.

The solution

Six steps from a 45-minute interview to a usable SOP

Pre-write the interview structure — open-ended, scenario-driven, not yes/no
The interview is the input quality lever. Structure it around three prompt types: (a) walk-through (“walk me through the last time you did X — what did you do first?”), (b) edge cases (“what happens when Y goes wrong? When did you last have to handle that?”), and (c) decision points (“when you see Z, how do you decide between option A and option B?”). Avoid yes/no questions — they produce checklists, not workflows. Avoid hypotheticals — operators describe what they wish they did, not what they actually do.
Record and transcribe — high-quality audio matters more than fancy tooling
Record the conversation with a meeting-recording tool. Good audio matters more than the tool brand. Live transcription tools (Otter, Fireflies) save a step; post-meeting transcription from a clean audio recording is more accurate. For workflows involving specific terminology (medical, legal, financial jargon), pass a glossary to the transcription tool ahead of time or post-process with terminology correction.
Extract the workflow with structured output — steps, inputs, outputs, decision points
Pass the transcript to the LLM with a schema: list of steps in order, each step with a description, required inputs (data, tools, information), expected outputs (deliverable, state change, downstream trigger), and decision branches (when does the operator choose differently). For each step, ask the model to extract the verbatim quote from the transcript that supports it — the audit trail makes the SOP trustworthy and lets reviewers verify against the source.
Run the gap-finding pass — what did the operator skip over?
The most valuable extraction step is finding what’s missing. Pass the extracted SOP back to the LLM with a different prompt: “what gaps exist in this procedure that a brand-new hire would need? What background knowledge is assumed? What systems / tools are referenced without explanation?” The model surfaces the obvious-to-the-operator-but-not-to-anyone-else gaps. Some of these are real (the operator genuinely forgot to mention something); some are signals to ask a follow-up question in a second interview.
Draft the SOP — structure for the executor, not the author
The output document is optimised for someone executing the procedure, not for someone who wrote it. Standard structure: purpose (one sentence), prerequisites (tools, access, prior knowledge), steps (numbered, with inputs and outputs per step), decision branches (visual flowchart for non-linear procedures), edge cases (a few common ones; reference the operator if more arise), success criteria (how do you know the procedure was done correctly). Keep verbatim operator-voice in the steps where it adds clarity; rewrite for clarity where the operator was clearly speaking conversationally.
Validate with the operator and a test execution
Two-stage validation. First, the operator reviews the extracted SOP and corrects misinterpretations — usually 15–30 minutes for a 45-minute interview’s output. Second, someone other than the operator executes the procedure following the SOP. The test execution catches the gaps the operator-review missed: steps that assume context, jargon that needs definition, decision branches that the SOP didn’t capture. Most SOPs need one revision after the first test execution; mature SOPs are stable after that.

The numbers

What it costs and what to expect

Per-SOP cost — transcription + LLM extraction $1–$5 per SOP including the transcript and extraction passes

Interview duration per workflow 30–60 minutes typically

Total operator time per SOP (interview + review) 1.5–2.5 hours including the validation review

Total documentation time saved per SOP vs operator writing it themselves 3–6 hours typically, because operators rarely complete the writing they commit to

Gap-finding pass effectiveness — gaps identified 3–8 per SOP typically; varies by workflow complexity

Test-execution corrections per SOP 2–5 typically on first run; near-zero by version 2

Time savings during a senior-operator transition 2–4 months of onboarding compression at typical roles

SOP "completeness ceiling" — how much tribal knowledge gets captured 80–90% of the operationally-critical workflow

Time to first SOP 2–3 days for the interview, extraction, and validation cycle

Time to ten SOPs (same workflow family) 1–2 weeks with the reusable interview structure and prompts

The operator’s time investment is small for the institutional-knowledge value created. The transition-time savings is the strategic ROI; the immediate one is just having the SOPs exist at all.

In practice

What teams running this typically learn first

What teams miss first is that operators talk about edge cases more than steps. When asked “walk me through the procedure,” senior operators spend 60% of the interview on the exceptions, not on the happy path — because the happy path is automatic to them and the exceptions are where their judgement lives. The extraction needs to handle both. The happy-path SOP is the document a new hire follows; the exception catalogue is the document they consult when something goes wrong. Most extraction pipelines produce just the first; the valuable output includes both.

The gap-finding pass is the differentiator. The operator-narrated extraction produces a SOP that captures what the operator said; the gap-finding pass produces a SOP that captures what they assumed. The two together approach a SOP a new hire can actually execute. Teams that skip the gap-finding pass ship SOPs that look complete but fail in practice — the new hire follows the document, hits a step that assumes context they don’t have, and gets stuck.

The compounding pattern shows up over many interviews: senior operators discover their own workflow during the interview. The structured questions force them to articulate decisions they make automatically, and they often catch inconsistencies or improvements as they explain. Several teams have used SOP-extraction interviews as a process-improvement workshop — the documentation is one output; the operator’s renewed thinking about their own workflow is sometimes the more valuable one.

Alternatives

Other ways to solve this

Screen-recording workflow tools (Tango, Scribe, Loom + transcription). Right answer for software-heavy workflows where the operator interacts with specific tools. Tango and Scribe capture click-by-click; the AI generates the step descriptions automatically. For pure-software workflows, this beats interview extraction every time. Complement: use interview extraction for the judgement layer (“when do you choose A vs B?”) and screen capture for the mechanical layer (“click here, then here, then here”).

Pair-write with the operator. A documentation specialist or junior operator sits with the senior operator and writes the SOP collaboratively. Slow, high-quality, builds shared understanding. Works at small scale; doesn’t scale across many SOPs. Useful for the most critical workflows where the highest fidelity is worth the time.

Operator writes the SOP themselves (the default, often the failure). The cheapest path on paper, the most common in practice, the most likely to produce no SOP at all. Senior operators are busy; SOP-writing is low-leverage from their perspective; the work slips. Most teams that have tried this end up at the AI-extraction approach because the SOPs the operator was supposed to write never materialised.

Don’t document — accept the institutional-knowledge risk. Honest answer for some very-small teams. Becomes increasingly indefensible as headcount grows past 20 and the cost of any operator transition becomes meaningful. The AI extraction pipeline lowers the threshold where documentation is worth doing; for many ops workflows, “worth doing” arrives sooner than teams had assumed.

What's next

Related work

For the underlying meeting-transcription tooling, see Voice transcription for sales calls and customer interviews. For the comparison of transcription services that often handle the interview audio, see Whisper API vs Deepgram vs AssemblyAI. For making the resulting SOPs searchable by the team that needs them, see Internal Q&A bot over company docs. For the onboarding-document pattern that builds on SOP extraction, see Onboarding documentation generation for new hires.

Common questions

FAQ

What about workflows that are mostly judgement, not steps?

Judgement-heavy workflows (customer escalation, deal-quality assessment, hiring decisions) extract differently — the SOP captures decision frameworks and reference patterns rather than linear steps. Interview around case studies: "walk me through three escalations from last quarter — what was different about each?" The output is a decision-framework document with worked examples rather than a numbered procedure. The pipeline pattern is the same; the SOP shape changes.

How do we handle confidential information that surfaces in the interview?

Two layers. (1) Brief the operator before the interview that the transcript will go through an LLM; they should redact specific client names or sensitive numbers verbally during the session if needed. (2) Run a redaction pass on the transcript before extraction — names, account numbers, dollar amounts above a threshold. The redacted transcript still produces a usable SOP for procedure-level work; sensitive-content workflows may need a fully on-prem extraction pipeline. See AI privacy — what to watch for for the broader privacy framework.

How often do SOPs need updating?

Annually at minimum; quarterly for fast-changing workflows. Each major tool change or process redesign triggers an off-cadence refresh. The pipeline accelerates updates — interview the operator again on what's changed, re-extract, version. The SOP becomes a living document rather than a one-time artifact; teams that treat SOPs as static find them stale within 18 months.

What about cross-functional workflows that involve multiple operators?

Interview each operator separately about their part of the workflow, then combine the extractions into a unified SOP with clear ownership per step. The combination step is where conflicts surface — one operator's understanding of a handoff may not match the next operator's expectations. The extracted SOP becomes the basis for resolving these gaps, not just for documenting them.

Can we use this to onboard a replacement when an operator leaves?

Yes, and it's one of the highest-value use cases. Conduct the interviews during the notice period (not the last week — earlier, when the operator has time to participate well). Extract the SOPs and validate them with the operator before departure. The replacement onboards against the SOPs with the original operator as occasional reference for ambiguous cases. The interview-extraction pattern compresses onboarding by months in high-tribal-knowledge roles.

What if the operator is bad at explaining their own work?

Common. The pre-written interview structure helps; scenario-driven questions extract better content than open-ended ones. For operators who genuinely struggle, two patterns work: (1) shadow them through a real instance of the workflow while recording (in-context narration beats abstract explanation); (2) interview their peers or direct reports about what the operator does — sometimes others can articulate the workflow better than the operator can articulate it themselves. The goal is the extracted SOP; multiple inputs are fine.