AI procurement checklist for non-technical buyers

Most non-tech companies buy AI tools the way they buy other software: demo, references, security questionnaire, contract. The deal usually closes on the demo’s quality, which has roughly no predictive value for how the AI will hold up in production. The questions that actually predict outcomes — what does the AI fail at, what data flows where, how is the model updated, what happens to your pricing if your usage grows — get skipped because the buyer doesn’t know to ask.

Below: a 20-question checklist for non-technical buyers. Each question is operational, answerable in plain language, and produces a decision-relevant input. The framework doesn’t need a CTO in the room; it does need the discipline to ask the questions and not accept “trust the vendor” as the answer.

The capability questions

What the AI actually does (and doesn't)

What is the specific workflow this AI is replacing or accelerating? Vague answers (“AI for productivity”) predict vague outcomes. Specific answers (“AI that classifies and routes customer-support tickets”) predict implementable deployments.
What does the production accuracy look like on tasks like ours? Vendors will quote demo numbers. Ask for production data — actual customer-specific results, not benchmarks. Healthy vendors share this; deflecting vendors are flagged.
What are the failure modes? Every AI fails sometimes. Vendors that can’t articulate failure modes either don’t understand them or don’t want to. Either is a red flag.
How does the AI handle edge cases or unusual inputs? Production work is mostly the long tail. The demo shows the common 60%; the actual work includes the difficult 40%. Ask specifically about your domain’s edge cases.
How do we know when the AI is wrong? The mechanism for catching errors matters as much as the error rate. Confidence scores, validation rules, audit logs, human-in-the-loop gates.

The data questions

What flows where, and what happens to it

What data do we send to the AI vendor? Sometimes obvious; sometimes not (the AI may receive more context than is immediately visible).
Where does the data physically reside? US, EU, regional. Affects regulatory exposure.
Is our data used to train the vendor’s models? Default for free tiers: often yes. Enterprise tiers: usually no, but verify in writing.
What are the data retention and deletion policies? How long does the vendor keep our data, and what happens when we churn? “Deleted within 30 days” is materially different from “retained indefinitely for service improvement.”
What compliance certifications does the vendor hold? SOC 2 Type II, ISO 27001, HIPAA (if relevant), GDPR-readiness. Required for regulated industries; useful evidence for others.

The integration questions

How it fits with existing systems

What’s the realistic integration effort? Plug-and-play, low-code, full engineering project. Vendors overstate ease; ask for the engineering hours of a representative recent customer integration.
What systems does it integrate with natively? Your CRM, helpdesk, doc store, identity provider. Lack of native integration means custom integration work.
What’s the API stability commitment? Will the vendor break our integration when they update? Healthy vendors version APIs and deprecate slowly.
What happens if we need to migrate off? Data export formats, transition support, contractual obligations. The exit path matters even if you don’t expect to use it.

The contract and pricing questions

What changes after the demo

What’s the actual pricing at our expected usage? Demo pricing often differs from real-volume pricing. Ask for a realistic-volume cost projection.
How are pricing increases handled? Annual increases, mid-term adjustments, surprise charges. Standard SaaS contracts handle this; AI-specific contracts sometimes have usage-based surprises.
What’s the indemnification posture? IP, accuracy, output liability. Major vendors offer limited indemnification on enterprise plans; consumer plans typically don’t.
What’s the termination flexibility? 12-month auto-renew with 60-day notice is common; mid-term termination usually costly. Negotiate based on your confidence in the vendor.
What’s the support and SLA commitment? Response times, availability guarantees, dedicated contacts. Production AI workloads benefit from real SLAs.
What’s the vendor’s financial health? Smaller AI vendors face the same uncertainty most early-stage companies do. Established vendors with revenue produce less risk than venture-funded startups burning cash.

The numbers

What due diligence actually costs and saves

Time to run the full checklist with a vendor 4–8 hours of vendor conversations and document review

Typical AI procurement decision cycle (small purchase) 2–4 weeks

Typical AI procurement decision cycle (enterprise) 2–4 months

Vendor mistakes that cost more than the procurement cost Common — incompatible integration, hidden costs, vendor instability all produce material rework

Top procurement mistake (across categories) Buying on demo quality without production validation

Second-most-common mistake Not asking about data handling and compliance until after signing

Honest mistake rate for first AI vendor purchase without checklist 40–60% of teams report buyer's remorse on their first AI vendor without structured evaluation

The checklist’s value is in preventing six-figure mistakes; the time investment is modest relative to the cost of getting procurement wrong.

In practice

What teams that have used this checklist typically learn

The biggest realisation is that questions 1–5 (capability) and questions 6–10 (data) are the most-skipped and the most-predictive. Teams that ask only about pricing and SLAs make decisions on the surface; teams that probe capability and data handling identify the mismatches that would surface in production. Spending most of the evaluation time on the first ten questions usually filters out vendors that look good in demos but won’t work in your environment.

What teams notice next is that healthy vendors welcome the rigour. The vendors that resist structured evaluation, deflect on data handling, or claim “we can answer that later” are the ones procurement should be most cautious about. Quality vendors expect these questions and have ready answers; pitch-driven vendors don’t.

A subtler pattern: the procurement checklist also informs deployment success. The questions that distinguish good vendors are also the questions that surface deployment risks. Teams that ran the checklist rigorously had better deployment outcomes — not because the vendor was different, but because the buying team understood what would happen in production before they signed.

What's next

Related work

For the broader risk framework, see AI risk assessment for legal and compliance teams. For why programs fail without rigorous setup, see Why most “AI strategies” fail in the first 90 days. For the privacy-specific evaluation, see AI privacy — what to watch for. For the vendor-lock-in considerations that connect to procurement, see Vendor lock-in risks with AI.

Common questions

FAQ

Do we need a CTO to run this checklist?

No, that's the point. Each question is answerable in plain language. Bring in a technical reviewer for the integration questions if you have one; many SMBs don't and the checklist works without one. A trusted external advisor (fractional CTO, peer at another company) can fill in for the technical questions when needed.

What if the vendor won't answer some questions?

That's the answer. Vendors that deflect on data handling, accuracy, or contractual terms are signalling something. Some deflection is reasonable (early-stage vendors may not have formal answers); refusal to engage is a red flag.

Should we run all 20 questions for every AI vendor?

Yes, scaled to the deployment size. For a $20/month seat subscription, you can compress the conversation; for a $50,000/year enterprise contract, run the full checklist with documentation. The framework scales; the discipline of asking the questions is what matters.

What about open-source AI tools we're considering?

Most questions still apply; the answers shift. Data-handling becomes about your own infrastructure (you're the operator); accuracy and failure modes are still relevant; integration and exit considerations change shape. The same evaluation framework, applied to a different ownership model.

What the AI actually does (and doesn't)

What flows where, and what happens to it

How it fits with existing systems

What changes after the demo

What due diligence actually costs and saves

Related work

FAQ

Do we need a CTO to run this checklist?

What if the vendor won't answer some questions?

Should we run all 20 questions for every AI vendor?

What about open-source AI tools we're considering?

Sources & references

Related solutions

AI hallucinations explained

AI privacy — what to watch for

AI risk assessment for legal and compliance teams

AI security risks for businesses