Our AI principles
- Useful by design. An AI system that does not measurably improve a decision is not deployed.
- Context over cleverness. Retrieval and tool calls beat raw model size for the questions we answer.
- Bounded autonomy. Models are scoped to specific tasks with explicit inputs and outputs. They do not act on customer systems.
- Human accountability. A named human reviews the system's behaviour, approves releases and owns escalations.
- Transparent recommendations. Every recommendation in a report is grounded in either the customer's own answers or a retrieved passage from our knowledge base.
- Regionally appropriate. Models, prompts and evaluations are tuned for the UAE and GCC context, not silently imported from elsewhere.
Where AI is used in the product
| System | What it does | Risk class |
|---|---|---|
| Assessment scoring | Deterministic rule-based scoring across data, compliance, sponsorship and integration readiness. | Low — no LLM call |
| Deep-analysis report | LLM with RAG over UAE-specific knowledge; produces the 6–10 page advisory PDF. | Medium — advisory only, human follow-up expected |
| Vendor & cost shortlist | LLM calls into Model Context Protocol servers that return curated vendor lists and cost bands. | Medium — sources cited inline |
| Glossary & guide generation | Editorial AI used during content authoring; reviewed by a human before publication. | Low — human-edited output |
| Chat / support replies | Not currently powered by an LLM. Any future deployment will be flagged here first. | N/A |
Model selection
We treat model choice as an engineering decision, not a brand decision. Models enter production only after passing an evaluation set built from real assessment shapes and known-hard prompts. The current production stack is:
- Primary: Claude Sonnet 4.6 — best balance of reasoning quality, instruction-following and cost-per-page for our advisory format.
- Fallback: GPT-4o — used when the primary is unavailable or rate-limited; same prompt contract.
- In-region: Anthropic on AWS Bedrock or GPT-class on Azure UAE North for customers in Mode B (see data residency).
We avoid using a customer-pinned model that is no longer supported. When a provider deprecates or materially changes a model, we migrate, re-run our evaluation suite and announce the change in the report metadata so customers can see which model produced each report.
RAG & the knowledge base
The deep-analysis report is generated with Retrieval-Augmented Generation. The model does not answer from its parametric memory alone; it is given passages retrieved from a curated UAE-specific knowledge base covering PDPL, sector-specific regulation, vendor benchmarks and case studies. This dramatically reduces hallucination and keeps recommendations regionally accurate.
Knowledge base hygiene
- Every passage carries a source, an effective date and a reviewer name.
- Outdated passages are demoted from retrieval rather than silently rewritten.
- Customer data never enters the shared knowledge base.
- Quarterly review of high-traffic topics; immediate review when a relevant law or regulation changes.
Prompt & output safety
- Prompts are templated; user inputs are slotted into well-defined variables and stripped of control sequences.
- Output is constrained to a structured schema (sections, bullets, citations) so it can be validated before render.
- Refusal handling: when the model declines or returns an off-policy response, the request is logged and surfaced to a human reviewer rather than silently retried.
- Profanity, harassment and prompts attempting to extract another customer's data are blocked at the gateway.
- PII redaction layer strips obvious personal identifiers from prompts before they leave the trust boundary in Mode A.
Evaluation & accuracy
We maintain an evaluation suite of representative assessment inputs and known-good report shapes. Every model change, prompt change and retrieval change is run against this suite before it reaches production.
- Quality rubric covering factual accuracy, regional relevance, recommendation alignment and tone.
- Side-by-side human grading on a random sample of production reports each month.
- Regression check on a sealed set of customer-anonymised reports to detect drift over time.
- Public commitment to publishing a yearly summary of accuracy metrics and known limitations.
Human oversight
- No AI output is auto-executed against a customer system; every action is reviewed by a Scinops human or by the customer.
- A named report editor reviews a rolling sample of generated reports each week.
- Customers can flag any sentence in their report and request a human re-review at no additional cost.
- Material recommendations (e.g. specific vendor shortlists for regulated workloads) are validated against our internal vendor register before they enter retrieval.
Customer data & training
Aggregated, fully anonymised statistics (for example, how many assessments selected a given sector) may be used internally to improve the product. We never reveal information that could identify any single customer or individual.
Auditability
- Every report records the model and version that produced it, the prompt template version and the retrieved passage IDs.
- A read-only audit trail of admin actions affecting prompts, models or the knowledge base.
- On request, enterprise customers can receive a JSON export of all AI-related metadata for their tenant.
- Model and prompt changes are versioned in source control with code review.
Regulatory & framework alignment
We track and align our practices with the frameworks most relevant to our UAE / GCC customers:
- UAE Charter for the Development & Use of Artificial Intelligence — human dignity, inclusion, fairness, transparency and accountability principles applied across the product lifecycle.
- UAE National Strategy for Artificial Intelligence 2031 — design choices favour responsible deployment in regulated sectors.
- UAE PDPL Federal Decree-Law No. 45 of 2021 — including the right (Article 14) to object to fully automated decisions; our reports are advisory and do not constitute a fully automated decision.
- ISO/IEC 42001:2023 AI management system controls — used as our internal AI risk-management backbone.
- NIST AI Risk Management Framework (AI RMF 1.0) — informs our Govern, Map, Measure, Manage activities.
AI-specific incidents
An AI incident is any event in which a model output materially misleads a customer or violates one of the principles above. We treat AI incidents on the same severity ladder as security incidents:
- Detection: weekly review, customer flags, automated quality checks against the evaluation suite.
- Containment: route around the affected model, prompt or knowledge passage within the hour.
- Notification: affected customers contacted within two business days, with the offending report identified and an offer to regenerate or refund.
- Post-mortem: blameless review with documented corrective actions; published internally and shared with affected customers on request.