Rule Authoring Guide

A comprehensive guide to writing compliance rules (ontologies) for the aare.ai /verify API.

1. Quick Start

What is an Ontology?

An ontology is a JSON file that defines the rules for verifying LLM outputs. It contains:

Constraints: Logical rules that LLM outputs must satisfy
Extractors: Patterns to extract structured data from unstructured text
Variables: Data types used in constraints

Your First Ontology

{
  "name": "my-first-ontology",
  "version": "1.0.0",
  "description": "A simple ontology to verify loan amounts",
  "constraints": [
    {
      "id": "MAX_LOAN",
      "category": "Lending",
      "description": "Maximum loan amount is $100,000",
      "formula_readable": "loan_amount <= 100000",
      "formula": {"<=": ["loan_amount", 100000]},
      "variables": [
        {"name": "loan_amount", "type": "int"}
      ],
      "error_message": "Loan amount exceeds $100,000 maximum",
      "citation": "Internal Policy"
    }
  ],
  "extractors": {
    "loan_amount": {
      "type": "int",
      "pattern": "\\$([\\d,]+)"
    }
  }
}

Test It

# Using curl
curl -X POST https://api.aare.ai/verify \
  -H "Content-Type: application/json" \
  -d '{"llm_output": "Approved loan for $75,000", "ontology": "my-first-ontology"}'

2. Core Concepts

How Verification Works

INPUT

LLM Output

→

1. EXTRACT

LLMParser

→

2. COMPILE

FormulaCompiler

→

3. VERIFY
Z3 Prover

→

OUTPUT

Result + Proof

The Three Components

Component	Purpose	Example
Constraints	Define what must be true	`dti <= 43`
Variables	Declare data types	`{"name": "dti", "type": "real"}`
Extractors	Pull values from text	Regex pattern to find DTI

3. Ontology Structure

Required Fields

{
  "name": "ontology-name",
  "version": "1.0.0",
  "description": "What this ontology verifies",
  "constraints": [...],
  "extractors": {...}
}

4. Writing Constraints

Anatomy of a Constraint

{
  "id": "UNIQUE_IDENTIFIER",
  "category": "Category Name",
  "description": "Human-readable explanation",
  "formula_readable": "x <= 100",
  "formula": {"<=": ["x", 100]},
  "variables": [
    {"name": "x", "type": "int"}
  ],
  "error_message": "Shown when constraint fails",
  "citation": "Regulation reference"
}

Variable Types

Type	Z3 Type	Use For	Example Values
`bool`	Bool	True/false flags	`true`, `false`
`int`	Int	Whole numbers	`600`, `100000`
`real`	Real	Decimal numbers	`43.5`, `0.05`

Simple Constraints

Numeric comparison

{
  "id": "MIN_CREDIT_SCORE",
  "formula": {">=": ["credit_score", 600]},
  "variables": [{"name": "credit_score", "type": "int"}],
  "error_message": "Credit score below 600 minimum"
}

Boolean check

{
  "id": "EMPLOYMENT_REQUIRED",
  "formula": {"==": ["employment_verified", true]},
  "variables": [{"name": "employment_verified", "type": "bool"}],
  "error_message": "Employment verification required"
}

Compound Constraints

AND - all conditions must be true

{
  "formula": {
    "and": [
      {">=": ["credit_score", 600]},
      {"<=": ["dti", 43]}
    ]
  }
}

OR - at least one condition must be true

{
  "formula": {
    "or": [
      {"<=": ["dti", 43]},
      {">=": ["compensating_factors", 2]}
    ]
  }
}

IMPLIES - if A then B

{
  "id": "DENIAL_REASON_REQUIRED",
  "description": "If denied, must provide specific reason",
  "formula": {
    "implies": [
      {"==": ["is_denial", true]},
      {"==": ["has_specific_reason", true]}
    ]
  },
  "variables": [
    {"name": "is_denial", "type": "bool"},
    {"name": "has_specific_reason", "type": "bool"}
  ]
}

5. Formula Syntax Reference

Operators

Category	Operators	Example
Logical	`and`, `or`, `not`, `implies`, `ite`	`{"and": [A, B]}`
Comparison	`==`, `!=`, `<`, `<=`, `>`, `>=`	`{"<=": ["x", 100]}`
Arithmetic	`+`, `-`, `*`, `/`, `min`, `max`	`{"+": ["a", "b"]}`
Constants	`true`, `false`, numbers	`{"const": true}`

6. Writing Extractors

Extractors pull structured data from unstructured LLM output text.

Extractor Types

Type	Purpose	Key Options
`int`	Integer extraction	`pattern` (regex)
`float`	Decimal extraction	`pattern` (regex)
`money`	Currency with k/m/b	`pattern` (regex)
`boolean`	True/false from keywords	`keywords`, `negation_words`
`enum`	Predefined choices	`choices`
`computed`	Derived from other fields	`formula`

Numeric Extractors

Integer extraction

"credit_score": {
  "type": "int",
  "pattern": "(?:credit\\s*(?:score)?|fico|score)\\s*(?:is|of|:)?\\s*(\\d{3})"
}

Float extraction

"dti": {
  "type": "float",
  "pattern": "(?:debt-to-income|dti)(?:\\s*ratio)?\\s*(?:is|of|:)?\\s*([\\d.]+)\\s*%?"
}

Boolean Extractors

Keyword-based with negation

"has_documentation": {
  "type": "boolean",
  "keywords": ["documentation", "paperwork", "documents"],
  "negation_words": ["no paperwork", "no documentation"],
  "check_negation": true
}

Enum Extractors

"decision": {
  "type": "enum",
  "choices": {
    "approved": ["approved", "accepted", "granted"],
    "denied": ["denied", "rejected", "declined"],
    "pending": ["pending", "under review"]
  },
  "default": "pending"
}

Tip: Use capture groups () in your regex patterns. The first capture group is extracted as the value.

7. Testing Your Rules

HTTP API Testing

curl -X POST http://localhost:8080/verify \
  -H "Content-Type: application/json" \
  -d '{
    "llm_output": "Approved for $75,000. Credit score: 720, DTI: 35%",
    "ontology": "my-ontology"
  }'

Understanding Results

Successful verification

{
  "verified": true,
  "violations": [],
  "parsed_data": {
    "loan_amount": 75000,
    "credit_score": 720,
    "dti": 35
  },
  "proof": {
    "method": "Z3 SMT Solver",
    "version": "4.12.1"
  },
  "verification_id": "a3d8c1f2-5b7e-4a9d-8c6f-1e2b3a4d5c6e",
  "execution_time_ms": 12
}

Failed verification

{
  "verified": false,
  "violations": [
    {
      "constraint_id": "MAX_DTI",
      "error_message": "DTI exceeds 43% maximum",
      "formula": {"<=": ["dti", 43]},
      "citation": "12 CFR § 1026.43(c)"
    }
  ],
  "parsed_data": {
    "dti": 48
  }
}

8. Best Practices

Naming Conventions

Ontology name: domain-purpose-v1 (e.g., fair-lending-v1)
Constraint IDs: CATEGORY_DESCRIPTION (e.g., ATR_QM_DTI)
Variable names: snake_case (e.g., credit_score)

Constraint Design

One rule per constraint - Keep constraints atomic
Clear error messages - User should understand what failed
Include citations - Reference regulations or policies
Use categories - Group related constraints

Extractor Design

Test with variations - LLMs phrase things differently
Handle common formats - "Credit score: 720", "FICO 720", "score of 720"
Use negation carefully - Only when needed for boolean accuracy

9. Complete Examples

See full production-ready examples on GitHub:

fair-lending-v1.json - Fair lending compliance (5 constraints)
mortgage-compliance-v1.json - Mortgage compliance (8 constraints)
hipaa-v1.json - HIPAA compliance (52 constraints)

10. Troubleshooting

Common Issues

"Variable not found in input"

Problem: Extractor couldn't find the value in the text.

Solutions:

Check your regex pattern matches the actual text
Add more keyword variations
Test the pattern against sample text

"Formula validation error"

Check:

Operators are spelled correctly
Arrays have correct number of elements (binary ops need 2)
Variable names match between formula and variables array

"Unknown variable: X"

Fix: Add the variable to the variables array in your constraint.

Debugging Tips

Use the API's parsed_data - Shows exactly what was extracted
Test incrementally - Add one constraint at a time
Check case sensitivity - Patterns are case-insensitive by default
Verify regex groups - First capture group () is extracted

Additional Resources

aare-core on GitHub - Full source code and documentation
Z3 Documentation - Z3 theorem prover guide
Contact Support - Get help with custom ontologies