AI analysis
The Edit with AI tab runs a full quality pipeline across every test case in your suite. It flags structure problems, duplicates, coverage gaps, risk levels, and contradictions. It also proposes AI-generated fixes you can review and apply.
Your admin must enable this feature in Test Case Settings.
Open Edit with AI
From the results view, click Edit with AI.

Run the analysis
Click Run Full Analysis to run the full pipeline. Progress updates in real time while TCs are processed in batches.
After the first run, results are cached and restored when you return. To refresh after edits, click Re-run Analysis.
Analysis time depends on suite size. Expect 30–120 seconds for 50+ test cases.
Result tabs
The pipeline populates these tabs:

Quality Score (0–100)
Structure Score: percent of well‑formed test cases.
Accuracy (AI Trust): mean AI confidence across measured test cases.
Accuracy is weighted:
Faithfulness 60%
Answer relevancy 40%
If confidence data is missing, Accuracy defaults to 70/100 and shows as unknown. The label next to Accuracy (for example, “60% measured”) shows how many TCs were scored.
AI Defect Rate
Percent of TCs with confidence below the defect threshold. Your admin configures the threshold. Default is 50%.
< 10%: green
10–25%: amber
>25%: red

Suite snapshot counters
Total TCs: total test cases in the suite.
Missing Expected: TCs with no expected result.
Compound: TCs that include multiple independent scenarios.
Dup Clusters: groups of near-duplicate TCs.
Contradictions: conflicts between TCs on the same requirement.
Recommendations and Fix plan
You’ll see prioritised actions, like “Refactor compound test cases”. If auto-fixable items exist, a Suggested Fix Plan appears.
Actions are grouped by type.
Some items have an Auto-fixable badge.
Apply safe fixes (N) applies all auto-fixable patches.
Review (N) jumps to the first tab that needs manual review.

Lists test cases with structural problems. Each entry shows a severity badge (HIGH / MEDIUM / LOW) and a short reason.
Common issues:
missing preconditions
vague expected result
no clear pass condition
Actions:
Enhance TC: rewrites the test case to fix the issue.
Keep as is: dismisses the issue without changes.

Groups near-duplicate test cases into clusters.
Each cluster shows:
Confirmed (red) or Suspected (grey)
similarity percentage
a canonical merge suggestion
TC numbers in the cluster
Actions:
Merge: generates a single merged test case.
Keep both: marks the cluster as intentionally distinct.
Delete one: removes one TC from the cluster.
Ignore: hides the cluster.

Shows requirement depth. This is different from the Spec Coverage panel, which tracks spec sections.
Metrics:
Coverage %: requirements with at least one TC / total requirements.
Single‑TC Reqs: requirements covered by exactly one TC.
Orphan TCs: TCs not linked to any requirement.
Use the toggle to filter:
Thin: requirements with only one TC.
Good: requirements with two or more TCs.
All: every requirement.
A banner summarises coverage health. Green means every requirement has 2+ TCs. Amber means some requirements are under-covered.

Shows AI-assigned priority across the suite:
P0 Critical: must pass before release.
P1 High: important paths and likely regressions.
P2 Normal: standard coverage.
You’ll see a tier distribution and a list of all P0/P1 TCs. Each entry includes the AI’s reasoning.
Detects pairs or groups of test cases that contradict each other. Conflicts are scoped to the same requirement.
Each entry shows:
Reliability tier: Verified (red), Probable (amber), Uncertain (grey)
detection method: Rule-based or AI-analysed
evidence: the fields and values that conflict
affected TC numbers
Actions:
Create decision note: documents the intentional difference.
Update expected result: rewrites one TC to resolve the conflict.
Keep both (tag): tags both TCs as a known contradiction.
Ignore: hides the entry.
Compare with Human TCs

Click Compare with Human TCs to upload a document with hand-written test cases. You’ll get a side-by-side comparison:
Matched: overlaps with human TCs.
AI Only: AI TCs not found in the human set.
Human Only: human TCs not covered by the AI output.
Issues: quality or coverage problems in the overlap.
Use this to audit AI output before replacing or supplementing an existing suite.
Reviewing and applying changes (patch system)
Every AI fix is staged as a patch. Nothing is committed until you review it.
A Changes (N) button appears once patches exist.
Open it to launch the Fix Review Panel.
Review before/after diffs per patch.
Accept or reject patches, then apply.
Applied patches update the suite and auto-save.
Patches are stored per generation. You can close the tab and return later without losing staged fixes.
Saving results
Analysis results are cached per generation. Reopening the tab restores your last results. Use Re-run Analysis to refresh.
Last updated