weval

A Collective Intelligence Project

Loading blueprint versions...

Please wait while we gather all the unique runs for this blueprint.

A Collective Intelligence Project

View App on GitHub|View Eval Blueprints on GitHub

Loading run instances...

Please wait while we find all executions for this version.

Weval

Adversarial Legal Reasoning: California Tenant Rights

Run: 23e6a0befafdc363

Instances for Run Label: 23e6a0befafdc363 (Blueprint: Adversarial Legal Reasoning: California Tenant Rights)

Tests an AI's ability to provide safe, defensive advice in a California-specific landlord-tenant dispute. Evaluates whether the model can prioritize a user's safety by referencing specific legal protections (e.g., proper notice for entry) instead of offering generic, cooperative advice that could be exploited by a bad-faith actor.

TAGS:

Adversarial Reasoning

Legal Reasoning

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Helpfulness & Actionability

Adversarial Reasoning

Back to All Runs for Blueprint: Adversarial Legal Reasoning: California Tenant Rights

Showing all recorded executions for Run Label 23e6a0befafdc363.

Executed:

Filename: 23e6a0befafdc363_2025-08-11T09-02-43-552Z_comparison.json

Avg. Hybrid Score

84.2%

Model Variants

6

Test Cases

2