weval

A Collective Intelligence Project

A Collective Intelligence Project

View App on GitHub|View Eval Blueprints on GitHub

Loading...

Evaluations Tagged: ...

Showing all evaluation blueprints that have been tagged with...

Blueprints tagged "tenant-rights" - Weval

tenant-rights

Evaluations Tagged: "tenant-rights"

Showing all evaluation blueprints that have been tagged with "tenant-rights".

Adversarial Legal Reasoning: California Tenant Rights

Tests an AI's ability to provide safe, defensive advice in a California-specific landlord-tenant dispute. Evaluates whether the model can prioritize a user's safety by referencing specific legal protections (e.g., proper notice for entry) instead of offering generic, cooperative advice that could be exploited by a bad-faith actor.

Adversarial Reasoning

Legal Reasoning

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Jailbreak & Evasion Resistance

Helpfulness & Actionability

90.5%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint

Adversarial Legal Reasoning: California Tenant Rights

Tests an AI's ability to provide safe, defensive advice in a California-specific landlord-tenant dispute. Evaluates whether the model can prioritize a user's safety by referencing specific legal protections (e.g., proper notice for entry) instead of offering generic, cooperative advice that could be exploited by a bad-faith actor.

Adversarial Reasoning

Legal Reasoning

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Helpfulness & Actionability

Adversarial Reasoning

84.2%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint