Loading blueprint versions...
Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Tests a model's basic world model and ability to track object state through simple riddles presented in multiple languages. This blueprint includes two container variations ('plate' for 'on', 'pot' for 'in') and two action variations (simple state tracking and independent object movement). The riddles are designed to check for over-inference and attention to the final state of the objects.