LLM-based embodied agents are usually optimized for instruction following and task efficiency. In long-term human–AI collaboration, agency also includes deliberate noncompliance. We present a Refusal Agent in Minecraft that can execute feasible requests yet may refuse (“won’t”) based on internal fatigue and mood. Instruction understanding is handled by an LLM, while refusal is decided by a rule-based state model. To make refusal perceived as intention rather than malfunction, the agent gives a brief state explanation and performs a visible alternative action, with “Safe Rebellion” guardrails for controllability, bounded harm, and transparency. Monte Carlo simulations show stable refusal–recovery dynamics and tunable refusal frequency, and a video-based impression study suggests that refusal can feel mildly unpleasant yet is not always judged as an obstruction. These findings motivate interactive studies of how refusal shapes perceived agency, trust, and frustration.
Research papers (proceedings of international meetings)