Adaptive LiveOps Decision Engine

Policy Replay Simulator with one RL Decision Agent, deterministic safety guardrails, cinematic replay, and benchmark evidence.

The RL Decision Agent scores interventions, applies an internal red safety/risk gate, serves one action, and logs outcomes for explanation.

Runtime mode

Local path uses bundled repo files. Cloud path uses BigQuery/Gemini only when configured and confirmed by /health.

Runtime: checkingChat: checking

Ask the RL agent explanation console

Ask about the current RL recommendation, the red safety gate, OPE, rollout evidence, or what to do next.

Scenario controls

How to test a scenario

1Load a preset below, or move sliders to create a custom player state. Slider changes recalculate the RL recommendation automatically.

2The RL recommendation refreshes automatically after edits; Preview decision is optional.

3The primary play button automatically applies the served RL action after the red safety gate.

4Click Play Next Match to apply the served action, run the battle replay, and refresh the metrics.

Ready: choose a preset or move sliders. Recommendation updates automatically.

New player Advanced player

Preset scenarios

Live metrics

Win probability--

Frustration--

Churn risk--

Power gap--

Cold start--

History confidence--

Replay speed1.00x

Auto matches10

Manual mode lets you inspect one recommendation or play one match at a time. Auto RL Plan runs several match-policy cycles automatically and shows how the fixed policy adapts after each simulated outcome.

Player-- power

Boss-- power

Change a slider or select a preset; the RL recommendation refreshes automatically. Click Play Next Match to advance.

Progress0%

Player HP100

Boss HP100

Last actionnone

Autonomous RL Rollout

Run a 10-match policy loop to show: state -> RL recommendation -> applied action -> simulated match -> updated telemetry -> next recommendation.

No automatic rollout has run yet.

Estimate vs actual match response

Left: the policy estimate updates when presets or controls change. Right: actual telemetry appends only after Play Next Match completes.

side-by-side

Control Preview: Estimated Outcomes control-change history

win frustration churn

Each dot is a recalculation after a preset load or slider change. These are estimated outcomes before a simulated match is run.
X-axis: control update sequence, not match time

Actual match telemetry two points per match: before / after

Actual telemetry updates only after completed simulated matches. Separators indicate different matches. Each match adds a before point and an after point.

win probability frustration churn

Adds before/after points when Play Next Match runs and updates the player state.

Benchmark

Policy comparison safety-gated RL vs baselines

Historical and predicted trajectory

Seven-day view for the selected scenario. Hover cards for why churn, frustration, or win probability changed.

Agent operations, health checks, and 7-day progress

Operational evidence for the same RL Decision Agent: health checks, policy metrics, audit/explanation tools, OPE, recent logs, dataset tools, and day-by-day progress cards.

The red safety/risk gate is an internal serving component. The explanation console describes what happened; it does not choose actions.