Mimiq v2 is the next generation of the behavioral cloning model, developed based on operational experience with the first version. Like Mimiq v1, the new version operates exclusively on the visual stream from the game screen, without game APIs, process memory reading, or per-title integrations. The model input is screen frames; the output is user actions.

The first version demonstrated that demonstration-based training is viable across a wide range of games. During practical use, however, two systemic limitations were identified that constrained agent stability in scenarios requiring context to be preserved over extended periods.

Limitations of Mimiq v1

The first version architecture relied on analysis of a limited window of recent frames. This approach provided sufficient quality for tasks where decisions are made from the current screen state, but proved inadequate for scenarios that require prior events to be taken into account.

In practice, this manifested as loss of context after an action: the model did not retain information about actions already performed, such as hitting a target, picking up an item, or opening a door. A similar issue occurred during movement toward a set point: after the landmark left the field of view, the agent often lost direction and exhibited cyclic behavior. The cause was the absence of a long-term state retention mechanism between frames.

A separate limitation was the image processing format. Mimiq v1 reduced frames to 224×224 pixels, which distorted proportions in widescreen games with a 16:9 aspect ratio. Interface elements — health indicators, resource counters, corner icons — lost readability after scaling, reducing decision quality in situations where that data is decisive.

Recurrent memory

The key change in Mimiq v2 is support for recurrent memory. The model maintains an internal state that carries between frames for the entire game session and accumulates information about prior events.

Memory is implemented as a multi-timescale mechanism. Part of the internal state updates at high frequency and covers short-term context of current actions. Another part retains long-term information — movement direction, completed actions, the phase of the current game scenario — for tens of seconds. This allows the model to handle both instant reactions and sustained purposeful actions with comparable effectiveness.

An important implementation detail is that memory is tied to real time, not frame count. A drop in frame rate does not accelerate how quickly the model “forgets” accumulated context, which yields more stable behavior across different hardware.

As a result, Mimiq v2 can:

  • retain context of completed actions over extended periods;
  • maintain purposeful movement toward a target even after the landmark leaves the field of view;
  • avoid cyclic behavior typical of models without long-term memory.

The memory mechanism is integrated into all v2 model sizes, including the smallest configurations. Training with memory is required for every tier, not an option limited to large architecture variants.

Reworked perception

The second major change is modification of the visual perception module. Mimiq v2 processes frames in native widescreen 16:9 without prior compression to a square format, preserving correct image proportions.

Additionally, a focus-region mechanism was introduced. Instead of distributing attention uniformly across the frame, the model can analyze key screen areas separately — interface elements, crosshair region, information panels — while retaining an overview of the scene. This allows stable recognition of small but decision-critical details that were lost during prior image processing in Mimiq v1.

Mimiq v1 and Mimiq v2: comparison

Parameter Mimiq v1 Mimiq v2
Context retention recent frame window internal state for full session
Event memory none multi-timescale recurrent memory
Time basis frame-bound seconds, FPS-independent
Movement to target often lost retained
Cyclic behavior common significantly reduced
Image format 224×224 16:9
Interface details lost in compression dedicated focus regions

Upgrade guidance

Mimiq v1 remains suitable for games where agent behavior depends solely on the current screen state and does not require retention of prior action context. For projects where the agent must perform action sequences, follow a route, or account for outcomes of earlier game stages, upgrading to Mimiq v2 provides a substantial improvement in behavioral stability and predictability.

Mimiq v2 continues the approach established in the first model version and is the currently recommended option for new projects on the Universal Game AI platform.