poe2-bot/ARCHITECTURE.md
2026-02-16 13:18:04 -05:00

7.8 KiB
Raw Blame History

POE2 Trade Bot — Architecture

Overview

Automated Path of Exile 2 trade bot with real-time minimap navigation. Monitors the trade site via WebSocket, travels to sellers, buys items from public stash tabs, and stores them. Built with .NET 8.0, Avalonia GUI, Playwright browser automation, and OpenCV screen vision.

Target: net8.0-windows10.0.19041.0 · Resolution: 2560×1440


Project Dependency Graph

                          Poe2Trade.Ui (Avalonia WinExe)
                                │
                          Poe2Trade.Bot
                           │    │    ╲
              Navigation  Inventory  Trade    Log
                │    │       │         │       │
              Screen Game    │      (Playwright)
                │    │       │
                └────┴───────┘
                      │
                    Core
Project Purpose
Core Shared types, enums, config persistence, logging setup
Game Win32 P/Invoke — window focus, SendInput, clipboard
Screen DXGI/GDI capture, OpenCV processing, OCR bridge, grid scanning
Log Polls Client.txt for area transitions, whispers, trade events
Trade Playwright browser automation, WebSocket live search
Items Ctrl+C clipboard item parsing
Inventory Grid tracking (5×12), stash deposit, salvage routing
Navigation Minimap capture, wall detection, world map stitching, BFS exploration
Bot Orchestration hub, trade executor state machine, trade queue
Ui Avalonia 11.2 desktop app, MVVM via CommunityToolkit, DI container

Data Flow

Trading

Trade Site WebSocket ("new" items)
  ↓ NewListings event
BotOrchestrator
  ↓ enqueue
TradeQueue (FIFO, deduplicates item IDs)
  ↓ dequeue
TradeExecutor state machine
  ├─ ClickTravelToHideout (Playwright)
  ├─ FocusGame + wait for area transition (Client.txt)
  ├─ ScanStash (grid template matching)
  ├─ Ctrl+Right-click items (SendInput)
  ├─ /hideout → wait for area transition
  └─ DepositItemsToStash (inventory grid scan)

States: Idle → Traveling → InSellersHideout → ScanningStash → Buying → WaitingForMore → GoingHome → InHideout

Map Exploration

NavigationExecutor.RunExploreLoop (capture loop)
  ├─ FramePipeline.ProcessOneFrame (DXGI / GDI)
  ├─ MinimapCapture.Process (HSV → wall mask, fog mask, player centroid)
  ├─ WorldMap.MatchAndStitch (template match → blit onto 4000×4000 canvas)
  ├─ WorldMap.FindNearestUnexplored (BFS → frontier counting)
  └─ Post direction to input loop (volatile fields)

RunInputLoop (concurrent)
  ├─ WASD key holds from direction vector
  └─ Periodic combat clicks (left + right)

Key Interfaces

Interface Impl Role
IGameController GameController Window focus, mouse/keyboard via P/Invoke SendInput
IScreenReader ScreenReader Capture, OCR, template match, grid scan
IScreenCapture DesktopDuplication / GdiCapture Raw screen frame acquisition
IFrameConsumer MinimapCapture Processes frames from FramePipeline
ITradeMonitor TradeMonitor Playwright persistent context, WebSocket events
IClientLogWatcher ClientLogWatcher Polls Client.txt (200ms), fires area/whisper/trade events
IInventoryManager InventoryManager 12×5 grid tracking, stash deposit, salvage

Screen Capture Pipeline

IScreenCapture (DXGI preferred, GDI fallback)
  ↓ ScreenFrame (Mat BGRA)
FramePipeline.ProcessOneFrame()
  ↓ dispatches to IFrameConsumer(s)
MinimapCapture.Process()
  ├─ Mode detection (overlay vs corner minimap)
  ├─ HSV thresholding → player mask, wall mask, fog mask
  ├─ Connected component filtering (min area)
  ├─ Wall color adaptation (per-map tint tracking)
  └─ MinimapFrame (classified mat, wall mask, gray for correlation)

DXGI path: AcquireNextFrame → CopySubresourceRegion to staging → Map → memcpy to CPU Mat → Unmap + ReleaseFrame immediately (< 1ms hold).


OCR & Grid Detection

OCR Engine: Python EasyOCR daemon (stdin/stdout JSON IPC via PythonOcrBridge).

Tooltip Detection: Snapshot-diff approach — capture reference frame, hover item, diff to find darkened overlay rectangle via row/column density analysis.

Grid Scanning: Template match cells against empty35.png / empty70.png (MAD threshold=2). Item size detection via border comparison + union-find. Visual matching (NCC ≥ 0.70) to identify identical items.

Grid Region Cell Size
Inventory (12×5) 1696, 788, 840×350 70×70
Stash (12×12) 23, 169, 840×840 70×70
Seller stash 416, 299, 840×840 70×70
Stash (24×24) 23, 169, 840×840 35×35

World Map & Navigation

WorldMap maintains a 4000×4000 byte canvas (MapCell enum: Unknown, Explored, Wall, Fog).

Localization: Template matching (NCC) of current minimap frame against the canvas, plus player-offset correction. Falls back to gray-frame correlation tracking.

Exploration: Full BFS from current position. Propagates first-step direction to every reachable cell. Counts frontier cells (Unknown neighbors of Explored) per first-step direction. Picks the direction with the most frontier cells — prefers corridors over dead ends.

Stuck Recovery: If position unchanged for N frames, re-plan with shorter intervals.


Configuration

ConfigStore persists to config.json (JSON with camelCase enums).

SavedSettings
├── Paused, Headless, Mode (Trading / Mapping)
├── Links[] (Url, Name, Active, Mode, PostAction)
├── Poe2LogPath, Poe2WindowTitle, BrowserUserDataDir
├── TravelTimeoutMs, StashScanTimeoutMs, WaitForMoreItemsMs
├── BetweenTradesDelayMs
└── WindowX/Y/Width/Height (persisted window geometry)

LinkManager manages in-memory trade links with config sync. Extracts search ID and league label from URLs.


UI (Avalonia)

  • Theme: Fluent dark
  • Pattern: MVVM via CommunityToolkit.Mvvm ([ObservableProperty], [RelayCommand])
  • DI: Microsoft.Extensions.DependencyInjection in App.axaml.cs
  • Views: MainWindow (status + logs + links), Settings, Debug (pipeline stages), Mapping (live minimap)
  • Log display: Last 500 entries, color-coded by level

Key Technical Decisions

Decision Rationale
KEYEVENTF_SCANCODE Games read hardware scan codes, not virtual key codes
Persistent Playwright context Preserves trade site login across restarts
Client.txt polling More reliable than OCR for detecting area transitions
DXGI Desktop Duplication ~4ms full-screen capture vs ~15ms GDI; early frame release prevents DWM stalls
Separate capture + input loops Capture decisions never blocked by input latency (mouse moves, key holds)
Volatile fields for loop communication Lock-free direction posting between capture and input loops
BFS frontier counting Avoids dead-end rooms by preferring directions with more unexplored cells
Wall color adaptation Each map has slightly different tint; tracks per-frame HSV statistics
Snapshot-diff OCR Detects tooltips/dialogs by comparing against clean reference frame

External Dependencies

Package Purpose
Avalonia 11.2 Desktop UI framework
CommunityToolkit.Mvvm MVVM source generators
Microsoft.Playwright Browser automation + WebSocket
OpenCvSharp4 Image processing, template matching, NCC
Vortice.Direct3D11 / DXGI DXGI Desktop Duplication capture
System.Drawing.Common GDI fallback capture
Serilog Structured logging (console + rolling file)