About arc42
arc42, the template for documentation of software and system architecture.
Template Version 8.2 EN. (based upon AsciiDoc version), January 2023
Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.
1. Introduction and Goals
We are a team of four Software Engineering students from the University of Oviedo developing YOVI, a web-based gaming platform for the company Micrati as part of our Architecture of Software Systems (ASR) course project.
The platform will allow users to play Game Y against AI opponents with strategies and track their games via their own account. Our solution is built with a service-oriented architecture comprising three main components: a React frontend (webapp/), a Node.js user service (users/), and a Rust game engine (gamey/), all orchestrated via an API gateway. This architecture ensures clarity, maintainability, and scalability while fulfilling both academic requirements and player expectations.
This document outlines the architectural vision for YOVI, balancing functionality with clarity and maintainability to meet both academic requirements and player expectations.
1.1. Requirements Overview
The YOVI system is a web-based gaming platform based on Game Y, developed for the company Micrati. The main goal is to allow users to play matches against the machine or via bots, with support for registration and statistics functionalities.
Key functional requirements include:
-
Public deployment and web accessibility.
-
Development of a web application in React that allows playing at least the classic version of Game Y.
-
Implementation of a Rust module to verify the game state and suggest moves to the bot.
-
User registration and access to match history.
-
Support for multiple strategies and difficulty levels in the vs. machine mode.
1.2. Quality Goals
The main quality attributes for the system architecture are:
-
Functionality: The system must correctly implement Game Y rules and provide at least one working AI strategies.
-
Usability: The React-based interface must be intuitive for players. The component-based architecture of the webapp supports future internationalization (i18n)
-
Modularity & Maintainability: The clear separation into services ensures that components can be developed, tested, and extended independently.
-
Deployability & Availability: Using Docker and Docker Compose ensures that the multi-service application can be consistently deployed in any environment. The gateway provides a single entry point, simplifying deployment
-
Testability: The separation of concerns allows for comprehensive testing at multiple levels: unit tests in Rust and Node.js, integration tests between services, and end-to-end tests for the webapp
1.3. Stakeholders
| Role/Name | Contact | Expectations |
|---|---|---|
Development Team |
yovi_en2d |
Implement a scalable, maintainable, and well-documented solution. |
End Users |
N/A |
Usable, stable interface with complete gameplay and statistics features. |
Micrati (Client) |
N/A |
Fulfillment of game, API, and public deployment requirements. |
Project Evaluators |
Arquisoft |
Compliance with documentation, testing, deployment, and code quality criteria. |
Bot Developers |
API users |
Well-documented and stable API for automated integration. |
2. Architecture Constraints
This section lists the main constraints that shape the architecture of the YOVI system. Unlike requirements, constraints define how it must be done or which technologies must be used. These are non-negotiable boundaries within which we must operate.
2.1. Technical Constraints
These constraints are imposed by the client (Micrati). They dictate the technology stack and communication protocols.
Constraint |
Explanation |
Rationale |
Negotiable? |
Frontend Technology |
The web application MUST be implemented in TypeScript. React is the chosen framework but not strictly mandatory. |
The client requires TypeScript for type safety and maintainability. |
No (TypeScript is mandatory) |
Game Engine Language |
The game logic module MUST be implemented in Rust. |
Performance and safety requirements for game state validation and AI strategies. |
No |
Communication Protocol |
All communication between the webapp and game engine MUST use JSON messages following YEN notation for game states. |
Standardized format specified by the client for interoperability. |
No |
Public API |
The system MUST expose a documented public API that allows external bots to interact. |
Core requirement from Micrati for third-party integration. |
No |
Deployment |
The complete system MUST be publicly accessible via the Web using HTTPS. |
Ensures secure communication and production readiness. |
No |
Containerization |
All services MUST be containerized using Docker with a root-level |
Ensures consistent deployment across environments. |
Partial (format flexible) |
Database |
User data MUST be persisted. MongoDB is the current implementation, but alternatives are possible. |
Data persistence is mandatory; the specific database technology is negotiable. |
Yes (technology choice) |
2.2. Domain Constraints
These constraints come from the game domain itself and the specific requirements of Game Y.
Constraint |
Explanation |
Rationale |
Negotiable? |
Game Rules |
The system MUST correctly implement the rules of Game Y (classic version minimum). |
Core business requirement; incorrect game rules make the product worthless. |
No |
Game Mode |
A player-versus-machine mode MUST be available. |
Primary use case for the platform. |
No |
Board Size |
The game MUST support variable board sizes configurable by the user. |
Required for different difficulty levels and game variants. |
No |
AI Strategies |
The computer MUST implement more than one strategy, selectable by the user. |
Demonstrates AI sophistication and provides varied gameplay. |
No |
User Data |
Users MUST be able to register and access their match history. |
Basic user management requirement. |
No |
2.3. Organizational Constraints
These constraints govern our development process and team practices.
Constraint |
Explanation |
Rationale |
Negotiable? |
Documentation Standard |
Architecture MUST be documented following the Arc42 template. |
Course requirement for the ASR lab. |
No |
Decision Recording |
Architectural decisions MUST be recorded as ADRs (Architectural Decision Records) in the wiki of GitHub. |
Ensures traceability and rationale documentation. |
No |
Version Control |
Development MUST use Git with the repository hosted on GitHub ( |
Course requirement for collaboration and evaluation. |
No |
Branching Strategy |
A branch-based workflow MUST be followed (feature branches, pull requests, code reviews). |
Ensures code quality and team coordination. |
Partial (specific workflow details negotiable) |
Testing Requirements |
The system MUST include unit tests, integration tests, and end-to-end tests. |
Quality assurance requirement from course evaluation. |
No |
Test Coverage |
Code coverage MUST meet thresholds defined in |
Quality gate for acceptance. |
No |
CI/CD |
Automated build, test, and deployment MUST be implemented via CI/CD pipelines (GitHub Actions). |
Ensures reproducible builds and deployment automation. |
No |
Issue Tracking |
All tasks MUST be tracked using GitHub Issues. |
Project management and traceability requirement. |
No |
2.4. External Constraints
These constraints come from external stakeholders and the operating environment.
Constraint |
Explanation |
Rationale |
Negotiable? |
Bot Integration |
External bots can ONLY interact through the public API exposed via the secure HTTPS endpoint; no direct access to internal services is allowed. |
Security and encapsulation requirement. |
No |
API Documentation |
The public API MUST be documented for third-party developers. |
Enables external developers to build compatible bots. |
Partial (documentation format negotiable) |
YEN Notation Compliance |
Any system claiming to support Game Y MUST use YEN notation for game state representation. |
Industry standard for this game family. |
No |
Public Accessibility |
The deployed system MUST be accessible without special network configuration. |
End users and evaluators need to access the platform easily. |
No |
2.5. Quality Constraints
These constraints define the minimum acceptable quality levels the architecture must guarantee.
Constraint |
Target |
Measurement |
Negotiable? |
Response Time |
Game engine must respond within 2 seconds for standard board sizes. |
Performance testing with realistic loads. |
Partial (exact threshold) |
Concurrent Users |
Support at least 10 concurrent users without degradation. |
Load testing simulation. |
Yes (scale up with resources) |
Availability |
System must achieve 99% uptime during evaluation period. |
Monitoring and observability tools. |
Partial (percentage) |
2.6. Implications of These Constraints
These constraints collectively shape our architecture in specific ways:
-
Three-service structure: The technology constraints (TypeScript + Rust) force a separation between frontend (
webapp) and game engine (gamey). The user management requirement adds a third service (users). -
API Gateway necessity: Multiple services with public API exposure require a gateway for request routing and security.
-
Reverse proxy layer: A reverse proxy is required to handle HTTPS, enforce secure communication, and route external traffic to the appropriate internal services.
-
Docker standardization: The containerization constraint ensures consistent development and deployment environments.
-
Documentation overhead: Arc42 and ADR requirements mean we must invest time in documentation alongside development.
-
Testing necessity: The testing requirements force us to design for testability from the start.
3. YOVI System Architecture Documentation – Context and Scope
4. Context and Scope
This section delimits the YOVI system from its environment. Following Arc42 guidelines, we separate business context (what the system does) from technical context (how it communicates).
4.1. Business Context
The YOVI system is a web-based gaming platform for Game Y. From a business perspective, it provides two distinct value propositions:
-
For human players: An intuitive web interface to play Game Y against AI opponents, register, and track performance.
-
For bot developers: A public API that allows automated bots to interact with the game engine, request moves, and play matches.
The system is self-contained; all game logic, user management, and AI strategies are implemented within YOVI boundaries.
4.1.1. Business Context Diagram
4.1.2. Business Interfaces
| Interface | Description | Input | Output | Communication Partner |
|---|---|---|---|---|
Game Play Interface |
Web UI for human players to interact with the game |
Mouse clicks, keyboard |
Visual board, game state |
Human Player |
User Management |
Registration and login interface |
Username, password |
Session token, user profile |
Human Player |
Statistics Interface |
Access to match history and metrics |
User ID |
Win/loss records, game history |
Human Player |
Bot API - Move Request |
Programmatic interface for bots to get next move |
Board state (YEN), strategy |
Move (YEN) |
Bot |
Bot API - Game State |
Programmatic interface to query game status |
Game ID |
Current board, turn, winner |
Bot |
4.2. Technical Context
From a technical perspective, the YOVI system is implemented as five independent services communicating via HTTP/REST, plus an API gateway that routes external requests.
4.2.1. Technical Context Diagram
4.2.2. Technical Interfaces and Channels
| Business Interface | Technical Channel | Protocol | Data Format |
|---|---|---|---|
Game Play UI |
Browser ↔ Gateway |
HTTPS |
HTML/CSS/JS + JSON |
User Registration |
Browser ↔ Gateway ↔ Auth ↔ Users |
HTTPS → HTTP → HTTP |
JSON |
User Login |
Browser ↔ Gateway ↔ Auth ↔ Users |
HTTPS → HTTP → HTTP |
JSON |
Token Verification |
Browser ↔ Gateway ↔ Auth |
HTTPS → HTTP |
JSON (JWT) |
Statistics Retrieval |
Browser ↔ Gateway ↔ Users |
HTTPS → HTTP |
JSON |
Bot API - Move Request |
Bot ↔ Gateway ↔ Gamey |
HTTPS → HTTP → JSON |
JSON (YEN) |
Bot API - Game State |
Bot ↔ Gateway ↔ Gamey |
HTTPS → HTTP → JSON |
JSON (YEN) |
Game Logic Execution |
Gateway ↔ Gamey Service |
HTTP (internal) |
JSON (YEN) |
Database Access |
Users Service ↔ MongoDB |
MongoDB Wire Protocol |
BSON |
5. Solution Strategy
This section summarizes the fundamental decisions that shape the YOVI system architecture. Each decision is motivated by specific constraints (from section 2) and quality goals (from section 1), and forms the foundation for detailed design decisions in later sections.
5.1. 1. Technology Decisions
These decisions are primarily driven by the technical constraints defined in section 2 (TypeScript, Rust, Docker) and the quality goal of maintainability.
| Decision | Rationale | Constraints/Goals Addressed | Alternatives Considered |
|---|---|---|---|
Frontend: React + TypeScript |
React’s component model enables UI reuse and supports future i18n. TypeScript is mandatory per client. |
Technical Constraint: TypeScript; Quality: Usability, Maintainability |
Vue, Angular (rejected due to team familiarity) |
User Service: Node.js/Express |
Lightweight, fast for prototyping, integrates well with MongoDB. Team has JavaScript expertise. |
Quality: Development speed, Testability |
Python/Flask, Java/Spring (rejected: heavier, slower for simple CRUD) |
Game Engine: Rust |
Mandated by client. Provides memory safety and performance for game logic and AI. |
Technical Constraint: Rust; Quality: Performance, Reliability |
C++, Go (not allowed by constraint) |
Database: MongoDB |
Schema flexibility for user data, easy integration with Node.js, quick setup. |
Quality: Development speed, Deployability |
PostgreSQL (rejected: schema changes slower for evolving requirements) |
Containerization: Docker + docker-compose |
Ensures consistent environments across dev/test/prod. Required for deployment. |
Technical Constraint: Containerization; Quality: Deployability |
Manual deployment (rejected: inconsistent, error-prone) |
CI/CD: GitHub Actions |
Integrated with repository, automates testing and deployment, free for students. |
Organizational Constraint: CI/CD; Quality: Testability |
Jenkins (rejected: requires separate infrastructure) |
5.2. 2. Top-Level Decomposition (Architectural Pattern)
Decision: Microservices architecture with three independent services (webapp, users, gamey) plus an API gateway behind an Nginx reverse proxy.
Reasons: - Technology constraints (TypeScript + Rust) force separation—cannot run in same process - Separating user management from game logic allows independent scaling and development - The API gateway provides a single entry point, hiding internal complexity and enforcing security - Nginx reverse proxy handles HTTPS termination and redirects HTTP → HTTPS
How it maps to quality goals:
-
Maintainability: Services can be modified independently
-
Testability: Each service can be tested in isolation
-
Deployability: Services can be deployed independently or together via docker-compose
-
Security: HTTPS handled at proxy, internal services not exposed
5.3. 3. Design Patterns
| Pattern | Application | Rationale | Location |
|---|---|---|---|
Model-View-Controller |
Webapp structure |
Separates UI (View) from game state (Model) and user input handling (Controller). Essential for maintainability. |
|
Strategy Pattern |
Bot AI implementation |
Allows multiple difficulty levels/strategies without changing game core. New strategies = new classes, no modification to existing code. |
|
Observer Pattern |
UI updates |
When game state changes, UI must update automatically. React’s virtual DOM and state management handle this efficiently. |
|
Gateway/Router Pattern |
API Gateway + reverse proxy |
Single entry point routes requests to appropriate services. Simplifies client and adds security layer. HTTPS enforced at reverse proxy. |
|
5.4. 4. Quality Goal Realization
| Quality Goal | How We Achieve It | Key Decisions | Verification |
|---|---|---|---|
Functionality |
Rust engine ensures correct game rules; multiple strategies in |
Rust for game logic; Strategy pattern for AI |
Unit tests in |
Usability |
React provides responsive UI; component design supports i18n |
React frontend; Externalized strings |
E2E tests verify user flows; usability testing |
Modularity & Maintainability |
Microservices separation; Strategy pattern for extensions |
Three-service architecture; Strategy pattern |
Independent deployment possible; new strategies added without modifying core |
Deployability & Availability |
Docker containers; docker-compose; CI/CD; HTTPS via Nginx |
Containerization; GitHub Actions; Reverse proxy |
|
Testability |
Separation of concerns; dependency injection; Rust test framework |
Microservices; Repository pattern |
Unit tests: |
Interoperability |
REST API with YEN notation; documented for bots |
Public API in |
API tests verify YEN format; documentation generated automatically |
5.5. 5. Organizational and Process Decisions
| Decision | Rationale | Impact | Constraint Addressed |
|---|---|---|---|
Iterative Development (2-week sprints) |
Early validation of gameplay; adapt to feedback |
Regular demos; continuous integration |
Course timeline; quality goals |
Kanban Board (GitHub Projects) |
Visualize work; identify bottlenecks |
Issues tracked; clear priorities |
Organizational: issue tracking |
Code Reviews via Pull Requests |
Catch issues early; share knowledge |
All PRs require review; coding standards enforced |
Quality: maintainability, reliability |
Pair Programming for Complex Features |
Rust game logic is critical |
Higher quality for core engine |
Quality: functionality, reliability |
Architectural Decision Records (ADRs) |
Document why decisions were made |
Github wiki with numbered ADRs |
Organizational: documentation standard |
Definition of Done |
Feature = code + tests + docs + reviewed |
Ensures completeness before merge |
Quality: testability, maintainability |
5.6. 6. Key Architectural Decisions Summary
| Decision | ADR | Summary | Status |
|---|---|---|---|
Modes of game available to the user |
ADR-001 |
Basic functionality, extendable project |
Implemented |
MongoDB for user data |
ADR-002 |
Schema flexibility for evolving requirements |
Implemented |
Three-service architecture |
ADR-003 |
Separate webapp, users, gamey services due to technology constraints |
Implemented |
API Gateway + Reverse Proxy |
ADR-004 |
Single entry point, routing, HTTPS enforced |
Implemented |
Strategy Pattern for AI |
ADR-005 |
Bot strategies as pluggable components |
In progress |
5.7. 7. Traceability to Constraints
| Constraint (from Section 2) | How Solution Strategy Addresses It |
|---|---|
TypeScript frontend |
React + TypeScript in |
Rust game engine |
|
JSON + YEN communication |
REST APIs with YEN validation in |
Public API for bots |
|
Docker containerization |
Each service has Dockerfile; root |
User data persistence |
MongoDB in |
Multiple AI strategies |
Strategy pattern in |
Testing requirements |
Unit, integration, E2E tests in all services |
CI/CD |
GitHub Actions workflows for test and deploy |
Documentation (Arc42 + ADRs) |
This document + ADRs in Github wiki |
Secure HTTPS access |
Nginx reverse proxy terminates SSL; internal services not exposed |
6. Building Block View
This section describes the static decomposition of the YOVI system into building blocks. Following arc42 guidelines, we present a hierarchical view:
-
Level 1: Overview of top-level building blocks with responsibilities and key interfaces.
-
Level 2: White box descriptions of selected building blocks showing their internal structure.
This abstraction allows understanding the system structure without going deep into implementation details.
6.1. Level 1: Overall System White Box
The YOVI system consists of five top-level building blocks: Nginx reverse proxy, API gateway, three services, and an external database.
6.1.1. Black Box Descriptions
Nginx Reverse Proxy
-
Purpose: Public-facing entry point for HTTPS requests. Handles SSL termination, HTTP → HTTPS redirects, forwards traffic to API gateway.
-
Provided: Public Web UI endpoint (HTTPS), Public Bot API endpoint (HTTPS)
-
Required: API Gateway (internal HTTP)
-
Quality/Performance: Stateless, scalable horizontally, must handle all external traffic
-
Location:
/gateway/nginx/ -
Fulfilled Requirements: Public deployment, secure HTTPS
-
Open Issues: Certificate renewal, load balancing
API Gateway
-
Purpose: Internal routing and orchestration between services. Receives traffic from reverse proxy.
-
Provided: Routes for Webapp, Users, Gamey
-
Required: Webapp, Users, Gamey (internal HTTP)
-
Quality/Performance: Stateless, scalable horizontally
-
Location:
/gateway/ -
Open Issues: None (internal use only)
Web Frontend Service
-
Purpose: Provides the user interface for humans. Handles game rendering, user interaction, backend calls.
-
Provided: Web UI (HTML/CSS/JS)
-
Required: User Service API, Game Engine API
-
Quality/Performance: Responsive (<100ms), supports i18n, stateless
-
Location:
/webapp/ -
Open Issues: Real-time updates for future multiplayer features
User Service
-
Purpose: Manages user data: registration, authentication, profiles, match history, stats.
-
Provided: REST API for registration, login, profile, stats
-
Required: MongoDB
-
Quality/Performance: Response <200ms, reliable persistence
-
Location:
/users/ -
Open Issues: Token expiration & refresh strategy
Game Engine Service
-
Purpose: Game logic, move validation, win conditions, AI move calculation.
-
Provided: REST API (
/api/game/*) using YEN notation -
Required: None
-
Quality/Performance: Low latency (<500ms), high correctness
-
Location:
/gamey/ -
Open Issues: Performance optimization for large boards
MongoDB Database
-
Purpose: Persist user data including profiles, credentials, match history, stats.
-
Provided: MongoDB Wire Protocol
-
Required: None
-
Quality/Performance: Durable, supports concurrent connections
-
Location: Managed via Docker
-
Open Issues: Backup and recovery strategy
6.2. Level 2: Building Block Internals
6.2.1. Web Frontend Service (webapp/)
-
Component-based architecture: UI Components, State Management, API Client, Routing, i18n module.
-
Relationships:
-
UI Components → State Management, Routing, i18n
-
State Management → API Client
-
API Client → External Services
-
-
Key building blocks:
-
GameBoard, RegisterForm, UserProfile
-
6.2.2. Game Engine Service (gamey/)
-
Layered architecture: HTTP Server, YEN Parser, Game Core, AI Strategies, Board Model
-
Relationships:
-
HTTP Server → Parser → Core → Board & AI
-
-
Strategies:
-
Random (implemented), Heuristic (discussing), Minimax (discussing)
-
6.2.3. User Service (users/)
-
Layered architecture for REST APIs: Routes → Controllers → Services → Repositories → Models → MongoDB
-
Key endpoints: POST /register, POST /login, GET /profile/:id, GET /:id/stats
7. Runtime View
This section describes the dynamic behavior of the YOVI system. Following arc42 guidelines, we present a representative selection of architecturally significant scenarios:
-
Critical use cases: User registration, gameplay against bot, bot API usage
-
Key interactions: How building blocks cooperate
-
Error scenarios: How the system handles failures
-
Operation scenarios: System startup and shutdown
These scenarios demonstrate how the building blocks defined in Section 5 collaborate at runtime.
7.1. Runtime Scenario 1: User Registration (In development)
This scenario shows how a new user creates an account in the system. Currently, only username is required, but verification works the same way.
Steps:
-
User fills registration form in browser.
-
Reverse proxy receives HTTPS request.
-
Proxy forwards request to Web Frontend.
-
Frontend performs validation.
-
Web frontend sends registration request to Proxy → Gateway.
-
Gateway forwards to User Service.
-
User Service hashes password and checks uniqueness.
-
Alternative 1: User exists → 409 Conflict sent back through Gateway → Proxy → Webapp → User.
-
Alternative 2: New user inserted into DB, JWT generated. Response flows back through Gateway → Proxy → Webapp → User.
7.2. Runtime Scenario 2: Playthrough of a Game against Two Players
Steps Summary: Same as original, but all external requests go through Proxy → Gateway.
7.3. Runtime Scenario 3: Play Game vs. AI Bot
7.4. Runtime Scenario 4: Error and Exception Handling
All scenarios updated to include Nginx Reverse Proxy as first hop for external requests.
-
Game Engine Unavailable: User request → Proxy → Gateway → Gamey → error returned through Proxy.
-
Invalid Move: Same, request flows through Proxy → Gateway → Gamey.
-
Database Connection Lost: Requests from Webapp still go through Proxy → Gateway → Users → MongoDB.
8. Deployment View
This section describes the technical infrastructure that executes the YOVI system and the mapping of building blocks (from Section 5) to infrastructure elements. Following arc42 guidelines, we present:
-
Level 1: Overall deployment environment (production)
-
Level 2: Internal structure of key infrastructure nodes
-
Environment variations: Development, test, and production differences The application is built to use Docker and a Virtual Machine as a host in such a way that it combines the portability of the containers with the security and isolation features of the VM.
8.1. Level 1: Production Deployment Overview
The YOVI system is deployed as a containerized application running on a cloud virtual machine. All services run in Docker containers, orchestrated with docker-compose.
8.1.1. Infrastructure Elements (Level 1)
Element |
Description |
Technology |
Purpose |
Cloud VM |
Virtual machine hosting all containers |
Ubuntu 22.04 LTS |
Execution environment |
Docker Engine |
Container runtime |
Docker v24.0+ |
Container orchestration |
Gateway Container |
API Gateway service |
Nginx / Express Gateway |
Request routing, SSL termination |
Webapp Container |
Frontend service |
Node.js + React (static files) |
Serve UI to users |
Users Container |
User management service |
Node.js + Express |
User data & auth |
Gamey Container |
Game engine service |
Rust |
Game logic & AI |
MongoDB Container |
Database |
MongoDB 7.x |
Data persistence |
Prometheus Container |
Metrics collection |
Prometheus latest |
Monitoring |
Grafana Container |
Metrics visualization |
Grafana latest |
Dashboards |
GitHub Registry |
Container image registry |
GitHub Container Registry |
Image storage |
8.1.2. Communication Channels (Level 1)
Channel |
Protocol |
Source |
Destination |
Purpose |
External HTTPS |
HTTPS (TLS 1.3) |
Internet |
Gateway (port 443) |
User/bot access |
Internal HTTP |
HTTP (plain) |
Gateway |
Webapp/Users/Gamey |
Service communication |
MongoDB Wire |
MongoDB protocol |
Users |
MongoDB (27017) |
Database access |
Docker Pull |
HTTPS |
VM |
GitHub Registry |
Image download |
Metrics Scrape |
HTTP |
Prometheus |
All services |
Monitoring data |
Grafana Query |
HTTP |
Grafana |
Prometheus |
Dashboard queries |
8.2. Level 2: Internal Structure of Key Nodes
8.2.1. Level 2: Cloud VM Internal Structure
This diagram zooms into the Cloud VM, showing the container organization and internal networks.
8.2.2. Production Environment
| Aspect | Configuration | Justification |
|---|---|---|
Location |
Cloud VM (AWS EC2 / DigitalOcean) |
Public accessibility |
Orchestration |
docker-compose + systemd |
Simple, sufficient for scale |
Networking |
Public IP + domain |
User access |
Data |
Persistent volumes |
Data durability |
Monitoring |
Prometheus + Grafana |
Observability |
Backup |
Daily automated backups |
Disaster recovery |
SSL |
Let’s Encrypt auto-renew |
Security |
8.3. Mapping of Building Blocks to Infrastructure
This table maps the building blocks from Section 5 to the infrastructure elements described above.
| Building Block | Deployed As | Environment | Node | Container | Network |
|---|---|---|---|---|---|
Web Frontend Service |
Static files + Node server |
Production |
Cloud VM |
webapp_container |
frontend_network |
User Service |
Node.js application |
Production |
Cloud VM |
users_container |
backend_network |
Game Engine Service |
Rust binary |
Production |
Cloud VM |
gamey_container |
backend_network |
API Gateway |
Nginx configuration |
Production |
Cloud VM |
gateway_container |
frontend_network |
MongoDB Database |
MongoDB instance |
Production |
Cloud VM |
mongodb_container |
backend_network |
Prometheus |
Prometheus server |
Production |
Cloud VM |
prometheus_container |
monitoring_network |
Grafana |
Grafana server |
Production |
Cloud VM |
grafana_container |
monitoring_network |
8.4. Deployment Decisions and Rationale
| Decision | Rationale | Alternatives Considered | Trade-offs |
|---|---|---|---|
Single VM with docker-compose |
Simplicity, cost-effective for expected load (10 concurrent users) |
Kubernetes (overkill), separate VMs (costly) |
Limited horizontal scaling, but simpler operations |
Separate networks |
Security isolation between frontend and backend |
Single flat network |
More complex config, but better security |
Monitoring stack included |
Required for observability and course evaluation |
External monitoring service |
Additional resource usage, but integrated |
Persistent volumes |
Data durability across container restarts |
Ephemeral storage |
Requires backup strategy |
Let’s Encrypt SSL |
Free, automated certificates |
Paid certificates, self-signed |
Auto-renewal complexity, but free |
8.5. Deployment Scripts and Configuration
The deployment is automated via:
-
docker-compose.yml– at repository root. Defines all services. -
deploy.sh– script for production deployment. -
GitHub Actions workflow – CI/CD pipeline for testing and deployment.
8.6. Quality and Performance Features
Some of the characteristics obtained by the use of the specified implementation:
-
Isolation: Both thanks to the container isolation of Docker and the own security features of the Virtual Machine.
-
Portability and Compatibility: Thanks to the use of the Docker containers.
-
Security: If a container is compromised, the VM can act as another security layer.
-
Others: Scalability, easily maintainable, would be very simple to backup…
8.7. Mapping of Building Blocks to Infrastructure
-
Webapp: Located in port 80, it is the one the user will connect to and interact with. It is the GUI (Graphic User Interface), and as such the frontend of the application.
-
Gateway: This block has port 8080 assigned, and it is in charge of acting as a bridge between the rest of the building blocks of the application.
-
Users: It is in charge of the user management as well as the authentication feature, and its port is 3000.
-
Gamey: This service, written in Rust, is the one containing the complete functionality of the Y Game. It uses port 4000.
-
Database: It is an external block containing all the information of the application (user data, game data…).
Additionally, it is worth mentioning the presence of Prometheus, Loki and Grafana, they are external blocks that will be used to monitor and obtain metrics of the application and its usage.
9. Cross-cutting Concepts
9.1. Game State Representation
All game state across the system is represented using YEN notation, as mandated by the client.
{
"size": 11,
"turn": "R",
"board": {
"cells": [
["R", "B", "R", "B"]
]
}
}
Usage:
- gamey/ service: Parses and validates YEN, returns moves in YEN
- webapp/: Converts UI interactions to YEN for API calls
- Public API: Accepts and returns YEN exclusively
Rationale: Standardization ensures interoperability with external bots and consistency across services.
9.2. Internationalization
For accessibility, the application supports English and Spanish, with a design that allows easy addition of new languages.
Implementation:
- All UI strings externalized to JSON files
- Language selection in user preferences
- Optional URL-based language: /en/play, /es/play
// i18n file structure
/locales/
en/
common.json
game.json
user.json
es/
common.json
game.json
user.json
Fallback: English as default if translation is missing
9.3. Network Flow (Proxy → Gateway → Services)
The external client requests flow through the reverse proxy and API gateway to reach internal services:
-
External Clients → Reverse Proxy → API Gateway
-
API Gateway forwards requests to Webapp and Users
-
Webapp communicates with the Game Engine
The flow diagrammatically is:
External Clients
|
v
+----------------+
| Reverse Proxy |
+----------------+
|
v
+----------------+
| API Gateway |
+----------------+
| |
v v
+-----------+ +-----------+
| Webapp | | Users |
+-----------+ +-----------+
|
v
Game Engine
9.4. Secrets Management
Secrets are handled differently depending on environment:
-
Development:
.envfiles (gitignored) -
CI/CD: GitHub Secrets for all credentials
-
Production: Environment variables in docker-compose (never committed)
Example secrets: - JWT_SECRET - MONGODB_URI - GITHUB_TOKEN
9.5. Architecture and Design Patterns
Key design patterns applied across services:
-
Strategy Pattern: Used in Bot AI strategies; allows extensibility (adding new strategies = new classes).
-
Repository Pattern: Applied in
usersservice for data access; improves testability (mock repositories). -
Gateway/Router: API Gateway implementation; ensures security, routing, and monitoring (internal traffic behind proxy).
-
Observer/Reactive: UI updates via React state; ensures consistent UI across state changes.
9.6. Error Handling Strategy
All services implement consistent error handling, including requests coming via the reverse proxy:
-
400 Bad Request: Invalid input format, e.g., malformed YEN
-
401 Unauthorized: Missing or invalid token (no JWT)
-
403 Forbidden: Valid token but insufficient permissions
-
404 Not Found: Resource does not exist (invalid game ID)
-
409 Conflict: Business rule violation, e.g., username already exists
-
500 Internal Error: Unexpected server error, e.g., database connection lost
-
503 Unavailable: Service temporarily down, e.g., game engine unreachable (proxied)
9.7. Code Organization
All services follow a consistent structure:
/service-name/
src/ # Production code
test/ # Unit tests
integration/ # Integration tests
package.json/Cargo.toml
Dockerfile
README.md
9.8. Testing Strategy
The project follows a comprehensive testing strategy to ensure correctness, reliability, and quality across all system components. Different types of tests are implemented, including unit tests, end-to-end tests, and load testing.
9.8.1. Unit Tests
Unit tests are implemented for individual services to validate core logic in isolation. These tests focus on:
-
Business logic in backend services
-
API endpoint behavior
-
Game logic validation in the game engine
Code coverage is measured and reported as part of the continuous integration process, ensuring that critical components are adequately tested.
9.8.2. End-to-End Tests (E2E)
End-to-End tests are implemented using Playwright together with Cucumber for behavior-driven development (BDD). These tests validate the system from the user’s perspective, ensuring that all components work together correctly.
The E2E tests are executed against the deployed application and simulate real user interactions through the browser.
Tested Flows
The following key user flows are covered:
-
Authentication flow
-
User registration
-
User login (valid and invalid credentials)
-
Logout functionality
-
-
Game flow
-
Navigation from home to game configuration
-
Selection of bot and board size
-
Starting a new game
-
Interaction with the game board (selecting cells, sending moves)
-
Returning to home
-
-
Bot interaction
-
Playing against AI bots
-
Verifying game state updates after moves
-
Ensuring correct rendering of game elements
-
-
Navigation and UI flow
-
Access to statistics page
-
Navigation between pages (home, game, stats)
-
UI validation (buttons, selectors, labels)
-
Architecture Coverage
A key aspect of the E2E tests is that they validate the full system integration, covering the entire request flow:
User → Reverse Proxy → API Gateway → Microservices → Response
Specifically:
-
Requests are sent to the public URL (HTTPS endpoint)
-
The reverse proxy (Nginx) handles routing and SSL termination
-
Requests are forwarded to the API Gateway
-
The gateway routes requests to the appropriate services:
-
Users service
-
Game service
-
Authentication service
-
-
Responses propagate back through the same path to the user interface
This ensures that all infrastructure components (networking, routing, and service communication) are tested as part of the system.
9.8.3. Load Testing
Load testing is performed to evaluate system performance under concurrent usage.
The tests simulate multiple users interacting with the system simultaneously, focusing on:
-
Concurrent game sessions
-
API response times under load
-
System stability under stress
Key metrics analyzed include:
-
Average response time
-
Throughput (requests per second)
-
Error rate under load
These tests help identify performance bottlenecks and validate that the system meets performance requirements.
9.8.4. Continuous Integration
All tests are integrated into the CI pipeline, ensuring that:
-
Tests are executed automatically on pull request (Unit Tests) or release (Unit Tests, e2e Tests and Load Tests)
-
Failures are detected early in the development process
-
Code quality and stability are maintained over time
9.8.5. Summary
This multi-layered testing strategy ensures:
-
Correctness of individual components (unit tests)
-
Proper integration of all services (E2E tests)
-
System performance and scalability (load tests)
Together, these tests contribute to the overall reliability, maintainability, and quality of the system.
10. Architecture Decisions
This section documents the most significant architectural decisions made during the design and development of the YOVI platform. Each decision follows the ADR (Architecture Decision Record) format to provide context, rationale, and consequences. These decisions complement the solution strategy outlined in Section 4, providing deeper rationale for key choices.
10.1. ADR-001: Game Modes Available to the User
-
Status: Implemented
-
Date: 2026-01-15
-
Deciders: Development Team
Context: The platform needs to offer users different ways to play Game Y. The initial requirements specify player-versus-machine mode, but the architecture should be extensible to support additional modes in the future (multiplayer, different variants, etc.).
Decision: We will design the game mode system with extensibility as a core principle. The architecture will separate game mode logic from core game rules, allowing new modes to be added by:
-
Defining new mode types in the webapp UI
-
Extending the game engine to support mode-specific rules
-
Keeping mode state separate from core game state
Alternatives Considered:
-
Hard-code only PvM mode: Rejected - would require major refactoring for future modes
-
Configuration-driven modes: Considered but deferred - adds complexity for current requirements
-
Plugin architecture: Rejected - overkill for project scope
Consequences:
-
Positive: New game modes can be added without modifying core game logic
-
Positive: Clear separation between "how game is played" and "game rules"
-
Positive: Meets current requirements while preparing for future
-
Negative: Slightly more complex initial implementation
-
Negative: Requires careful design to avoid over-engineering
-
️ Mitigation: Focus on PvM mode first, ensure extension points are clean
10.2. ADR-002: MongoDB for User Data Persistence
-
Status: Implemented
-
Date: 2026-01-20
-
Deciders: Development Team
Context: The users service needs to persist user profiles, authentication data, and match history. The team needed to choose a database technology that balances development speed with future flexibility.
Decision: We will use MongoDB for user data persistence. The schema will be designed to accommodate:
-
User profiles (username, password hash, email)
-
Authentication data (tokens, sessions)
-
Match history (games played, results, moves)
Alternatives Considered:
-
PostgreSQL: Rejected - schema changes for evolving requirements would be slower
-
MySQL: Rejected - similar concerns
-
In-memory storage: Rejected - data must persist
-
SQLite: Rejected - not suitable for concurrent access
Consequences:
-
Positive: Flexible schema allows easy addition of user fields
-
Positive: Excellent integration with Node.js via Mongoose ODM
-
Positive: Fast prototyping and iteration on data models
-
Negative: No built-in relations (handled in application code)
-
Negative: Eventual consistency (not critical for this domain)
-
Mitigation: Use atomic operations for critical updates (e.g., game statistics)
10.3. ADR-003: Three-Service Microservices Architecture
-
Status: Implemented
-
Date: 2026-01-15
-
Deciders: Development Team
Context: The lab assignment requires a web application in TypeScript and a game engine in Rust. These technologies cannot run in the same process. Additionally, user management (registration, history) is a distinct responsibility that could be separated.
Decision:
We will implement a microservices architecture with three independent services:
-
webapp/- TypeScript/React frontend -
users/- Node.js/Express user management service -
gamey/- Rust game engine service
All services will communicate via REST APIs and be orchestrated with an API gateway.
Consequences:
-
Positive: Clear separation of concerns
-
Positive: Each service can be developed and tested independently
-
Positive: Technology-specific optimizations possible
-
Negative: Increased operational complexity (multiple services to deploy)
-
Negative: Network latency between services
-
Mitigation: docker-compose simplifies local development and deployment
10.4. ADR-004: API Gateway as Single Entry Point
-
Status: Implemented
-
Date: 2026-01-20
-
Deciders: Development Team
Context: With three independent services, external clients (human users and bots) would need to know multiple endpoints. This increases complexity and exposes internal service topology.
Decision:
We will implement an API Gateway (gateway/) as the single entry point for all external requests. The gateway will:
-
Route
/→ webapp service (static files) -
Route
/users/*→ users service -
Route
/game/*→ gamey service -
Handle SSL termination
-
Provide a security layer
Alternatives Considered:
-
Direct client-to-service communication: Rejected - exposes internal services, complicates client logic
-
Load balancer only: Rejected - doesn’t provide routing based on paths
-
No gateway: Rejected - security concerns
Consequences:
-
Positive: Single URL for all clients simplifies access
-
Positive: Internal services remain hidden from external clients
-
Positive: Centralized SSL and security policies
-
Negative: Additional component to deploy and maintain
-
Negative: Potential performance bottleneck
-
Mitigation: Gateway is stateless and can be scaled horizontally
10.5. ADR-005: Strategy Pattern for AI Bot Behaviors
-
Status: In Progress
-
Date: 2026-01-25
-
Deciders: Development Team
Context: The system must support multiple AI strategies and difficulty levels. New strategies may be added in the future without modifying existing code. The requirement states "more than one strategy" must be available.
Decision:
We will implement the Strategy Pattern in the gamey service. Each AI strategy will be a separate module implementing a common trait (BotStrategy). The game engine will select the appropriate strategy based on user input. Initial strategies will include:
-
Random move (implemented)
-
Heuristic-based (in discussion)
-
Minimax (in discussion)
Alternatives Considered:
-
Conditional logic (if/else or switch): Rejected - violates Open/Closed principle, hard to extend
-
Configuration flags with hard-coded algorithms: Rejected - similar concerns
-
Machine learning model: Rejected - overkill for requirements, unpredictable performance
Consequences:
-
Positive: New strategies can be added without modifying core game logic
-
Positive: Each strategy can be tested independently
-
Positive: Clear separation of concerns
-
Negative: Slight indirection overhead (negligible)
-
Negative: Requires understanding of trait/interface design
-
Mitigation: Well-documented strategy trait and example implementations
10.6. ADR-006: Reverse Proxy and HTTPS Termination
-
Status: Implemented
-
Date: 2026-04-06
-
Deciders: Development Team
Context: The platform must be accessible securely over HTTPS and expose multiple internal services through a single public endpoint. Additionally, SSL/TLS certificate management should be centralized to avoid duplication across services.
Decision: We introduce a reverse proxy using Nginx as the public entry point of the system. The reverse proxy is responsible for:
-
Redirecting all HTTP traffic (port 80) to HTTPS
-
Handling SSL/TLS termination using Let’s Encrypt certificates
-
Routing incoming requests:
-
/→ webapp service -
/api/*→ API Gateway
This approach ensures that all internal services communicate over HTTP, while external communication is secured via HTTPS.
Alternatives Considered:
-
Handling HTTPS in each service: Rejected - duplicates configuration and increases complexity
-
Letting the API Gateway manage HTTPS: Rejected - mixes routing and infrastructure concerns
-
No HTTPS: Rejected - insecure and not suitable for production environments
Consequences:
-
Positive: Centralized SSL/TLS management simplifies configuration
-
Positive: Improved security by enforcing HTTPS
-
Positive: Clean separation between infrastructure (proxy) and application logic
-
Positive: Single public endpoint for all services
-
Negative: Additional component to deploy and maintain
-
Negative: Slight latency introduced by proxy layer
-
Mitigation: Nginx is lightweight and widely used in production environments
10.7. ADR-007: Monitoring and Observability Stack
-
Status: Implemented
-
Date: 2026-04-20
-
Deciders: Development Team
Context: As the system evolved into a distributed microservices architecture, it became necessary to ensure visibility into system behavior, performance, and failures. Without proper monitoring and logging, diagnosing issues across multiple services would be complex and time-consuming. Additionally, the evaluation criteria require support for availability and observability.
Decision: We implemented a complete monitoring and observability stack composed of:
-
Prometheus for metrics collection
-
Grafana for metrics visualization and dashboards
-
Loki for centralized log aggregation
-
Promtail for log collection from Docker containers
All monitoring components are deployed within the same Docker Compose environment and connected through the shared internal network.
The system collects and monitors:
-
Resource usage (CPU, memory)
-
Request rates and response times
-
Error rates and failed requests
-
Service availability
Logs from the user and authentication services are centralized and can be queried and visualized through Grafana, enabling correlation between logs and metrics.
Consequences:
-
Positive: Improved visibility into system performance and behavior
-
Positive: Faster debugging through centralized logs
-
Positive: Real-time monitoring via Grafana dashboards
-
Positive: Supports availability and reliability quality attributes
-
Negative: Additional infrastructure components to maintain
-
Negative: Slight increase in resource consumption
-
Mitigation: Lightweight tools (Prometheus, Loki) and containerized deployment minimize overhead
10.8. Summary of Architectural Decisions
ID |
Decision |
Key Rationale |
Status |
ADR-001 |
Game Modes |
Extensibility for future modes |
Implemented |
ADR-002 |
MongoDB for User Data |
Schema flexibility, Node.js integration |
Implemented |
ADR-003 |
Three-Service Architecture |
Technology constraints (TypeScript + Rust) |
Implemented |
ADR-004 |
API Gateway |
Security, single entry point |
Implemented |
ADR-005 |
Strategy Pattern for AI |
Extensibility for multiple strategies |
Implemented |
ADR-006 |
Reverse Proxy + HTTPS |
Security and centralized routing |
Implemented |
ADR-007 |
Monitoring and Observability Stack |
Observability |
Implemented |
11. Quality Requirements
This section details the quality requirements for the YOVI system. While Section 1.2 introduced the high-level quality goals, this section makes them concrete, measurable, and testable through scenarios and descriptive metrics.
Quality requirements are critical because they influence architectural decisions and determine whether stakeholders consider the system a success.
11.1. Quality Tree
The quality tree organizes requirements hierarchically, following the ATAM (Architecture Tradeoff Analysis Method) approach. "Quality" is decomposed into categories and subcategories, each linked to measurable scenarios:
-
Performance
-
Response Time → QS-01: Move calculation < 2 seconds
-
Throughput
-
Resource Usage
-
Usability
-
Learnability → QS-02: New user starts without help
-
Operability
-
Accessibility (i18n) → QS-07: EN + ES supported
-
Maintainability
-
Modularity
-
Testability → QS-10: Coverage > 80%
-
Extensibility → QS-03: New feature < 2 days
-
Reliability
-
Availability → QS-05: 10+ concurrent users
-
Fault Tolerance → QS-04: Crash recovery < 30 sec
-
Recoverability
-
Portability
-
Browser Support → QS-06: Chrome, Firefox, Edge
-
Platform Independence
-
Security
-
Authentication → QS-08: Unauthorized access blocked
-
Data Protection → QS-09: Zero data loss
11.2. Quality Scenarios
11.2.1. QS-01: Move Calculation Performance
Measures the game engine response time for calculating a bot move.
-
Stimulus: Human player makes a move vs. bot
-
Source: Human player
-
Environment: Normal operation, board size 11x11
-
Artifact: Game Engine Service (
gamey/) -
Response: System calculates and returns bot move
-
Response Measure: 95% of moves < 2 seconds
-
Priority: High
-
Verification: Automated performance tests with timing metrics
Flow: Human Player → Web Frontend → API Gateway → Game Engine → returns move → Web Frontend → User
11.2.2. QS-02: New User Learnability
Tests whether a new user can start playing without external instructions.
-
Stimulus: New user wants to play
-
Source: First-time player
-
Environment: User has never seen Game Y
-
Artifact: Web Frontend (
webapp/) -
Response: User starts a valid game without help
-
Response Measure: 80% complete first move within 2 minutes
-
Priority: High
-
Verification: Usability testing with 5 new users
Flow: New User → Web Frontend → explore UI → make first move → UI updates
11.2.3. QS-03: Feature Extensibility
Measures how easily a new game variant can be added.
-
Stimulus: Developer adds new variant (e.g., Hex)
-
Source: Development team
-
Environment: Development environment, existing codebase
-
Artifact: Game Engine Service (
gamey/) -
Response: New variant added with minimal changes
-
Response Measure: Implementation < 2 developer-days, changes only at extension points
-
Priority: Medium
-
Verification: Code review, time tracking, demonstration
Flow: Developer → Codebase → implement variant → run tests → all pass
11.2.4. QS-04: Crash Recovery (Fault Tolerance)
Tests the system’s ability to recover from a service failure.
-
Stimulus: Game Engine crashes
-
Source: Internal failure
-
Environment: Production, active game
-
Artifact: Game Engine Service (
gamey/) -
Response: System detects failure, restarts service, continues game
-
Response Measure: Downtime < 30 seconds, game state recoverable
-
Priority: High
-
Verification: Chaos testing: kill container, measure recovery time
Flow: User → Web Frontend → API Gateway → Game Engine crash → detect & restart → User continues
11.2.5. QS-05: Concurrent Users
Tests system performance under load.
-
Stimulus: Multiple users play simultaneously
-
Source: Human players
-
Environment: Production deployment
-
Artifact: All services
-
Response: All users experience normal response times
-
Response Measure: Support 10+ concurrent users, response degradation < 50%
-
Priority: Medium
-
Verification: Load testing with 10 concurrent sessions
Flow: Users 1–10 → System → Responses → Users
11.2.6. QS-06: Browser Compatibility
Tests frontend across major browsers.
-
Stimulus: User accesses game from different browsers
-
Source: Human player
-
Environment: Latest Chrome, Firefox, Edge
-
Artifact: Web Frontend (
webapp/) -
Response: Game renders correctly, fully playable
-
Response Measure: Core functionality works in Chrome, Firefox, Edge
-
Priority: Medium
-
Verification: CI cross-browser testing, manual verification
Flow: User → Chrome/FF/Edge → Web Frontend → Game renders
11.2.7. QS-07: Internationalization
Tests language switching functionality.
-
Stimulus: User selects Spanish language
-
Source: Spanish-speaking player
-
Environment: Normal operation
-
Artifact: Web Frontend (
webapp/) -
Response: All UI text in Spanish
-
Response Measure: 100% of strings translated to EN and ES
-
Priority: Low (optional feature)
-
Verification: Automated checks for missing translations
Flow: User → Web Frontend → i18n Module → UI in Spanish
11.2.8. QS-08: Unauthorized Access Prevention
Tests authentication and authorization.
-
Stimulus: User tries to access another user’s data
-
Source: Malicious/curious user
-
Environment: Production
-
Artifact: User Service (
users/), API Gateway -
Response: Access rejected with 403 Forbidden
-
Response Measure: 100% of cross-user attempts blocked
-
Priority: High
-
Verification: CI security tests, penetration testing
Flow: User A → API Gateway → User Service → blocked → 403 → User A
11.2.9. QS-09: Data Persistence
Tests data durability after system restart.
-
Stimulus: System restarts
-
Source: Maintenance or crash
-
Environment: Post-restart
-
Artifact: MongoDB, User Service
-
Response: All user data intact
-
Response Measure: Zero data loss, 100% records recoverable
-
Priority: High
-
Verification: Backup/restore tests, crash simulation
Flow: User → System → MongoDB → Restart → User → System → MongoDB → Data intact
11.2.10. QS-10: Code Maintainability
Tests code quality and maintainability for new developers.
-
Stimulus: New developer joins
-
Source: Onboarding
-
Environment: Development
-
Artifact: All services
-
Response: Developer can locate/modify feature in a day
-
Response Measure: Coverage > 80%, cyclomatic complexity < 10
-
Priority: Medium
-
Verification: SonarQube metrics, onboarding test
Flow: Developer → Codebase → Read docs → Modify feature → Run tests → Coverage verified
11.3. Traceability to Quality Goals
Mapping scenarios to Section 1.2 quality goals:
-
Functionality: QS-01, QS-03 → correct game rules, multiple AI strategies
-
Usability: QS-02, QS-07 → learnability (80% success), i18n support (EN+ES)
-
Modularity & Maintainability: QS-03, QS-10 → extensibility < 2 days, coverage > 80%
-
Deployability & Availability: QS-04, QS-05 → recovery < 30 sec, 10+ concurrent users
-
Testability: QS-10 → coverage > 80%
-
Interoperability: Covered in Section 6 → API works for external bots
12. Risks and Technical Debts
Following the principle that "risk management is project management for grown-ups" (Tim Lister), we systematically evaluate and track items that could impact project success.
Each item is assessed using a risk matrix:
-
Probability: Low (1), Medium (2), High (3)
-
Impact: Low (1), Medium (2), High (3)
-
Risk Score = Probability × Impact (1-9)
Items with score ≥ 6 require active mitigation. Items with score ≥ 8 are critical and require immediate attention.
12.1. Technical Risks
Identified technical risks (ordered by Risk Score, highest first):
-
R1: Rust learning curve
-
Probability / Impact / Score: 3 / 3 / 9
-
Description: Team lacks Rust experience
-
Mitigation: Learning sprint (week 1), pair programming, focus on core game verification first, leverage Rust’s type system
-
R2: TypeScript-Rust integration
-
Probability / Impact / Score: 3 / 3 / 9
-
Description: Communication bugs between services
-
Mitigation: Versioned JSON interface (YEN), contract tests, integration tests in CI, OpenAPI documentation
-
R3: Game Y rules misunderstanding
-
Probability / Impact / Score: 2 / 3 / 6
-
Description: Incorrect implementation of rules
-
Mitigation: Domain analysis session, shared glossary, early walking skeleton, reference authoritative sources
-
R4: Scope creep
-
Probability / Impact / Score: 2 / 3 / 6
-
Description: Optional features delay mandatory ones
-
Mitigation: MoSCoW prioritization, clear definition of done for optional features, feature flags
-
R5: Reverse Proxy misconfiguration
-
Probability / Impact / Score: 2 / 3 / 6
-
Description: Proxy incorrectly routes traffic or breaks TLS
-
Mitigation: Automated integration tests via proxy, configuration templates, staging verification, monitoring/logging
-
R6: API stability for bots
-
Probability / Impact / Score: 2 / 2 / 4
-
Description: Changes break external bots
-
Mitigation: Versioned API (/v1/), documentation, deprecation policy (6 months), compatibility tests
-
R7: AI performance
-
Probability / Impact / Score: 2 / 2 / 4
-
Description: Slow on large boards
-
Mitigation: Performance budget (< 2s), early profiling, CI performance tests
-
R8: DevOps complexity
-
Probability / Impact / Score: 2 / 2 / 4
-
Description: Inconsistent deployments
-
Mitigation: Docker + docker-compose, CI/CD automation, environment parity
-
R9: MongoDB data loss
-
Probability / Impact / Score: 1 / 3 / 3
-
Description: Corruption or deletion of data
-
Mitigation: Daily backups, 30-day retention, monthly restore tests, replication
-
R10: JWT vulnerability
-
Probability / Impact / Score: 1 / 3 / 3
-
Description: Weak secrets or token leakage
-
Mitigation: Strong secrets (env vars), short expiration (1h), refresh tokens, GitHub Secrets
-
R11: Gateway single point of failure
-
Probability / Impact / Score: 1 / 3 / 3
-
Description: Entire system down if gateway fails
-
Mitigation: Stateless design, multiple instances, Docker restart, Prometheus monitoring
12.2. Technical Debts
Technical debts are already incurred design or implementation shortcuts that need to be addressed:
-
TD1: Frontend Component Duplication
-
Impact if not fixed: Medium – inconsistent UI, harder maintenance
-
Planned Resolution: Refactor to use shared component library
-
TD2: Hardcoded Configuration
-
Impact if not fixed: Low – works in dev but may cause production issues
-
Planned Resolution: Move all configuration to
.envfiles -
TD3: Missing Input Validation
-
Impact if not fixed: Medium – poor user experience, extra server load
-
Planned Resolution: Add validation to all forms
-
TD4: Incomplete Test Coverage
-
Impact if not fixed: Medium – higher risk of undetected bugs
-
Planned Resolution: Increase coverage to 80% (per QS-10)
-
TD5: Proxy Integration Not Fully Tested
-
Impact if not fixed: Medium – could cause downtime or failed API calls
-
Planned Resolution: Add automated end-to-end tests covering Proxy → Gateway → Services, including stress tests
13. Glossary
This glossary defines the most important domain and technical terms used throughout the YOVI project. It ensures that all stakeholders (developers, client, evaluators, bot developers) have an identical understanding of these terms.
13.1. Domain Terms
-
YOVI: The name of the system developed in this project. A web-based platform for playing Game Y, supporting human players and external bots.
-
Game Y: An abstract strategy board game in which players aim to connect three different sides of the board. Played on a triangular or hexagonal grid.
-
Classic Game Y: The standard version of Game Y defined by the original rules, without additional variants or optional mechanics. Mandatory implementation for the project.
-
Game Variant: An alternative version of Game Y with modified rules, such as Poly-Y, Hex, Tabu Y, or Holey Y. Optional features.
-
Player: A human user who interacts with the system through the web frontend to play games.
-
Bot: An external program that interacts with the system exclusively through the public API to retrieve information and play games automatically. Bots have no UI.
-
Player-versus-Machine (PvM): A game mode in which a human player competes against a computer-controlled opponent (AI).
-
Match / Game Session: A single instance of a game, from initialization to completion, including all moves and the final result.
-
Move: A single action by a player placing a piece on the board. Represented in YEN notation.
-
Strategy: A specific algorithm or heuristic used by the AI to determine its next move. Examples: Random, Heuristic, Minimax.
-
Difficulty Level: A configuration of the AI strategy that affects its playing strength. Usually implemented by varying strategy parameters or search depth.
-
User Registration: The process of creating an account in the system, providing username, password, and optionally email.
-
Match History: A record of all games played by a user, including date, opponent (human/bot), result, and moves.
-
Statistics: Aggregated metrics about a user’s performance, such as total games played, wins, losses, and win rate.
-
Leaderboard: An optional feature ranking users based on various metrics (win rate, games played, etc.).
13.2. Technical Terms
-
Architecture Decision Record (ADR): A document capturing an important architectural decision, including context, rationale, alternatives, and consequences. Stored in the project wiki.
-
API Gateway: A service acting as a single entry point for all external requests. Routes requests to the appropriate internal service (webapp, users, gamey) and handles SSL termination.
-
Microservice: An architectural style in which a system is composed of small, independent services that communicate over a network and can be developed, deployed, and scaled separately. YOVI uses three microservices.
-
Web Frontend Service (
webapp/): The TypeScript/React service providing the UI for human players. Handles game rendering, user interaction, and API calls to backend services. -
User Service (
users/): Node.js/Express service responsible for user management: registration, authentication, profile storage, match history, and statistics. -
Game Engine Service (
gamey/): Rust service implementing all Game Y logic: move validation, win condition checking, and AI move calculation with multiple strategies. -
MongoDB: NoSQL database used by the User Service to persist user profiles, authentication data, and match history.
-
Docker: Containerization platform used to package each service with dependencies, ensuring consistent execution across development, test, and production.
-
docker-compose: Tool for defining and running multi-container Docker applications. Orchestrates all YOVI services locally and in production.
-
CI/CD (Continuous Integration / Continuous Deployment): Automated pipeline (via GitHub Actions) that builds, tests, and deploys the system on code changes.
-
GitHub Actions: CI/CD platform used for automated testing, building, and deployment of YOVI.
-
SonarQube: Code quality analysis tool integrated into the CI pipeline to monitor test coverage, code smells, bugs, and technical debt.
-
Prometheus: Monitoring and metrics collection tool used to gather performance data from all services.
-
Grafana: Visualization tool creating dashboards from Prometheus metrics, used for monitoring system health.
-
JWT (JSON Web Token): Compact, URL-safe token used for stateless authentication. Contains user identity and claims, signed with a secret key.
-
YEN Notation: JSON-based notation representing the state of a Game Y match, including board size, current turn, players, and board layout. Mandated by the client.
-
REST API: Architectural style for designing networked applications. YOVI exposes RESTful APIs for service-to-service and external communication.
-
HTTP / HTTPS: Hypertext Transfer Protocol (Secure). Used for all communication between clients, bots, and services.
-
JSON: JavaScript Object Notation, a lightweight data-interchange format used for all API payloads, especially YEN notation.
-
TypeScript: Statically typed programming language extending JavaScript, used to implement the web frontend.
-
Rust: Systems programming language focused on safety and performance, used to implement the game engine.
-
Node.js: JavaScript runtime used to execute the User Service backend.
-
Express: Web framework for Node.js used to build the User Service REST API.
-
React: JavaScript library for building user interfaces, used in the web frontend.
13.3. Acronyms
-
ADR: Architecture Decision Record
-
API: Application Programming Interface
-
CI/CD: Continuous Integration / Continuous Deployment
-
DoD: Definition of Done
-
HTTP: Hypertext Transfer Protocol
-
HTTPS: Hypertext Transfer Protocol Secure
-
i18n: Internationalization
-
JSON: JavaScript Object Notation
-
JWT: JSON Web Token
-
MVC: Model-View-Controller
-
ODM: Object Document Mapper
-
PvM: Player versus Machine
-
REST: Representational State Transfer
-
SSL: Secure Sockets Layer
-
TLS: Transport Layer Security
-
UI: User Interface
-
UX: User Experience
-
VM: Virtual Machine
-
YEN: Y Notation (specific to Game Y)
