Note

This version of the template contains some help and explanations. It is used for familiarization with arc42 and the understanding of the concepts. For documentation of your own system you use better the plain version.

1. Introduction and Goals

Describes the relevant requirements and the driving forces that software architects and development team must consider. These include

  • underlying business goals,

  • essential features,

  • essential functional requirements,

  • quality goals for the architecture and

  • relevant stakeholders and their expectations

1.1. Requirements Overview

Contents

Short description of the functional requirements, driving forces, extract (or abstract) of requirements. Link to (hopefully existing) requirements documents (with version number and information where to find it).

Motivation

From the point of view of the end users a system is created or modified to improve support of a business activity and/or improve the quality.

Form

Short textual description, probably in tabular use-case format. If requirements documents exist this overview should refer to these documents.

Keep these excerpts as short as possible. Balance readability of this document with potential redundancy w.r.t to requirements documents.

Further Information

See Introduction and Goals in the arc42 documentation.

  • The system must allow users to play a complete match of the classic Game Y, following the official rules and supporting different board sizes.

  • The application must allow the user to select the difficulty level before starting a match.

  • The system must correctly detect the win condition and determine when a match has ended.

  • After the match finishes, the system must display an end-game summary screen including relevant statistics such as match duration, number of moves, difficulty level, and winner.

  • The application must allow users to view rankings, including local and global rankings.

  • The system must provide a tutorial or help section explaining the rules of Game Y and how to interact with the interface.

  • The application must support multiple game modes, including local 2-player and variant rule sets.

  • The interface must support multiple languages.

  • The system must provide an AI opponent capable of generating valid moves according to the selected difficulty level.

  • The AI must always generate legal moves and must never violate the rules of Game Y.

  • The system must store user accounts and authentication data securely.

  • This service must store match results, including outcome, duration, difficulty, and player statistics.

  • The system must allow users to register or log in with valid credentials.

  • It also shall include an option to play as a Guest without requiring authentication.

1.2. Quality Goals

Contents

The top three (max five) quality goals for the architecture whose fulfillment is of highest importance to the major stakeholders. We really mean quality goals for the architecture. Don’t confuse them with project goals. They are not necessarily identical.

Consider this overview of potential topics (based upon the ISO 25010 standard):

Categories of Quality Requirements
Motivation

You should know the quality goals of your most important stakeholders, since they will influence fundamental architectural decisions. Make sure to be very concrete about these qualities, avoid buzzwords. If you as an architect do not know how the quality of your work will be judged…​

Form

A table with quality goals and concrete scenarios, ordered by priorities

Quality Goal Description

Functional Suitability

The system shall correctly implement the classic version of Game Y, including win-condition verification for any supported board size. All user-facing features (play game, select strategy, view statistics) and API operations shall behave according to the specified requirements.

Performance Efficiency

For a standard match, the system shall return a computer move or API response within 2 seconds under normal load. The architecture shall support at least 20 concurrent players without noticeable degradation in responsiveness.

Reliability

The system shall consistently determine correct game outcomes and suggested moves, with an error rate of less than 1% in win detection or AI move generation. User data and match history shall not be lost during normal operation.

Usability

First-time users shall be able to start and complete a game without external help. The web interface shall clearly display the board state, available actions, and game status. The learning time for basic gameplay should be less than 30 minutes.

Maintainability

The architecture shall allow new AI strategies, difficulty levels, or Game Y variants to be added without major refactoring. Clear module boundaries and documented interfaces shall enable developers to modify or extend functionality efficiently.

Security

User accounts and personal data shall be protected through proper authentication and authorization mechanisms. API endpoints shall restrict access to authorized clients and prevent unauthorized manipulation of game or user data.

1.3. Stakeholders

Contents

Explicit overview of stakeholders of the system, i.e. all person, roles or organizations that

  • should know the architecture

  • have to be convinced of the architecture

  • have to work with the architecture or with code

  • need the documentation of the architecture for their work

  • have to come up with decisions about the system or its development

Motivation

You should know all parties involved in development of the system or affected by the system. Otherwise, you may get nasty surprises later in the development process. These stakeholders determine the extent and the level of detail of your work and its results.

Form

Table with role names, person names, and their expectations with respect to the architecture and its documentation.

Role/Name Contact Expectations

Owner [Client]

Jose Emilio Labra Gayo

He is the main client of the system, for he enforces expected behaviors and has to be the most satisfied stakeholder.

Dev Team

Jaime Alonso, Pelayo Pérez, Ana Calleja, Matías Valle

They are the ones that implement and determine whether a problem may be solved given the constraints, being able to discuss with the client (Labra).

Game Y rules

Official Wiki - Game Y

Immutable rules that define the rules for playing the vanilla version of the game. Shall be followed.

Final users

There is no direct communication, though social media or polls may be useful

They will be the main users of the system, they shall communicate errors and suggest improvements. The system shall be designed expecting them to use it.

Server (Azure VM)

20.250.145.156

Hosts all Dockerized services, providing public access to all users.

Firebase (Google Cloud)

Firestore + Firebase Auth

A non-relational cloud database that stores all persistent information regarding users and games, and handles authentication.

Stakeholders Map

2. Architecture Constraints

2.1. Technical Constraints

Constraint Explanation

Web application must be implemented in TypeScript/JavaScript

The frontend and external API are required to be developed using TypeScript/JavaScript as specified in the assignment.

Bot logic must be implemented in Rust

The game engine responsible for move suggestion and win detection must be developed as a separate Rust module.

Communication between subsystems via JSON using YEN notation

The web application and the Rust service must exchange data using JSON messages following the YEN game notation defined in the assignment.

Service-oriented architecture

The Rust module must expose its functionality through a web service interface callable by the web application.

Use of Firebase as the database

User data, match history, and statistics must be stored in Firebase, as chosen by the team.

Deployment on a cloud platform

The system must be publicly accessible. The backend services are deployed on an Azure Virtual Machine, with all services containerized via Docker Compose.

Public external API for bots

The system must expose a documented API allowing external bots to access game data and play matches.

Browser-based execution

The frontend must run in standard modern web browsers without requiring special installations.


2.2. Organizational Constraints

Constraint Explanation

Assignment requirements define core functionality

The system must implement at least the classic version of Game Y with player-vs-machine mode and variable board size.

Public deployment required

The application must be accessible through the web as part of the evaluation criteria.

Documentation must follow arc42

Architecture documentation must comply with the arc42 structure provided by the course.

Quality requirements for evaluation

The project must include testing (unit, integration, and e2e), CI/CD, monitoring, and proper repository management.

Team-based academic project

Decisions must consider limited time, resources, and team size typical of a university assignment.


2.3. Conventions and Standards

Constraint Explanation

Use of YEN notation for game state

All game states exchanged between components must follow the defined YEN notation format.

REST-style API with JSON

All external and internal services will communicate using JSON over HTTP.

Version control with Git

All source code and documentation will be managed using a Git repository.

Code quality and testing practices

The project must include automated tests and follow agreed coding standards.

Contents

Any requirement that constraints software architects in their freedom of design and implementation decisions or decision about the development process. These constraints sometimes go beyond individual systems and are valid for whole organizations and companies.

Motivation

Architects should know exactly where they are free in their design decisions and where they must adhere to constraints. Constraints must always be dealt with; they may be negotiable, though.

Form

Simple tables of constraints with explanations. If needed you can subdivide them into technical constraints, organizational and political constraints and conventions (e.g. programming or versioning guidelines, documentation or naming conventions)

Further Information

See Architecture Constraints in the arc42 documentation.

3. Context and Scope

Contents

Context and scope - as the name suggests - delimits your system (i.e. your scope) from all its communication partners (neighboring systems and users, i.e. the context of your system). It thereby specifies the external interfaces.

If necessary, differentiate the business context (domain specific inputs and outputs) from the technical context (channels, protocols, hardware).

Motivation

The domain interfaces and technical interfaces to communication partners are among your system’s most critical aspects. Make sure that you completely understand them.

Form

Various options:

  • Context diagrams

  • Lists of communication partners and their interfaces.

Further Information

See Context and Scope in the arc42 documentation.

3.1. Business Context

Contents

Specification of all communication partners (users, IT-systems, …​) with explanations of domain specific inputs and outputs or interfaces. Optionally you can add domain specific formats or communication protocols.

Motivation

All stakeholders should understand which data are exchanged with the environment of the system.

Form

All kinds of diagrams that show the system as a black box and specify the domain interfaces to communication partners.

Alternatively (or additionally) you can use a table. The title of the table is the name of your system, the three columns contain the name of the communication partner, the inputs, and the outputs.

The following table includes the different business components of the application, explaining for each their points of interaction with the rest of the system.

Element Description Inputs Outputs

Player

Takes part in the game, is able to create a user in the application and launch a game either against a bot or against a local opponent

the current state of the game and any requested statistics

their username and password as well as their moves and requests

Interface

Shows the user the current state of the match as well as any relevant information

all the user’s interactions with its components as well as the moves and information obtained from the game engine and database respectively

the updated state of the game and statistics

Engine

generates the bot moves in a PvE game

the user’s move coordinates

the following bot move

Database

stores information about past matches

the data from played games

information requested by the player to be displayed by the UI

3.2. Technical Context

Contents

Technical interfaces (channels and transmission media) linking your system to its environment. In addition a mapping of domain specific input/output to the channels, i.e. an explanation which I/O uses which channel.

Motivation

Many stakeholders make architectural decision based on the technical interfaces between the system and its context. Especially infrastructure or hardware designers decide these technical interfaces.

Form

E.g. UML deployment diagram describing channels to neighboring systems, together with a mapping table showing the relationships between channels and input/output.

The following table describes the technical interfaces and the channels used to link the system components and external entities.

Component / Channel Description

User Device

Computer, laptop, or mobile device used to access the system via HTTPS (port 443). HTTP on port 80 is automatically redirected to HTTPS by nginx.

Webapp (Frontend)

React/TypeScript SPA built with Vite, served by nginx. Renders the game UI and communicates exclusively with the Users API gateway via HTTP/JSON.

Users API (Gateway)

Node.js/Express service acting as the single entry point for all client requests. Manages server-side sessions in Redis, enforces CSRF protection, and proxies traffic to Auth Engine and Game Manager.

Auth Engine

Rust/Axum microservice handling user registration and login. Communicates with Firebase Auth via the Firebase Admin SDK to verify and store credentials.

Game Manager

Rust/Axum microservice managing match lifecycle: creation, move processing, and result persistence. Caches active game state in Redis and persists final results to Firestore.

Gamey Engine

Rust/Axum microservice implementing game rules, win detection, and AI move generation. Called by the Game Manager to validate moves and compute bot responses.

Redis

In-memory data store serving two roles: (1) server-side session storage for the Users API (TTL 30 min), and (2) active game state cache for the Game Manager.

Firebase (Auth + Firestore)

Google Cloud managed platform. Firebase Auth handles identity verification; Firestore stores user profiles, match history, and global rankings.

Public API for Bots

External REST interface exposed by the Users API for automated agents to interact with the game over HTTPS using YEN notation.

Deployment Platform (Azure VM)

Single Azure virtual machine (20.250.145.156) hosting all Docker containers orchestrated via docker-compose.yml.

context component

3.2.1. Mapping Input/Output to Channels

The following table maps the domain-specific inputs and outputs to their respective technical channels and protocols.

Channel Input Output

HTTPS (Browser → Webapp)

User interactions: login, registration, move submissions, game start.

Board state, game results, statistics, UI updates.

HTTP/JSON (Webapp → Users API :3000)

All API requests from the frontend (auth, game, moves).

Responses proxied from the appropriate backend service.

HTTP/JSON (Users API → Auth Engine :4001)

Registration and login requests with user credentials.

Session token, authentication result.

HTTP/JSON (Users API → Game Manager :5000)

Game creation requests, player move submissions.

Match ID, updated board state, game outcome.

HTTP/JSON (Game Manager → Gamey Engine :4000)

Current board state (YEN notation), player move.

Validated new state, win detection result, bot move.

Redis Protocol (Game Manager → Redis :6379)

Active match state writes on every move.

Current match state reads for move processing.

Firebase SDK (Auth Engine ↔ Firebase Auth)

User credentials for registration and login.

User profile (username, email); session is then stored in Redis by the Gateway.

Firestore SDK (Game Manager ↔ Firestore)

Match results and player statistics on game end.

Historical match data and global rankings.

HTTPS (External Bot → Public API)

Board state (YEN notation), strategy parameters.

Bot move response in YEN notation.

context sequence

4. Solution Strategy

4.1. Technical Decisions

4.1.1. Frontend

  • The user interface is implemented using React with TypeScript, built with Vite. React’s component model allows us to organize and reuse UI elements efficiently, while TypeScript enforces type safety across the codebase.

4.1.2. Backend Services

The backend follows a microservices architecture with the following services:

  • Users Service (Node.js/Express): Acts as the central API gateway. All client requests go through this service, which proxies to the appropriate internal service. It also handles CSRF protection.

  • Auth Engine (Rust/Axum): A dedicated Rust microservice responsible for user registration, login, and token validation, backed by Firebase Auth.

  • Game Manager (Rust/Axum): Manages match lifecycle — game creation, move processing, and result persistence. Bridges Redis for active game state and Firestore for long-term storage.

  • Gamey Engine (Rust/Axum): The computational core. Implements game rules, win detection, and AI move generation. Written in Rust for maximum performance and determinism.

4.1.3. Data Storage

  • Firebase (Firestore + Auth): Chosen as the primary persistent data store for user accounts, match history, and global rankings. Fully managed with built-in authentication.

  • Redis: Used as an in-memory cache for active match state, providing low-latency reads and writes during gameplay.

4.1.4. Infrastructure

  • Docker & Docker Compose: Every service is containerized to ensure environment parity between development and production. A single docker-compose.yml orchestrates all services.

  • Azure VM: The production deployment target. All containers run on a single Azure virtual machine at 20.250.145.156.

4.1.5. Monitoring & Quality

  • Prometheus + Grafana: The Users Service exposes metrics collected by Prometheus and visualized through Grafana dashboards for operational monitoring.

  • SonarCloud: Integrated into the CI/CD pipeline for static analysis, code quality gates, and coverage reporting.

4.2. Organizational Decisions

  • GitHub is used to manage the project, with branch protection rules ensuring every push to a main branch is reviewed by at least one other team member.

  • GitHub Actions automates the CI/CD pipeline: unit tests, E2E tests, Docker image builds, and deployment to Azure on each release.

  • GitHub Wiki is used to document team meetings and decisions.

Contents

A short summary and explanation of the fundamental decisions and solution strategies, that shape system architecture. It includes

  • technology decisions

  • decisions about the top-level decomposition of the system, e.g. usage of an architectural pattern or design pattern

  • decisions on how to achieve key quality goals

  • relevant organizational decisions, e.g. selecting a development process or delegating certain tasks to third parties.

Motivation

These decisions form the cornerstones for your architecture. They are the foundation for many other detailed decisions or implementation rules.

Form

Keep the explanations of such key decisions short.

Motivate what was decided and why it was decided that way, based upon problem statement, quality goals and key constraints. Refer to details in the following sections.

Further Information

See Solution Strategy in the arc42 documentation.

5. Building Block View

5.1. Level 1

5.1.1. Whitebox Overall System

level1
Motivation

The system uses a hybrid backend approach. Firebase is used as a Third-Party Building Block to offload identity management and long-term persistence. The Gamey system handles its own server-side session management via Redis, while delegating identity verification and data storage to Firebase.

Contained Building Blocks
Name Responsibility

Gamey System

Custom and developed layers of the application with orchestrated services.

Firebase

External black box providing Authentication and cloud-based data storage.

5.1.2. Gamey System

Purpose/Responsibility

Coordinates the user experience and provides a single gateway that communicates the backend and frontend interfaces.

5.1.3. Firebase

Purpose/Responsibility

Provides a secure and external layer for handling the Data Base.

5.1.4. Game Engine

Purpose/Responsibility

The computational core. It is agnostic of user identity (delegated to other blocks) and focuses on the game rules.

5.2. Level 2

5.2.1. White Box Game Engine (Rust)

level2
Motivation

The Gamey System is decomposed into specialized services to ensure a clear separation of concerns. By utilizing a central Gateway (Users), the system provides a single entry point for the Webapp and API requests. This architecture isolates the Gamey Engine from session management, while Auth and Game Manager handle microservice-specific tasks, using Redis for high-speed state caching and Firebase for long-term persistence.

Contained Building Blocks
Name Responsibility

Users

The central API Gateway that orchestrates and routes all incoming traffic to the appropriate internal services.

Webapp

The frontend user interface providing the visual representation of the game and user dashboard.

Auth Engine

Dedicated microservice that interacts with Firebase to manage user identity, tokens, and authentication flows.

Game Manager

Orchestration service responsible for handling match state, player matchmaking, and synchronizing data with Firebase.

Gamey

The core game engine service that executes game rules and handles CPU-intensive mechanics.

Redis

In-memory data store used for low-latency caching of active game states and temporary session data.

Firebase

Third-party platform providing robust authentication and cloud-hosted NoSQL data storage.

5.3. Level 3

level3

5.3.1. Whitebox Game Manager (Rust)

Purpose/Responsibility

The Game Manager acts as the bridge between the high-level game requests and the data persistence layers. It translates business logic into actionable data operations across NoSQL and cache environments.

Contained Building Blocks (Components)
Name Responsibility

REST API (api_rest.rs)

Exposes endpoints for match management and receives commands from the Gateway.

Data Layer (data.rs)

Contains the core business logic and decides whether to fetch data from cache or primary storage.

Firebase Client (firebase.rs)

Handles the low-level communication and serialization for Firebase Firestore and Auth.

Redis Client (redis_client.rs)

Manages the connection pool and CRUD operations for the volatile Redis cache.

5.3.2. Whitebox Gamey Engine (Rust)

Purpose/Responsibility

This is the computational heart of the ecosystem. It is designed to be highly modular, allowing for different interfaces (CLI or Bot Server) to interact with the same deterministic game logic.

Contained Building Blocks (Components)
Name Responsibility

Core Engine (core/)

Implements the fundamental rules, state machine, and move validation of the game.

Bot Server (bot_server/)

A wrapper that allows the engine to be played against or by automated agents over a network.

Bot Logic (bot/)

Contains the algorithms and decision-making heuristics for AI players.

Notation (notation/)

Handles the parsing and generation of game-specific notation (e.g., FEN or PGN equivalents).

CLI (cli.rs)

Provides a command-line interface for direct interaction with the engine for debugging or standalone play.

6. Runtime View

Contents

The runtime view describes concrete behavior and interactions of the system’s building blocks in form of scenarios.

Motivation

You should understand how building blocks of your system perform their job and communicate at runtime.

6.1. User Registration and Login

Both flows follow the same pattern: the client first fetches a CSRF token, then submits credentials. The Gateway validates the CSRF token, forwards the request to the Auth Engine, and on success creates a server-side session in Redis, returning an httpOnly session cookie. The only difference is what the Auth Engine does with Firebase.

runtime auth

6.2. Post-Login Session Usage

After login, every request that reads or mutates user state goes through the session stored in Redis. Read requests only need the sessionId cookie; state-changing requests additionally require the X-CSRF-Token header to match the csrf_token cookie.

runtime session lifecycle

6.3. Match Creation

This view illustrates how the system initializes a game session. The Users Gateway routes the request to the Game Manager, which prepares the environment for a new match.

Resource Allocation: The Game Manager creates a unique match ID.

Fast Persistence: The initial state is stored in Redis to allow for high-frequency updates during gameplay.

runtime create game

6.4. Player Move

The player move scenario represents the core interactive loop of the application. It involves real-time validation and state synchronization across the engine and the cache.

State Retrieval: The Game Manager pulls the current board status from Redis.

Engine Validation: The Gamey Engine processes the move logic to ensure it follows game rules.

Cache Update: The new state is pushed back to Redis for immediate availability.

runtime player move

6.5. Bot Move

This scenario covers automated gameplay. It is triggered by the frontend as a second HTTP call immediately after the player’s move response returns — the Game Manager does not initiate it automatically.

Two-step flow: After the player move response confirms the game is not over, the frontend calls /reqBotMove. The Game Manager reads the current state from Redis, requests the bot move from the Gamey Engine, writes the result back to Redis, and returns the bot’s coordinates. The player sees the update as part of this second response — no polling, refresh, or WebSocket is involved.

runtime bot move

6.6. Game End and Result Persistence

This scenario describes what happens when the final move of a match is played. The Gamey Engine detects the win condition, and the Game Manager persists the result to Firebase Firestore before clearing the now-finished match from Redis.

runtime game end

7. Deployment View

7.1. Infrastructure Level 1

deployment lvl1
Motivation

The system is fully containerized using Docker, orchestrated via docker-compose.yml. All backend services run on a single Azure VM (20.250.145.156). Firebase is used as a managed external platform for authentication and persistence, reducing infrastructure overhead.

Quality and/or Performance Features
  • Containerization: Every service runs in an isolated Docker container, ensuring consistent behavior across development and production.

  • Separation of concerns: The Users API acts as a single entry point, hiding the internal service topology from external clients.

  • Low-latency state management: Redis caches active game states, enabling fast move validation without hitting Firebase on every request.

  • Managed auth and persistence: Firebase offloads identity management and long-term data storage, reducing the attack surface of the self-hosted infrastructure.

Mapping of Building Blocks to Infrastructure
Building Block Infrastructure Element

Webapp

Docker container serving the compiled React/Vite SPA on port 80.

Users API

Docker container running the Node.js/Express gateway on port 3000.

Auth Engine

Docker container running the Rust/Axum authentication service on port 4001.

Game Manager

Docker container running the Rust/Axum game state manager on port 5000.

Gamey Engine

Docker container running the Rust/Axum game engine and bot service on port 4000.

Redis

Docker container running Redis for in-memory game state caching on port 6379.

Firebase Auth + Firestore

Fully managed by Google Cloud (Firebase Platform). No self-hosted component.

Prometheus

Docker container scraping metrics from the Users API for operational monitoring.

Grafana

Docker container providing dashboards fed by Prometheus for visualizing system health.

7.2. Infrastructure Level 2

7.2.1. Docker Environment

Each service is defined as a container in docker-compose.yml and shares a common internal Docker network, allowing inter-service communication by service name.

Webapp:

  • Runtime: Multi-stage build — Node.js (build stage) compiles the React/Vite SPA; the output is served by jonasal/nginx-certbot (production) or nginx:alpine (local).

  • TLS: In production, the jonasal/nginx-certbot image automatically obtains and renews a Let’s Encrypt certificate for the domain resolved via sslip.io (e.g. 20-250-145-156.sslip.io). All HTTP traffic on port 80 is automatically redirected to HTTPS on port 443 by the certbot image’s built-in redirect block.

  • nginx config: Only the HTTPS server block is defined in nginx.conf. It terminates TLS, serves the SPA static files, and reverse-proxies /api/ and /game/ to the Users API on port 3000.

  • Build Args: VITE_API_URL is injected at build time to configure the frontend’s API base URL.

  • Ports: Exposes 80 (HTTP → HTTPS redirect) and 443 (HTTPS, TLS termination).

Users API:

  • Runtime: Node.js 22.

  • Config: Environment variables for Firebase Admin credentials and internal service URLs (AUTH_URL, GAMEMANAGER_URL, GAMEY_URL).

  • Port: Exposes 3000 (REST API gateway).

Auth Engine:

  • Runtime: Compiled Rust binary on a minimal base image.

  • Config: FIREBASE_PROJECT_ID and FIREBASE_CREDENTIALS_B64 for Firebase Admin SDK.

  • Port: Exposes 4001.

Game Manager:

  • Runtime: Compiled Rust binary on a minimal base image.

  • Config: Firebase credentials and REDIS_HOST for Redis connectivity.

  • Port: Exposes 5000.

Gamey Engine:

  • Runtime: Compiled Rust binary on a minimal base image.

  • Performance: Built with release optimizations (LTO) for maximum throughput.

  • Port: Exposes 4000.

Redis:

  • Runtime: Official Redis image.

  • Persistence: Volume-mounted data directory to survive container restarts.

  • Port: Exposes 6379 (internal network only).

7.3. HTTPS Request Flow

This sequence shows how a browser request travels through the TLS termination layer and the internal Docker network. All external traffic enters exclusively through nginx on port 443; no backend port is exposed to the public internet.

deployment https flow

8. Cross-cutting Concepts

Content

This section describes overall, principal regulations and solution ideas that are relevant in multiple parts (= cross-cutting) of your system. Such concepts are often related to multiple building blocks. They can include many different topics, such as

  • models, especially domain models

  • architecture or design patterns

  • rules for using specific technology

  • principal, often technical decisions of an overarching (= cross-cutting) nature

  • implementation rules

Motivation

Concepts form the basis for conceptual integrity (consistency, homogeneity) of the architecture. Thus, they are an important contribution to achieve inner qualities of your system.

Some of these concepts cannot be assigned to individual building blocks, e.g. security or safety.

Form

The form can be varied:

  • concept papers with any kind of structure

  • cross-cutting model excerpts or scenarios using notations of the architecture views

  • sample implementations, especially for technical concepts

  • reference to typical usage of standard frameworks (e.g. using Hibernate for object/relational mapping)

Structure

A potential (but not mandatory) structure for this section could be:

  • Domain concepts

  • User Experience concepts (UX)

  • Safety and security concepts

  • Architecture and design patterns

  • "Under-the-hood"

  • development concepts

  • operational concepts

Note: it might be difficult to assign individual concepts to one specific topic on this list.

Possible topics for crosscutting concepts
Further Information

See Concepts in the arc42 documentation.

8.1. Domain Model: YEN Notation

The core domain concept is the YEN Notation. To ensure all components (Frontend, Gateway, and Engine) communicate seamlessly, a standardized JSON format has been established. This format describes the hexagonal board, edge connections, and the current turn, preventing logical interpretation errors when passing data between JavaScript and Rust environments.

8.2. Security: Layered Authentication, Sessions, and CSRF

Security is implemented across several layers:

8.2.1. Identity — Firebase Auth

User registration and login are handled by the dedicated Auth Engine (Rust/Axum), which communicates with Firebase Auth. The Auth Engine validates credentials and returns the authenticated user’s profile to the Gateway. The browser never interacts with Firebase directly.

8.2.2. Server-side Sessions — Redis

After a successful login or registration, the Users Gateway creates a server-side session in Redis using a cryptographically random 64-character hex session ID. The session record (session:{id}) stores the user’s username and email with a 30-minute TTL (setex). Only the opaque session ID is sent to the browser as an httpOnly, SameSite=Lax cookie named sessionId. Because the cookie is httpOnly, JavaScript cannot read it, which prevents session hijacking via XSS. In production the cookie is also marked Secure so it is only transmitted over HTTPS.

The GET /api/me endpoint reads the session from Redis on each request. The POST /api/logout endpoint deletes the Redis key and clears the cookie.

All state-changing requests (POST, PUT, DELETE) are protected against Cross-Site Request Forgery using the double-submit cookie pattern:

  1. The client calls GET /api/csrf-token. The Gateway generates a random token, stores it in an httpOnly cookie (csrf_token), and also returns it in the JSON response body.

  2. The client includes the token value in the X-CSRF-Token request header on every subsequent state-changing call.

  3. The verifyCsrf middleware in the Gateway compares the cookie value with the header value. If they do not match the request is rejected with HTTP 403.

Because a cross-origin attacker cannot read the httpOnly cookie from a foreign origin, they cannot replicate the header value, so the check blocks CSRF attacks without requiring a server-side token store.

8.3. HTTPS: Let’s Encrypt and nginx via sslip.io

In production, TLS is handled entirely by the Webapp container using the jonasal/nginx-certbot Docker image:

  • sslip.io: The Azure VM’s public IP (20.250.145.156) is mapped to the domain 20-250-145-156.sslip.io using the wildcard DNS service sslip.io. This gives the VM a stable, publicly resolvable domain name without purchasing a custom domain, which is a requirement for Let’s Encrypt.

  • Let’s Encrypt: On first startup, the nginx-certbot image automatically issues a free TLS certificate for the sslip.io domain using the ACME HTTP-01 challenge. It stores the certificates in a Docker volume (letsencrypt) and renews them automatically before expiry.

  • nginx: The nginx.conf defines only the HTTPS server block (port 443). The certbot image automatically adds a port 80 server block that redirects all plain HTTP traffic to HTTPS. nginx terminates TLS, serves the compiled React SPA from /usr/share/nginx/html, and reverse-proxies /api/ and /game/ to the Users API container at http://users:3000.

  • Local development: For local development, nginx:alpine is used instead of the certbot image (swap two comment pairs in the Dockerfile). A plain HTTP server block defined in nginx.local.conf is used, so no certificates are needed.

8.4. Persistency: Hybrid State Management

The system distinguishes between two data lifecycles:

  • Session State: Managed by Redis in the Users Gateway. Active user sessions are stored with a 30-minute TTL. Redis is also used by the Game Manager to cache active match state.

  • Volatile Game State (Real-time): Managed by Redis in the Game Manager for active matches where low latency is critical.

  • Persistent State: Managed by Firebase/Firestore for long-term storage of user profiles, match history, and global rankings.

8.5. Deterministic Logic (Rust Engine)

Since Game Y requires complex graph pathfinding to detect wins, all computational logic is centralized in Rust. This ensures that rules are applied deterministically and with maximum CPU efficiency, regardless of whether the move comes from the Web UI or the Bot API.

8.6. Deployment: Containerization (Docker)

To ensure "it works on my machine" consistency, every sub-service is encapsulated in a Docker container. This cross-cutting concept simplifies local development and cloud deployment, ensuring identical runtimes for the Node.js gateway and the compiled Rust binary.

8.7. Testing Strategy

The project applies a multi-layer testing approach enforced through CI/CD:

  • Unit tests: Vitest for the webapp and users service; cargo test for the three Rust services (gamey, game_manager, userAuthentification).

  • Integration tests: Rust integration tests run against real Firebase instances using randomised test data and clean up after themselves.

  • End-to-end tests: Playwright and Cucumber scenarios covering full user flows, executed against the deployed application on each release.

  • Load tests: Gatling simulations (Maven) targeting the deployed server, measuring response times and success rates under concurrent load.

9. Architecture Decisions

This section documents the most important architectural decisions made during the development of the system. Each decision includes its context, the chosen option, and the rationale.


9.1. ADR-001: Separation into Web Application and Rust Game Engine

Status: Accepted

Context: The assignment requires the system to be composed of at least two subsystems: - A web application implemented in TypeScript. - A game logic module implemented in Rust.

Decision: The system will follow a service-oriented architecture where: - The web application handles the user interface, API, and user management. - The Rust module provides game logic (win detection and move suggestion) through a web service.

Rationale: - Required by the assignment. - Separates concerns between UI and game logic. - Rust provides high performance for algorithmic computations.

Consequences: - Requires inter-service communication via HTTP and JSON. - Adds deployment and integration complexity.


9.2. ADR-002: Communication via JSON using YEN Notation

Status: Accepted

Context: The assignment specifies that both subsystems must communicate using JSON messages following YEN notation.

Decision: All communication between the web application and the Rust service will use: - HTTP requests - JSON payloads - YEN notation for game states

Rationale: - Mandatory requirement from the assignment. - JSON is simple, widely supported, and easy to debug.

Consequences: - Both services must implement serialization and validation of YEN data. - Changes to the notation would affect both subsystems.


9.3. ADR-003: Use of Firebase as the Database

Status: Accepted

Context: The system requires user registration, match history, and statistics storage.

Possible options: - Self-hosted database (PostgreSQL, MySQL) - Managed cloud database - Firebase

Decision: Use Firebase as the main data storage solution.

Rationale: - Fully managed service with minimal setup. - Built-in authentication and real-time database features. - Reduces infrastructure and maintenance effort. - Suitable for small-to-medium academic projects.

Consequences: - Vendor lock-in to Firebase ecosystem. - Limited flexibility compared to self-hosted databases. - Requires internet connectivity for all operations.


9.4. ADR-004: Cloud Deployment of the Application

Status: Accepted

Context: The assignment requires the system to be publicly accessible via the web.

Possible hosting options: - Local/on-premise server - Traditional VM hosting - Cloud platform (Vercel, Railway, Azure)

Decision: Deploy all services on an Azure Virtual Machine (20.250.145.156), running the full docker-compose.yml stack.

Rationale: - Azure provides a reliable, always-on VM suitable for running multiple Docker containers simultaneously. - A single VM with Docker Compose is simpler to manage for a university project than a managed container orchestration platform. - Meets the public deployment and accessibility requirements. - Enables CI/CD with automated deployment via GitHub Actions SSH.

Consequences: - All services share the resources of a single VM (no auto-scaling). - Deployment is triggered automatically on each GitHub release via an SSH deploy step in the CI pipeline. - Firebase credentials and other secrets are managed through GitHub Actions secrets and passed as environment variables at runtime.


Status: Accepted

Context: The system needs to maintain authenticated user state across requests. The main options were: - Stateless JWT tokens returned to and stored by the client. - Server-side sessions stored in Redis with an opaque session ID sent as a cookie.

Decision: Use server-side sessions stored in Redis. After a successful login or registration, the Users Gateway creates a Redis key (session:{randomId}) containing the user’s profile with a 30-minute TTL. The opaque session ID is delivered to the browser as an httpOnly, Secure, SameSite=Lax cookie.

Rationale: - httpOnly cookies are inaccessible to JavaScript, eliminating XSS-based token theft. - Server-side sessions can be invalidated instantly (logout deletes the Redis key), whereas JWTs remain valid until expiry. - Redis provides sub-millisecond session lookups with negligible overhead. - No sensitive user data is stored client-side.

Consequences: - Redis becomes a required infrastructure dependency for the Users Gateway. - Session state is lost if Redis is restarted without persistence (mitigated by the Redis volume mount). - Horizontal scaling of the Users Gateway requires a shared Redis instance (acceptable given single-VM deployment).


9.6. ADR-006: HTTPS via Let’s Encrypt and jonasal/nginx-certbot

Status: Accepted

Context: The system must be accessible over HTTPS in production. Options considered: - Self-signed certificate. - Manual Let’s Encrypt certificate management. - Automated certificate management via a purpose-built Docker image.

Decision: Use the jonasal/nginx-certbot Docker image for the Webapp container. The VM’s public IP is exposed as a domain via sslip.io (e.g. 20-250-145-156.sslip.io), which Let’s Encrypt can validate via the ACME HTTP-01 challenge.

Rationale: - sslip.io provides a free, always-valid domain from a bare IP address, satisfying Let’s Encrypt’s domain requirement without buying a custom domain. - jonasal/nginx-certbot handles certificate issuance and renewal automatically, with zero manual intervention. - The image also adds an HTTP→HTTPS redirect block on port 80 automatically, enforcing HTTPS for all traffic. - Certificates are persisted in a Docker named volume, surviving container restarts.

Consequences: - The Webapp container requires internet access on port 80 for the ACME challenge during certificate issuance. - A CERTBOT_EMAIL environment variable must be set for Let’s Encrypt account registration. - Local development requires swapping to nginx:alpine with a plain HTTP config (two comment pairs in the Dockerfile).


9.7. ADR-007: External REST API for Bots

Status: Accepted

Context: The assignment requires an API that allows external bots to interact with the system and play games.

Decision: Expose a REST-style HTTP API that: - Accepts JSON requests. - Uses YEN notation for game state. - Provides endpoints for game interaction.

Rationale: - REST is simple and widely supported. - Easy integration with external bots. - Matches the JSON communication requirement.

Consequences: - Requires proper authentication and validation. - Must be fully documented.

9.8. ADR-008: Internationalisation via react-i18next

Status: Accepted

Context: The application targets users who may not speak English. The team needed a way to support multiple languages without duplicating components.

Decision: Use react-i18next as the internationalisation library, with translations stored in JSON locale files (en.json, es.json) loaded at runtime.

Rationale: react-i18next is the standard i18n solution for React, well-maintained, and requires no backend changes — all translation logic stays in the frontend.

Consequences: - Adding a new language requires only a new locale file. - All user-facing strings must be referenced through the library’s t() function rather than hardcoded.


Contents

Important, expensive, large scale or risky architecture decisions including rationales. With "decisions" we mean selecting one alternative based on given criteria.

Please use your judgement to decide whether an architectural decision should be documented here in this central section or whether you better document it locally (e.g. within the white box template of one building block).

Avoid redundancy. Refer to section 4, where you already captured the most important decisions of your architecture.

Motivation

Stakeholders of your system should be able to comprehend and retrace your decisions.

Form

Various options:

  • ADR (Documenting Architecture Decisions) for every important decision

  • List or table, ordered by importance and consequences or:

  • more detailed in form of separate sections per decision

Further Information

See Architecture Decisions in the arc42 documentation. There you will find links and examples about ADR.

10. Quality Requirements

10.1. Quality Tree

quality tree

Explanation: This quality tree is structured from ISO 25010-inspired categories. Each main branch (Functional, Performance, etc.) links to more concrete sub-goals, which are then detailed in the scenarios below.


10.2. Quality Scenarios

Quality Attribute Scenario Stimulus Environment Response Metric / Success Criteria

Functional Suitability

Correct game logic

A player makes a move

Web frontend or bot API

The system updates the board and verifies the win condition

Must correctly determine game state

Functional Suitability

API operation correctness

External bot calls the 'play' endpoint

System running normally

System returns next move in YEN notation

Response contains valid move; no invalid game states

Performance Efficiency

Fast move computation

Player requests next move

Normal load (~1 concurrent player)

System calculates AI move

Response returned within 2 seconds

Performance Efficiency

Handle concurrency

Multiple users/bots interact simultaneously

Normal server load

System serves all requests

20 concurrent matches without noticeable lag

Reliability

Accurate game outcome

Match completes

Normal operation

System records winner/loser correctly

<1% error rate in win detection or move suggestion

Reliability

Data persistence

Unexpected restart or refresh

Web app and backend running

User stats and match history retained

No loss of game or user data

Usability

Learnability

New user starts first game

First-time user accesses web frontend

User can complete a match

Game completed without external help; <30 min to understand basic rules

Usability

Board clarity

Player observes board and options

Normal gameplay

Actions and current game state are clearly displayed

User can easily understand current turn, available moves, and game status

Maintainability

Extend AI strategies

Developers add new AI strategy or difficulty

Source code maintained and documented

System integrates new strategy without breaking existing features

Minimal code changes needed; no regression errors

Maintainability

Clear module interfaces

Developers modify Rust module or frontend

Existing system deployed

Interfaces documented; components interact correctly

Changes localized; integration works without extensive refactoring

Security

Authentication

User logs in

Public internet

System verifies credentials

Unauthorized access denied; valid users granted access

Security

API protection

External bot tries to call API

System exposed to network

API checks credentials and permissions

Unauthorized requests blocked; only authorized clients allowed

11. Risks and Technical Debts

Contents

A list of identified technical risks or technical debts, ordered by priority

Motivation

“Risk management is project management for grown-ups” (Tim Lister, Atlantic Systems Guild.)

This should be your motto for systematic detection and evaluation of risks and technical debts in the architecture, which will be needed by management stakeholders (e.g. project managers, product owners) as part of the overall risk analysis and measurement planning.

Form

List of risks and/or technical debts, probably including suggested measures to minimize, mitigate or avoid risks or reduce technical debts.

Further Information

See Risks and Technical Debt in the arc42 documentation.

11.1. Technical Risks

Risk Impact Mitigation

Data Consistency

Potential desync between Redis (volatile) and Firebase (persistent).

Implement robust write-behind patterns and retry logic.

JS Performance

Real-time processing lag in the browser client.

Optimize event loop and use Web Workers for networking.

Rust Complexity

Steep learning curve affecting development velocity.

Peer programming and strict linting (clippy).

Single VM infrastructure

All services share one Azure VM. A hardware failure, reboot, or network issue makes the entire application unreachable.

No automatic failover exists. Mitigation relies on Prometheus/Grafana alerts to detect outages quickly and manual restart procedures.

11.2. Technical Debts

Technical Debt Status/Context Priority

Session Reload

On page reload the frontend must call GET /api/me to re-hydrate auth state; if this call is slow or fails the UI briefly shows an unauthenticated state. Needs a loading state or optimistic restore.

Medium

Code Duplication

Identified logic redundancy across modules.

Low

Redis single point of failure

The Users Gateway’s session store is a single Redis container with no replica. A Redis crash invalidates all active sessions. A replica or persistence strategy should be evaluated.

Medium

Firebase vendor lock-in

All authentication and persistence depends on Firebase. Migrating to a different provider would require rewriting the Auth Engine and Game Manager data layers. Accepted trade-off for the project scale. No mitigation planned.

Low

CORS allow localhost URLs in prod

CORS configuration permits localhost origins in the production environment. Accepted as a known misconfiguration; will not be addressed. Wontfix.

Low

Bilingual comments in code

Source code contains comments written in multiple languages, reducing consistency and maintainability.

Low

Rate limiting in login / Race condition on user creation

No rate limiting is applied to the login endpoint, and a race condition window exists during user creation. Accepted as-is; will not be addressed. Wontfix.

Low

The original version of GameY did not have any scenarios where the game could end on a draw

Due to the implementation of new gamemodes (ex. Holey Y) now that scenario can occur, and it is not covered. A new method to check if game can not end on a win or lose has to be included.

Medium

12. API Documentation

The Gamey REST API is documented using the OpenAPI 3.0 specification and served via Swagger UI.

The interactive documentation is available at the following URL when the application is deployed:

12.1. How it works

The openapi.yaml file located in the users/ directory is loaded at startup by the Users gateway (users-service.js) and served through swagger-ui-express at the /api-docs path. Nginx proxies all /api-docs traffic from the public HTTPS endpoint directly to the Users container, so no separate deployment step is required.

12.2. Documented endpoint groups

Tag Endpoints

Auth

Register, login, logout

Session

CSRF token, current user (/api/me)

User

Update username

Game

Match lifecycle, moves, bot, history and rankings (offline and vs-bot)

Online

Online match creation, matchmaking, long-poll update, forfeit

Interops

Forwards a board state (YEN) to the Gamey engine and returns the coordinates the selected bot would play next (GET /play)

Legacy

Backwards-compatible endpoints

13. Glossary

Contents

The most important domain and technical terms that your stakeholders use when discussing the system.

You can also see the glossary as source for translations if you work in multi-language teams.

Motivation

You should clearly define your terms, so that all stakeholders

  • have an identical understanding of these terms

  • do not use synonyms and homonyms

Form

A table with columns <Term> and <Definition>.

Potentially more columns in case you need translations.

Further Information

See Glossary in the arc42 documentation.

Term Definition

Bot API

Public interface allowing external automated agents to interact with the game, request moves, and query game states over HTTPS.

CSRF (Cross-Site Request Forgery)

An attack where a malicious site tricks a logged-in user’s browser into making an unwanted request to the application. Mitigated in this system using the double-submit cookie pattern: the client must include a secret token in both a cookie and a request header; the Gateway rejects requests where they do not match.

Docker

Containerization platform used to package services (Users Service, Gamey Engine) ensuring consistent execution across different hardware and environments.

Double-Submit Cookie

The CSRF mitigation strategy used in this system. A random token is set in an httpOnly cookie and also returned in the response body. The client must echo it in the X-CSRF-Token header; the server compares the two values before processing any state-changing request.

Firebase

Cloud platform providing Authentication and NoSQL persistence (Firestore) for user profiles, match history, and global statistics.

Gamey Engine

The core computational module written in Rust, responsible for deterministic game logic, win detection, and AI move generation.

Gateway

Node.js service acting as the primary entry point for user-related requests, handling session management, CSRF protection, and proxying to internal services.

httpOnly Cookie

A browser cookie flag that prevents JavaScript from reading the cookie value, protecting it from XSS attacks. Used for both the sessionId and csrf_token cookies in this system.

Let’s Encrypt

A free, automated, and open Certificate Authority (CA) that issues TLS certificates via the ACME protocol. Used in production to provide HTTPS for the sslip.io domain.

nginx

High-performance web server and reverse proxy. Used in the Webapp container to serve the React SPA and proxy /api/ and /game/ traffic to the Users API.

Redis

In-memory data store used for two purposes: (1) server-side session storage in the Users Gateway (session TTL 30 min), and (2) active game state caching in the Game Manager.

Rust

High-performance systems programming language used for the game’s core logic and AI algorithms to ensure computational efficiency.

Session Cookie

An httpOnly, Secure, SameSite=Lax cookie named sessionId set by the Users Gateway after a successful login or registration. Its value is an opaque random ID pointing to the user’s session record in Redis.

sslip.io

A wildcard DNS service that maps IP-based subdomains (e.g. 20-250-145-156.sslip.io) to their embedded IP address. Used to give the Azure VM a resolvable domain name so Let’s Encrypt can issue a certificate without a custom domain purchase.

YEN Notation

Project-specific JSON-based format used for serializing board states and moves between the frontend and backend subsystems.

Game Y

Abstract strategy board game played on a triangular hexagonal grid. The goal is to connect all three sides of the board with a continuous chain of pieces.

Why Not

A variant of Game Y in which the player who completes the winning connection loses instead of winning. Also known as the reverse or "last player loses" rule.

ELO

A numerical rating system used to rank players based on match outcomes. A win increases the rating; a loss decreases it.

SPA

Single Page Application. A web application that loads a single HTML page and dynamically updates content without full page reloads. Used to refer to the React frontend.

i18n

Abbreviation for internationalisation. The process of adapting software to support multiple languages without code changes. In this system it is implemented via the react-i18next library, with translations stored in dedicated locale files (en.json, es.json) and selectable from the settings menu at runtime.


About arc42

arc42, the template for documentation of software and system architecture.

Template Version 8.2 EN. (based upon AsciiDoc version), January 2023

Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.