About arc42

arc42, the template for documentation of software and system architecture.

Template Version 8.2 EN. (based upon AsciiDoc version), January 2023

Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.


Note

This version of the template contains some help and explanations. It is used for familiarization with arc42 and the understanding of the concepts. For documentation of your own system you use better the plain version.

1. Introduction and Goals

1.1. Introduction

We are a team of seven Software engineering students working for ChattySw, a company that has been hired by RTVE aiming to extend their online version of the "Saber y Ganar" quiz show. The application we are taking as a reference was originally developed by another team from HappySw. This is their repository, from which we will take ideas: 📂 HappySw repository.

1.2. Requirements Overview

The system is designed to provide an interactive quiz experience based on Saber y Ganar. The key requirements include:

  • Web-Based Interface: The application will be accessible through a web frontend, displaying questions, answers, and a hint system.

  • User Accounts & Progress Tracking: Users will be able to register and view their participation history, including game statistics.

  • Automated Question Generation: Questions and answers will be taken from WikiData.

  • AI-Powered Hints: An LLM-based hint system will assist users without revealing direct answers, mitigating incorrect information.

  • Time-Limited Responses: Players must answer within a given time frame.

  • API Access: The system will expose APIs to retrieve user data and question details.

For a detailed breakdown of requirements, please refer to the following document: 📄 Project Requirements Document.

1.3. Quality Goals

The following quality attributes are prioritized in the architecture:

Quality Goal Description

Usability

The interface should be user-friendly, ensuring smooth interaction with questions and the AI assistant.

Scalability

The system should efficiently handle multiple concurrent users.

Reliability

The application must function without errors, correctly retrieving and displaying questions.

Maintainability

The codebase should follow best practices, allowing future extensions and improvements.

Security

User data and interactions must be securely managed, preventing unauthorized access.

1.4. Stakeholders

Role/Name Contact Expectations

RTVE

Client

To have a modernized and functional quiz system with AI-based hints to engage users.

End Users (Players)

Public Users

Want an enjoyable, responsive, and fair game experience with accurate hints and reliable performance.

Development Team (ChattySw)

Internal Team

Responsible for implementing the system following the defined requirements and quality standards.

WikiData

API Provider

Provides the database of questions and images, ensuring relevant and accurate information for the game.

External AI Provider (LLM Service)

API Provider

Provides the AI-based hint system, which must be reliable and minimize incorrect responses.

2. Architecture Constraints

In this section we will pinpoint the constraints that come into play for the design of our application and its architecture.

2.1. List of constraints

Constraint type Constraint description Repercussions

Technical

Using Git and GitHub for version control.

Requires the team to get accustomed to GitHub and its functionalities.

Technical

Deploying the application to access it through a web browser.

Requires the team to get familiarized with tools and services such as Docker, Azure…​

Technical

Not using the project from the previous year.

Requires the team to work without starting from code authored by other people previous year’s students of the chosen project, start fresh.

Technical

Mandatory inclusion of LLM functionalities to help the user.

Requires the team to get familiarized with the LLM selected (Qwen) and implement it into the program.

Organizational

There is a time limit established.

The application must be developed in a few months, which sets a time limit for the team.

Organizational

Using pull requests for code submissions.

Ensures that more than one person will be responsible of each commit.

Organizational

Holding regular team meetings.

A team meeting must be held at least once a week (or more if required) and the participants and contents of said meeting must both be noted down.

Conventional

Developing an application with code up to standards.

Code developed by the team must be as clear to understand as possible (i.e. following code conventions, or adding documentation), in order to ensure the maintainability of the application.

Conventional

Ensuring the LLM provides factual clues.

Requires to adjust the LLM accordingly to ensure it does not provide any falsehoods that may confuse the player of the game.

Conventional

Following the Arc42 template in the documentation.

The documentation to be provided in the project must be modelled after Arc42.

3. Context and Scope

This chapter describes the environment and context of the app, who uses the system and on which other systems it depends

Contents

Context and scope - as the name suggests - delimits your system (i.e. your scope) from all its communication partners (neighboring systems and users, i.e. the context of your system). It thereby specifies the external interfaces.

If necessary, differentiate the business context (domain specific inputs and outputs) from the technical context (channels, protocols, hardware).

Motivation

The domain interfaces and technical interfaces to communication partners are among your system’s most critical aspects. Make sure that you completely understand them.

Form

Various options:

  • Context diagrams

  • Lists of communication partners and their interfaces.

Further Information

See Context and Scope in the arc42 documentation.

3.1. Business Context

Contents

Specification of all communication partners (users, IT-systems, …​) with explanations of domain specific inputs and outputs or interfaces. Optionally you can add domain specific formats or communication protocols.

Motivation

All stakeholders should understand which data are exchanged with the environment of the system.

Form

All kinds of diagrams that show the system as a black box and specify the domain interfaces to communication partners.

Alternatively (or additionally) you can use a table. The title of the table is the name of your system, the three columns contain the name of the communication partner, the inputs, and the outputs.

The user in the game interact with the application and he/she Will see a question with its answers and if he want there’s the possibility of seeing a hint to know the correct answer. In this case the questions Will be generate using information with wikidata. The hint is caught from an LLM (empathy) using its API. Also the other 3 wrong answers are generated by the LLM too.

Business Context Photo

3.2. Technical Context

Contents

Technical interfaces (channels and transmission media) linking your system to its environment. In addition a mapping of domain specific input/output to the channels, i.e. an explanation which I/O uses which channel.

Motivation

Many stakeholders make architectural decision based on the technical interfaces between the system and its context. Especially infrastructure or hardware designers decide these technical interfaces.

Form

E.g. UML deployment diagram describing channels to neighboring systems, together with a mapping table showing the relationships between channels and input/output.

technical context photo

The application has microservices for the different actions perform in the application( LLMService, Scoreboard, QuestionService, ..). For the storage of the users or the questions, scoreboard, users,…​ it’s used a database, in this case we use MongoD. Other service is the llm service that allows diferent microservices of the application to ask to an llm. In our app, we let ask for the hint and for the three wrong answers for the questions. Another service is the question service that manage the questions and the answers for the game. Finally we have the webapp service that allows the user to play the game and it’s 'the front end' of the application.

4. Solution Strategy

In this section we will describe the approach taken to develop the system from a high-level perspective. This includes the technologies used, the architecture of the system and the most important organizational decisions.

4.1. Technologies

The following technologies are used in our project:

  • NodeJS: To build our backend, we will use this technology due to its simplicity and its ability to create scalable and efficient servers. This technology is not so familiar to the team but as it is widely used we think we won’t have many issues as there are plenty of resources available (documentation, GitHub repos, StackOverflow…​).

  • MongoDB: To store the data, we will use this technology due to its flexibility and scalability. Furthermore, this technology is recommended to use combined with NodeJS as they perform quite well together (therefore avoiding the pitfall of performance that is typically associated with the use of NoSQL databases over SQL ones). The team is not so used to work with NoSQL databases, but we think that it will be a good opportunity to learn something new.

  • React: To build the web application’s UI, we will use this technology due to its simplicity and its ability to create reusable components. It has also been chosen as it has a relatively gentle learning curve (as we will be learning while building).

  • Docker: To deploy the system, we will use this technology due to its portability and reliability.

  • Git & GitHub: For version control and collaboration, we will use Git and GitHub. These tools will enable the team to collaborate efficiently and keep track of changes, issues and documentation.

  • TBD Elaborate on CI/CD: We will use GitHub Actions to automate the CI/CD process, but this will be elaborated later when we have more knowledge.

  • SonarQube: For static code analysis, we will use SonarQube. This tool will help the team to keep the codebase clean and maintainable.

  • QWen2.5 (7B parameters): To add the LLM functionality to the system, we will use this model. This way we may apply to the Empathy contest if we decided to do so. That is the main reason to use this model because we felt that using others like Gemini or DeepSeek would probably provide better results, but we would not be able to enroll our project in the contest.

  • Azure: To deploy the system, we will use Azure due to its scalability and reliability. We will use a VM stored in Azure to deploy the system and make it accessible. We could have used any other cloud provider, but we decided to use Azure because it was the one shown in the course. Moreover, we could have used a local server, but that would demand to have a machine running 24/7 and we would have to take care of some security issues that using a cloud provided are not so relevant.

4.2. Architecture

The system will be divided into several components, each one will have theit own responsibility. The idea is to follow a microservices architecture, where each component is responsible for a specific task. This way we can scale the system more easily and we can have a more organized codebase. Also, we may be able to work in parallel on different components.

The specific components are:

  • WebApp: This component will be responsible for the user interface. It will be built using React and will be responsible for displaying the information to the user and for sending the user’s input to the backend services.

  • Gateway-Service: This component will be responsible for routing the requests to the correct service. It will be built using NodeJS and will be responsible for handling the requests from the WebApp and sending them to the correct service.

  • Users: This component will be responsible for managing the users. It will be built using NodeJS and will be responsible for handling the user’s information and authentication. Within this service, there are several subcomponents:

    • User-Service: This sub-service will be used to handle users within the database.

    • Auth-Service: This sub-service will be used to handle the authentication of the users.

    • Game-Service: This sub-service will be used to store the games that the users play.

    • LeaderBoard-Service: This sub-service will be used to retrieve information about users and games to be displayed ordered (for now by the total score obtained).

  • Questions-Service: This component will be responsible for retrieving the questions from Wikidata and storing them in its database. It will be built using NodeJS and will be responsible for handling the requests from the Gateway-Service and sending them to Wikidata.

  • LLM-Service: This component will be responsible of any interaction the system performs with the LLM model. It will be built using NodeJS and will be responsible for handling the requests from the Gateway-Service and sending them to the LLM model.

4.3. Organizational Decisions

All decisions will be recorded at the decision’s section of the Wiki of the project. We have decided to do it in such a way so that we do not need to deploy the documentation all the times we record a decision.

5. Building Block View

5.1. Whitebox Overall System

Whitebox Overall System
Motivation

This diagram shows the different parts of the application that interact when a user is playing, as well as outside elements used.

Contained Building Blocks
  • Player: User that interacts with the game, they will need to fill some sort of login information and be validated before playing.

  • App System: Main place where the game takes place.

  • WikiData API: External API, which will be used to generate questions and answers.

  • LLM API: External use of LLM (QWen) to allow the user to get hints from an artificial intelligence.

Important Interfaces

As we can see in the diagram, the user will interact with the App Internal System, which will first ask for a list of questions and answers from the WikiData API. Then, as well as showing the questions to the player, it will allow them to see an AI generated hint.

5.1.1. Container Diagram

level2 container
Motivation

This diagram shows in more detail the different (internal) parts of the application that interact when a user is playing, as well as outside elements used.

Contained Building Blocks
  • Player’s browser: The browser will send us some information about the user that interacts with the game, they will need to fill some sort of login information and be validated before playing.

  • App System: Main place where the game takes place. Will receive some information about the user from the browser (probably using a proxy) and communicate with the other internal parts of the app, or external (like the LLM functionality).

  • App API: Probably programmed in SpringbReactoot. This part of the app will receive the information about the user from the browser and save it in the database, as well as information about the status of the game.

  • Question generator: This part will first ask the WikiData API for some questions and answers, which it will later store in the database.

  • Database: In MongoDB. Will receive information about the user/state of the game/questions and answers.

  • WikiData API: External API, which will be used to generate questions and answers.

  • LLM API: External use of LLM (for now it will be QWen) to allow the user to get hints from an artificial intelligence.

Important Interfaces

As we can see in the diagram, the user will interact with the App Internal System first, which will ask for information about the user from their browser (proxy). This container will send this info to the app API which will store it in the database.

Then, when the game starts, the Question generator will use the WikiData API to get a list of questions and answers, which will be stored in the database as well. Then, as well as showing the questions to the player, the App System container will communicate with the LLM Container to allow players to see an AI generated hint.

6. Runtime View

6.1. <Runtime Scenario 1. Login and Authentication>

  • The user opens the application and is asked to log in.

  • They enter their credentials (username and password).

  • The client application sends an authentication request to the backend.

  • The backend verifies the credentials by checking in its database.

  • If login is successful, a session token is generated and sent back to the client.

  • The client stores the token for future requests.

  • If login fails, an error message is displayed, asking the user to try again.

Runtime 1 Login Diagram

6.2. <Runtime Scenario 2. Game Mode and Question Selection>

After logging in, the user navigates to the game selection screen. • The user selects a game mode (e.g., cities, historical figures, sports…). • The client requests a set of questions from the backend. • The game retrieves a set questions based on the image from WikiData. • The questions and associated images are sent to the client. • The game session starts, displaying the first question.

Runtime 2 Question Selection Diagram

6.3. <Runtime Scenario 3. Image-Based Questioning>

  • The app presents an image to the user based on the game topic.

  • The app gets the set of questions related to the image from the QuestionsDB, such as: o "Which historical figure is shown in this photo?" o "What city is this landmark located in?" o "Which sport is depicted in this picture?"   o "Which historical event is depicted in this picture?"

  • The user selects an answer from multiple-choice options or types in a response.

  • The client sends the answer to the backend for validation.

  • The backend checks if the response matches the correct answer.

  • If correct: o The user earns points. o The next question is displayed.

  • If incorrect: o The user can try again or request a hint

Runtime 3 Questioning Diagram

6.4. <Runtime Scenario 4. Hint System and AI Assistance>

  • If the user does not know the answer, they can request a hint.

  • The backend calls the AI to generate a hint based on the image and question.

  • Types of hints may include: o A brief historical fact about the person, place, or event. o A partial reveal of the answer. o A clue referencing a relatedevent. o Eliminating one wrong answer choices (if it is a multiple-choice question).

  • The hint is sent to the client and displayed to the user.

  • The user can then retry answering the question.

Runtime 4 Hints Diagram

6.5. <Runtime Scenario 5. Game Progress and Score Management>

  • The game continues through a predefined number of images and questions.

  • The backend keeps track of the user’s progress, including: o Score. o Number of correct and incorrect answers. o Time taken per question.

  • After the final question, the backend calculates the final score and stores it in the database.

  • The client displays a summary of the results.

Runtime 4 Hints Diagram

6.6. <Runtime Scenario 6. . Leaderboard and Social Features>

  • The user’s score is compared against other players.

  • The backend updates the leaderboard using the scores database or a dedicated one.

  • The client displays rankings, showing top players.

Runtime 4 Hints Diagram

7. Deployment View

Overview Diagram

deployment
Motivation

The application is build using Docker and an Azure Virtual Machine with Ubuntu. Using both together creates a combination of the portability and efficiency of the containers obtained by the use of Docker with the additional security and isolation provided by virtual machines.

Docker will be used for the deployment of the application both while developing (local deployment) and when the final result is ready. It is a containerization platform that packages applications and their dependencies into lightweight, portable containers, ensuring consistency across environments.

Quality and/or Performance Features

Expanding on the advantages of the combined use of Docker and a VM (and also their independent characteristics that add their own value), we can remark:

  • Isolation: Not only because of the container-level isolation provided by Docker (which helps prevent dependency conflicts), but also because the additional OS-level isolation thanks to the use of an VM.

  • Security: The VM can act as a security boundary if a container is ever compromised.

  • Resource Management and Allocation: The VM has a dedicated CPU, RAM and disk which can be allocated specifically for Docker, and Docker itself reduces resource consumption by sharing the OS kernel.

  • Portability and Compatibility: Using Docker ensures that the application works in the exact same way across different environments.

  • More: Easy recovery and backup (this can also be attributed to the use of a Github repository), flexible scalability…​

Mapping of Building Blocks to Infrastructure

As of now, the current application counts with several building blocks that are yet to be defined:

  • Webapp: Its port (3000) is the one the user will connect to when interacting with the application, as it is its graphic user interface and will work as the frontend.

  • Gateway: As its name suggets, it works as the interface that comunicates the webapp with the rest of the services.

  • Userservice: It will be in charge of the user management.

  • Authservice: On the same level as the previous one, it will be use for the authentication of users.

  • Llmservice: This one is also on the same level as the two before it, and it is the one that manages the LLM that will be used in the application. The LLM API used will be the one from QWen.

  • Questionservice: This service communicates with the Wikidata API and returns a type of question, with a picture attached to it and a correct answer.

  • Database: The database that will contain all the data of the application. Currently we only have one database but, further ahead, it may be divided in several databases to manage different features independently (one for users, another for the questions for the game…​), but this is yet unclear.

It is also worth mentioning the use of other tools that we be probably or surely implemented in the future:

  • APIs: The Wikidata API, the LLM API…​

  • Prometheus and Grafana: This tools combined allow to monitor and observe the system creating statistics about it.

8. Cross-cutting Concepts

8.1. Internacionalization

In hopes of incresing accessibility for our users, we want to give our application the ability of playing in different languages.

We are starting off only offering English and Spanish, but this offer may be expanded in the future with different prototypes.

As we know that internacionalization is more than just changing the language of the text in the game, we will have to look further into some library that will make these feature easier to implement.

Our first prototype won’t be capable of changing the language, but hopefully in the future it will.

8.2. Security

We also want our users to not only have fun while playing our game, but feel safe. For this purpose, we use the following characteristics:

  • Bcrypt: used to securely hash and store user passwords, protecting them from breaches. Its adaptive nature makes it resistant to brute-force attacks, ensuring robust user authentication.

  • JWT (JSON Web Token): it enables secure, stateless user authentication. It allows users to stay logged in without storing session data on the server, enhancing scalability and providing a secure way to transmit user data between client and server.

8.3. Testing

We want our team to pay lots of attention to testing while developing the app. A correct testing suite can ensure that the app is evolving correctly and that we won’t have problems in the future.

For this, we are using jest and testing each individual service as well as the endpoints in the gateway service, before they connect to the webapp directory (front-end).

9. Architecture Decisions

9.1. Using endpoints in gateway service

This will provide some importante benefits, for example:

  • Centralized Routing: Manages and directs traffic to microservices, improving scalability (see the quality requirements part).

  • Enhanced Security: Acts as a security layer for authentication and logging.

  • Simplified Client Access: Provides a unified API endpoint, making client-side development easier.

Some downsides could be that if the gateway service fails, our whole app fails. And this will also make our app harder to mantain.

9.2. Using MongoDB

We considered other data base models to use in our project, including MySQL and PostgreSQL, after some thinking, we sttled with MongoDB for the following reasons:

  • Scalability: Easily handles large amounts of unstructured or semi-structured data.

  • Flexible Schema: NoSQL structure allows rapid iteration without worrying about complex migrations.

  • High Performance: Fast reads and writes, very important when working with a game.

10. Quality Requirements

Quality Requirements Scenario Priority

Security

Protect user data through secure authentication (e.g., bcrypt, JWT) and defend against attacks like SQL injection

High

Scalability

Ensure the app can handle increased load from more users and questions without degrading performance

High

Performance

Provide fast response times for a smooth user experience, especially during gameplay.

High

Maintainability

Make the codebase easy to update, modify, and extend with clean architecture and documentation for future extensions of the prototype

Medium

Usability

Ensure the user interface is intuitive and responsive, offering a pleasant experience for all users

Medium

Testability

Facilitate automated testing for quicker development cycles and better reliability

Medium

Portability

Have a smooth deploymentin different operating systems/devices/platforms.

Low

11. Risks and Technical Debts

One of the main risks of our project is the use of innovative but potentially unstable technologies and approaches. Below, there is a list with the key risks and technical debts, comparing our current choices with alternative solutions.

Decision

Current approach

Pros of current approach

Cons of current approach

Alternative Approach

Software environment

Node.js

Great performance and we work with something new

We haven’t work with it until now, more experience with other languages (java)

Java

Database

MongoDB

Flexible, great perfromance with node and easy to use

Flexibility can lead to problems, lack experience with no relational DB

PostgreSQL

Incorrect answers

LLM

Innovated and flexible

performance, inconsitent, external service

Database, wikidata

Deployment

Azure Virtual Machine

Full control, no local resources

external service, cost money

Other external service, local machine

12. Glossary

Term Definition

MongoDB

A NoSQL, document-oriented database that stores data in flexible, JSON-like documents, making it easy to handle.

JWT (JSON Web Token)

Compact, URL-safe token used for securely transmitting information between parties, commonly for user authentication and authorization in web applications.

CI/CD Testing

Automated process of running tests during Continuous Integration (CI) and Continuous Deployment (CD) pipelines to ensure code changes are reliable, functional, and ready for deployment

WikiData

Knowledge base that stores structured data to support Wikipedia, and external applications, enabling access to facts and relationships.

LLM (Large Language Model)

Artificial intelligence model trained on of text data to understand, generate, and process human language.