About arc42
arc42, the template for documentation of software and system architecture.
Template Version 8.2 EN. (based upon AsciiDoc version), January 2023
Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.
1. Introduction and Goals
The application aims to be a web game of questions and answers following the format of the TV show ‘Saber y Ganar’. The game will show a question to the player followed by a fixed number of possible answers (the information used in the questions and answers will be obtained from Wikidata https://www.wikidata.org). The player must answer these questions by clicking on one of the answers, clicking a correct answer will increase their score and clicking a bad answer will decrease it.
The description given till now was the same as the project that was implemented by the students that took the software architecture course last year. But, apart from that, this year there will be a new feature that implements a clue system using an LLM. This system will let the user decide whether to use clues or not once they have seen the question. Using the clues will result in a penalty to be decided. Some goals established for the application are (without priority order):
-
The game should implement all the functionality of the ones developed the previous year
-
The game should integrate a clue system that uses LLM technology.
-
There should be a good quantity of questions and answers to be asked in the game. (considering that the information for this Q&As must be taken for Wikidata)
-
There shall be a log in in the application that allows users to obtain historical data from their participations.
1.1. Requirements Overview
-
The system should give non-registered or unidentified users the option to register to the application
-
The user should be able to ask for hints from the LLM in each question
-
An specific approach should be use to avoid incorrect answers and hallucinations from the LLM
-
Each question must have a single correct answer and at least one incorrect answer
-
The game shall correctly show the questions and all the possible answers.
-
The game shall manage the answers given by the player as expected.
-
The LLM clue system should be implemented so it gives quality clues.
-
The user interaction with the clue system shall be managed.
-
The questions shall be obtained from Wikidata.
-
The questions shall come with an image obtained from Wikidata.
-
The application should be accessible through the web.
-
There shall be a record of the historical data of the games from logged in players (number of games, number of failed questions, …).
-
There shall be a time count down in the game questions.
-
The system shall give access to information about questions through an API.
1.2. Quality Goals
The implementation of tests and that the product is extensible is one of the main goals as they are one of the most important requirements of some of our stakeholders (our teachers)
Goal | Description |
---|---|
Compatibility & Transferability |
The system has to be compatible with different browsers and devices while ensuring an optimal experience regardless of the chosen platform. It also has to take into account different screen sizes |
Availability |
The system shall be available 99% of the time a user tries to access |
Usability & Operability |
The application is intuitive, users can learn how to fluently use the application in their first use |
Performance |
The application is able to manage a big number of users without penalizing response times |
Maintainability |
The system has to be designed to facilitate the maintenance and updating |
Functional suitability |
The system can fulfill its intended goal effectively by allowing the users to register, play the game, ask for hints to the LLM and access their statistics |
Reliability |
The system has to be able to generate questions from Wikidata ensuring their accuracy and diversity. It should handle the storage of user registrations, logins and game data without errors |
Security |
The information about users can only accessed by themselves |
Other goals to take into account are:
-
We should use a test focused development
-
We should carefully choose the design to make sure that the application is maintainable
-
The system has to be easy to use and understand
1.3. Stakeholders
Role/Name | Expectations |
---|---|
Users |
Test their knowledge. |
Software architecture professors |
Provide their students, the development team, with practical experience regarding software architecture by allowing them to work in an environment that emulates the flow of development of a real project . Which means that they expect us to be able to meet the standards of quality and production of a real application. |
RTVE |
Get an evolution of their quiz game and continue promoting their program all around the world |
ChattySW |
Develop an application within the given constraints and that fulfills the requirements given by the client |
Wikidata |
Offer a service allowing the development team to create the questions using an API |
2. Architecture Constraints
The application has some constraints, these are requirements of the stakeholders or environment that we must accept. They are divided in technical constraints, organizational constraints and convention constraints.
2.1. Technical Constraints
Constraint | Limitations |
---|---|
Web Frontend |
The system will have at least a Web frontend |
Wikidata |
We must use Wikidata to get information to generate the questions |
LLM |
An LLM will be used to generate hints for the questions |
GitHub |
We must use GitHub for the version control of the project |
2.2. Organizational Constraints
Constraints | Limitations |
---|---|
Team |
The team for the project was chosen by the professors |
Weekly meetings |
There must be at least one weekly meeting |
Public repository |
The project source code will be available in a public repository |
2.3. Conventions
Conventions | Limitations |
---|---|
Documentation architecture |
The documentation must follow Arc42 |
Language |
The documentation and in general the project will be written in English |
3. Context and Scope
3.1. Business Context
data:image/s3,"s3://crabby-images/9816e/9816eda7a097bd061df4ad768d9f41f2700f41a7" alt="Business context"
-
Player (user, person): the player interacts with our system by using the front-end or user interface of the application to play the question games.
-
Our system (internal system) : our system allows the player to play the question games, and it will interact with the WikiData API and LLM API by using queries to get the questions and options, and to ask for hints related to the questions respectively.
-
WikiData API: API to extract the information for generating the different questions, options and answers.
-
LLM API: API to ask to an AI for hints related to the question shown to the user.
3.2. Technical Context
data:image/s3,"s3://crabby-images/3828b/3828ba4f77c3981f53643421dd85015d512973f3" alt="Technical context"
The application is deployed using Docker with Azure. The whole application: The user interacts with the web app (front-end of the app). It will be a SpringBoot project that will contain the user interfaces and ask the other parts of the application the different data that it needs. So, it acts as a Controller.
The services will be:
-
Authentication service
-
Users service (statistics)
-
Question generation service (WikiData)
-
Hint service (LLM)
-
Game service
Many services will interact with the database (MongoDB) to add information for different entities (Questions, Users…)
4. Solution Strategy
4.1. Technology breakdown
Goal / Requirement | Architectural approach | Conclusions |
---|---|---|
Implement LLM technologies in the game |
Research shall be done to decide which LLM is more convenient for our game. A research about how to implement this technologies in a good way (so hallucinations are avoided) shall be done |
The LLM that is going to be used will be decided in further stages of the development. This will be done like this because the majority of the LLMs are implemented in a pretty similar way, so we can change the model with only some small changes in the code. |
Develop the program using GitHub. |
Research about branching strategies in GitHub shall be done. All team members shall be acquainted with GitHub |
The branching strategy ‘GitHub Flow’ will be used by the development team as it is the branching strategy that fits better the team and project characteristics. |
The code should be easy to understand and to extend in the future |
Design techniques and patterns shall be implemented. |
4.2. Organizational breakdown
Up until this moment we are meeting only once a week, but we expect that that may have to change as the project evolves. We are using a whatsapp group as a secondary way of communication
4.3. Quality decisions
Attribute pursued | Choice |
---|---|
5. Building Block View
5.1. Level 1
Overall view of the system and the parts in which it is divided as well as the external systems it connects to.
data:image/s3,"s3://crabby-images/9083e/9083e3c949d54a4f0a7b3a711af111f015a648fc" alt="Level 1"
Application represents the entire system implemented by our team. This is the part of the system with which the user interacts. LLM stands for Large Language Model, which is used to generate hints. Wiki data is the external system that provides information about the questions in the game. The last part is the Database, which represents the connection to a database where data about the program, such as user data, is stored.
5.2. Level 2
data:image/s3,"s3://crabby-images/b255e/b255e04e412bf26df54fc0df2efeaffa38085e40" alt="Level 2"
-
The Frontend is what the user will interact with, the part the user will be able to see. It will send requests to the GameService and UserService.
-
The GameService handles the game, it will ask for questions and hints, and send them to the frontend, then process the answers. It will also interact with the user service to keep track of scores and other user information related to the game.
-
The UserService is in charge of logging in the user and keeping track of all their relevant information by storing it and retrieving it from a database.
-
The HintService will be used to interact with an LLM in order to generate hints.
-
The QuestionService will be used to generate questions based on data extracted from WikiData.
-
The Database will store information used in the system such as the user login details, the previous games and scores, in the future we might also use it for information about the questions.
6. Runtime View
6.1. Sign up
The following diagram shows how the sign up process of a user is done.
-
The client writes the credentials needed to create and account.
-
Check the credentials are valid in format.
-
Check that the fields that need to be unique are unique.
-
Returns if they are unique.
-
Create the user and store it in the database.
-
Notify to what page the user should be redirected.
-
The user is redirected to the menu. The user can access to its statistics (interaction with the User service) and play the game.
-
Notify that the credentials are invalid.
-
The errors are shown and the user isn’t registered.
data:image/s3,"s3://crabby-images/2c094/2c094157e9a03bd1a55911169d65aacf0fc22672" alt="Sign up diagram"
6.2. Login
The following diagram shows how the login process of a user is done.
-
The client writes the credentials needed to login.
-
Check the credentials are valid in format.
-
Check the credentials are valid for the user.
-
Return if the credentials are valid.
-
The user is authenticated.
-
The user is redirected to the menu. The user can access to its statistics (interaction with the User service) and play the game.
-
Notify that the credentials are invalid.
-
The errors are shown and the user isn’t registered.
data:image/s3,"s3://crabby-images/b66b9/b66b903bdd11da2fc3a92a291ea5b4b94176c9a9" alt="Login diagram"
6.3. Game
The following diagram shows how the playing of the game is done.
-
Starts game.
-
The game starts.
-
See the Question Generation section below.
-
The data related to the question is returned.
-
User asks for a hint.
-
The timer is stopped.
-
See the Hint Generation section below.
-
The information related to the hint is returned.
-
User choses an option from the four given.
-
The game service checks if the option picked is the answer or not to the question.
-
The game service returns if it’s correct the option picked.
-
The user is informed if they make a good guess or not.
-
Store the results and information of the game in the database.
-
Returns the information of the game to the Application.
-
The information of the game is shown to the user.
data:image/s3,"s3://crabby-images/9a727/9a7276c5d5ed78194ce88858e6e8582ad8d364c6" alt="Game diagram"
6.4. Question generation
The following diagram shows how the generation of the questions is done. We have 2 options for the moment.
6.4.1. Option 1
This option loads the questions during the game directly from Wikidata.
-
The game service asks for a question.
-
The question generation service requests data for creating the questions.
-
Wikidata returns the question, image and options for the question.
-
The question generation service returns all the information.
-
The game service stores all the information returned.
-
The game service returns the data to the Application
-
The question, image and options for the question are shown
data:image/s3,"s3://crabby-images/42baa/42baad93d11007818f319dcff82ef1a888eedae3" alt="Question generation 1 diagram"
6.4.2. Option 2
This option is it can have a better performance, as you don’t depend in the game itself for the WikiData API, as the questions are loaded from the database during the game.
-
Run the question generation service to load the questions from Wikidata.
-
WikiData returns all the information related to the question.
-
The information is stored in the database for later use.
-
The game service asks for a question to the database
-
The database returns all the information of a question
-
The game service returns the data to the Application
-
The question, image and options for the question are shown
data:image/s3,"s3://crabby-images/85ad8/85ad8327a81b4eb2c2b6ba7e636754c2d146eecf" alt="Question generation 2 diagram"
6.5. Hint generation
The following diagram shows how the generation of the hints is done.
-
The user asks for a hint to the current question.
-
The application requests a hint for the current question.
-
The hint service gives some context to the LLM.
-
The LLM returns the clue.
-
The hint service returns the clue.
-
The clue is shown to the user.
data:image/s3,"s3://crabby-images/d6c28/d6c28a5e4d50fe6395d1adb4a62003ca7b83b35f" alt="Hint generation diagram"
7. Deployment View
7.1. Infrastructure Level 1
data:image/s3,"s3://crabby-images/555a4/555a4bbdb464d2c5e140095493c7c0bb7095af1d" alt="Deployment"
7.2. Motivation
A well-defined deployment view provides a clear understanding of how different system components interact and operate within the infrastructure. This ensures smooth functionality, scalability, and maintainability.
7.3. Content
The deployment view represents the technical infrastructure that supports the system’s execution. The diagram offers a high-level visualization of key components and their interactions.
7.4. Deployment Architecture Components
-
User: The end-user interacts with the system through the web application.
-
Web Application: The central interface that connects users to the system.
-
Application Layer: Manages multiple services, including:
-
LLM Service: Interfaces with an external Large Language Model (Gemini) to process queries.
-
Question Service: Handles question-related logic and interacts with the database using the Wikidata API.
-
User Service: Manages user-related functionalities such as authentication and profile management.
-
-
Database Layer: Uses MongoDB as the primary database for storing user and question-related data.
8. Cross-cutting Concepts
8.1. <Concept 1>
<explanation>
8.2. <Concept 2>
<explanation>
…
8.3. <Concept n>
<explanation>
9. Architecture Decisions
10. Quality Requirements
10.1. Quality Tree
10.2. Quality Scenarios
11. Risks and Technical Debts
Risk | Description |
---|---|
Integration with Wikidata |
Extracting meaningful and accurate data from Wikidata for generating questions and hints may be challenging. Data inconsistencies, missing attributes, or outdated information could lead to incorrect questions or hints. |
LLM Accuracy |
The LLM may generate incorrect or misleading hints (hallucinations). |
Scalability of the System |
As more users register and play, the backend (Wikidata API, LLM API, and database) may face performance bottlenecks. |
User Authentication and Data Privacy |
User registration and historical data storage introduce potential security vulnerabilities (e.g., data leaks, unauthorized access). Compliance with GDPR or other privacy regulations is also a concern. |
12. Glossary
Term | Definition |
---|---|
WikiData |
Large database containing information from multiple subjects that’s free to access and open to modifications. We will use it to generate questions for the game. |
LLM |
A Large Language Model is a pre-trained language model with ample knowledge about many different topics that can deliver answers in a way similar to human speach. It will be used to provide clues to the user |
Spring |
Popular framework for the development of web applications that provides different modules for common services such as authentication, access to databases, security, etc. |
Framework |
Set of concepts and practices used for solving a problem that can be used as a template for solving similar problems. |
LLM |
A Large Language Model is a pre-trained language model with ample knowledge about many different topics that can deliver answers in a way similar to human speech. It will be used to provide clues to the user. |
Gemini |
A LLM developed by Google. It will be one of the LLMs used in our application. |
Gitlab flow |
Branching strategy whit 3 different branch types. A Master branch with stable code for release, a Deployment branch where errors are still being fixed and a Feature branch for each feature developed for the system. |
MongoDB |
An opensource Database system NoSQL. Since it is non-relational, the data isn’t stored in tables but in BSON, a structure similar to JSON. |
NoSQL databases |
Umbrella term used for database systems that store data in a non-relational way. In relational databases, data is stored in tables and connections are established through relationships. NoSQL databases don’t follow this format and may relate the elements of the database using any other method. |
JSON |
Text format used for transferring data. Information is stored in pairs of “name” and “value”, with “name” acting as an identifier for the attribute and “value” its current value. |
BSON |
(Binary JSON) Data format used by MongoDB. It includes all JSON data structure types and adds support for types including dates, different size integers, ObjectIds, and binary data. |
Docker |
Platform for developers and system administrators. Provides an extra layer or abstraction compared to Virtual Machines since the containers themselves are not found on the guest operating system but on Docker. It also provides orchestration between containers without the big files needed for VMs. |
Container |
Executable package that encloses an application. Multiple containers can form a complex architecture while remaining isolated from each other. It is a live instance of an image, allowing it to be shared and stored. |
Image |
File that includes everything necessary to run an application such as code, runtime system, libraries, runtime variables and configuration files. This will ensure that running the application is consistent independently of the environment. |