About arc42

arc42, the template for documentation of software and system architecture.

Template Version 8.2 EN. (based upon AsciiDoc version), January 2023

Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.


Note

This version of the template contains some help and explanations. It is used for familiarization with arc42 and the understanding of the concepts. For documentation of your own system you use better the plain version.

1. Introduction and Goals

The application aims to be a web game of questions and answers following the format of the TV show ‘Saber y Ganar’. The game will show a question to the player followed by a fixed number of possible answers (the information used in the questions and answers will be obtained from Wikidata. The player must answer these questions by clicking on one of the answers, clicking a correct answer will increase their score and clicking a incorrect answer will decrease it.

The description given till now was the same as the project that was implemented by the students that took the software architecture course last year. But, apart from that, this year there will be a new feature that implements a clue system using an LLM. This system will let the user decide whether to use clues or not once they have seen the question. Using the clues will result in a penalty to be decided. Some goals established for the application are (without priority order):

  • The game should implement all the functionality of the ones developed the previous year

  • The game should integrate a hint system that uses LLM technology.

  • There should be a good quantity of questions and answers to be asked in the game. (considering that the information for this Q&As must be taken for Wikidata)

  • There shall be a log in in the application that allows users to obtain historical data from their participations.

1.1. Requirements Overview

1.1.1. The functional requirements

About the Game service:

  • The game shall correctly show the questions and all the possible answers.

  • The game shall manage the answers given by the player as expected.

  • The user interaction with the clue system shall be managed.

  • The application should be accessible through the web.

  • There shall be a time count down in the game questions.

About the Hint service:

  • The user should be able to ask for hints from the LLM in each question

  • The system shall give access to information about questions through an API.

About the Question service:

  • Each question must have a single correct answer and at least one incorrect answer

  • The questions shall be obtained from Wikidata.

  • The questions shall come with an image obtained from Wikidata.

About the Users service:

  • The system should give non-registered or unidentified users the option to register to the application There shall be a record of the historical data of the games from logged in players (number of games, number of failed questions, …).

1.1.2. The non-functional requirements

About the Hint service:

  • An specific approach should be use to avoid incorrect answers and hallucinations from the LLM

  • The LLM clue system should be implemented so it gives quality clues.

1.2. Quality Goals

About the Hint service: The implementation of tests and that the product is extensible is one of the main goals as they are one of the most important requirements of some of our stakeholders (our teachers)

Goal Description

Compatibility & Transferability

The system has to be compatible with different browsers (at least Chrome, Firefox and Opera) and devices while ensuring an optimal experience regardless of the chosen platform. It also has to take into account different screen sizes (computer/laptop size, tablet size and phone size)

Availability

The system shall be available 99% of the time a user tries to access

Usability & Operability

The application is intuitive, users can learn how to fluently use the application in their first use. The users should be able to play the game and complete it.

Performance

The application is able to manage a big number of users (100) without penalizing response times

Maintainability

The system has to be designed to facilitate the maintenance and updating

Functional suitability

The system can fulfill its intended goal effectively by allowing the users to register, play the game, ask for hints to the LLM and access their statistics

Reliability

The system has to be able to generate questions from Wikidata ensuring their accuracy and diversity. It should handle the storage of user registrations, logins and game data without errors

Security

The information about users can only accessed by themselves

Other goals to take into account are:

  • We should use a test focused development

  • We should carefully choose the design to make sure that the application is maintainable

  • The system has to be easy to use and understand

1.3. Stakeholders

Role/Name Expectations

Users

Test their knowledge.

Software architecture professors

Provide their students, the development team, with practical experience regarding software architecture by allowing them to work in an environment that emulates the flow of development of a real project . Which means that they expect us to be able to meet the standards of quality and production of a real application.

RTVE

Get an evolution of their quiz game and continue promoting their program all around the world

ChattySW

Develop an application within the given constraints and that fulfills the requirements given by the client

Wikidata

Offer a service allowing the development team to create the questions using an API

2. Architecture Constraints

The application has some constraints, these are requirements of the stakeholders or environment that we must accept. They are divided in technical constraints, organizational constraints and convention constraints.

2.1. Technical Constraints

Constraint Limitations

Web Frontend

The system will have at least a Web frontend

Deployment

The application should be deployed

Wikidata

We must use Wikidata to get information to generate the questions

LLM

An LLM will be used to generate hints for the questions

GitHub

We must use GitHub for the version control of the project

2.2. Organizational Constraints

Constraints Limitations

Team

The team for the project was chosen by the professors

Weekly meetings

There must be at least one weekly meeting

Public repository

The project source code will be available in a public repository

2.3. Conventions

Conventions Limitations

Documentation architecture

The documentation must follow Arc42

Language

The documentation and in general the project will be written in English

3. Context and Scope

3.1. Business Context

Business context
  • User: the player interacts with our application by using the front-end of the application to play the question game, sign up and log in.

  • Application : our application allows the player to play the question games, and it will interact with the WikiData API and LLM API by using queries to get the questions and options, and to ask for hints related to the questions respectively. It also allows to authenticated the player and allows them to see their statistics.

  • WikiData API: API to extract the information for generating the different questions, options and answers.

  • Database: Database for storing the users information.

  • LLM API: API to ask to an AI for hints related to the question shown to the user.

3.2. Technical Context

Technical context

The application is deployed using Docker with Azure. The whole application: The user interacts with the front-end of the application. It will be a SpringBoot project that will contain the user interfaces and ask the other parts of the application the different data that it needs. So, it acts as a Controller.

The services will be:

  • Users service (login and statistics)

  • Question service (WikiData)

  • Hint service (LLM)

  • Game service

The User service will interact with the database (MongoDB) to add information for the user authentication, and their statistics for the different games he played.

4. Solution Strategy

4.1. Technology breakdown

4.1.1. Backend

Decision Reasons

Use SpringBoot at the project

We are going to use SpringBoot at the project because:

- It gives us a lot of functionality already implemented so that we don’t have to implement it ourselves.

- It allows us to use Java, which we are very used to and prefer over JavaScript.

MongoDB as a database system

We will use MongoDB because:

- It’s already integrated in the user and authentication service

- It’s the more popular non-relational database, and has a more or less simple syntax

4.1.2. Frontend

Decision Reasons

Use React for developing the project

We are going to use React to develop the frontent because:

- React’s component-based structure allows for modular and reusable code.

- React has a vast ecosystem of libraries, tools, and community resources.

4.1.3. LLM

Decision Reasons

Gemini as the LLM for generating hints

We will use (at least at first) Gemini because:

- It has a free api usage with limitations, but it is enough for our application.

- It’s already integrated in the llm service.

- It has a way to handle hallucinations and errors

4.1.4. Deployment

Decision Reasons

Docker with Azure for deploying the application

We will use Docker with Azure because:

- Docker containers encapsulate the application and its dependencies, ensuring consistency across different environments (development, testing, production).

- This portability allows seamless deployment on Azure, regardless of the underlying infrastructure.

4.2. Toplevel decomposition

Decision Reasons

MVC architectural pattern

We will use MVC for the project because:

- It allows us to divide the application into model, view and controller and are less coupled

Microservices architecture

We will use the microservices architecture because:

- We can divide the project into independent services (other projects) that can be reusable

4.3. Organizational breakdown

Scrum, Pull requests revision,

Decision Reasons

Weekly meeting

Up until this moment we are meeting only once a week because:

- We can work together at class in person

- We can be up to date with the things done, discuss the future decisions and see what problemas each teammate have

- Everyone have a moment to express themselves

WhatsApp group

We are using a WhatsApp group as a secondary way of communication because it’s a common application for all of us and it can be accessed anywhere.

GitLab Flow as branching strategy

The branching strategy ‘GitLab Flow’ will be used by the development team because:

- It is the branching strategy that fits better the team and project characteristics.

- It’s simple as it only has 3-4 types of branches. It’s less complex than Gitflow but more structured than Github Flow.

- It’s well divided, and allows simultaneous work on different features

4.4. Quality decisions

Attribute pursued Choice

5. Building Block View

5.1. Level 1

Overall view of the system and the parts in which it is divided as well as the external systems it connects to.

Level 1

Application represents the entire system implemented by our team. This is the part of the system with which the user interacts.

LLM stands for Large Language Model, which is used to generate hints.

Wiki data is the external system that provides information about the questions in the game.

The last part is the Database, which represents the connection to a database where data about the program, such as user data, is stored.

5.2. Level 2

Level 2
  • The Frontend is what the user will interact with, the part the user will be able to see. It will send requests to the GameService and UserService.

  • The GameService handles the game, it will ask for questions and hints, and send them to the Frontend, then process the answers. It will also interact with the UserService to keep track of scores and other user information related to the game.

  • The UserService is in charge of logging in the user and keeping track of all their relevant information by storing it and retrieving it from the Database.

  • The HintService will be used to interact with an LLM in order to generate hints.

  • The QuestionService will be used to generate questions based on data extracted from WikiData.

  • The Database will store information used in the system such as the user login details, the previous games and scores.

6. Runtime View

6.1. Sign up

The following diagram shows how the sign up process of a user is done.

  1. The client writes the credentials needed to create and account.

  2. Check the credentials are valid in format.

  3. Check that the fields that need to be unique are unique.

  4. Returns if they are unique.

  5. Create the user and store it in the Database.

  6. Notify to what page the user should be redirected.

  7. The user is redirected to the menu. The user can access to its statistics (interaction with the User service) and play the game.

  8. Notify that the credentials are invalid.

  9. The errors are shown and the user isn’t registered.

Sign up diagram

6.2. Login

The following diagram shows how the login process of a user is done.

  1. The client writes the credentials needed to login.

  2. Check the credentials are valid in format.

  3. Check the credentials are valid for the user.

  4. Return if the credentials are valid.

  5. The user is authenticated.

  6. The user is redirected to the menu. The user can access to its statistics (interaction with the UserService) and play the game.

  7. Notify that the credentials are invalid.

  8. The errors are shown and the user isn’t registered.

Login diagram

6.3. Game

The following diagram shows how the playing of the game is done.

  1. Choses the game option in the menu.

  2. The game starts.

  3. See the Question Generation section below.

  4. The data related to the question is returned.

  5. User asks for a hint.

  6. The timer is stopped.

  7. See the Hint Generation section below.

  8. The information related to the hint is returned.

  9. User choses an option from the four given.

  10. The GameService checks if the option picked is the answer or not to the question.

  11. The GameService returns if it’s correct the option picked.

  12. The user is informed if they make a good guess or not.

  13. Store the results and information of the game in the database.

  14. Returns the information of the game to the FrontEnd.

  15. The information of the game is shown to the user.

Game diagram

6.4. Question generation

The following diagram shows how the generation of the questions is done. This option loads the questions during the game directly from Wikidata.

  1. The GameService asks for a question.

  2. The QuestionService requests data for creating the questions.

  3. Wikidata returns the question, image and options for the question.

  4. The QuestionService returns all the information.

  5. The GameService stores all the information returned.

  6. The GameService returns the data to the FrontEnd

  7. The question, image and options for the question are shown

Question generation 1 diagram

6.5. Hint generation

The following diagram shows how the generation of the hints is done.

  1. The GameService (after the user asks it) asks for a hint to the current question.

  2. The HintService gives some context to the LLM.

  3. The LLM returns the clue.

  4. The HintService returns the clue.

Hint generation diagram

7. Deployment View

7.1. Infrastructure Level 1

Deployment

7.2. Deployment Architecture Components

  • User: The end-user interacts with the system through the web application.

  • Application: The central interface that connects users to the system.

  • GameService: Manages multiple services, including:

    • Hint Service: Interfaces with an external Large Language Model (Gemini) to process queries.

    • Question Service: Handles question-related logic and interacts with the database using the Wikidata API.

    • User Service: Manages user-related functionalities such as authentication and profile management.

  • Database: Uses MongoDB as the primary database for storing user and question-related data.

8. Cross-cutting Concepts

Content

This section describes overall, principal regulations and solution ideas that are relevant in multiple parts (= cross-cutting) of your system. Such concepts are often related to multiple building blocks. They can include many different topics, such as

  • models, especially domain models

  • architecture or design patterns

  • rules for using specific technology

  • principal, often technical decisions of an overarching (= cross-cutting) nature

  • implementation rules

Motivation

Concepts form the basis for conceptual integrity (consistency, homogeneity) of the architecture. Thus, they are an important contribution to achieve inner qualities of your system.

Some of these concepts cannot be assigned to individual building blocks, e.g. security or safety.

Form

The form can be varied:

  • concept papers with any kind of structure

  • cross-cutting model excerpts or scenarios using notations of the architecture views

  • sample implementations, especially for technical concepts

  • reference to typical usage of standard frameworks (e.g. using Hibernate for object/relational mapping)

Structure

A potential (but not mandatory) structure for this section could be:

  • Domain concepts

  • User Experience concepts (UX)

  • Safety and security concepts

  • Architecture and design patterns

  • "Under-the-hood"

  • development concepts

  • operational concepts

Note: it might be difficult to assign individual concepts to one specific topic on this list.

Possible topics for crosscutting concepts
Further Information

See Concepts in the arc42 documentation.

8.1. <Concept 1>

<explanation>

8.2. <Concept 2>

<explanation>

…​

8.3. <Concept n>

<explanation>

9. Architecture Decisions

9.1. Client/Server architecture

We are going to have an architecture with a main application with a client and server, and several servers for the external services (users, WikiData, LLM, etc.) that will act as data APIs for the main application server. These service servers will respond to requests by returning data that the main application will process and pass to the view.

10. Quality Requirements

Content

This section contains all quality requirements as quality tree with scenarios. The most important ones have already been described in section 1.2. (quality goals)

Here you can also capture quality requirements with lesser priority, which will not create high risks when they are not fully achieved.

Motivation

Since quality requirements will have a lot of influence on architectural decisions you should know for every stakeholder what is really important to them, concrete and measurable.

Further Information

See Quality Requirements in the arc42 documentation.

10.1. Quality Tree

Content

The quality tree (as defined in ATAM – Architecture Tradeoff Analysis Method) with quality/evaluation scenarios as leafs.

Motivation

The tree structure with priorities provides an overview for a sometimes large number of quality requirements.

Form

The quality tree is a high-level overview of the quality goals and requirements:

  • tree-like refinement of the term "quality". Use "quality" or "usefulness" as a root

  • a mind map with quality categories as main branches

In any case the tree should include links to the scenarios of the following section.

10.2. Quality Scenarios

Contents

Concretization of (sometimes vague or implicit) quality requirements using (quality) scenarios.

These scenarios describe what should happen when a stimulus arrives at the system.

For architects, two kinds of scenarios are important:

  • Usage scenarios (also called application scenarios or use case scenarios) describe the system’s runtime reaction to a certain stimulus. This also includes scenarios that describe the system’s efficiency or performance. Example: The system reacts to a user’s request within one second.

  • Change scenarios describe a modification of the system or of its immediate environment. Example: Additional functionality is implemented or requirements for a quality attribute change.

Motivation

Scenarios make quality requirements concrete and allow to more easily measure or decide whether they are fulfilled.

Especially when you want to assess your architecture using methods like ATAM you need to describe your quality goals (from section 1.2) more precisely down to a level of scenarios that can be discussed and evaluated.

Form

Tabular or free form text.

11. Risks and Technical Debts

Risk Description

Integration with Wikidata

Extracting meaningful and accurate data from Wikidata for generating questions and hints may be challenging. Data inconsistencies, missing attributes, or outdated information could lead to incorrect questions or hints.

LLM Accuracy

The LLM may generate incorrect or misleading hints (hallucinations).

Scalability of the System

As more users register and play, the backend (Wikidata API, LLM API, and database) may face performance bottlenecks.

User Authentication and Data Privacy

User registration and historical data storage introduce potential security vulnerabilities (e.g., data leaks, unauthorized access). Compliance with GDPR or other privacy regulations is also a concern.

12. Glossary

Term Definition

WikiData

Large database containing information from multiple subjects that’s free to access and open to modifications. We will use it to generate questions for the game.

LLM

A Large Language Model is a pre-trained language model with ample knowledge about many different topics that can deliver answers in a way similar to human speach. It will be used to provide clues to the user

Spring

Popular framework for the development of web applications that provides different modules for common services such as authentication, access to databases, security, etc.

Framework

Set of concepts and practices used for solving a problem that can be used as a template for solving similar problems.

Gemini

A LLM developed by Google. It will be one of the LLMs used in our application.

Gitlab flow

Branching strategy whit 3 different branch types. A Master branch with stable code for release, a Deployment branch where errors are still being fixed and a Feature branch for each feature developed for the system.

MongoDB

An opensource Database system NoSQL. Since it is non-relational, the data isn’t stored in tables but in BSON, a structure similar to JSON.

NoSQL databases

Umbrella term used for database systems that store data in a non-relational way. In relational databases, data is stored in tables and connections are established through relationships. NoSQL databases don’t follow this format and may relate the elements of the database using any other method.

JSON

Text format used for transferring data. Information is stored in pairs of “name” and “value”, with “name” acting as an identifier for the attribute and “value” its current value.

BSON

(Binary JSON) Data format used by MongoDB. It includes all JSON data structure types and adds support for types including dates, different size integers, ObjectIds, and binary data.

Docker

Platform for developers and system administrators. Provides an extra layer or abstraction compared to Virtual Machines since the containers themselves are not found on the guest operating system but on Docker. It also provides orchestration between containers without the big files needed for VMs.

Container

Executable package that encloses an application. Multiple containers can form a complex architecture while remaining isolated from each other. It is a live instance of an image, allowing it to be shared and stored.

Image

File that includes everything necessary to run an application such as code, runtime system, libraries, runtime variables and configuration files. This will ensure that running the application is consistent independently of the environment.