About arc42

arc42, the template for documentation of software and system architecture.

Template Version 8.2 EN. (based upon AsciiDoc version), January 2023

Created, maintained and © by Dr. Peter Hruschka, Dr. Gernot Starke and contributors. See https://arc42.org.


Note

This version of the template contains some help and explanations. It is used for familiarization with arc42 and the understanding of the concepts. For documentation of your own system you use better the plain version.

1. Introduction and Goals

The application aims to be a web game of questions and answers following the format of the TV show ‘Saber y Ganar’. The game will show a question to the player followed by a fixed number of possible answers (the information used in the questions and answers will be obtained from Wikidata https://www.wikidata.org). The player must answer these questions by clicking on one of the answers, clicking a correct answer will increase their score and clicking a bad answer will decrease it.

The description given till now was the same as the project that was implemented by the students that took the software architecture course last year. But, apart from that, this year there will be a new feature that implements a clue system using an LLM. This system will let the user decide whether to use clues or not once they have seen the question. Using the clues will result in a penalty to be decided. Some goals established for the application are (without priority order):

  • The game should implement all the functionality of the ones developed the previous year

  • The game should integrate a clue system that uses LLM technology.

  • There should be a good quantity of questions and answers to be asked in the game. (considering that the information for this Q&As must be taken for Wikidata)

  • There shall be a log in in the application that allows users to obtain historical data from their participations.

1.1. Requirements Overview

  • The system should give non-registered or unidentified users the option to register to the application

  • The user should be able to ask for hints from the LLM in each question

  • An specific approach should be use to avoid incorrect answers and hallucinations from the LLM

  • Each question must have a single correct answer and at least one incorrect answer

  • The game shall correctly show the questions and all the possible answers.

  • The game shall manage the answers given by the player as expected.

  • The LLM clue system should be implemented so it gives quality clues.

  • The user interaction with the clue system shall be managed.

  • The questions shall be obtained from Wikidata.

  • The questions shall come with an image obtained from Wikidata.

  • The application should be accessible through the web.

  • There shall be a record of the historical data of the games from logged in players (number of games, number of failed questions, …).

  • There shall be a time count down in the game questions.

  • The system shall give access to information about questions through an API.

1.2. Quality Goals

The implementation of tests and that the product is extensible is one of the main goals as they are one of the most important requirements of some of our stakeholders (our teachers)

Goal Description

Compatibility & Transferability

The system has to be compatible with different browsers and devices while ensuring an optimal experience regardless of the chosen platform. It also has to take into account different screen sizes

Availability

The system shall be available 99% of the time a user tries to access

Usability & Operability

The application is intuitive, users can learn how to fluently use the application in their first use

Performance

The application is able to manage a big number of users without penalizing response times

Maintainability

The system has to be designed to facilitate the maintenance and updating

Functional suitability

The system can fulfill its intended goal effectively by allowing the users to register, play the game, ask for hints to the LLM and access their statistics

Reliability

The system has to be able to generate questions from Wikidata ensuring their accuracy and diversity. It should handle the storage of user registrations, logins and game data without errors

Security

The information about users can only accessed by themselves

Other goals to take into account are:

  • We should use a test focused development

  • We should carefully choose the design to make sure that the application is maintainable

  • The system has to be easy to use and understand

1.3. Stakeholders

Role/Name Expectations

Users

Test their knowledge.

Software architecture professors

Provide their students, the development team, with practical experience regarding software architecture by allowing them to work in an environment that emulates the flow of development of a real project . Which means that they expect us to be able to meet the standards of quality and production of a real application.

RTVE

Get an evolution of their quiz game and continue promoting their program all around the world

ChattySW

Develop an application within the given constraints and that fulfills the requirements given by the client

Wikidata

Offer a service allowing the development team to create the questions using an API

2. Architecture Constraints

The application has some constraints, these are requirements of the stakeholders or environment that we must accept. They are divided in technical constraints, organizational constraints and convention constraints.

2.1. Technical Constraints

Constraint Limitations

Web Frontend

The system will have at least a Web frontend

Wikidata

We must use Wikidata to get information to generate the questions

LLM

An LLM will be used to generate hints for the questions

GitHub

We must use GitHub for the version control of the project

2.2. Organizational Constraints

Constraints Limitations

Team

The team for the project was chosen by the professors

Weekly meetings

There must be at least one weekly meeting

Public repository

The project source code will be available in a public repository

2.3. Conventions

Conventions Limitations

Documentation architecture

The documentation must follow Arc42

Language

The documentation and in general the project will be written in English

3. Context and Scope

3.1. Business Context

Business context
  • Player (user, person): the player interacts with our system by using the front-end or user interface of the application to play the question games.

  • Our system (internal system) : our system allows the player to play the question games, and it will interact with the WikiData API and LLM API by using queries to get the questions and options, and to ask for hints related to the questions respectively.

  • WikiData API: API to extract the information for generating the different questions, options and answers.

  • LLM API: API to ask to an AI for hints related to the question shown to the user.

3.2. Technical Context

Technical context

The application is deployed using Docker with Azure. The whole application: The user interacts with the web app (front-end of the app). It will be a SpringBoot project that will contain the user interfaces and ask the other parts of the application the different data that it needs. So, it acts as a Controller.

The services will be:

  • Authentication service

  • Users service (statistics)

  • Question generation service (WikiData)

  • Hint service (LLM)

  • Game service

Many services will interact with the database (MongoDB) to add information for different entities (Questions, Users…)

4. Solution Strategy

4.1. Technology breakdown

Goal / Requirement Architectural approach Conclusions

Implement LLM technologies in the game

Research shall be done to decide which LLM is more convenient for our game. A research about how to implement this technologies in a good way (so hallucinations are avoided) shall be done

The LLM that is going to be used will be decided in further stages of the development. This will be done like this because the majority of the LLMs are implemented in a pretty similar way, so we can change the model with only some small changes in the code.

Develop the program using GitHub.

Research about branching strategies in GitHub shall be done. All team members shall be acquainted with GitHub

The branching strategy ‘GitHub Flow’ will be used by the development team as it is the branching strategy that fits better the team and project characteristics.

The code should be easy to understand and to extend in the future

Design techniques and patterns shall be implemented.

4.2. Organizational breakdown

Up until this moment we are meeting only once a week, but we expect that that may have to change as the project evolves. We are using a whatsapp group as a secondary way of communication

4.3. Quality decisions

Attribute pursued Choice

5. Building Block View

5.1. Level 1

Overall view of the system and the parts in which it is divided as well as the external systems it connects to.

Level 1

Application represents the entire system implemented by our team. This is the part of the system with which the user interacts. LLM stands for Large Language Model, which is used to generate hints. Wiki data is the external system that provides information about the questions in the game. The last part is the Database, which represents the connection to a database where data about the program, such as user data, is stored.

5.2. Level 2

Level 2
  • The Frontend is what the user will interact with, the part the user will be able to see. It will send requests to the GameService and UserService.

  • The GameService handles the game, it will ask for questions and hints, and send them to the frontend, then process the answers. It will also interact with the user service to keep track of scores and other user information related to the game.

  • The UserService is in charge of logging in the user and keeping track of all their relevant information by storing it and retrieving it from a database.

  • The HintService will be used to interact with an LLM in order to generate hints.

  • The QuestionService will be used to generate questions based on data extracted from WikiData.

  • The Database will store information used in the system such as the user login details, the previous games and scores, in the future we might also use it for information about the questions.

6. Runtime View

Contents

The runtime view describes concrete behavior and interactions of the system’s building blocks in form of scenarios from the following areas:

  • important use cases or features: how do building blocks execute them?

  • interactions at critical external interfaces: how do building blocks cooperate with users and neighboring systems?

  • operation and administration: launch, start-up, stop

  • error and exception scenarios

Remark: The main criterion for the choice of possible scenarios (sequences, workflows) is their architectural relevance. It is not important to describe a large number of scenarios. You should rather document a representative selection.

Motivation

You should understand how (instances of) building blocks of your system perform their job and communicate at runtime. You will mainly capture scenarios in your documentation to communicate your architecture to stakeholders that are less willing or able to read and understand the static models (building block view, deployment view).

Form

There are many notations for describing scenarios, e.g.

  • numbered list of steps (in natural language)

  • activity diagrams or flow charts

  • sequence diagrams

  • BPMN or EPCs (event process chains)

  • state machines

  • …​

Further Information

See Runtime View in the arc42 documentation.

6.1. Sign up

The following diagram shows how the sign up process of a user is done.

  1. The client writes the credentials needed to create and account.

  2. Check the credentials are valid in format.

  3. Check that the fields that need to be unique are unique.

  4. Returns if they are unique.

  5. Create the user and store it in the database.

  6. Notify to what page the user should be redirected.

  7. The user is redirected to the menu. The user can access to its statistics (interaction with the User service) and play the game.

  8. Notify that the credentials are invalid.

  9. The errors are shown and the user isn’t registered.

Sign up diagram

6.2. Login

The following diagram shows how the login process of a user is done.

  1. The client writes the credentials needed to login.

  2. Check the credentials are valid in format.

  3. Check the credentials are valid for the user.

  4. Return if the credentials are valid.

  5. The user is authenticated.

  6. The user is redirected to the menu. The user can access to its statistics (interaction with the User service) and play the game.

  7. Notify that the credentials are invalid.

  8. The errors are shown and the user isn’t registered.

Login diagram

6.3. Game

The following diagram shows how the playing of the game is done.

  1. Starts game.

  2. The game starts.

  3. See the Question Generation section below.

  4. The data related to the question is returned.

  5. User asks for a hint.

  6. The timer is stopped.

  7. See the Hint Generation section below.

  8. The information related to the hint is returned.

  9. User choses an option from the four given.

  10. The game service checks if the option picked is the answer or not to the question.

  11. The game service returns if it’s correct the option picked.

  12. The user is informed if they make a good guess or not.

  13. Store the results and information of the game in the database.

  14. Returns the information of the game to the Application.

  15. The information of the game is shown to the user.

Game diagram

6.4. Question generation

The following diagram shows how the generation of the questions is done. We have 2 options for the moment.

6.4.1. Option 1

This option loads the questions during the game directly from Wikidata.

  1. The game service asks for a question.

  2. The question generation service requests data for creating the questions.

  3. Wikidata returns the question, image and options for the question.

  4. The question generation service returns all the information.

  5. The game service stores all the information returned.

  6. The game service returns the data to the Application

  7. The question, image and options for the question are shown

Question generation 1 diagram

6.4.2. Option 2

This option is it can have a better performance, as you don’t depend in the game itself for the WikiData API, as the questions are loaded from the database during the game.

  1. Run the question generation service to load the questions from Wikidata.

  2. WikiData returns all the information related to the question.

  3. The information is stored in the database for later use.

  4. The game service asks for a question to the database

  5. The database returns all the information of a question

  6. The game service returns the data to the Application

  7. The question, image and options for the question are shown

Question generation 2 diagram

6.5. Hint generation

The following diagram shows how the generation of the hints is done.

  1. The user asks for a hint to the current question.

  2. The application requests a hint for the current question.

  3. The hint service gives some context to the LLM.

  4. The LLM returns the clue.

  5. The hint service returns the clue.

  6. The clue is shown to the user.

Hint generation diagram

7. Deployment View

7.1. Infrastructure Level 1

Deployment

7.2. Motivation

A well-defined deployment view provides a clear understanding of how different system components interact and operate within the infrastructure. This ensures smooth functionality, scalability, and maintainability.

7.3. Content

The deployment view represents the technical infrastructure that supports the system’s execution. The diagram offers a high-level visualization of key components and their interactions.

7.4. Deployment Architecture Components

  • User: The end-user interacts with the system through the web application.

  • Web Application: The central interface that connects users to the system.

  • Application Layer: Manages multiple services, including:

    • LLM Service: Interfaces with an external Large Language Model (Gemini) to process queries.

    • Question Service: Handles question-related logic and interacts with the database using the Wikidata API.

    • User Service: Manages user-related functionalities such as authentication and profile management.

  • Database Layer: Uses MongoDB as the primary database for storing user and question-related data.

8. Cross-cutting Concepts

Content

This section describes overall, principal regulations and solution ideas that are relevant in multiple parts (= cross-cutting) of your system. Such concepts are often related to multiple building blocks. They can include many different topics, such as

  • models, especially domain models

  • architecture or design patterns

  • rules for using specific technology

  • principal, often technical decisions of an overarching (= cross-cutting) nature

  • implementation rules

Motivation

Concepts form the basis for conceptual integrity (consistency, homogeneity) of the architecture. Thus, they are an important contribution to achieve inner qualities of your system.

Some of these concepts cannot be assigned to individual building blocks, e.g. security or safety.

Form

The form can be varied:

  • concept papers with any kind of structure

  • cross-cutting model excerpts or scenarios using notations of the architecture views

  • sample implementations, especially for technical concepts

  • reference to typical usage of standard frameworks (e.g. using Hibernate for object/relational mapping)

Structure

A potential (but not mandatory) structure for this section could be:

  • Domain concepts

  • User Experience concepts (UX)

  • Safety and security concepts

  • Architecture and design patterns

  • "Under-the-hood"

  • development concepts

  • operational concepts

Note: it might be difficult to assign individual concepts to one specific topic on this list.

Possible topics for crosscutting concepts
Further Information

See Concepts in the arc42 documentation.

8.1. <Concept 1>

<explanation>

8.2. <Concept 2>

<explanation>

…​

8.3. <Concept n>

<explanation>

9. Architecture Decisions

Contents

Important, expensive, large scale or risky architecture decisions including rationales. With "decisions" we mean selecting one alternative based on given criteria.

Please use your judgement to decide whether an architectural decision should be documented here in this central section or whether you better document it locally (e.g. within the white box template of one building block).

Avoid redundancy. Refer to section 4, where you already captured the most important decisions of your architecture.

Motivation

Stakeholders of your system should be able to comprehend and retrace your decisions.

Form

Various options:

  • ADR (Documenting Architecture Decisions) for every important decision

  • List or table, ordered by importance and consequences or:

  • more detailed in form of separate sections per decision

Further Information

See Architecture Decisions in the arc42 documentation. There you will find links and examples about ADR.

10. Quality Requirements

Content

This section contains all quality requirements as quality tree with scenarios. The most important ones have already been described in section 1.2. (quality goals)

Here you can also capture quality requirements with lesser priority, which will not create high risks when they are not fully achieved.

Motivation

Since quality requirements will have a lot of influence on architectural decisions you should know for every stakeholder what is really important to them, concrete and measurable.

Further Information

See Quality Requirements in the arc42 documentation.

10.1. Quality Tree

Content

The quality tree (as defined in ATAM – Architecture Tradeoff Analysis Method) with quality/evaluation scenarios as leafs.

Motivation

The tree structure with priorities provides an overview for a sometimes large number of quality requirements.

Form

The quality tree is a high-level overview of the quality goals and requirements:

  • tree-like refinement of the term "quality". Use "quality" or "usefulness" as a root

  • a mind map with quality categories as main branches

In any case the tree should include links to the scenarios of the following section.

10.2. Quality Scenarios

Contents

Concretization of (sometimes vague or implicit) quality requirements using (quality) scenarios.

These scenarios describe what should happen when a stimulus arrives at the system.

For architects, two kinds of scenarios are important:

  • Usage scenarios (also called application scenarios or use case scenarios) describe the system’s runtime reaction to a certain stimulus. This also includes scenarios that describe the system’s efficiency or performance. Example: The system reacts to a user’s request within one second.

  • Change scenarios describe a modification of the system or of its immediate environment. Example: Additional functionality is implemented or requirements for a quality attribute change.

Motivation

Scenarios make quality requirements concrete and allow to more easily measure or decide whether they are fulfilled.

Especially when you want to assess your architecture using methods like ATAM you need to describe your quality goals (from section 1.2) more precisely down to a level of scenarios that can be discussed and evaluated.

Form

Tabular or free form text.

11. Risks and Technical Debts

Risk Description

Integration with Wikidata

Extracting meaningful and accurate data from Wikidata for generating questions and hints may be challenging. Data inconsistencies, missing attributes, or outdated information could lead to incorrect questions or hints.

LLM Accuracy

The LLM may generate incorrect or misleading hints (hallucinations).

Scalability of the System

As more users register and play, the backend (Wikidata API, LLM API, and database) may face performance bottlenecks.

User Authentication and Data Privacy

User registration and historical data storage introduce potential security vulnerabilities (e.g., data leaks, unauthorized access). Compliance with GDPR or other privacy regulations is also a concern.

12. Glossary

Term Definition

WikiData

Large database containing information from multiple subjects that’s free to access and open to modifications. We will use it to generate questions for the game.

LLM

A Large Language Model is a pre-trained language model with ample knowledge about many different topics that can deliver answers in a way similar to human speach. It will be used to provide clues to the user

Spring

Popular framework for the development of web applications that provides different modules for common services such as authentication, access to databases, security, etc.

Framework

Set of concepts and practices used for solving a problem that can be used as a template for solving similar problems.

LLM

A Large Language Model is a pre-trained language model with ample knowledge about many different topics that can deliver answers in a way similar to human speech. It will be used to provide clues to the user.

Gemini

A LLM developed by Google. It will be one of the LLMs used in our application.

Gitlab flow

Branching strategy whit 3 different branch types. A Master branch with stable code for release, a Deployment branch where errors are still being fixed and a Feature branch for each feature developed for the system.

MongoDB

An opensource Database system NoSQL. Since it is non-relational, the data isn’t stored in tables but in BSON, a structure similar to JSON.

NoSQL databases

Umbrella term used for database systems that store data in a non-relational way. In relational databases, data is stored in tables and connections are established through relationships. NoSQL databases don’t follow this format and may relate the elements of the database using any other method.

JSON

Text format used for transferring data. Information is stored in pairs of “name” and “value”, with “name” acting as an identifier for the attribute and “value” its current value.

BSON

(Binary JSON) Data format used by MongoDB. It includes all JSON data structure types and adds support for types including dates, different size integers, ObjectIds, and binary data.

Docker

Platform for developers and system administrators. Provides an extra layer or abstraction compared to Virtual Machines since the containers themselves are not found on the guest operating system but on Docker. It also provides orchestration between containers without the big files needed for VMs.

Container

Executable package that encloses an application. Multiple containers can form a complex architecture while remaining isolated from each other. It is a live instance of an image, allowing it to be shared and stored.

Image

File that includes everything necessary to run an application such as code, runtime system, libraries, runtime variables and configuration files. This will ensure that running the application is consistent independently of the environment.