Difference between revisions of "CNM Bureau Farm"

From CNM Wiki
Jump to: navigation, search
 
(894 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[CNM Bureau Farm]] (formerly known as [[CNM EndUser Farm]]; hereinafter, the ''Farm'') is the [[CNM Farms|CNM farm]] that is based on [[bare-metal]] servers. This ''Farm'' also utilizes a portion of one bare-metal server that belongs to the [[CNM Lab Farm]]. The issues to work on may include (a) security outside of [[iptables]], (b) adding NAS, as well as advanced backup and recovery systems, and (c) advanced monitoring systems.
+
[[CNM Bureau Farm]] (formerly known as [[CNM EndUser Farm]]; hereinafter, [[#The Farm]]) is the segment of [[Opplet]] that is responsible for providing users with [[CNM Corp]] and [[CNMCyber.com]]. Most likely, [[#The Farm]] will also handle the applications that [[CNM Fed Farm]] currently provides.
  
 +
[[CNMCyber Team]] (hereinafter, [[#The Team]]) develops and administers [[#The Farm]]. To sustain [[#The Farm]] online, [[#The Team]] rents [[Bureau Infrastructure]].
  
==Features==
 
===Communication channels===
 
current cluster uses 3 IP for their communication channel.
 
# ipv4 public for network cluster communication.
 
# ipv6 for network cluster communication
 
# ipv4 internal for storage network cluster.
 
  
===DNS entry point===
+
==In the nutshell==
: [[load balancer]] on a public web address
+
For the purposes of this very wikipage, [[end-user]]s are called [[#The End-Users]]. Collectively, [[end-user application]]s are called [[#The User Apps]].
  
===Syncronization===
+
===Architecture===
: synchronization of resources of common individual nodes, at least databases.
+
: While using [[#The Farm]], [[#The End-Users]] work with [[#The User Apps]] that are installed in [[#The Cluster]] that is, consequently, hosted by [[#The Infrastructure]]. [[#The Cluster]] consists of [[#The Storage]], [[#The Environment]], [[#The Gateway]]. [[#The Infrastructure]] includes [[#The Bridges]] and [[#The Metal]], which is [[#The Farm]]'s hardware.
  
===Monitoring===
+
===Cluster-based===
 +
: To mitigate a [[single point of failure]] ([[single point of failure|SPOF]]), [[#The Farm]] is built on not just one, but three [[bare-metal server]]s. Each of those hardware servers with all of software installed on the top of it (hereinafter, [[#The Node]]) is self-sufficient to host [[#The User Apps]]. Various solutions such as [[#The uptime tools]] orchestrate coordination between [[#The Node]]s.
  
===Security===
+
===COTS-powered===
: [[iptables]] as a firewall
+
: [[#The Farm]]'s software is a collection of [[commercial off-the-shelf]] ([[commercial off-the-shelf|COTS]]) packages (hereinafter, [[#The COTS]]). In plain English, [[#The COTS]] is the software that is already available on the market. No single line of programming code is written specifically for [[#The Farm]]. Only time-tested market-proven solutions have been used. [[#The User Apps]] use [[HumHub]], [[Jitsi]], and [[Odoo]] instances. [[#The Cluster]] uses [[Ceph]], [[Proxmox]], and [[pfSense]].
  
: For security, we use [[Fail2ban]] because it operates by monitoring log files (e.g. /var/log/auth.log, /var/log/apache/access.log, etc.) for selected entries and running scripts based on them. Most commonly this is used to block selected IP addresses that may belong to hosts that are trying to breach the system's security. It can ban any host IP address that makes too many login attempts or performs any other unwanted action within a time frame defined by the administrator. Includes support for both IPv4 and IPv6.
+
==Addressing the needs==
 +
Development and sustenance of [[#The Farm]] address two [[business need]]s of [[CNMCyber Team]] (hereinafter, [[#The Team]]). For [[#The Team]], [[#The Farm]] shall serve as both [[#Tech side of the Apps]] and [[#Worksite]].
  
===Backup and recovery===
+
===Tech side of the Apps===
 +
: [[#The Team]] needs to provide [[#The End-Users]] with the services of [[#The User Apps]] 24 hours a day, 7 days a week. [[#The Farm]] shall power [[#The User Apps]] technologically; their content is outside of [[#The Farm]]'s scope.
  
==Development==
+
===Worksite===
Development of the ''Farm'' occurs under the [[Administration for CNM Farms]] project.
+
: [[#The Team]] needs to provide those incumbents of [[CNMCyber practice]]s who work with [[#The Farm]]'s technologies with their worksite.
  
==See also==
+
==Farm users==
 +
For the purposes of this very wikipage, a ''farm user'' refers to any user of [[#The Farm]]. The ''farm user's'' [[authentication]] and [[authorization]] administration is a part of [[identity and access management]] ([[Identity and access management|IAM]]).
 +
 
 +
===The End-Users===
 +
: For the purposes of this very wikipage, the ''Patron'' refers to an [[end-user]] of [[#The User Apps]]. They access [[#The User Apps]] via the [[graphic user interface]]s ([[graphic user interface|GUI]]s) that are associated with the particular application. At [[#The Farm]], those [[#User interfaces (UIs)]] are located at [[IPv4 address]]es. [[Opplet.net]] provides the ''Patrons'' with access automatically or, by [[#The Power-Users]], manually. The ''Patrons'' can access [[#The User Apps]] and [[#The User Apps]] without administrative tools only. The ''Patrons'' may or may not be members of [[#The Team]].
 +
 
 +
===The Power-Users===
 +
: For the purposes of this very wikipage, the ''Power-User'' refers to a [[power-user]] of [[#The User Apps]]. By definition, the ''Power-Users'' are those members of [[#The Team]] who have authorized to access more of [[#The User Apps]]' resources than [[#The End-Users]]. Normally, those resources are administrative tools that allow the ''Power-Users'' administer one or more of [[#The User Apps]].
 +
 
 +
: While having administrative access, the ''Power-Users'' serve as application-level administrators. They access [[#The User Apps]] via the same [[graphic user interface]]s ([[graphic user interface|GUI]]s) as [[#The End-Users]], but administrative-level ''UIs'' display administration tools in addition to those tools that are available to non-privileged [[end-user]]s.
 +
 
 +
: At the moment, [[#The Sysadmins]] can grant the ''Power-User'' rights to one or more of [[#The End-Users]] manually. Administrative access credentials are classified and securely stored in the application project spaces of [[CNM Lab]]. Those incumbents of [[CNMCyber practice]] who work with [[#The Farm]]'s technologies may or may not be the ''Power-Users''.
 +
 
 +
===The Sysadmins===
 +
: For the purposes of this very wikipage, the ''Sysadmin'' refers to a [[system administrator]] of [[#The Cluster]]. By definition, the ''Sysadmins'' are those members of [[#The Team]] who have authorized to access at least some of [[#The Cluster]]'s resources:
 +
:* [[#The Environment]] access is carried via [[#UI for the Environment]]. That access also includes root-level access to all of the applications that [[#The Environment]] hosts. While having such an access, the ''Sysadmins'' are able to delete or re-install applications such as [[#The User Apps]].
 +
:* [[#The Storage]] access is carried via [[#UI for the Storage]].
 +
:* [[#The Gateway]] access is carried via [[#UI for the Gateway]].
 +
 
 +
: [[#The Superusers]] can grant the ''Sysadmin'' rights to one or more of [[#The End-Users]] manually. Administrative access credentials are classified and securely stored in the cluster project spaces of [[CNM Lab]]. No incumbent of [[CNMCyber practice]] can be the ''Sysadmin''; only those who work in [[CNMCyber Office]] can.
 +
 
 +
===The Superusers===
 +
: For the purposes of this very wikipage, the ''Superuser'' refers to a [[superuser]] of [[#The Farm]]. By definition, the ''Superusers'' are those members of [[#The Team]] who have authorized to access any part of [[#The Farm]] starting with its hardware. [[#The Metal]]'s access is carried via [[#UI for the Metal]].
 +
 
 +
: [[#The Provider]] grants the ''Superuser'' rights to [[CNMCyber Customer]], who can manually pass it to one or more members of [[CNMCyber Office]]. The ''Superuser'' credentials are not stored in [[CNM Lab]].
 +
 
 +
==User interfaces (UIs)==
 +
For the purposes of this very wikipage, a [[user interface]] ([[user interface|UI]]) refers to [[#The COTS]]' feature that allows [[#The COTS]]' instance and its users to interact. ''UIs'' of [[#The User Apps]] are described in the wikipages that are dedicated to particular applications. The other ''UIs'' of [[#The Farm]] are described in the [[#UI of the Environment]], [[#UI of the Storage]], [[#UI of the Gateway]], and [[#UI for the Metal]] sections of this very wikipage.
 +
 
 +
===Dashboards===
 +
: For the purposes of this very wikipage, a ''Dashboard'' refers to a [[graphic user interface]] ([[graphic user interface|GUI]]) that either belongs to any [[#The COTS]] package installed in [[#The Farm]] or is provider by [[#The Provider]]. This screen-based interface allows [[#Farm users]] to interact with [[#The User Apps]] and other software through graphical buttons, icons, or hyperlinked texts rather than typed commands in [[command line interface]]s ([[command line interface|CLI]]s).
 +
 
 +
===Third-party UIs===
 +
: For the purposes of this very wikipage, a ''Third-party UI'' refers to any [[user interface]] ([[user interface|UI]]) that neither belongs to any [[#The COTS]] package installed in [[#The Farm]] nor is provider by [[#The Provider]]. [[#The Sysadmins]] and [[#The Suparusers]] may use access tools such as [[PuTTy]] and [[Midnight Commander]] to access [[#UI for the Metal]].
 +
 
 +
==The User Apps==
 +
For the purposes of this very wikipage, the ''User Apps'' refer to those [[end-user application]]s with which [[#The End-Users]] interact.
 +
 
 +
===HumHub===
 +
: [[CNMCyber.com]], which is the end-user instance of [[Educaship HumHub]].
 +
 
 +
===Jitsi===
 +
: [[CNM Talk]], which is the end-user instance of [[Educaship Jitsi]].
 +
 
 +
===Odoo===
 +
: [[CNM Corp]], which is the end-user instance of [[Educaship Odoo]].
 +
 
 +
==The Cluster==
 +
For the purposes of this very wikipage, the ''Cluster'' refers to all of the software between [[#The User Apps]] and [[#The Infrastructure]].
 +
 
 +
===Cluster components===
 +
: [[#The Cluster]] consists of [[#The Node]]s and their management tools. The following components compose [[#The Cluster]]:
 +
:# '''[[#The Storage]]''' that is powered by [[Educaship Ceph]] to provide [[#The Environment]] with stored objects, blocks, and files. Three storage spaces of [[#The Node]]s create one distributed storage foundation.
 +
:# '''[[#The Environment]]''' that is powered by [[Educaship Proxmox]] to make [[container]]s and [[virtual machine]]s ([[virtual machine|VM]]s) available to [[#The User Apps]], so [[#The User Apps]] can function properly. Each of [[#The Node]]s features its own environment; [[#The uptime tools]] orchestrate them all.
 +
:# '''[[#The Gateway]]''' that is powered by [[Educaship pfSense]] to create a gateway between [[#The Environment]] and the outside world. There is only one gateway; if it fails, [[#The Farm]] fails.
 +
 
 +
===Choice of COTS===
 +
: While building [[#The Farm]] generally and [[#Cluster components]] specifically, [[#The Team]] utilized only [[#The COTS]] that is both [[open-source]] and free of charge. Other considerations for the choice are stated in the [[#COTS for the Environment]], [[#COTS for the Storage]], [[#COTS for the Gateway]], [[#COTS for backups]] sections of this very wikipage.
 +
 
 +
===Cluster provisioning===
 +
: Provisioning of [[#The Cluster]] occurs in the following sequence:
 +
:# Since [[#The Environment]] is installed on the top of [[#The Infrastructure]], the [[#Environment provisions]] shall be accommodated first.
 +
:# Since [[#The Storage]] is a part of [[#The Environment]], [[#Storage provisions]] shall be accommodated second.
 +
:# Since [[#The Gateway]] is installed in [[#The Environment]], [[#Gateway provisions]] shall be accommodated third.
 +
 
 +
===Cluster monitoring===
 +
: Monitoring features are to be identified. Previously, various candidates offered three options:
 +
:# Stack -- [[prometheus]] + node-exporter + [[grafana]]
 +
:# [[Prometheus]] to monitor VMs, Influx to monitor Pve nodes, [[Grafana]] for Dashbord
 +
:# [[grafana]] + [[influxdb]] + [[telegraf]], as well as [[zabbix]]. To monitor websites, use [[uptimerobot]]
 +
 
 +
Observability vs. APM vs. Monitoring
 +
 
 +
application performance management
 +
 
 +
===Cluster recovery===
 +
==High availability (HA)==
 +
Generally speaking, [[high availability]] ([[High availability|HA]]) of any system assumes its higher uptime in comparison with a similar system without higher uptime ability. [[High availability|HA]] of [[#The Farm]] assumes its higher uptime in comparison with a similar farm built on one of [[#The Node]]s. Before [[#The uptime tools]] were deployed, [[#The Farm]] functioned only on one of [[#The Node]]s and, when it failed, services of [[#The User Apps]] were no longer available until the failure was fixed. Now, until at least one of [[#The Node]]s is operational, [[#The Farm]] is operational.
 +
 
 +
===The uptime tools===
 +
: Both [[#The Environment]] and [[#The Storage]] feature advanced tools for [[#High availability (HA)]].
 +
:* '''The [[Educaship Proxmox]] instances'''. With regards to [[#The Farm]]'s applications, when any application fails, its work continues its sister application installed on the second of [[#The Node]]s. If another application fails, its work continues its sister application installed on the third of [[#The Node]]s. If the third application fails, [[#The Farm]] can no longer provide [[#The End-Users]] with [[#The Farm]]'s services in full. To ensure that, [[#The Farm]] utilizes tools that come with [[ProxmoxVE]]. Every [[virtual machine]] ([[virtual machine|VM]]) or [[container]] is kept on at least two of [[#The Node]]s. When the operational resource, [[virtual machine|VM]] or [[container]], fails on one instance, the second [[Educaship Proxmox]] instance activates its own resource and requests the third instance to create the third resource as a reserve. As a result, [[virtual machine|VM]] or [[container]] "migrates" from one of [[#The Node]]s to another.
 +
:* '''The [[Educaship Ceph]] instance'''. Because of distributed nature of [[Ceph]], [[#High availability (HA)]] is the native feature of [[#The Storage]]. When one [[database management system|DBMS]] fails, its work continues its sister [[database management system|DBMS]] installed on the second of [[#The Node]]s. When another [[database management system|DBMS]] fails, its work continues its sister [[database management system|DBMS]] installed on the third of [[#The Node]]s. If the third [[database management system|DBMS]] fails, [[#The Farm]] can no longer provide [[#The User Apps]] with the data it requires to properly work.
 +
 
 +
===Uptime limitations===
 +
: Generally speaking, [[High availability|HA]] comes with significant costs. So does [[#The uptime tools]]. At very least, running three of [[#The Node]]s is more expensive than running one. The cost cannot exceed the benefit, so [[high availability]] ([[high availability|HA]]) cannot be equal to [[failure tolerance]].
 +
 
 +
===Uptime management===
 +
: To manage redundant resources, [[#The uptime tools]]:
 +
:* '''Monitor''' its resources to identify whether they are operational or failed as described in the [[#Monitoring]] section of this very wikipage.
 +
:* '''Fence''' those resources that are identified as failed. As a result, non-operational resources are withdrawn from the list of available.
 +
:* '''Restore''' those resources that are fenced. The [[#Recovery]] supports that feature, while constantly creating snapshots and reserve copies of [[#The Farm]] and its parts in order to make them available for restoring when needed.
 +
 
 +
===Uptime principles===
 +
: Principally, [[#High availability (HA)]] of [[#The Cluster]] is based on:
 +
:* A principle of redundancy. Each of [[#The User Apps]], as well as every object, block, or file that [[#The User Apps]] may use is stored at least twice on different hardware servers of [[#The Node]]s as [[#The uptime tools]] and [[#Uptime of the Storage]] sections describe.
 +
:* Management of redundant resources. [[#The Cluster]] needs to put into operations those and only those resources that are in a good standing and operational shape as described in the [[#Uptime management]] section.
 +
 
 +
==The Environment==
 +
For the purposes of this very wikipage, the ''Environment'' refers to the [[virtual environment]] ([[virtual environment|VE]]) of [[#The Cluster]] or, allegorically, to the environment where [[#The User Apps]] "live".
 +
 
 +
===COTS for the Environment===
 +
: As [[#The COTS]] for the ''Environment'', [[#The Team]] utilizes [[Educaship Proxmox]]. For a while, [[#The Team]] has also tried [[OpenStack]] and [[VirtualBox]] as its virtualization tools. The trials suggested that [[OpenStack]] required more hardware resources and [[VirtualBox]] didn't allow for required sophistication in comparison with [[ProxmoxVE]], which has been chosen as [[#The COTS]] for [[#The Farm]]'s virtualization.
 +
 
 +
===Environment features===
 +
: [[#The Team]] uses [[virtualization]] to divide hardware resources of [[#The Node]]'s [[bare-metal server]]s in smaller [[container]]s and [[virtual machine]]s ([[virtual machine|VM]]s), which are created in the ''Environment'' to run [[#The User Apps]], [[#The Gateway]], and other applications. In [[#The Farm]], the [[ProxmoxVE]] instance tightly integrates [[KVM hypervisor]], [[LXC container]]s, [[Educaship Ceph]] as software-defined storage, as well as networking functionality on a single virtualization platform.
 +
 
 +
===Environment functions===
 +
: [[#The Environment]] executes four major functions. It:
 +
:# '''Runs''' [[#The User Apps]], [[#The Gateway]], and other applications. They can be deployed utilizing two models:<ol type="a"><li>Using [[container]]s; they already contain [[operating system]]s tailored specifically to the needs of the ''App''.</li><li>In [[virtual machine]]s ([[virtual machine|VM]]) and without [[container]]s. In that model, the ''App'' is installed on the [[operating system]] of its [[virtual machine|VM]].</li></ol>
 +
:# '''Hosts''' [[#The Storage]] and [[#Backup box]]
 +
:# '''Connects''' the applications it runs and the storages it hosts to each other and to [[#The Bridges]], while creating networks.
 +
:# '''Creates''' backups and accommodates its own recovery when requested.
 +
 
 +
===Environment provisions===
 +
: Every instance of [[ProxmoxVE]] requires one "physical" [[bare-metal server]]. The interaction between [[ProxmoxVE]] instances and [[#The Infrastructure]] is carried out by [[Debian]] [[operating system]] ([[operating system|OS]]) that comes in the same "box" of [[#The COTS]] as [[ProxmoxVE]] and is specifically configured for that interaction. [[#The Farm]]'s [[ProxmoxVE]] also hosts [[#The Storage]] as its [[storage]].
  
===Related lectures===
+
===UI of the Environment===
:*[[What CNM Farms Are]].  
+
: With regards to [[#User interfaces (UIs)]], [[#The Environment]] features its administrative interface, which belongs to [[#Dashboards]].
  
[[Делова Ферма]] (ранее называемая [[Деловы Кластер]]; здесь и далее -- ''Ферма'') -- это отказоустойчивый ([[high availability]]) кластер [[Брацки Фермы|Брацких Ферм]] (здесь и далее по тексту -- ''Ферм''), который обеспечивает работу и высокую доступность услуг приложений [[Делово Бюро|Делово Бюро]], так называемых [[деловы прилады|деловых прилад]].
+
==The Storage==
 +
For the purposes of this very wikipage, the ''Storage'' refers to the storage platform or [[database management system]] ([[database management system|DBMS]]) that provides [[#The User Apps]] with the [[storage]] they need to operate. Thus, the ''Storage'' supports [[#The Environment]]'s non-emergency operations and differs from the [[#Backup box]] that comes into play in emergencies.
  
 +
===COTS for the Storage===
 +
: As [[#The COTS]] for [[#The Storage]], [[#The Team]] utilizes [[Educaship Ceph]]. Any [[ProxmoxVE]] instance requires some [[storage]] to operate.
  
==Общее описание==
+
: Before deploying [[#The uptime tools]], [[#The Team]] used [[RAID]] to make the double hard disks redundant. So, the [[ProxmoxVE]] instance was just installed on the top of one disk and replicated to the other disk automatically. Flexibly, [[ProxmoxVE]] allows for better usage of hard disks. [[ProxmoxVE]] can be configured to host as [[#The COTS]], many storage-type software packages such as [[ZFS]], [[NFS]], [[GlusterFS]], and so on.
  
===Общепринятые понятия===
+
: Initially, the cluster developer proposed using [[Ceph]]. Later, [[#The Team]] substituted one node with another with higher hard disk, but without [[SSD]] and [[NVMe]]; as a result, [[#The Farm]]'s storage collapsed. The substituted node was disconnected (today, it serves as hardware for [[CNM Lab Farm]]), a new [[bare-metal server]] was purchased (today, it is the [[#Node 3 hardware]]) and [[Ceph]] restored.
:На данной вики-странице, используются следующие термины для общепринятых понятий:
 
:*'''[[A запись]]''' ([[A record]]). Та [[DNS запись]], которая определяет соответствующий доменному имени ([[domain name]]) [[IPv4 адрес]]. Когда пользователь [[Всемирная Паутина|Всемирной Паутины]] набирает доменное имя, например, "bskol.com", веб-просмотрщик ищет в зоне [[DNS]] тот [[IPv4 адрес]], к которому это доменное имя привязано. Буква "А" в названии записи, так называемый тип записи, пришла в название от первой буквы английского слова "address" (адрес).
 
:*'''[[AAAA запись]]''' ([[AAAA record]]). Та [[DNS запись]], которая определяет соответствующий доменному имени ([[domain name]]) [[IPv6 адрес]]. Когда пользователь [[Всемирная Паутина|Всемирной Паутины]] набирает доменное имя, например, "bskol.com", веб-просмотрщик ([[web browser]]) ищет в зоне [[DNS]] тот [[IPv6 адрес]], к которому это доменное имя привязано. Четыре буквы "A" в типе этой записи символизируют тот факт, что максимальное количество адресов протокола [[IPv6]] (128 бит) в четыре раза превышает максимальное количество адресов протокола [[IPv4]] (32 биты).
 
:*'''[[IP адрес]]''' ([[IP address]]). Адрес компьютерного устройства, соответствующий либо протоколу [[IPv4]], либо протоколу [[IPv6]]. Доступные в сети Интернет адреса куплены у поставщика услуг размещения.
 
:*'''[[IPv4 адрес]]''' ([[IPv4 address]]). [[IP адрес]], соответствующий протоколу [[IPv4]]. Эти адреса представляют собою 4 группы цифр, разделённых точками. Например, 88.99.71.85 -- это один из адресов ''Фермы''. Часть адресов зарезервированы для частных сетей и не могут появляться в сети Интернет. Количество адресов [[IPv4]] ограничено 4.3 триллионами, что на момент разработки казалось достаточным числом. Протокол [[IPv4]] был разработан в 1981. Чтобы разрешить проблему ограничения, в 1995 году был разработан протокол [[IPv6]], однако на лето 2022 года, 62% Интернета продолжает пользоваться протоколом [[IPv4]]. В [[DNS зона|DNS зоне]], этот адрес указывается в [[A запись|"A" записи]].
 
:*'''[[IPv6 адрес]]''' ([[IPv6 address]]). [[IP адрес]], соответствующий протоколу [[IPv6]]. Эти адреса представляют собою несколько групп цифр и букв, разделённых двоеточиями. Некоторые группы могут быть пустыми. Например, 2a01:4f8:fff0:53::2 -- это один из адресов ''Фермы'' и группы между сдвоенными двоеточиями пусты. В [[DNS зона|DNS зоне]], этот адрес указывается в [[AААА запись|"AААА" записи]].
 
:*'''[[DNS]]''' ([[Domain Name System]]) -- иерархическая и децентрализованная система доменных имён, которая была изначально создана для привязки человечески-разпознаваемым [[доменное имя|доменных имён]] к машинно-обрабатываемым адресам протокола Интернет ([[IP адрес]]ам), а позже стала использоваться для определения других данных этих имён и адресов. Например, в текстовые записи может быть добавлен открытый ключ к подписи почты. [[DNS запись|DNS записи]] содержатся в так называемых [[DNS зона|DNS зонах]], которые предоставляют поставщики услуг Интернета ([[Internet service provider]] или [[ISP]]).
 
:*'''[[DNS запись]]''' ([[DNS record]]) -- Привязка стандартизованных данных к конкретному доменному имени. Запись состоит из типа (type), например , "AAAA" в [[AAAА запись|AAAА записи]], названия (resource record), например, jitsi.bskol.com, и привязанных к названию данных (data). Вместе, записи составляют [[DNS зона|DNS зону]].
 
:*'''[[DNS зона]]''' ([[DNS zone]]). Ta часть системы доменных имён ([[DNS]]), которая управляется отвечающим в системе за конкретное доменное имя поставщиком услуг Интернета ([[Internet service provider]] или [[ISP]]) и которая определяет данные, связанные с этим доменным именем. Эти данные представлены в виде [[DNS запись|DNS записей]], таких, как [[A запись]] или [[AAAA запись]].
 
:*'''[[Виртуальная машина]]''' ([[virtual machine]] или [[VM]]). Виртуальное компьютерное устройство, имитирующее компьютер, создаваемое виртуальной средой. Аналогично обычному компьютеру, на VM устанавливается операционная система, обычно, из коробки, и, на неё, -- пользовательские приложения.
 
:*'''[[Высокая доступность]]''' ([[high availability]] или [[HA]]). Свойство системы иметь более высокую продолжительность исправного состояния ([[uptime]]) по сравнению с идентичной системой, которая не использует инструментов и методик высокой доступности. Ни одна система и ни одна часть системы не могут быть полностью защищены от угрозы нештатной работы или аварийной ситуации. Высокую доступность можно описать как продолжение предоставления услуг системой на каком-то "исправном" уровне при сбое её определённой части с одновременным восстановлением той самой части, которая пострадала от сбоя. Инструменты высокой доступности включают дублирующие части, готовые взять на себя роль основных, устройства мониторинга для обнаружения случаев отказа, а также управляющие устройства, которые огораживают ([[fencing]]) неработающие части и перенаправляют запросы на работающие. Требование "исправного", пускай и аварийного, состояния отличает высокую доступность от концепции [[отказоустойчивость|отказоустойчивости]] ([[failure tolerance]]), которая стремится к тому, чтобы обычный пользователь системы отказа её части и не заметил.
 
:*'''[[Доменное имя]]''' ([[domain name]], [[hostname]]). Воспринимаемое людьми название веб-сайта или иного ресурса, особенно в сети Интернет, например, "bskol.com". Веб-просмотрщики и другие устройства работают с [[IP адрес]]ами, но эти адреса трудны для запоминания и воспроизведения людьми; для них, созданы доменные имена. В зонах [[DNS]], доменные имена привязаны либо к [[IPv4 адрес]]у, либо к [[IPv6 адрес]]у, либо к обоим.
 
:*'''[[Контейнер]]'''. Виртуальное компьютерное устройство, имитирующее компьютер с установленной операционной системой и пользовательскими приложениями, создаваемое виртуальной средой. Как правило, контейнеры задействуют облегчённую операционную систему, заточенную исключительно под работу установленных приложений.
 
:*'''[[Операционная система]]''' ([[operating system]] или [[OS]]). Программное обеспечение, которое,  с одной стороны, взаимодействует либо с железным, либо с виртуальным компьютерным устройством и, с другой стороны, может взаимодействовать с пользовательскими приложениями.
 
:*'''[[Отказоустойчивость]]''' ([[failure tolerance]]) -- это концепция такой работы системы, в которой конечный пользователь системы не может заметить отказа её части от штатной работы. Некоторые инструменты и методики отказоустойчивости аналогичны инструментам и методикам высокой доступности ([[high availability]]), которые способствуют предоставлению услуг системой при сбое её определённой части с одновременным восстановлением той самой части, которая пострадала от сбоя. Однако никакой набор не гарантирует, что любое восстановление будет моментальным и 100% полным. Потому "отказоустойчивость" -- это всё же концепция, к которой можно стремиться, но не конечная точка, которую можно достичь.
 
:*'''[[Поставщик услуг Интернета]]''' ([[Internet service provider]] или [[ISP]]). Организация, авторизованная администрацией сети Интернет на предоставление [[доменное имя|доменных имён]] и других услуг Интернета. С некоторыми исключениями, поставщики услуг Интернета предоставляют доступ к сети напрямую конечным пользователям или посредникам. Многие поставщики услуг Интернета являются также и [[поставшик услуг размещения|поставщиками услуг размещения]].
 
  
===Специальные термины===
+
: As [[#The COTS]], [[ProxmoxVE]] comes with [[OpenZFS]]. [[#The Team]] has deployed the combination of both in its [[CNM Lab Farm]].
:На данной вики-странице, используются следующие термины, которые специфичны для этой страницы:
 
:*'''[[Железo]]''' ([[bare-metal server]]). "Физический, железный" сервер, арендуемый у поставщика услуг размещения и описанный в [[#Инфраструктура|Инфраструктуре]].
 
:*'''[[Пользовательское приложение]]'''. Одна из установленных на ''Ферме'' [[деловы прилады|деловых прилад]].
 
:*'''[[Поставщик услуг размещения]]'''. Поставщик услуг Интернета ([[Internet service provider]] или [[ISP]]), предоставляющий свои подключённые к сети Интернет "железные" сервера в аренду для размещения ''Фермы''.
 
:*'''[[Соединитель]]'''. Коммутационное устройство предоставляемое поставщиком услуг размещения ''Фермы'' и описанное в [[#Соединители|Соединителях]].
 
:*'''[[Среда]]''' ([[virtual environment]]). Виртуальнaя среда на базе программного обеспечения [[ProxmoxVE]], описанная в [[#Виртуальные среды|Виртуальных средах]].
 
:*'''[[Узел]]''' ([[node]]). Комбинация одного ''Железа'' и установленного на нём программного обеспечения, представленная в сети и описанная в [[#Узлы Фермы|Узлах Фермы]].
 
:*'''Ферма'''. ''Делова Ферма'', для описания которой предназначена данная вики-страница.
 
:*'''[[Хранилище]]'''. Система для хранения объектов, блоков и файлов, которые ''Ферма'' либо обрабатывает, либо предоставляет пользователям без обработки. Термины "хранилище Узла" или, во множественном числе, "хранилища", подразумевают системы хранения на отдельном ''Узле''. Система описана в [[#Хранилища Узлов|Хранилищах Узлов]].
 
  
===Архитектура===
+
===Storage features===
:'''Для предоставления услуг''' пользователям:
+
: [[#The Storage]] features are:
:#Пользовательские приложения ''Фермы'' установлены:
+
:* File system
:#*либо в контейнерах, которые уже содержат подогнанные исключительно под нужды приложения операционные системы.
+
:* Distributed
:#*либо на виртуальных машинах. Для взаимодействия виртуальной машины и приложения, операционные системы "из коробки" установлены в машинах перед установкой приложений.
+
:* Fault tolerant
:#Контейнеры и виртуальные машины ''Фермы'' создаются в виртуальных средах.
+
:* Object storage
:#Виртуальные среды ''Фермы'' требуют для работы "физические", так называемые "железные", сервера ([[bare-metal server]]; здесь и далее по тексту -- ''Железа'').
+
:* Block device
:#Взаимодействие виртуальных сред с ''Железом'' осуществляется специально-ориентированной на это взаимодействие [[операционная система|операционной системой]]. В сети, комбинация одного ''Железа'' и установленного на нём программного обеспечения называется "узловым центром" (node; здесь и далее, ''Узлом'').
 
  
:'''Для высокой доступности''' ([[high availability]] или [[HA]]) и отказоустойчивости услуг:
+
===Storage functions===
:*Задействуются три ''Узла'', объединённые в единые сети [[#Соединители|Соединителями]]. Два из трёх ''Узлов'' являются "несущими"; их базы синхронизованы и изменение в одной базе влечёт автоматическое изменение в другой. Из двух несущих, одно является основным. Третий ''Узел'' -- это требование используемого для создания виртуальных сред программного обеспечения [[ProxmoxVE]] для обеспечения кворума.
+
: To make objects, blocks, and files immediately available for [[#The User Apps]]' operations, [[#The Cluster]] uses a common distributed cluster foundation that orchestrates storage spaces of [[#The Node]]s.
:*В обычном режиме, веб-просмотрщик (web browser) пользователя обращается к [[IP адрес]]у [[Hetzner vSwitch]], который отправляет пользователя к основному ''Узлу''.
 
:*Если основной ''Узел'' неспособен обслуживать клиентов, виртуальная среда изолирует его и переключает клиентов на второй несущий, работающий ''Узел''.
 
  
:'''Отказоустойчивость требует''' пару дополнительных функций:
+
===Storage provisions===
:#Для обнаружения сбоя или другой нештатной ситуации, ''Ферма'' постоянно мониторится. Сигнал о сбое поступает в виртуальную среду, которая ограждает (fencing) ''Узел'' со сбоем и запускает процесс восстановления данных с резервной копии.
+
: Since [[#The Storage]] is installed on the top of [[#The Environment]], the ''Storage'' provisioning entails configuring a [[ProxmoxVE]] instance to work with a [[Educaship Ceph]] instance.
:#Для восстановления данных в случае их потери из-за сбоя или другой нештатной ситуации, каждый ''Узел'' постоянно проводит резервное копирование.
 
  
==Доступы==
+
: At [[#The Farm]], [[Educaship Ceph]] is deployed at all of [[#The Node]]s. Each of [[#The Node]]'s servers features doubled hard disks. Physically, a [[ProxmoxVE]] instance is installed on one disk of each of [[#The Node]]s; [[Educaship Ceph]] uses three "second" disks. So, [[#The Farm]] features three instances of [[ProxmoxVE]] and one instance of [[Ceph]].
===Администраторские===
 
:#Администраторский доступ к ''Железу'', а также к соединителям [[Hetzner vSwitch]] осуществляется через административную панель и администраторские консоли. Они предоставлены непосредственно [[Hetzner]] заказчику; заказчик лично может предоставить доступы ответственным администраторам.
 
:#Администраторский доступ к виртуальным средам [[ProxmoxVE]] и, далее, файлам пользовательских приложений, осуществляется через привязанные к ''Железу'' [[IP адрес]]а. Данные доступов засекречены и хранятся в [[Брацка Крынка|Брацкой Крынке]].
 
:#Администраторские доступы к пользовательским приложениям осуществляются через привязанные к приложениям [[IP адрес]]а. В данный момент, доступы предоставляются бюрократами [[Оплёт]]а вручную.
 
  
===Пользовательские===
+
: While experimenting with [[OpenZFS]] and [[RAID]], [[#The Team]] has also tried another model. The second disks then served as reserve copies of the first ones. Since every disk is just 512 GB, that model shrank [[#The Farm]]'s capacity in a half since both [[#The User Apps]] and their storage needed to fit the 512 GB limitation together.  
:#Те пользователи, которые администраторами не являются, не должны иметь доступ к ''Железу'' и ''Средам''.
 
:#Доступы тех пользователей, которые администраторами не являются, к пользовательским приложениям осуществляются через привязанные к приложениям [[IP адрес]]а. Доступы предоставляются [[Оплёт]]ом автоматически и, бюрократами [[Оплёт]]а, вручную.
 
  
==Инфраструктура==
+
: In the current model, [[#The User Apps]] shouldn't share their 512 GB with the storage. On another hand, [[#The Farm]]'s [[Educaship Ceph]] capacity is about 3 * 512 GB = 1.536 GB.
Инфраструктура ''Фермы'' -- это объединённые в единую связку три ''Железа''.
 
  
===Поставщик Железа===
+
===UI of the Storage===
:[[Hetzner]] является поставщиком услуг размещения, у которого "Железо" арендуется. Сотрудничество с данным поставщиком длится с 2016 года. Другие поставщики периодически рассматриваются, но никто другой не предлагал более низких цен на долгосрочной основе.
+
: With regards to [[#User interfaces (UIs)]], [[#The Storage]] features its administrative interface, which belongs to [[#Dashboards]].
  
===Выбор Железа===
+
==The Gateway==
:Из-за меньшей стоимости, ''Железo'' выбрано на аукционе -- https://www.hetzner.com/sb?hdd_from=500&hdd_to=1000 исходя из следующих предпосылок:
+
For the purposes of this very wikipage, the ''Gateway'' refers to the composition of software that is built on the external ''Bridge''. The ''Gateway'' is the hub for both [[#The Farm]]'s [[wide area network]] ([[wide area network|WAN]]) and [[local area network]] ([[local area network|LAN]]). To power the ''Gateway'', [[Educaship pfSense]] is deployed.
:*Целевой рабочий объём жёсткого диска для этой ''Фермы'' -- 512Gb.
 
:*Как минимум один, основной сервер выбран с [[SSD]] и, желательно, [[NVMe]], и частотой процессора в 64Gb.
 
:*Как минимум два "несущих" сервера выбраны в одном датацентре. Хотя [[Hetzner]] не берёт оплату за траффик, это обстоятельство повышает скорость работы ''Фермы''. Если второй сервер не был бы доступен в том же датацентре, мы искали бы его в других датацентрах то же города или месторасположения.
 
:*Подрядчик предпочёл сервер на процессоре Intel Xeon E3-1275v5 серверу на Intel Core i7-7700.
 
:*Требования к третьему ''Железу'' ниже, чем к "несущим". Один кандидат утверждал, что его объём может быть меньше, так как на нём может быть установлен только [[ProxmoxVE]].
 
:[[#Характеристики Железа|Характеристики Железа]] представлены ниже.
 
  
===Соединители===
+
The composition of software such as a [[load balancer]] or [[reverse proxy]] that is built on the [[#External Bridge]].
:Для объединения ''Узлов'' в сети, используются инструменты [[Hetzner vSwitch]]. Их хозяином является поставщик услуг размещения; команда может заказать присоединeние к одному ''Железу'' до 5 соединителей. Право на присоединение соединителей предоставляется вместе с арендой ''Железа''.
 
  
:На соединителях построены внутренняя и внешняя сети. Каждый из соединителей имеет свой [[IP адрес]], внутренний или внешний:
+
===COTS for the Gateway===
:*Внутренние соединители обеспечивают передачу данных между ''Узлами''. Прежде всего, такая передача жизненно необходима синхронизации хранилищ отдельных ''Узлов''.
+
: As [[#The COTS]] for the ''Gateway'', [[#The Team]] utilizes [[Educaship pfSense]]. For a while, [[#The Team]] has also tried [[iptables]] as a [[firewall]] and [[Fail2ban]], which operates by monitoring log files (e.g. /var/log/auth.log, /var/log/apache/access.log, etc.) for selected entries and running scripts based on them. Most commonly this is used to block selected IP addresses that may belong to hosts that are trying to breach the system's security. It can ban any host IP address that makes too many login attempts or performs any other unwanted action within a time frame defined by the administrator. Includes support for both [[IPv4]] and [[IPv6]].
:*Соединители с внешними, доступными из сети Интернет, [[IP адрес]]ами, распределяют запросы из сети Интернет между ''Узлами'' и возвращают ответы ''Узлов'' на запросы назад в сеть Интернет.
 
:Инструменты ''Фермы'' не поддерживают и не могут поддерживать высокой доступности соединителей. За отказоустойчивость соединителей отвечает их хозяин, поставщик услуг размещения [[Hetzner]].
 
  
==Характеристики Железа==
+
===Gateway features===
В результате процесса [[#Выбор Железа|Выбора Железа]], были выбраны сервера со следующими характеристиками:
 
  
===Железо 1===
+
===Gateway functions===
:1 x Dedicated Root Server "Server Auction"
+
[[FreeBSD]], HA, VPN, LDAP, backups, CARP VIP
:* Intel Xeon E3-1275v5
 
:* 2x SSD M.2 NVMe 512 GB
 
:* 4x RAM 16384 MB DDR4 ECC
 
:* NIC 1 Gbit Intel I219-LM
 
:* Location: FSN1-DC1
 
:* Rescue system (English)
 
:* 1 x Primary IPv4
 
  
===Железо 2===
+
: [[#The Gateway]] can be compared to an executive secretary, who (a) takes external client's requests, (b) serves as a [[gatekeeper]], while checking validity of those requests, (c) when the request is valid, selects to which internal resource to dispatch it, (d) dispatches those requests to the selected resource, (e) gets internal responses, and (f) returns them back to the client in the outside world.
:1 x Dedicated Root Server "Server Auction"
 
:* Intel Xeon E3-1275v5
 
:* 2x SSD M.2 NVMe 512 GB
 
:* 4x RAM 16384 MB DDR4 ECC
 
:* NIC 1 Gbit Intel I219-LM
 
:* Location: FSN1-DC1
 
:* Rescue system (English)
 
:* 1 x Primary IPv4
 
  
===Железо 3===
+
: Thus, [[#The Gateway]]:
:1 x Dedicated Root Server "Server Auction"
+
:# (constantly) '''Is monitoring''' state of internal resources of [[#The Farm]].
:* Intel Core i7-7700
+
:# '''Receives''' requests from the world outside of [[#The Farm]].
:* 2x SSD SATA 512 GB
+
:# '''Checks''' validity of external requests, while serving as a [[firewall]].
:* 2x RAM 16384 MB DDR4
+
:# When the request is valid, '''selects''' to which of [[#The Node]]s to dispatch it. [[#The Gateway]] is responsible for dispatching external requests to those and only to those internal resources that [[#Cluster monitoring]] has identified as operational.
:* NIC 1 Gbit Intel I219-LM
+
:# '''Dispatches''' those requests to [[#The Node]] that was selected.
:* Location: FSN1-DC1
+
:# '''Collects''' internal responses.
:* Rescue system (English)
+
:# '''Returns''' those responses to the outside world.
:* 1 x Primary IPv4
 
  
===Железо 4===
+
: To be more accessible to its clients, [[#The Gateway]] utilizes public [[IPv4 address]]es.
:1 x Dedicated Root Server "Server Auction"
 
:* Intel Xeon E5-1650V3
 
:* 2x HDD SATA 2,0 TB Enterprise
 
:* 8x RAM 16384 MB DDR4 ECC reg.
 
:* NIC 1 Gbit Intel I210
 
:* Location: FSN1
 
:* Rescue system (English)
 
:* 1 x Primary IPv4
 
  
==Узлы Фермы==
+
===Gateway provisions===
Работа ''Фермы'' обеспечивается тремя ''Узлами''. Каждый ''Узел'' представляет собой отдельное ''Железо'', приводимoe в действие несколькими видами программного обеспечения (ПО).
+
: [[#The Gateway]] is deployed in a [[virtual machine]] ([[virtual machine|VM]]) of [[#The Environment]].
  
===Резервное копирование===
+
===UI of the Gateway===
:[[OpenZFS]] или [[RAID]] создаёт резервные копии и может быть задействовано для восстановления данных ''Железа'' в случае аварий. Жёсткие диски каждого ''Железа'' сдвоены, как, например, 2x SSD SATA 512 GB. [[RAID]] или [[OpenZFS]] копирует данные основного диска ''Железа'' на резервный диск. Если основной диск теряет данные из-за сбоя, резервный диск будет использован для восстановления данных на основной диск. [[RAID]] или [[OpenZFS]] устанавливается непосредственно на ''Железо''.
+
: With regards to [[#User interfaces (UIs)]], [[#The Gateway]] features its administrative interface, which belongs to [[#Dashboards]].
  
===Виртуальные среды===
+
==Gateway components==
:[[ProxmoxVE]], в данный момент, v.7.2, создаёт виртуальные среды (здесь и далее -- ''Среды''). Это программное обеспечение взаимодействует с ''Железом'' через [[операционная система|операционную систему]] [[Debian]], под которую оно настроено. Эта операционная система приходит в одной коробке с [[ProxmoxVE]].
+
[[#The Gateway]] includes [[#Firewall and router]], [[#Load balancer]], and [[#Web server]].
  
==Сети Узлов==
+
===Firewall and router===
Сеть каждого ''Узла'' использует мост по выбираемой по умолчанию в [https://pve.proxmox.com/wiki/Network_Configuration#_default_configuration_using_a_bridge Network Configuration] модели.
+
: [[Educaship pfSense]] plays roles of [[firewall]], [[reverse proxy]], and platform to which [[#Load balancer]] and [[#Web server]] are attached.
  
===Хранилища Узлов===
+
===Load balancer===
:Для хранения данных, каждый ''Узел'' использует платформу распределённого хранилища [[Ceph]]. Хранилища отдельныx ''Узлов'' синхранизуются через внутреннюю сеть инфраструктуры.
+
: As a [[load balancer]], [[Educaship pfSense]] uses the select version of [[HAProxy]] that is specifically configured as [[HAProxy]]'s add-on. As of summer of 2023, no full [[HAProxy Manager]] exists in [[#The Farm]]. As of summer of 2023, a [[round robin]] model is activated for load balancing.
  
:Таким образом, всё хранилище ''Фермы'' включает специально-зарезервированные дисковые пространства ''Желез'' и программное обеспечение, работа которого распределена по всем ''Узлам''. Благодаря этому обеспечению, хранилища отдельных ''Узлов'' синхранизованы между собою, чтобы исключить единую точку отказа.
+
===Web server===
 +
: As its [[web server]], [[pfSense]] utilizes [[lighttpd]]. Prior to deployment of [[Educaship pfSense]], [[#The Team]] utilized two [[web server]]s to communicate with the outside world via [[HTTP]]. [[Nginx]] handled requests initially and [[Apache HTTP Server]] handled those requests that hadn't handled by [[Nginx]].
  
===IP адреса===
+
==Web architecture==
:В сетях [[ProxmoxVE]], мы задействуем три типа [[IP адрес]]ов:
+
For the purposes of this wikipage, "web architecture" refers to [[#The Farm]]'s outline of [[DNS record]]s and [[IP address]]es.
:#Для управления средами [[ProxmoxVE]], мы используем [[IPv4 адрес]]а и [[IPv6 адрес]]а отдельных ''Желез''.
 
:#Для внутренней сети из трёх ''Желез'', собранной на одном [[Hetzner vSwitch]], задействуется частный [[IP адрес]]. Эта сеть не доступна из сети [[Интернет]]; прежде всего, через неё синхранизуются хранилища ''Желез''. Для этой сети, выбран адрес с типом "/24" .
 
:#Внешняя сеть требует покупки дополнительных [[IP адрес]]ов, причём [[IPv4 адрес]]а дороги, а [[IPv6 адрес]]а, возможно, могут не обеспечивать стабильной работы. В данный момент, мы купили один [[IPv6 адрес]] и тестируем его. Этот [[IPv6 адрес]] будет присваиваться всем [[VM]] и [[контейнер]]ам, которые будут создаваться в инфраструктуре. Чтобы работать с ресурсами ''Фермы'', пользователи будут запрашивать именно этот адрес. Эта сеть также собрана на тех же ''Железах'' другим [[Hetzner vSwitch]].
 
  
===Веб-сервер===
+
===Channels and networks===
:В качестве веб-сервера одни приложения будут использовать [[apache]], а другие [[nginx]].
+
: [[#The Farm]]'s communication channels are built on [[#The Metal]] and [[#The Bridges]]. Currently, [[#The Cluster]] uses three communication channels, each of which serves one of the network as follows:
 +
:# '''[[wide area network|WAN]]''' ([[wide area network]]), which is [[#The Farm]]'s public network that uses external, public [[IPv4 address]]es to integrate the [[#The Gateway]] into the [[Internet]]. The public network is described in the [[#The Gateway]] section of this wikipage.
 +
:# '''[[local area network|LAN]]''' ([[local area network]]), which is [[#The Farm]]'s private network that uses internal, private [[IPv6 address]]es to integrate [[#The Gateway]] and [[#The Node]]s into one network cluster. This network cluster is described in [[#The Environment]] section of this very wikipage.
 +
:# '''[[storage area network|SAN]]''' ([[storage area network]]), which is [[#The Farm]]'s private network that uses internal, private [[IPv6 address]]es to integrate storage spaces of [[#The Node]]s into one storage cluster. This storage cluster is described in [[#The Storage]] section of this wikipage.
 +
: [[#The Farm]]'s usage of [[IP address]]es is best described in the [[#IP addresses]] section.
  
===DNS зона===
+
===DNS zone===
:Для связи с сетью Интернет, следующие записи созданы в зоне [[DNS]]:
+
: To locate [[#The Farm]]'s public resources in the [[Internet]], the following [[DNS record]]s are created in [[#The Farm]]'s [[DNS zone]]:
 
:{|class="wikitable"
 
:{|class="wikitable"
!Resource record!!Type!!Data!!Комментарий (не являющийся частью записи)
+
!Field!!Type!!Data!!Comment (not a part of the records)||Review
|-
 
|pm1.bskol.com||AAAA||2a01:4f8:10a:439b::2||Среда 1
 
|-
 
|pm2.bskol.com||AAAA||2a01:4f8:10a:1791::2||Среда 2
 
|-
 
|pm3.bskol.com||AAAA||2a01:4f8:10b:cdb::2||Среда 3
 
 
|-
 
|-
|pbs.bskol.com||AAAA||2a01:4f8:fff0:53::6||Сервер резервных копий Сред
+
|pm1.bskol.com||[[AAAA record]]||2a01:4f8:10a:439b::2||Node 1||No data
 
|-
 
|-
|pf.bskol.com||AAAA||2a01:4f8:fff0:53::6||rowspan="2"|pfsense
+
|pm2.bskol.com||[[AAAA record]]||2a01:4f8:10a:1791::2||Node 2||No data
 
|-
 
|-
|pf.bskol.com||A||88.99.71.85
+
|pm?.bskol.com||[[AAAA record]]||&nbsp;||Node ?||No data
 
|-
 
|-
|npm1.bskol.com||A||88.99.218.172||NGINX Среды 1
+
|pf.bskol.com||[[A record]]||88.99.71.85||[[Educaship pfSense]]||Record is not operational
 
|-
 
|-
|npm2.bskol.com||A||88.99.71.85||NGINX Среды 2
+
|talk.cnmcyber.com||[[A record]]||188.34.147.106||[[CNM Talk]] ([[Educaship Jitsi]])||Passed
 
|-
 
|-
|npm3.bskol.com||A||94.130.8.161||NGINX Среды 3
+
|corp.cnmcyber.com||[[A record]]||188.34.147.106||[[CNM Corp]] ([[Educaship Odoo]])||Passed
 
|-
 
|-
|jitsi.bskol.com||AAAA||2a01:4f8:fff0:53::2||[[Брацки Жици|Жици]] ([[Jitsi]])
+
|social.cnmcyber.com||[[A record]]||188.34.147.106||[[CNMCyber.com]] ([[Educaship HumHub]])||Passed
 
|-
 
|-
|jitsi1.bskol.com||A||88.99.218.172||Доступ по IPv4 к [[Брацки Жици|Жици]] ([[Jitsi]])
+
|portainer.cnmcyber.com||[[A record]]||188.34.147.107||Docker server, dockers are used for all monitoring||Passed
 
|-
 
|-
|sprava.bskol.com||AAAA||2a01:4f8:fff0:53::3||[[Брацка Справа|Справа]] ([[Odoo]])
+
|dash-status.cnmcyber.com||[[A record]]||188.34.147.107||Dashboard for monitoring status powered by [[Uptime Kuma]]||Passed
 
|-
 
|-
|sprava2.bskol.com||A||88.99.71.85||Доступ по IPv4 к [[Брацка Справа|Справе]] ([[Odoo]])
+
|status.cnmcyber.com||[[A record]]||188.34.147.107||||Passed
 
|-
 
|-
|setka.bskol.com||AAAA||2a01:4f8:fff0:53::4||[[Брацка Сетка|Сетка]] ([[HumHub]])
+
|influxdb.cnmcyber.com||[[A record]]||188.34.147.107||[[InfluxDB]]||Passed
 
|-
 
|-
|setka2.bskol.com||A||88.99.71.85||Доступ по IPv4 к [[Брацка Сетка|Сетке]] ([[HumHub]])
+
|monitor.cnmcyber.com||[[A record]]||188.34.147.107||[[Grafana]]||Passed
 
|-
 
|-
|svazka.bskol.com||AAAA||2a01:4f8:fff0:53::5||[[Брацки Связка|Связка]] ([[SuiteCRM]])
+
|npm.cnmcyber.com||[[A record]]||188.34.147.107||[[Nginx Proxy Manager]]||Passed
 
|-
 
|-
|svazka2.bskol.com||A||88.99.71.85||Доступ по IPv4 к [[Брацки Связка|Связке]] ([[SuiteCRM]])
+
|pass.cnmcyber.com||[[A record]]||188.34.147.107||[[Passbolt]]||Passed
 
|}
 
|}
  
==Пользовательские прилады==
+
==Web server files==
''Ферма'' обеспечивает высокую доступность [[Брацка Сетка|Брацкой Сетки]], [[Брацка Справа|Брацкой Справы]] и, возможно, других приложений, которые принадлежат [[Делово Бюро|Делово Бюро]].
 
  
===Сетка===
+
==Legacy==
*[[Брацка Сетка]] -- это брацкая прилада, которая представляет собою систему поддержки социальной сети, построено на базе готового программного решения [[HumHub]].
 
  
===Справа===
+
haproxy.bskol.com 86400 A 0 185.213.25.206
*[[Брацка Справа]] -- это средство управления людскими и материальными ресурсами предприятия, построенное на основе программного обеспечения [[Odoo]].
+
influx.bskol.com 86400 A 0 49.12.5.41
 +
monitor.bskol.com 86400 A 0 49.12.5.41
 +
pbs.bskol.com 86400 A 0 88.99.214.92
 +
pf.bskol.com 86400 A 0 88.99.71.85
 +
pm1.bskol.com 86400 A 0 88.99.218.172
 +
pm2.bskol.com 86400 A 0 88.99.71.85
 +
pm3.bskol.com 86400 A 0 88.99.214.92
 +
zabbix.bskol.com 86400 A 0 167.235.255.244
 +
pbs.bskol.com 86400 AAAA 0 2a01:4f8:10a:3f60::2
 +
pf.bskol.com 86400 AAAA 0 2a01:4f8:fff0:53::6
  
===Жици===
+
===IP addresses===
*[[Брацки Жици]] -- это инструмент Брацкой Школы для организации видео- и аудио-конференций, построена на основе программного обеспечения [[Jitsi]].
+
: To locate its resources in the [[#Communication channels]], [[#The Farm]] uses three types of [[IP address]]es:
 +
:# To access [[#The Environment]] from the outside world, [[#The Farm]] features public [[IPv6 address]]es. One address is assigned to each of [[#The Node]]s. Since there are three of them, three addresses of that type are created.
 +
:# For an internal network of [[#The Node]]s, which is assembled on the [[#Internal Bridge]], a private [[IP address]] is used. This network is not accessible from the Internet and not included in [[#The Farm]]'s [[DNS zone]]. For instance, [[#The Storage]] utilizes this network to synchronize its data. For this network, an address with the type "/24" is selected.
 +
:# For an external network of three ''Nodes'', which is assembled on the [[#External Bridge]], [[#The Farm]] features public [[IPv4 address]]es. They are handled by [[#Web intermediaries]].
  
===Связка===
+
===SSL certificates===
*[[Брацка Связка]] --  это брацкая прилада, которая представляет собою систему управления взаимоотношениями с клиентами, построена на основе программного обеспечения SuiteCRM.
 
  
==Мониторинг==
+
==Backup box==
:Сейчас не используется специальные функции.
+
A backup box is deployed on a 1 TB, unlimited traffic storage box BX-11 that has been rented for that purpose.
:Предложения кандидатов:
 
:#Стек -- [[prometheus]] + node-exporter + [[grafana]]
 
:#[[Prometheus]] to monitor VMs, Influx to monitor Pve nodes , [[Grafana]] for Dashbord
 
:#(M) [[grafana]] + [[influxdb]] + [[telegraf]], а также [[zabbix]]. Для мониторинга веб-сайта использовать [[uptimerobot]]
 
  
==История разработки==
+
===COTS for backups===
 +
: [[#The Team]] utilizes no additional software beyond [[ProxmoxVE]] for backups. Initially, [[Proxmox Backup Server]] was used. However, it consumed the active storage. As a result, the storage box was just attached to [[#The Environment]]. And backup to that storage goes directly from and to [[#The Environment]].
  
===Предыстория===
+
===Box features===
:Пользовательские приложения [[Делово Бюро|Делова Бюро]] изначально ставились на то, что сейчас называется [[Кампусна Ферма|Кампусной Фермой]]. При создании курса [[Брацки Техобзор]] появилась идея вынести их на отдельную платформу. Идея развилась, когда было решено создать кластер на "железных" серверах.
+
: 10 concurrent connections, 100 sub-accounts, 10 snapshots, 10 automated snapshots, FTP, FTPS, SFTP, SCP, Samba/CIFS, BorgBackup,    Restic, Rclone, rsync via SSH, HTTPS, WebDAV, Usable as network drive
  
===Собственные попытки===
+
===Box functions===
:Весной 2022 года, в дополнение к железному серверу [[Опытна Ферма|Опытной Фермы]], был арендован второй "железный" сервер. Наталья и подрядчик Андрей собственными силами в течение пары месяцев пытались сделать из них кластер.
+
: [[#The Provider]]'s description: Storage Boxes provide you with safe and convenient online storage for your data. Score a Storage Box from one of Hetzner Online's German or Finnish data centers! With Hetzner Online Storage Boxes, you can access your data on the go wherever you have internet access. Storage Boxes can be used like an additional storage drive that you can conveniently access from your home PC, your smartphone, or your tablet. Hetzner Online Storage Boxes are available with various standard protocols which all support a wide array of apps. We have an assortment of diverse packages, so you can choose the storage capacity that best fits your individual needs. And upgrading or downgrading your choice at any time is hassle-free!
  
===Советы по созданию===
+
===Box provisions===
:В результате собственных попыток, стало понятно, что компетенций команды для создания кластера не хватает. Для получения сторонней экспертизы, следующее объявление было размещено на [[Upwork]]:<blockquote><p><strong>Hetzner/RAID/Proxmox consultant is needed</strong>&nbsp;<i>Tech Support</i></p><p>Hey, guys, I need a consultant for this project - https://pravka.bskol.com/en/CNMC_bare-metal. I plan to buy a bare-metal server at hetzner.de, setup RAID and Proxmox, as well as start setting up the rest of technology</p></blockquote>
 
:По итогам консультаций, были сформулированы задания на создание кластера. [[Деловы прилады]] были выбраны в качестве "жителей" этого кластера из-за того, что требования [[кампусны прилады|кампусных прилад]] к инфраструктуре ниже.
 
  
===Создание кластера===
+
===UI for backups===
:Летом 2022-го, проект создания кластера был оформлен и Каролина была назначена координатором. Она привлекала подрядчиков на изготовление ''Фермы'' по разработанным на этой вики-странице требованиям.
+
: [[#User interfaces (UIs)]]
  
:Объявление на разовую работу было опубликовано на [[Upwork]]:<blockquote><p><strong>HA Proxmox Hetzner cluster is needed</strong>&nbsp;<i>Systems Administration</i></p><p>Guys, we need the most affordable well-documented HA (high availability) ProxmoxVE 7.2 cluster that is assembled on three Hetzner nodes:</p><ul><li>2 Intel Xeon E3-1275v5/2x SSD M.2 NVMe 512 GB/4x RAM 16384 MB DDR4 ECC/NIC 1 Gbit Intel I219-LM and</li><li>One Intel Core i7-7700/2x SSD SATA 512 GB/2x RAM 16384 MB DDR4/NIC 1 Gbit Intel I219-LM.</li></ul><p>Ceph, iptables. We plan that each node would have one VM or container for testing. We will assign a domain name to that.</p><p>Each has a primary IPv4. However, there are some unresolved issues related to the network. Initially, we planned to use vSwitch; however, it seems to require additional IP addresses, from which IPv4 are expensive and IPv6 may not be able to deliver HA. Thus, we plan to offer two different contract prices -- one is if we need to buy additional IPv4 addresses for vSwitch and another is if we don't.</p><p>We see two parts of acceptance testing. If both are successful, the contract shall be considered completed.</p><ol><li>During software testing, we will shut down 2 of 3 nodes to see whether the cluster is still available.</li><li>During documentation testing, we will erase the software from one, implement the rescue, and one expert will try to restore the software using your documentation. She will video-record her attempts and, if not successful, will provide you with the recording, so either you can show her errors or correct yours.</li></ol><p>What else do you need? If nothing, please give your minimum project budget (your project fare + initial costs of additional purchases such as setup fees for additional IP addresses, if any, required by the contract + first year costs of additional purchases, if any) and timeframe up to 2-3 weeks.</p><p>This project is a fixed-price one. When we send you an offer, we will change the terms. To complete the project, the selected contractor will be given an admin access, but not full robot credentials, for three weeks. After three weeks, that access would be revoked; if you don't complete the project by that time, you will never finish it!</p></blockquote>
+
==Used terms==
 +
On this very wikipage, a few abbreviations and terms are commonly used.
 +
:* ''Bridge''. Any of two [[Hetzner vSwitch]]es that [[#The Farm]] utilizes.
 +
:* ''UI''. A [[user interface]], which is the ''COTS'' feature that allows a ''COTS'' instance and its users to interact.
  
:Подрядчики отбирались по следующему принципу. Нам нужен кластер с нуля и мы отдадим подряд тому,
+
===The COTS===
:#Кто сможет это сделать,
+
: [[Commercial off-the-shelf]] ([[commercial off-the-shelf|COTS]]) software.
:#В чьём графике завершение контракта не растянется на более, чем два месяца, и
 
:#Чей бюджет будет наименьший. Имеется в виду весь бюджет, включая и оплату подрядчиков, и расходы на покупку и ежегодного поддержания ''Фермы''.
 
  
:Контракт был присуждён 8-го июля со сроком действия до 12 августа 2022 года.
+
===The Farm===
 +
CNM Bureau Farm, this very wikipage describes it.
  
:Подрядчик завершил роботу по контракту, результатом которой есть кластер ProxmoxVE 7.2 HA реализован на трех узлах Hetzner c помощью vSwitch. В процессе разработки и на этапе приемки было выявлено ряд проблем, решение которых было вынесено на следующий проект. Основная проблема с доступом до кластеру, на данный момент доступ реализован с помощью дополнительного IPv6. Однако это решение не дает достаточной доступности нашему кластеру, так как не все поставщики Интернет услуг работают с ним. Хорошим решением есть покупка IPv4 для этой цели, но это очень затратное. По этому было принято решения поискать другие варианты решения. Так же много вопросов осталось относительно улучшения безопасности, резервного копирования и мониторинг кластера.
+
===The Node===
 +
One hardware, [[bare-metal server]] of [[#The Infrastructure]] with all of software installed on the top of it.
  
===Доступ IPv4===
+
===The Team===
:Кластер ProxmoxVE 7.2 HA реализован на трех узлах Hetzner c помощью vSwitch. Для внешнего доступа к кластеру используется IPv6, что ограничивает его доступность в Интернете. Можно использовать дополнительный IPv4. Нас интересуют другие способы решения на вход.  
+
[[CNMCyber Team]].
  
:Объявление на разовую работу было опубликовано на Upwork:
+
==See also==
:Guys, we need IPv4 access to our HA (high availability) ProxmoxVE 7.2 cluster that is assembled on three Hetzner nodes through vSwitch.
 
  
:We know that we can add a IPv4 net to vSwitch. Do we have other options?
+
===Related lectures===
 
+
:* [[What CNM Farms Are]].
:As the same or separate projects, we would also like to consider some improvements to its security (right now, we use iptables only), backup (right now, we deploy Proxmox backup only, no ZFS, no RAID1 nor anything else), and monitoring (no any special features is deployed).
 
 
 
===Варианты на вход===
 
:#На вход поставить [[HAProxy]] на отдельном VPS.
 
 
 
===Безопасность===
 
:На данный момент используются только [[iptables]]. Вопросы относительно улучшения безопасности вынесены на отдельный проект.
 
:Предложение от кандидатов:
 
:# [[ipset]] + [[fail2ban]] + [[iptables]]
 
:# [[Pfsense FW]]
 
:#(M) firewall + [[fail2ban]].
 
 
 
===Резервное копирования===
 
:На данный момент используется резервное копирования Proxmox ([[Proxmox Backup Server]]). Рассматриваются  другие возможности резервного копирования.
 
 
 
===Ввод в работу===
 
 
 
==Передача и приёмка==
 
 
 
===Объёмы работ===
 
:Мы предоставляем подрядчику [[#Инфраструктура|Инфраструктуру]] и изложенные на этой вики-странице требования. Подрядчик должен представить нам объект приёмки -- отлично задокументированные [[#Виртуальные среды|Виртуальные среды]] с установленными высокоустойчивыми [[#Пользовательские прилады|Пользовательскими приладами]].
 
 
 
===Приёмочные тесты===
 
:Для того, чтобы убедиться в том, что то, что представлено подрядчиком -- это то, что нам надо (aka отвечает критериям приемлемости), порядок приёмочного тестирования установлен следующим:
 
:#Созданную ''Ферму'' тестируем, насильно отключивши два случайно выбранные ''Железа'' из трёх доступных. Если ''Ферма'' продолжает работать, то сборка принимается.
 
:#Программное обеспечение случайно выбранного ''Узла'' (одного из трёх) удаляется и наш специалист, Natly, восстанавливает его по созданной документации, одновременно записывая восстановление на видео. Восстановленная ''Ферма'' тестируется аналогично созданной. Если ''Ферма'' продолжает работать, то документация принимается. Если нет, то видео передаётся подрядчику для доработки документации или указания ошибок Natly.
 
 
 
==Вопросы для прояснения==
 
===Архитектура платформы===
 
:Есть две окончательно неразрешённые проблемы касаемые [[#ПО платформы|ПО платформы]]:
 
:*Один подрядчик предлагает вместо [[Ceph]] задействовать [[TrueNAS]].
 
:*До начала проекта, один специалист предлагал использовать роутер [[Microtik]], чтобы на proxy сделать два IP адреса, первый использовать для внутренних виртуалок, если они нормально работают, а второй загнать в bridge для внешних серверов и средствами Линукса типа firewall делить тот трафик, который приходит. Кроме того, на том же proxy он предлагал поставить [[DHCP]] сервер для раздачи адресов машинам. Другой специалист считал, что безопасных [[DHCP]] серверов на рынке нет. В результате, роутеры и [[DHCP]] сервер не устанавливались.
 
  
===Начало работы===
+
===Useful recommendations===
:*Что надо от нас, кроме присуждения контракта, решений по [[#Архитектура платформы|Архитектуре платформы]] и данных ''Железа''?
+
:* https://www.informaticar.net/how-to-setup-proxmox-cluster-ha/ (using Ceph without [[Hetzner vSwitch]])
:*Насколько полезны для этой разработки [[#Полезные рекоммендации|Полезные рекоммендации]]?
+
:* https://community.hetzner.com/tutorials/hyperconverged-proxmox-cloud (using Ceph with [[Hetzner vSwitch]])
 +
:* https://pve.proxmox.com/wiki/High_Availability (general [[ProxmoxVE]] [[HA]] functionality)
 +
:* https://docs.hetzner.com/robot/dedicated-server/network/vswitch/ (general [[Hetzner vSwitch]] functionality)
  
==Полезные рекоммендации==
+
Jitsi/Odoo/HumHub, DNS, [[Educaship pfSense]], monitoring -- [[Telegraf]] + [[InfluxDB]] + [[Grafana]], [[Uptime Kuma]], [[Passbolt]], mail server, LDAP, DNS zone, IPv4, what else to do? HA test, alpha testing, DNS
*https://www.informaticar.net/how-to-setup-proxmox-cluster-ha/ (using Ceph without [[Hetzner vSwitch]])
 
*https://community.hetzner.com/tutorials/hyperconverged-proxmox-cloud (using Ceph with [[Hetzner vSwitch]])
 
*https://pve.proxmox.com/wiki/High_Availability (general [[ProxmoxVE]] [[HA]] functionality)
 
*https://docs.hetzner.com/robot/dedicated-server/network/vswitch/ (general [[Hetzner vSwitch]] functionality)
 
  
 
[[Category:CNM Cloud products]][[Category: CNM Cyber Orientation]][[Category: Articles]]
 
[[Category:CNM Cloud products]][[Category: CNM Cyber Orientation]][[Category: Articles]]

Latest revision as of 15:57, 14 April 2024

CNM Bureau Farm (formerly known as CNM EndUser Farm; hereinafter, #The Farm) is the segment of Opplet that is responsible for providing users with CNM Corp and CNMCyber.com. Most likely, #The Farm will also handle the applications that CNM Fed Farm currently provides.

CNMCyber Team (hereinafter, #The Team) develops and administers #The Farm. To sustain #The Farm online, #The Team rents Bureau Infrastructure.


In the nutshell

For the purposes of this very wikipage, end-users are called #The End-Users. Collectively, end-user applications are called #The User Apps.

Architecture

While using #The Farm, #The End-Users work with #The User Apps that are installed in #The Cluster that is, consequently, hosted by #The Infrastructure. #The Cluster consists of #The Storage, #The Environment, #The Gateway. #The Infrastructure includes #The Bridges and #The Metal, which is #The Farm's hardware.

Cluster-based

To mitigate a single point of failure (SPOF), #The Farm is built on not just one, but three bare-metal servers. Each of those hardware servers with all of software installed on the top of it (hereinafter, #The Node) is self-sufficient to host #The User Apps. Various solutions such as #The uptime tools orchestrate coordination between #The Nodes.

COTS-powered

#The Farm's software is a collection of commercial off-the-shelf (COTS) packages (hereinafter, #The COTS). In plain English, #The COTS is the software that is already available on the market. No single line of programming code is written specifically for #The Farm. Only time-tested market-proven solutions have been used. #The User Apps use HumHub, Jitsi, and Odoo instances. #The Cluster uses Ceph, Proxmox, and pfSense.

Addressing the needs

Development and sustenance of #The Farm address two business needs of CNMCyber Team (hereinafter, #The Team). For #The Team, #The Farm shall serve as both #Tech side of the Apps and #Worksite.

Tech side of the Apps

#The Team needs to provide #The End-Users with the services of #The User Apps 24 hours a day, 7 days a week. #The Farm shall power #The User Apps technologically; their content is outside of #The Farm's scope.

Worksite

#The Team needs to provide those incumbents of CNMCyber practices who work with #The Farm's technologies with their worksite.

Farm users

For the purposes of this very wikipage, a farm user refers to any user of #The Farm. The farm user's authentication and authorization administration is a part of identity and access management (IAM).

The End-Users

For the purposes of this very wikipage, the Patron refers to an end-user of #The User Apps. They access #The User Apps via the graphic user interfaces (GUIs) that are associated with the particular application. At #The Farm, those #User interfaces (UIs) are located at IPv4 addresses. Opplet.net provides the Patrons with access automatically or, by #The Power-Users, manually. The Patrons can access #The User Apps and #The User Apps without administrative tools only. The Patrons may or may not be members of #The Team.

The Power-Users

For the purposes of this very wikipage, the Power-User refers to a power-user of #The User Apps. By definition, the Power-Users are those members of #The Team who have authorized to access more of #The User Apps' resources than #The End-Users. Normally, those resources are administrative tools that allow the Power-Users administer one or more of #The User Apps.
While having administrative access, the Power-Users serve as application-level administrators. They access #The User Apps via the same graphic user interfaces (GUIs) as #The End-Users, but administrative-level UIs display administration tools in addition to those tools that are available to non-privileged end-users.
At the moment, #The Sysadmins can grant the Power-User rights to one or more of #The End-Users manually. Administrative access credentials are classified and securely stored in the application project spaces of CNM Lab. Those incumbents of CNMCyber practice who work with #The Farm's technologies may or may not be the Power-Users.

The Sysadmins

For the purposes of this very wikipage, the Sysadmin refers to a system administrator of #The Cluster. By definition, the Sysadmins are those members of #The Team who have authorized to access at least some of #The Cluster's resources:
#The Superusers can grant the Sysadmin rights to one or more of #The End-Users manually. Administrative access credentials are classified and securely stored in the cluster project spaces of CNM Lab. No incumbent of CNMCyber practice can be the Sysadmin; only those who work in CNMCyber Office can.

The Superusers

For the purposes of this very wikipage, the Superuser refers to a superuser of #The Farm. By definition, the Superusers are those members of #The Team who have authorized to access any part of #The Farm starting with its hardware. #The Metal's access is carried via #UI for the Metal.
#The Provider grants the Superuser rights to CNMCyber Customer, who can manually pass it to one or more members of CNMCyber Office. The Superuser credentials are not stored in CNM Lab.

User interfaces (UIs)

For the purposes of this very wikipage, a user interface (UI) refers to #The COTS' feature that allows #The COTS' instance and its users to interact. UIs of #The User Apps are described in the wikipages that are dedicated to particular applications. The other UIs of #The Farm are described in the #UI of the Environment, #UI of the Storage, #UI of the Gateway, and #UI for the Metal sections of this very wikipage.

Dashboards

For the purposes of this very wikipage, a Dashboard refers to a graphic user interface (GUI) that either belongs to any #The COTS package installed in #The Farm or is provider by #The Provider. This screen-based interface allows #Farm users to interact with #The User Apps and other software through graphical buttons, icons, or hyperlinked texts rather than typed commands in command line interfaces (CLIs).

Third-party UIs

For the purposes of this very wikipage, a Third-party UI refers to any user interface (UI) that neither belongs to any #The COTS package installed in #The Farm nor is provider by #The Provider. #The Sysadmins and #The Suparusers may use access tools such as PuTTy and Midnight Commander to access #UI for the Metal.

The User Apps

For the purposes of this very wikipage, the User Apps refer to those end-user applications with which #The End-Users interact.

HumHub

CNMCyber.com, which is the end-user instance of Educaship HumHub.

Jitsi

CNM Talk, which is the end-user instance of Educaship Jitsi.

Odoo

CNM Corp, which is the end-user instance of Educaship Odoo.

The Cluster

For the purposes of this very wikipage, the Cluster refers to all of the software between #The User Apps and #The Infrastructure.

Cluster components

#The Cluster consists of #The Nodes and their management tools. The following components compose #The Cluster:
  1. #The Storage that is powered by Educaship Ceph to provide #The Environment with stored objects, blocks, and files. Three storage spaces of #The Nodes create one distributed storage foundation.
  2. #The Environment that is powered by Educaship Proxmox to make containers and virtual machines (VMs) available to #The User Apps, so #The User Apps can function properly. Each of #The Nodes features its own environment; #The uptime tools orchestrate them all.
  3. #The Gateway that is powered by Educaship pfSense to create a gateway between #The Environment and the outside world. There is only one gateway; if it fails, #The Farm fails.

Choice of COTS

While building #The Farm generally and #Cluster components specifically, #The Team utilized only #The COTS that is both open-source and free of charge. Other considerations for the choice are stated in the #COTS for the Environment, #COTS for the Storage, #COTS for the Gateway, #COTS for backups sections of this very wikipage.

Cluster provisioning

Provisioning of #The Cluster occurs in the following sequence:
  1. Since #The Environment is installed on the top of #The Infrastructure, the #Environment provisions shall be accommodated first.
  2. Since #The Storage is a part of #The Environment, #Storage provisions shall be accommodated second.
  3. Since #The Gateway is installed in #The Environment, #Gateway provisions shall be accommodated third.

Cluster monitoring

Monitoring features are to be identified. Previously, various candidates offered three options:
  1. Stack -- prometheus + node-exporter + grafana
  2. Prometheus to monitor VMs, Influx to monitor Pve nodes, Grafana for Dashbord
  3. grafana + influxdb + telegraf, as well as zabbix. To monitor websites, use uptimerobot

Observability vs. APM vs. Monitoring

application performance management

Cluster recovery

High availability (HA)

Generally speaking, high availability (HA) of any system assumes its higher uptime in comparison with a similar system without higher uptime ability. HA of #The Farm assumes its higher uptime in comparison with a similar farm built on one of #The Nodes. Before #The uptime tools were deployed, #The Farm functioned only on one of #The Nodes and, when it failed, services of #The User Apps were no longer available until the failure was fixed. Now, until at least one of #The Nodes is operational, #The Farm is operational.

The uptime tools

Both #The Environment and #The Storage feature advanced tools for #High availability (HA).

Uptime limitations

Generally speaking, HA comes with significant costs. So does #The uptime tools. At very least, running three of #The Nodes is more expensive than running one. The cost cannot exceed the benefit, so high availability (HA) cannot be equal to failure tolerance.

Uptime management

To manage redundant resources, #The uptime tools:
  • Monitor its resources to identify whether they are operational or failed as described in the #Monitoring section of this very wikipage.
  • Fence those resources that are identified as failed. As a result, non-operational resources are withdrawn from the list of available.
  • Restore those resources that are fenced. The #Recovery supports that feature, while constantly creating snapshots and reserve copies of #The Farm and its parts in order to make them available for restoring when needed.

Uptime principles

Principally, #High availability (HA) of #The Cluster is based on:

The Environment

For the purposes of this very wikipage, the Environment refers to the virtual environment (VE) of #The Cluster or, allegorically, to the environment where #The User Apps "live".

COTS for the Environment

As #The COTS for the Environment, #The Team utilizes Educaship Proxmox. For a while, #The Team has also tried OpenStack and VirtualBox as its virtualization tools. The trials suggested that OpenStack required more hardware resources and VirtualBox didn't allow for required sophistication in comparison with ProxmoxVE, which has been chosen as #The COTS for #The Farm's virtualization.

Environment features

#The Team uses virtualization to divide hardware resources of #The Node's bare-metal servers in smaller containers and virtual machines (VMs), which are created in the Environment to run #The User Apps, #The Gateway, and other applications. In #The Farm, the ProxmoxVE instance tightly integrates KVM hypervisor, LXC containers, Educaship Ceph as software-defined storage, as well as networking functionality on a single virtualization platform.

Environment functions

#The Environment executes four major functions. It:
  1. Runs #The User Apps, #The Gateway, and other applications. They can be deployed utilizing two models:
    1. Using containers; they already contain operating systems tailored specifically to the needs of the App.
    2. In virtual machines (VM) and without containers. In that model, the App is installed on the operating system of its VM.
  2. Hosts #The Storage and #Backup box
  3. Connects the applications it runs and the storages it hosts to each other and to #The Bridges, while creating networks.
  4. Creates backups and accommodates its own recovery when requested.

Environment provisions

Every instance of ProxmoxVE requires one "physical" bare-metal server. The interaction between ProxmoxVE instances and #The Infrastructure is carried out by Debian operating system (OS) that comes in the same "box" of #The COTS as ProxmoxVE and is specifically configured for that interaction. #The Farm's ProxmoxVE also hosts #The Storage as its storage.

UI of the Environment

With regards to #User interfaces (UIs), #The Environment features its administrative interface, which belongs to #Dashboards.

The Storage

For the purposes of this very wikipage, the Storage refers to the storage platform or database management system (DBMS) that provides #The User Apps with the storage they need to operate. Thus, the Storage supports #The Environment's non-emergency operations and differs from the #Backup box that comes into play in emergencies.

COTS for the Storage

As #The COTS for #The Storage, #The Team utilizes Educaship Ceph. Any ProxmoxVE instance requires some storage to operate.
Before deploying #The uptime tools, #The Team used RAID to make the double hard disks redundant. So, the ProxmoxVE instance was just installed on the top of one disk and replicated to the other disk automatically. Flexibly, ProxmoxVE allows for better usage of hard disks. ProxmoxVE can be configured to host as #The COTS, many storage-type software packages such as ZFS, NFS, GlusterFS, and so on.
Initially, the cluster developer proposed using Ceph. Later, #The Team substituted one node with another with higher hard disk, but without SSD and NVMe; as a result, #The Farm's storage collapsed. The substituted node was disconnected (today, it serves as hardware for CNM Lab Farm), a new bare-metal server was purchased (today, it is the #Node 3 hardware) and Ceph restored.
As #The COTS, ProxmoxVE comes with OpenZFS. #The Team has deployed the combination of both in its CNM Lab Farm.

Storage features

#The Storage features are:
  • File system
  • Distributed
  • Fault tolerant
  • Object storage
  • Block device

Storage functions

To make objects, blocks, and files immediately available for #The User Apps' operations, #The Cluster uses a common distributed cluster foundation that orchestrates storage spaces of #The Nodes.

Storage provisions

Since #The Storage is installed on the top of #The Environment, the Storage provisioning entails configuring a ProxmoxVE instance to work with a Educaship Ceph instance.
At #The Farm, Educaship Ceph is deployed at all of #The Nodes. Each of #The Node's servers features doubled hard disks. Physically, a ProxmoxVE instance is installed on one disk of each of #The Nodes; Educaship Ceph uses three "second" disks. So, #The Farm features three instances of ProxmoxVE and one instance of Ceph.
While experimenting with OpenZFS and RAID, #The Team has also tried another model. The second disks then served as reserve copies of the first ones. Since every disk is just 512 GB, that model shrank #The Farm's capacity in a half since both #The User Apps and their storage needed to fit the 512 GB limitation together.
In the current model, #The User Apps shouldn't share their 512 GB with the storage. On another hand, #The Farm's Educaship Ceph capacity is about 3 * 512 GB = 1.536 GB.

UI of the Storage

With regards to #User interfaces (UIs), #The Storage features its administrative interface, which belongs to #Dashboards.

The Gateway

For the purposes of this very wikipage, the Gateway refers to the composition of software that is built on the external Bridge. The Gateway is the hub for both #The Farm's wide area network (WAN) and local area network (LAN). To power the Gateway, Educaship pfSense is deployed.

The composition of software such as a load balancer or reverse proxy that is built on the #External Bridge.

COTS for the Gateway

As #The COTS for the Gateway, #The Team utilizes Educaship pfSense. For a while, #The Team has also tried iptables as a firewall and Fail2ban, which operates by monitoring log files (e.g. /var/log/auth.log, /var/log/apache/access.log, etc.) for selected entries and running scripts based on them. Most commonly this is used to block selected IP addresses that may belong to hosts that are trying to breach the system's security. It can ban any host IP address that makes too many login attempts or performs any other unwanted action within a time frame defined by the administrator. Includes support for both IPv4 and IPv6.

Gateway features

Gateway functions

FreeBSD, HA, VPN, LDAP, backups, CARP VIP

#The Gateway can be compared to an executive secretary, who (a) takes external client's requests, (b) serves as a gatekeeper, while checking validity of those requests, (c) when the request is valid, selects to which internal resource to dispatch it, (d) dispatches those requests to the selected resource, (e) gets internal responses, and (f) returns them back to the client in the outside world.
Thus, #The Gateway:
  1. (constantly) Is monitoring state of internal resources of #The Farm.
  2. Receives requests from the world outside of #The Farm.
  3. Checks validity of external requests, while serving as a firewall.
  4. When the request is valid, selects to which of #The Nodes to dispatch it. #The Gateway is responsible for dispatching external requests to those and only to those internal resources that #Cluster monitoring has identified as operational.
  5. Dispatches those requests to #The Node that was selected.
  6. Collects internal responses.
  7. Returns those responses to the outside world.
To be more accessible to its clients, #The Gateway utilizes public IPv4 addresses.

Gateway provisions

#The Gateway is deployed in a virtual machine (VM) of #The Environment.

UI of the Gateway

With regards to #User interfaces (UIs), #The Gateway features its administrative interface, which belongs to #Dashboards.

Gateway components

#The Gateway includes #Firewall and router, #Load balancer, and #Web server.

Firewall and router

Educaship pfSense plays roles of firewall, reverse proxy, and platform to which #Load balancer and #Web server are attached.

Load balancer

As a load balancer, Educaship pfSense uses the select version of HAProxy that is specifically configured as HAProxy's add-on. As of summer of 2023, no full HAProxy Manager exists in #The Farm. As of summer of 2023, a round robin model is activated for load balancing.

Web server

As its web server, pfSense utilizes lighttpd. Prior to deployment of Educaship pfSense, #The Team utilized two web servers to communicate with the outside world via HTTP. Nginx handled requests initially and Apache HTTP Server handled those requests that hadn't handled by Nginx.

Web architecture

For the purposes of this wikipage, "web architecture" refers to #The Farm's outline of DNS records and IP addresses.

Channels and networks

#The Farm's communication channels are built on #The Metal and #The Bridges. Currently, #The Cluster uses three communication channels, each of which serves one of the network as follows:
  1. WAN (wide area network), which is #The Farm's public network that uses external, public IPv4 addresses to integrate the #The Gateway into the Internet. The public network is described in the #The Gateway section of this wikipage.
  2. LAN (local area network), which is #The Farm's private network that uses internal, private IPv6 addresses to integrate #The Gateway and #The Nodes into one network cluster. This network cluster is described in #The Environment section of this very wikipage.
  3. SAN (storage area network), which is #The Farm's private network that uses internal, private IPv6 addresses to integrate storage spaces of #The Nodes into one storage cluster. This storage cluster is described in #The Storage section of this wikipage.
#The Farm's usage of IP addresses is best described in the #IP addresses section.

DNS zone

To locate #The Farm's public resources in the Internet, the following DNS records are created in #The Farm's DNS zone:
Field Type Data Comment (not a part of the records) Review
pm1.bskol.com AAAA record 2a01:4f8:10a:439b::2 Node 1 No data
pm2.bskol.com AAAA record 2a01:4f8:10a:1791::2 Node 2 No data
pm?.bskol.com AAAA record   Node ? No data
pf.bskol.com A record 88.99.71.85 Educaship pfSense Record is not operational
talk.cnmcyber.com A record 188.34.147.106 CNM Talk (Educaship Jitsi) Passed
corp.cnmcyber.com A record 188.34.147.106 CNM Corp (Educaship Odoo) Passed
social.cnmcyber.com A record 188.34.147.106 CNMCyber.com (Educaship HumHub) Passed
portainer.cnmcyber.com A record 188.34.147.107 Docker server, dockers are used for all monitoring Passed
dash-status.cnmcyber.com A record 188.34.147.107 Dashboard for monitoring status powered by Uptime Kuma Passed
status.cnmcyber.com A record 188.34.147.107 Passed
influxdb.cnmcyber.com A record 188.34.147.107 InfluxDB Passed
monitor.cnmcyber.com A record 188.34.147.107 Grafana Passed
npm.cnmcyber.com A record 188.34.147.107 Nginx Proxy Manager Passed
pass.cnmcyber.com A record 188.34.147.107 Passbolt Passed

Web server files

Legacy

haproxy.bskol.com 86400 A 0 185.213.25.206 influx.bskol.com 86400 A 0 49.12.5.41 monitor.bskol.com 86400 A 0 49.12.5.41 pbs.bskol.com 86400 A 0 88.99.214.92 pf.bskol.com 86400 A 0 88.99.71.85 pm1.bskol.com 86400 A 0 88.99.218.172 pm2.bskol.com 86400 A 0 88.99.71.85 pm3.bskol.com 86400 A 0 88.99.214.92 zabbix.bskol.com 86400 A 0 167.235.255.244 pbs.bskol.com 86400 AAAA 0 2a01:4f8:10a:3f60::2 pf.bskol.com 86400 AAAA 0 2a01:4f8:fff0:53::6

IP addresses

To locate its resources in the #Communication channels, #The Farm uses three types of IP addresses:
  1. To access #The Environment from the outside world, #The Farm features public IPv6 addresses. One address is assigned to each of #The Nodes. Since there are three of them, three addresses of that type are created.
  2. For an internal network of #The Nodes, which is assembled on the #Internal Bridge, a private IP address is used. This network is not accessible from the Internet and not included in #The Farm's DNS zone. For instance, #The Storage utilizes this network to synchronize its data. For this network, an address with the type "/24" is selected.
  3. For an external network of three Nodes, which is assembled on the #External Bridge, #The Farm features public IPv4 addresses. They are handled by #Web intermediaries.

SSL certificates

Backup box

A backup box is deployed on a 1 TB, unlimited traffic storage box BX-11 that has been rented for that purpose.

COTS for backups

#The Team utilizes no additional software beyond ProxmoxVE for backups. Initially, Proxmox Backup Server was used. However, it consumed the active storage. As a result, the storage box was just attached to #The Environment. And backup to that storage goes directly from and to #The Environment.

Box features

10 concurrent connections, 100 sub-accounts, 10 snapshots, 10 automated snapshots, FTP, FTPS, SFTP, SCP, Samba/CIFS, BorgBackup, Restic, Rclone, rsync via SSH, HTTPS, WebDAV, Usable as network drive

Box functions

#The Provider's description: Storage Boxes provide you with safe and convenient online storage for your data. Score a Storage Box from one of Hetzner Online's German or Finnish data centers! With Hetzner Online Storage Boxes, you can access your data on the go wherever you have internet access. Storage Boxes can be used like an additional storage drive that you can conveniently access from your home PC, your smartphone, or your tablet. Hetzner Online Storage Boxes are available with various standard protocols which all support a wide array of apps. We have an assortment of diverse packages, so you can choose the storage capacity that best fits your individual needs. And upgrading or downgrading your choice at any time is hassle-free!

Box provisions

UI for backups

#User interfaces (UIs)

Used terms

On this very wikipage, a few abbreviations and terms are commonly used.

The COTS

Commercial off-the-shelf (COTS) software.

The Farm

CNM Bureau Farm, this very wikipage describes it.

The Node

One hardware, bare-metal server of #The Infrastructure with all of software installed on the top of it.

The Team

CNMCyber Team.

See also

Related lectures

Useful recommendations

Jitsi/Odoo/HumHub, DNS, Educaship pfSense, monitoring -- Telegraf + InfluxDB + Grafana, Uptime Kuma, Passbolt, mail server, LDAP, DNS zone, IPv4, what else to do? HA test, alpha testing, DNS