Upgrade from 7.7.2 to 8.1.0.1

1. Upgrade From 3.7.2 to 8.1.0.1 and 7.7.2 to 8.1.0.1

Non-Kubernetes

RDAF Infra Upgrade: 1.0.3, 1.0.3.3 (haproxy) to 1.0.4

RDAF Platform: From 3.7.2 to 8.1.0.1

OIA (AIOps) Application: From 7.7.2 to 8.1.0.1

RDAF Deployment rdaf CLI: From 1.3.2 to 1.4.1

RDAF Client rdac CLI: From 3.7.2 to 8.1.0.1

Python: From 3.7 to 3.12

1.1. Prerequisites

Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.

Non-Kubernetes

RDAF Deployment CLI version: 1.3.2
Infra Services tag: 1.0.3 / 1.0.3.3 (haproxy)
Platform Services and RDA Worker tag: 3.7.2
OIA Application Services tag: 7.7.2
Each OpenSearch node requires an additional 100 GB of disk space to support both the ingestion of new alert payloads and the migration of alert history data to the pstream.
Python Version: 3.7.4
CloudFabrix recommends taking VMware VM snapshots where RDA Fabric infra/platform/applications are deployed

Important

The Webhook URL is currently configured with port 7443 and should be updated to port 443. Below are the steps to update Webhook URL:
Login to UI → Click on Administration → Organization → click on Configure → click on Alert Endpoints → click on required Endpoint and edit to update the port
Any pipelines or external target sources using port 7443 will also need to be updated to port 443.

Note

Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
```
df -kh
```

Example Output

rdauser@oia-125-216:~/collab-3.7-upgrade$ df -kh
Filesystem                         Size  Used Avail Use% Mounted on
udev                                32G     0   32G   0% /dev
tmpfs                              6.3G  357M  6.0G   6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   48G   12G   34G  26% /
tmpfs                               32G     0   32G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                               32G     0   32G   0% /sys/fs/cgroup
/dev/loop0                          64M   64M     0 100% /snap/core20/2318
/dev/loop2                          92M   92M     0 100% /snap/lxd/24061
/dev/sda2                          1.5G  309M  1.1G  23% /boot
/dev/sdf                            50G  3.8G   47G   8% /var/mysql
/dev/loop3                          39M   39M     0 100% /snap/snapd/21759
/dev/sdg                            50G  541M   50G   2% /minio-data
/dev/loop4                          92M   92M     0 100% /snap/lxd/29619
/dev/loop5                          39M   39M     0 100% /snap/snapd/21465
/dev/sde                            15G  140M   15G   1% /zookeeper
/dev/sdd                            30G  884M   30G   3% /kafka-logs
/dev/sdc                            50G  3.3G   47G   7% /opt
/dev/sdb                            50G   29G   22G  57% /var/lib/docker
/dev/sdi                            25G  294M   25G   2% /graphdb
/dev/sdh                            50G   34G   17G  68% /opensearch
/dev/loop6                          64M   64M     0 100% /snap/core20/2379

Warning

Make sure all of the above pre-requisites are met before proceeding with the upgrade process.

Warning

Non-Kubernetes: Upgrading RDAF Platform and AIOps application services is a disruptive operation. Schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.

Important

Please make sure full backup of the RDAF platform system is completed before performing the upgrade.

Non-Kubernetes: Please run the below backup command to take the backup of application data.

rdaf backup --dest-dir <backup-dir>

Note: Please make sure this backup-dir is mounted across all infra,cli vms.

Verify that RDAF deployment rdaf cli version is 1.3.2 on the VM where CLI was installed for docker on-prem registry managing Kubernetes or Non-kubernetes deployments.

rdaf --version

Example

RDAF CLI version: 1.3.2

On-premise docker registry service version is 1.0.3

docker ps | grep docker-registry

Example

ff6b1de8515f   cfxregistry.CloudFabrix.io:443/docker-registry:1.0.3   "/entrypoint.sh /bin…"   7 days ago   Up 7 days             deployment-scripts-docker-registry-1

RDAF Infrastructure services version is 1.0.3 except for below services.
rda-minio: version is RELEASE.2023-09-30T07-02-29Z
haproxy: version is 1.0.3.3

Run the below command to get RDAF Infra service details

rdaf infra status

RDAF Platform services version is 3.7.2

Run the below command to get RDAF Platform services details

rdaf platform status

RDAF OIA Application services version is 7.7.2

Run the below command to get RDAF App services details

rdaf app status

Warning

Before starting the upgrade of the RDAF platform's version to 8.1.0.1 release, please complete the below 2 steps which are mandatory.

Before starting the upgrade of RDAF CLI from version 1.3.2 to 1.4.1. These steps are mandatory and only applicable if the CFX RDAF AIOps (OIA) application services are installed.
Upgrade python version from 3.7 to 3.12

1.1.1 Upgrade Python from 3.7 to 3.12

Please refer Python Upgrade Guide From Python 3.7 to Python 3.12

RDAF Deployment CLI Upgrade:

Please follow the below given steps.

Note

Upgrade RDAF Deployment CLI on both on-premise docker registry VM and RDAF Platform's management VM if provisioned separately.

Login into the VM where rdaf deployment CLI was installed for docker on-premise registry and managing Non-kubernetes deployment.

Non-Kubernetes

With Internet AccessWithout Internet Access

Download the RDAF Deployment CLI's newer version 1.4.1 bundle

wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1/rdafcli-1.4.1.tar.gz

Upgrade the rdaf CLI to version 1.4.1

pip install --user rdafcli-1.4.1.tar.gz

Verify the installed rdaf CLI version is upgraded to 1.4.1

rdaf --version

Example Output

RDAF CLI version: 1.4.1

Download the RDAF Deployment CLI's newer version 1.4.1 bundle and copy it to RDAF management VM on which rdaf deployment CLI was installed.

For Ubuntu OS Environment

wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1/offline-ubuntu-1.4.1.tar.gz

Extract the rdaf CLI software bundle contents

tar -xvzf offline-ubuntu-1.4.1.tar.gz

Change the directory to the extracted directory

cd offline-ubuntu-1.4.1

Upgrade the rdafCLI to version 1.4.1

pip install --user rdafcli-1.4.1.tar.gz -f ./ --no-index

Verify the installed rdaf CLI version

rdaf --version

Example Output

RDAF CLI version: 1.4.1

1.2. Upgrade Steps

1.2.1 Migration of On-Prem Registry

Non-Kubernetes

Please download the below python script (rdaf_upgrade_132_141.py)

wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1/rdaf_upgrade_132_141.py

The below step will generate values.yaml.latest files for all RDAF Infrastructure, Platform and Application services in the /opt/rdaf/deployment-scripts directory.

Please run the downloaded python upgrade script rdaf_upgrade_132_141.py as shown below

python rdaf_upgrade_132_141.py -h

Note

The above command will show the available options for the upgrade script

usage: rdaf_upgrade_132_141.py [-h] {registry_upgrade,upgrade,migrate_dataset} ...

options:
    -h, --help            show this help message and exit

options:
    {registry_upgrade,upgrade,migrate_dataset}
                            Available options
        registry_upgrade    Upgrade on prem registry if any
        upgrade             upgrade the existing setup
        migrate_dataset     Migrate dataset using script

If there is already an on-premises registry, upgrade it with the command below.

python rdaf_upgrade_132_141.py registry_upgrade

Example Output

Creating backup rdaf-registry.cfg
Login Succeeded

WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/rdauser/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store


Updating docker-compose binary...

1.2.2 Upgrade On-Prem Registry

Please update the registry by using the below command

rdaf registry upgrade --tag 1.0.4

1.2.3 Download the new Docker Images

Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.

Non-Kubernetes

To fetch registry please use the below command

rdaf registry fetch --tag 1.0.4,8.1.0.1
rdaf registry fetch --minio-tag RELEASE.2024-12-18T13-15-44Z

Note

If the Download of the images fail, Please re-execute the above command

Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.

rdaf registry list-tags

Please make sure 1.0.4 Image tag is downloaded for the Infra service

nats - 1.0.4
minio with tag RELEASE.2024-12-18T13-15-44Z
mariadb - 1.0.4
opensearch - 1.0.4
kafka - 1.0.4
graphdb - 1.0.4
haproxy - 1.0.4

Please make sure 8.1.0.1 image tag is downloaded for the below RDAF Platform services.

rda-client-api-server
rda-registry
rda-scheduler
rda-collector
rda-identity
rda-fsm
rda-asm
rda-access-manager
rda-resource-manager
rda-user-preferences
onprem-portal
onprem-portal-nginx
rda-worker-all
onprem-portal-dbinit
cfxdx-nb-nginx-all
rda-event-gateway
rda-chat-helper
rdac
bulk_stats

Please make sure 8.1.0.1 image tag is downloaded for the below RDAF OIA (AIOps) Application services.

cfx-rda-app-controller
cfx-rda-alert-processor
cfx-rda-file-browser
cfx-rda-smtp-server
cfx-rda-ingestion-tracker
cfx-rda-reports-registry
cfx-rda-ml-config
cfx-rda-event-consumer
cfx-rda-webhook-server
cfx-rda-irm-service
cfx-rda-alert-ingester
cfx-rda-collaboration
cfx-rda-notification-service
cfx-rda-configuration-service
cfx-rda-alert-processor-companion

Downloaded Docker images are stored under the below path.

/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/

Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.

df -h /opt

If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.

Note

Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.

rdaf registry delete-images --tag <tag1,tag2>

Important

Before setting up graphDB make sure graphDB isn't already installed, if its already installed please Ignore the below steps

If the /opt/rdaf/rdaf.cfg file contains only GraphDB configuration entries, and the GraphDB service is either not running or the GraphDB mount point is not mounted, then remove the GraphDB entries from /opt/rdaf/rdaf.cfg.

Warning

For GraphDB installation, an additional disk must be provisioned on the RDA Fabric Infrastructure VMs. Click Here to perform this action

It is a pre-requisite and this step need to be completed before installing the GraphDB service.

Please run the below python script to setup GraphDB

Note

1.Please take a backup of /opt/rdaf/deployment-scripts/values.yaml and /opt/rdaf/config/network_config/config.json

wget https://macaw-amer.s3.amazonaws.com/releases/rdaf-platform/1.2.2/rdaf_upgrade_120_121_to_122.py

python rdaf_upgrade_120_121_to_122.py graphdb_setup

It will ask for the IPs to set the GraphDB configs

If it is a cluster setup please provide all 3 infra IPs with comma separated. If it is a stand-alone setup please provide the IP of Infra VM.

Once provided the IP Address it will ask for the username and password, please enter the username and password and make a note of them for future usage.

Install the GraphDB service using below command

rdaf infra install --tag 1.0.4 --service graphdb

Non-Kubernetes

Please run the downloaded python upgrade script rdaf_upgrade_132_141.py as shown below

python rdaf_upgrade_132_141.py upgrade

Example Output

rdauser@aiaperf-sv10851:~$ python  rdaf_upgrade_132_141.py upgrade
Updating docker compose binary..
updating docker binary on 192.168.108.51
updating docker binary on 192.168.108.54
updating docker binary on 192.168.108.53
updating docker binary on 192.168.108.50
updating docker binary on 192.168.108.52
updating docker binary on 192.168.108.58
updating docker binary on 192.168.108.56
cleaning up expiring certificates...
Cleanup complete!
cleaning up expiring certificates...
Cleanup complete!
Updating policy json configuration.
Creating backup policy.json
Encrypting policy user credentials.
Updating the policy.json in platform and service hosts.
Copying policy.json to hosts: 192.168.108.52
Updating the opensearch policy user permissions...
{"status":"OK","message":"'role-38fb12901221480083eaf050d44c839b-dataplane-policy' updated."}
{"status":"OK","message":"'role-38fb12901221480083eaf050d44c839b' updated."}
Creating backup of existing haproxy.cfg on host 192.168.108.50
Updating haproxy configs on host 192.168.108.50..
Creating backup of existing haproxy.cfg on host 192.168.108.56
Updating haproxy configs on host 192.168.108.56..
Copied /opt/rdaf/deployment-scripts/worker.yaml to /opt/rdaf/deployment-scripts/192.168.108.53
Copied /opt/rdaf/deployment-scripts/worker.yaml to /opt/rdaf/deployment-scripts/192.168.108.54
Copied /opt/rdaf/deployment-scripts/platform.yaml to /opt/rdaf/deployment-scripts/192.168.108.51
Copied /opt/rdaf/deployment-scripts/platform.yaml to /opt/rdaf/deployment-scripts/192.168.108.52
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.54
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.54
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.53
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.53
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.50
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.50
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.52
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.52
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.58
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.58
Copying /opt/rdaf/rdaf.cfg  to host 192.168.108.56
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.108.56
backing up existing values.yaml..
Removing rda_asset_dependency and AIA entries from the values.yaml file
Going to remove platform-rda_asset_dependency-1

Container platform-rda_asset_dependency-1  Stopping
Container platform-rda_asset_dependency-1  Stopped
Container platform-rda_asset_dependency-1  Removing
Container platform-rda_asset_dependency-1  Removed

Removing rda_asset_dependency entries from the platform_yaml
[+] Stopping 1/1
✔ Container platform-rda_asset_dependency-1  Stopped                     10.4s
Going to remove platform-rda_asset_dependency-1
[+] Removing 1/1
✔ Container platform-rda_asset_dependency-1  Removed                      0.2s

Removing rda_asset_dependency entries from the platform_yaml
backing up existing nats.conf on host 192.168.108.50
JetStream section removed successfully.
backing up existing nats.conf on host 192.168.108.56
JetStream section removed successfully.

The upgrade script makes the following changes:

OpenSearch Certificate Cleanup

Cleans up expired OpenSearch certificates.

Connects to all VMs via SSH to perform the cleanup.

Policy File Update

Copies policy.json to /opt/rdaf/config/policy.json on platform and service hosts.

Takes backup of the existing policy.json.

Updates policy user credentials within the file.

IP Address Directory Creation

Creates a directory for each platform and worker host at /opt/rdaf/deployment-scripts/192.168.xx.xx.

Moves corresponding YAML files into their respective IP address directories.

Runtime Folder Creation

Creates an empty runtime folder at /opt/rdaf/config.

AIA Dependency Removal

Removes AIA dependency configuration from values.yaml.

Asset Dependency Service Removal

Removes the asset-dependency service entry from platform.yaml.

NATS JetStream Removal

Removes the JetStream configuration section from /opt/rdaf/config/nats.conf.

HAProxy Configuration Update

Creates a backup of the existing haproxy.cfg file.

Updates /opt/rdaf/config/haproxy/haproxy.cfg with the following configuration under backend webhook:

backend webhook
mode http
balance roundrobin
stick-table type ip size 10k expire 10m
stick on src
option httpchk GET /healthcheck
http-check expect rstatus (2|3)[0-9][0-9]
http-check disable-on-404
http-response set-header Cache-Control no-store
http-response set-header Pragma no-cache
default-server inter 10s downinter 5s fall 3 rise 2
cookie SERVERID insert indirect nocache maxidle 30m maxlife 24h httponly secure
server rdaf-webhook-1 192.168.108.51:8888 check cookie rdaf-webhook-1
server rdaf-webhook-2 192.168.108.52:8888 check cookie rdaf-webhook-2

In the HAProxy Config file check for the frontend portal section, it should have the below highlighted line.

frontend portal
    bind *:80
    bind *:443 ssl crt /opt/certificates/haproxy.pem
    acl WEBHOOK_PATH path_beg -i /webhooks/
    rate-limit sessions 250
    use_backend webhook if WEBHOOK_PATH
    timeout client 30s
    mode http
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    redirect scheme https unless { ssl_fc }
    default_backend portal

Portal Backend Update in values.yaml

File path: /opt/rdaf/deployment-scripts/values.yaml

Updates the portal-backend environment variables section to be dynamically injected via CLI instead of hardcoded:

portal-backend:
mem_limit: 4G
memswap_limit: 4G
environment:
CFX_URL_PREFIX: ''
DATABASE_SQLALCHEMY_POOL_SIZE: 10
DATABASE_SQLALCHEMY_MAX_OVERFLOW: 10
deployment: true
cap_add:
- SYS_PTRACE
privileged: true

FSM Environment Updates in values.yaml

File path: /opt/rdaf/deployment-scripts/values.yaml

Under rda_fsm service, the value PURGE_STALE_INSTANCES_DAYS is updated from 120 to 90.

Adds a new environment variable FSM_INSTANCE_CACHE_SIZE with value 2000

rda_fsm:
mem_limit: 4G
memswap_limit: 4G
privileged: true
cap_add:
- SYS_PTRACE
environment:
RDA_ENABLE_TRACES: 'yes'
DISABLE_REMOTE_LOGGING_CONTROL: 'no'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
PURGE_COMPLETED_INSTANCES_DAYS: 1
PURGE_STALE_INSTANCES_DAYS: 90
FSM_INSTANCE_CACHE_SIZE: 2000
KAFKA_CONSUMER_BATCH_MAX_SIZE: 100
KAFKA_CONSUMER_BATCH_MAX_TIME_SECONDS: 1

Add Hosts Section for All Platform Services

Add hosts section to all the below given platform services path /opt/rdaf/deployment-scripts/values.yaml

rda_api_server
rda_registry
rda_scheduler
rda_collector
rda_identity
rda_asm
rda_fsm
rda_chat_helper
cfx-rda-access-manager
cfx-rda-resource-manager
cfx-rda-user-preferences
portal-backend
portal-frontend

Below is the Example for API Server

rda_api_server:
    mem_limit: 4G
    memswap_limit: 4G
    privileged: true
    cap_add:
    - SYS_PTRACE
    environment:
      RDA_STUDIO_URL: '""'
      RDA_ENABLE_TRACES: 'no'
      DISABLE_REMOTE_LOGGING_CONTROL: 'no'
      RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    deployment: true
    hosts:
    - 192.168.108.51
    - 192.168.108.52

Important

Ensure hosts IP address should be matched with platform services running host IP addresses

Add geodr_api_server Service in /opt/rdaf/deployment-scripts/values.yaml

rda_geodr_api_server:
    mem_limit: 2G
    memswap_limit: 2G
    privileged: true
    cap_add:
    - SYS_PTRACE
    environment:
      RDA_ENABLE_TRACES: 'no'
      DISABLE_REMOTE_LOGGING_CONTROL: 'no'
      RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    deployment: false
    hosts:
    - 192.168.108.51
    - 192.168.108.52

Increase Memory for Collaboration Service in /opt/rdaf/deployment-scripts/values.yaml

cfx-rda-collaboration:
    mem_limit: 6G
    memswap_limit: 6G
    privileged: true
    cap_add:
    - SYS_PTRACE
    environment:
      DISABLE_REMOTE_LOGGING_CONTROL: 'no'
      RDA_ENABLE_TRACES: 'no'
      RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    deployment: true
    hosts:
    - 192.168.108.51
    - 192.168.108.52

1.2.4 Upgrade RDAF Infra Services

Upgrade Infra service using below command. Although the tag provided is the same as the existing one,

rdaf infra upgrade --tag 1.0.4

Please use the below mentioned command to see infra service is up and in Running state

rdaf infra status

Example Output

+------------------------+------------------+---------------+--------------+------------------------------+
| Name                   | Host             | Status        | Container Id | Tag                          |
+------------------------+------------------+---------------+--------------+------------------------------+
| nats                   | 192.168.108.56   | Up 56 minutes | 41f35b3e8a03 | 1.0.4                        |
| minio                  | 192.168.108.50   | Up 56 minutes | f12a7f8f6f85 | RELEASE.2024-12-18T13-15-44Z |
| minio                  | 192.168.108.56   | Up 56 minutes | 43ae0b473698 | RELEASE.2024-12-18T13-15-44Z |
| minio                  | 192.168.108.58   | Up 56 minutes | 48829343c2f6 | RELEASE.2024-12-18T13-15-44Z |
| minio                  | 192.168.108.51   | Up 56 minutes | 2424ed057dee | RELEASE.2024-12-18T13-15-44Z |
| mariadb                | 192.168.108.50   | Up 55 minutes | c435c7f38ba3 | 1.0.4                        |
| mariadb                | 192.168.108.56   | Up 55 minutes | b7f7416c5e3f | 1.0.4                        |
| mariadb                | 192.168.108.58   | Up 55 minutes | dc78e416f180 | 1.0.4                        |
| opensearch             | 192.168.108.50   | Up 55 minutes | 85a0df23e3f7 | 1.0.4                        |
| opensearch             | 192.168.108.56   | Up 55 minutes | 6f76f281aca8 | 1.0.4                        |
| opensearch             | 192.168.108.58   | Up 55 minutes | b2f36099113e | 1.0.4                        |
| kafka                  | 192.168.108.50   | Up 54 minutes | 4fcdb0d6c942 | 1.0.4                        |
| kafka                  | 192.168.108.56   | Up 54 minutes | 6810698a8b30 | 1.0.4                        |
| kafka                  | 192.168.108.58   | Up 54 minutes | 21f1c70953f0 | 1.0.4                        |
| graphdb[operator]      | 192.168.108.50   | Up 54 minutes | bb0686761330 | 1.0.4                        |
| graphdb[agent]         | 192.168.108.50   | Up 54 minutes | 8ace86d77247 | 1.0.4                        |
| graphdb[server]        | 192.168.108.56   | Up 54 minutes | bb9754e230f0 | 1.0.4                        |
| graphdb[coordinator]   | 192.168.108.50   | Up 54 minutes | 11217b9360ea | 1.0.4                        |
| graphdb[operator]      | 192.168.108.56   | Up 54 minutes | 828b36784ff3 | 1.0.4                        |
| graphdb[agent]         | 192.168.108.56   | Up 54 minutes | 546a17d4fede | 1.0.4                        |
+------------------------+------------------+---------------+--------------+------------------------------+

Run the below RDAF command to check infra healthcheck status

rdaf infra healthcheck

Example Output

+------------+-----------------+--------+--------+----------------+--------------+
| Name       | Check           | Status | Reason | Host           | Container Id |
+------------+-----------------+--------+--------+----------------+--------------+
| nats       | Port Connection | OK     | N/A    | 192.168.108.50 | 178176a0cc79 |
| nats       | Service Status  | OK     | N/A    | 192.168.108.50 | 178176a0cc79 |
| nats       | Firewall Port   | OK     | N/A    | 192.168.108.50 | 178176a0cc79 |
| nats       | Port Connection | OK     | N/A    | 192.168.108.56 | 41f35b3e8a03 |
| nats       | Service Status  | OK     | N/A    | 192.168.108.56 | 41f35b3e8a03 |
| nats       | Firewall Port   | OK     | N/A    | 192.168.108.56 | 41f35b3e8a03 |
| minio      | Port Connection | OK     | N/A    | 192.168.108.50 | f12a7f8f6f85 |
| minio      | Service Status  | OK     | N/A    | 192.168.108.50 | f12a7f8f6f85 |
| minio      | Firewall Port   | OK     | N/A    | 192.168.108.50 | f12a7f8f6f85 |
| minio      | Port Connection | OK     | N/A    | 192.168.108.56 | 43ae0b473698 |
| minio      | Service Status  | OK     | N/A    | 192.168.108.56 | 43ae0b473698 |
| minio      | Firewall Port   | OK     | N/A    | 192.168.108.56 | 43ae0b473698 |
| minio      | Port Connection | OK     | N/A    | 192.168.108.58 | 48829343c2f6 |
| minio      | Service Status  | OK     | N/A    | 192.168.108.58 | 48829343c2f6 |
| minio      | Firewall Port   | OK     | N/A    | 192.168.108.58 | 48829343c2f6 |
| minio      | Port Connection | OK     | N/A    | 192.168.108.51 | 2424ed057dee |
| minio      | Service Status  | OK     | N/A    | 192.168.108.51 | 2424ed057dee |
| minio      | Firewall Port   | OK     | N/A    | 192.168.108.51 | 2424ed057dee |
| mariadb    | Port Connection | OK     | N/A    | 192.168.108.50 | c435c7f38ba3 |
| mariadb    | Service Status  | OK     | N/A    | 192.168.108.50 | c435c7f38ba3 |
| mariadb    | Firewall Port   | OK     | N/A    | 192.168.108.50 | c435c7f38ba3 |
| mariadb    | Port Connection | OK     | N/A    | 192.168.108.56 | bf7f416c5e3f |
| mariadb    | Service Status  | OK     | N/A    | 192.168.108.56 | bf7f416c5e3f |
| mariadb    | Firewall Port   | OK     | N/A    | 192.168.108.56 | bf7f416c5e3f |
| mariadb    | Port Connection | OK     | N/A    | 192.168.108.58 | dc78e416f180 |
| mariadb    | Service Status  | OK     | N/A    | 192.168.108.58 | dc78e416f180 |
| mariadb    | Firewall Port   | OK     | N/A    | 192.168.108.58 | dc78e416f180 |
| opensearch | Port Connection | OK     | N/A    | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Service Status  | OK     | N/A    | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Firewall Port   | OK     | N/A    | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Port Connection | OK     | N/A    | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Service Status  | OK     | N/A    | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Firewall Port   | OK     | N/A    | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Port Connection | OK     | N/A    | 192.168.108.58 | b2f36099113e |
| opensearch | Service Status  | OK     | N/A    | 192.168.108.58 | b2f36099113e |
| opensearch | Firewall Port   | OK     | N/A    | 192.168.108.58 | b2f36099113e |
+------------+-----------------+--------+--------+----------------+--------------+

Note

If external opensearch is already installed, upgrade it using the following command

rdaf opensearch_external upgrade --tag 1.0.4

To check the status of External opensearch

rdaf opensearch_external status

Example Output

+---------------------+----------------+------------+--------------+-------+
| Name                | Host           | Status     | Container Id | Tag   |
+---------------------+----------------+------------+--------------+-------+
| opensearch_external | 192.168.108.52 | Up 3 hours | 7952e3b28c6e | 1.0.4 |
| opensearch_external | 192.168.108.53 | Up 3 hours | 1b3f3ec30ecb | 1.0.4 |
| opensearch_external | 192.168.108.54 | Up 3 hours | 8604c5b4b012 | 1.0.4 |
+---------------------+----------------+------------+--------------+-------+

Note

Internet access from the API server is required for users to download packs directly from the public GitHub repository at fabrix.ai.

If users do not have internet access from the API server container and prefer not to configure a proxy on the API server, they can manually download the packs from the public GitHub repository to their local desktop and then use the Upload option instead of "Upload from Catalog."

In case of proxy environment please add below proxy settings to api-server in /opt/rdaf/deployment-scripts/values.yaml

Need to add the IPs in no_proxy, NO_PROXY which are part of rdaf product.

rda_api_server:
  mem_limit: 4G
  memswap_limit: 4G
  privileged: true
  environment:
    RDA_STUDIO_URL: '""'
    RDA_ENABLE_TRACES: 'no'
    DISABLE_REMOTE_LOGGING_CONTROL: 'no'
    RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    no_proxy: localhost,127.0.0.1,192.168.133.60,192.168.133.61,192.168.133.62,192.168.133.63,192.168.133.64,192.168.133.65,192.168.133.66
    NO_PROXY: localhost,127.0.0.1,192.168.133.60,192.168.133.61,192.168.133.62,192.168.133.63,192.168.133.64,192.168.133.65,192.168.133.66
    http_proxy:  "http://test:[email protected]:3128"
    https_proxy: "http://test:[email protected]:3128"
    HTTP_PROXY:  "http://test:[email protected]:3128"
    HTTPS_PROXY: "http://test:[email protected]:3128"
  deployment: true
  cap_add:
  - SYS_PTRACE

Note

Make sure Virtual IP

Note

For the document on SAML Configuration Update ("strict": false) for Portals Behind a URL Prefix with SAML SSO, Please Click Here

1.2.5 Upgrade RDAF Platform Services

Non-Kubernetes

Warning

For Non-Kubernetes deployment, upgrading RDAF Platform and AIOps application services is a disruptive operation when rolling-upgrade option is not used. Please schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.

Run the below command to initiate upgrading RDAF Platform services with zero downtime

rdaf platform upgrade --tag 8.1.0.1 --rolling-upgrade --timeout 10

Note

timeout <10> mentioned in the above command represents as Seconds

Note

The rolling-upgrade option upgrades the Platform services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Platform services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

During this upgrade sequence, RDAF platform continues to function without any impact to the application traffic.

After completing the Platform services upgrade on all VMs, it will ask for user confirmation to delete the older version Platform service PODs. The user has to provide YES to delete the old docker containers (in non-k8s)

Example

192.168.133.95:5000/onprem-portal-nginx:3.7.2
2024-08-12 02:21:58,875 [rdaf.component.platform] INFO     - Gathering platform container details.
2024-08-12 02:22:01,326 [rdaf.component.platform] INFO     - Gathering rdac pod details.
+----------+----------------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type             | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------------------+---------+---------+--------------+-------------+------------+
| 3a5ff878 | api-server           | 3.7.2   | 2:34:09 | 5119921f9c1c | None        | True       |
| 689c2574 | registry             | 3.7.2   | 3:23:10 | d21676c0465b | None        | True       |
| 0d03f649 | scheduler            | 3.7.2   | 2:34:46 | dd699a1d15af | None        | True       |
| 0496910a | collector            | 3.7.2   | 3:22:40 | 1c367e3bf00a | None        | True       |
| c4a88eb7 | asset-dependency     | 3.7.2   | 3:22:25 | cdb3f4c76deb | None        | True       |
| 9562960a | authenticator        | 3.7.2   | 3:22:09 | 8bda6c86a264 | None        | True       |
| ae8b58e5 | asm                  | 3.7.2   | 3:21:54 | 8f0f7f773907 | None        | True       |
| 1cea350e | fsm                  | 3.7.2   | 3:21:37 | 1ea1f5794abb | None        | True       |
| 32fa2f93 | chat-helper          | 3.7.2   | 3:21:23 | 811cbcfba7a2 | None        | True       |
| 0e6f375c | cfxdimensions-app-   | 3.7.2   | 3:21:07 | 307c140f99c2 | None        | True       |
|          | access-manager       |         |         |              |             |            |
| 4130b2d4 | cfxdimensions-app-   | 3.7.2   | 2:24:23 | 2d73c36426fe | None        | True       |
|          | resource-manager     |         |         |              |             |            |
| 29caf947 | user-preferences     | 3.7.2   | 3:20:36 | 3e2b5b7e6cb4 | None        | True       |
+----------+----------------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-08-12 02:23:04,389 [rdaf.component.platform] INFO     - Initiating Maintenance Mode...
2024-08-12 02:23:10,048 [rdaf.component.platform] INFO     - Following container are in maintenance mode
+----------+----------------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type             | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------------------+---------+---------+--------------+-------------+------------+
| 3a5ff878 | api-server           | 3.7.2   | 2:34:49 | 5119921f9c1c | maintenance | False      |
| ae8b58e5 | asm                  | 3.7.2   | 3:22:34 | 8f0f7f773907 | maintenance | False      |
| c4a88eb7 | asset-dependency     | 3.7.2   | 3:23:05 | cdb3f4c76deb | maintenance | False      |
| 9562960a | authenticator        | 3.7.2   | 3:22:49 | 8bda6c86a264 | maintenance | False      |
| 0e6f375c | cfxdimensions-app-   | 3.7.2   | 3:21:47 | 307c140f99c2 | maintenance | False      |
|          | access-manager       |         |         |              |             |            |
| 4130b2d4 | cfxdimensions-app-   | 3.7.2   | 2:25:03 | 2d73c36426fe | maintenance | False      |
|          | resource-manager     |         |         |              |             |            |
| 32fa2f93 | chat-helper          | 3.7.2   | 3:22:03 | 811cbcfba7a2 | maintenance | False      |
| 0496910a | collector            | 3.7.2   | 3:23:20 | 1c367e3bf00a | maintenance | False      |
| 1cea350e | fsm                  | 3.7.2   | 3:22:17 | 1ea1f5794abb | maintenance | False      |
| 689c2574 | registry             | 3.7.2   | 3:23:50 | d21676c0465b | maintenance | False      |
| 0d03f649 | scheduler            | 3.7.2   | 2:35:26 | dd699a1d15af | maintenance | False      |
| 29caf947 | user-preferences     | 3.7.2   | 3:21:16 | 3e2b5b7e6cb4 | maintenance | False      |
+----------+----------------------+---------+---------+--------------+-------------+------------+
2024-08-12 02:23:10,052 [rdaf.component.platform] INFO     - Waiting for timeout of 5 seconds...
2024-08-12 02:23:15,060 [rdaf.component.platform] INFO     - Upgrading service: rda_api_server on host 192.168.133.92

Run the below command to initiate upgrading RDAF Platform services without zero downtime

rdaf platform upgrade --tag 8.1.0.1

Please wait till all of the new platform services are in Up state and run the below command to verify their status and make sure all of them are running with 8.1.0.1 version.

rdaf platform status

Example Output

+--------------------------+----------------+-------------------------------+--------------+---------+
| Name                     | Host           | Status                        | Container Id | Tag     |
+--------------------------+----------------+-------------------------------+--------------+---------+
| rda_api_server           | 192.168.108.51 | Up 4 hours                    | dc2dd806e6a6 | 8.1.0.1 |
| rda_api_server           | 192.168.108.52 | Up 4 hours                    | a76257df0330 | 8.1.0.1 |
| rda_registry             | 192.168.108.51 | Up 4 hours                    | f23455c6b85b | 8.1.0.1 |
| rda_registry             | 192.168.108.52 | Up 4 hours                    | 3b8deb15ad1f | 8.1.0.1 |
| rda_scheduler            | 192.168.108.51 | Up 4 hours                    | 1864f7e88bfb | 8.1.0.1 |
| rda_scheduler            | 192.168.108.52 | Up 4 hours                    | 62089081e902 | 8.1.0.1 |
| rda_collector            | 192.168.108.51 | Up 4 hours                    | 50c81f436fd9 | 8.1.0.1 |
| rda_collector            | 192.168.108.52 | Up 4 hours                    | 754db49f2804 | 8.1.0.1 |
| rda_identity             | 192.168.108.51 | Up 4 hours                    | 37625fde83e8 | 8.1.0.1 |
| rda_identity             | 192.168.108.52 | Up 4 hours                    | bb60423a47fa | 8.1.0.1 |
| rda_asm                  | 192.168.108.51 | Up 4 hours                    | 5ae15e7d661e | 8.1.0.1 |
| rda_asm                  | 192.168.108.52 | Up 4 hours                    | 80181bb0f80e | 8.1.0.1 |
| rda_fsm                  | 192.168.108.51 | Up 4 hours                    | bfaf7206eacb | 8.1.0.1 |
| rda_fsm                  | 192.168.108.52 | Up 4 hours                    | 8c470b9d7b08 | 8.1.0.1 |
+--------------------------+----------------+-------------------------------+--------------+---------+

Run the below command to check the rda-scheduler service is elected as a leader under Site column.

rdac pods

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck

Example Output

+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=3, Brokers=[1, 2, 3] |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | minio-connectivity                                  | ok       |                                                             |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+

1.2.6 Upgrade `rdac` CLI

Non-Kubernetes

Run the below command to upgrade the rdac CLI

rdaf rdac_cli upgrade --tag 8.1.0.1

1.2.7 Upgrade RDA Worker Services

Non-Kubernetes

Note

If the worker was deployed in a HTTP proxy environment, please make sure the required HTTP proxy environment variables are added in /opt/rdaf/deployment-scripts/values.yaml file under rda_worker configuration section as shown below before upgrading RDA Worker services.

Example

rda_worker:
  mem_limit: 8G
  memswap_limit: 8G
  privileged: false
  environment:
    RDA_ENABLE_TRACES: 'no'
    RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    http_proxy:  "http://test:[email protected]:3128"
    https_proxy: "http://test:[email protected]:3128"
    HTTP_PROXY:  "http://test:[email protected]:3128"
    HTTPS_PROXY: "http://test:[email protected]:3128"

Upgrade RDA Worker Services

Please run the below command to initiate upgrading the RDA Worker Service with zero downtime

rdaf worker upgrade --tag 8.1.0.1 --rolling-upgrade --timeout 10

Note

timeout <10> mentioned in the above command represents as seconds

Note

The rolling-upgrade option upgrades the Worker services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Worker services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

After completing the Worker services upgrade on all VMs, it will ask for user confirmation, the user has to provide YES to delete the older version Worker service PODs.

Example

2024-08-12 02:56:11,573 [rdaf.component.worker] INFO     - Collecting worker details for rolling upgrade
2024-08-12 02:56:14,301 [rdaf.component.worker] INFO     - Rolling upgrade worker on 192.168.133.96
+----------+----------+---------------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type | Version       | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------+---------------+---------+--------------+-------------+------------+
| c8a37db9 | worker   | 8.1.0.1       |  3:32:31 | fffe44b43708 | None       | True       |
+----------+----------+---------------+------------------------+---+---------+------------+
Continue moving above pod to maintenance mode? [yes/no]: yes
2024-08-12 02:57:17,346 [rdaf.component.worker] INFO     - Initiating maintenance mode for pod c8a37db9
2024-08-12 02:57:22,401 [rdaf.component.worker] INFO     - Waiting for worker to be moved to maintenance.
2024-08-12 02:57:35,001 [rdaf.component.worker] INFO     - Following worker container is in maintenance mode
+----------+----------+-------------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type | Version     | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------+-------------+---------+--------------+-------------+------------+
| c8a37db9 | worker   | 8.1.0.1     | 3:33:52 | fffe44b43708 | maintenance | False      |
+----------+----------+-------------+---------+--------------+-------------+------------+
2024-08-12 02:57:35,002 [rdaf.component.worker] INFO     - Waiting for timeout of 3 seconds.

Please run the below command to initiate upgrading the RDA Worker Service without zero downtime

rdaf worker upgrade --tag 8.1.0.1

Please wait for 120 seconds to let the newer version of RDA Worker service containers join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service containers.

rdac pods | grep worker

| Infra | worker      | True        | 6eff605e72c4 | a318f394 | rda-site-01 | 13:45:13 |      4 |        31.21 | 0             | 0            |
| Infra | worker      | True        | ae7244d0d10a | 554c2cd8 | rda-site-01 | 13:40:40 |      4 |        31.21 | 0             | 0            |

rdaf worker status

Example Output

+------------+----------------+------------+--------------+---------+
| Name       | Host           | Status     | Container Id | Tag     |
+------------+----------------+------------+--------------+---------+
| rda_worker | 192.168.108.53 | Up 4 hours | ea187f89505f | 8.1.0.1 |
| rda_worker | 192.168.108.54 | Up 4 hours | a62b3230bbaa | 8.1.0.1 |
+------------+----------------+------------+--------------+---------+

Run the below command to check if all RDA Worker services has ok status and does not throw any failure messages.

rdac healthcheck

+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                                                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_infra | api-server                             | 1b0542719618 | 1845ae67 |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | api-server                             | 1b0542719618 | 1845ae67 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | api-server                             | d4404cffdc7a | a4cfdc6d |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | api-server                             | d4404cffdc7a | a4cfdc6d |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | asm                                    | 8d3d52a7a475 | 418c9dc1 |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | asm                                    | 8d3d52a7a475 | 418c9dc1 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | asm                                    | ab172a9b8229 | 2ac1d67a |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | asm                                    | ab172a9b8229 | 2ac1d67a |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 6ac69ca1085c | c2e9dcb9 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 6ac69ca1085c | c2e9dcb9 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 58a5f4f460d3 | 0b91caac |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 58a5f4f460d3 | 0b91caac |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | DB-connectivity                                     | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | DB-connectivity                                     | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | service-initialization-status                       | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | DB-connectivity                                     | ok       |                          
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+

1.2.8 Update Environment Variables in values.yaml

Non-Kubernetes

Step 1: Alert Ingester- Add Environment Variables

Before upgrading the Alert Ingester service, ensure the following environment variables are added under the cfx-rda-alert-ingester section in the values.yaml file. file path /opt/rdaf/deployment-scripts/values.yaml
Environment Variables to Add

INBOUND_PARTITION_WORKERS_MAX
OUTBOUND_TOPIC_WORKERS_MAX

Example Configuration

cfx-rda-alert-ingester:
   mem_limit: 6G
   memswap_limit: 6G
   privileged: true
environment:
  DISABLE_REMOTE_LOGGING_CONTROL: 'no'
  RDA_ENABLE_TRACES: 'yes'
  RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
  INBOUND_PARTITION_WORKERS_MAX:  1
  OUTBOUND_TOPIC_WORKERS_MAX:  1
hosts:
- 192.168.109.53
- 192.168.109.54
cap_add:
- SYS_PTRACE

Step 2: Event Consumer- Add Environment Variable

Before upgrading the event consumer service, add the following environment variable under the event_consumer section of values.yaml.
Environment Variable to Add

OUTBOUND_WORKERS_MAX

Example Configuration

cfx-rda-event-consumer:
   mem_limit: 6G
   memswap_limit: 6G
   privileged: true
environment:
  DISABLE_REMOTE_LOGGING_CONTROL: 'no'
  RDA_ENABLE_TRACES: 'yes'
  RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
  OUTBOUND_WORKERS_MAX: 3
hosts:
- 192.168.109.53
- 192.168.109.54
cap_add:
- SYS_PTRACE

Note

For above Environment Variables Configuration needs to be updated as per the Production Deployments, Please refer this Document

1.2.9 Upgrade OIA Application Services

Non-Kubernetes

Run the below commands to initiate upgrading the RDA Fabric OIA Application services with zero downtime

rdaf app upgrade OIA --tag 8.1.0.1 --rolling-upgrade --timeout 10

Note

timeout <10> mentioned in the above command represents as Seconds

Note

The rolling-upgrade option upgrades the OIA application services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of OIA application services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

After completing the OIA application services upgrade on all VMs, it will ask for user confirmation to delete the older version OIA application service PODs.

Example

2024-08-12 03:18:08,705 [rdaf.component.oia] INFO     - Gathering OIA app container details.
2024-08-12 03:18:10,719 [rdaf.component.oia] INFO     - Gathering rdac pod details.
+----------+----------------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type             | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------------------+---------+---------+--------------+-------------+------------+
| 2992fe69 | cfx-app-controller   | 7.7.2   | 3:44:53 | 0500f773a8ff | None        | True       |
| 336138c8 | reports-registry     | 7.7.2   | 3:44:12 | 92a5e0daa942 | None        | True       |
| ccc5f3ce | cfxdimensions-app-   | 7.7.2   | 3:43:34 | 99192de47ea4 | None        | True       |
|          | notification-service |         |         |              |             |            |
| 03614007 | cfxdimensions-app-   | 7.7.2   | 3:42:54 | fbdf4e5c16c3 | None        | True       |
|          | file-browser         |         |         |              |             |            |
| a4949804 | configuration-       | 7.7.2   | 3:42:15 | 4ea08c8cbf2e | None        | True       |
|          | service              |         |         |              |             |            |
| 8f37c520 | alert-ingester       | 7.7.2   | 3:41:35 | e9e3a3e69cac | None        | True       |
| 249b7104 | webhook-server       | 7.7.2.1 | 3:12:04 | 1df43cebc888 | None        | True       |
| 76c64336 | smtp-server          | 7.7.2.1 | 3:08:57 | 03725b0cb91f | None        | True       |
| ad85cb4c | event-consumer       | 7.7.2.1 | 3:09:58 | 8a7d349da513 | None        | True       |
| 1a788ef3 | alert-processor      | 7.7.2.1 | 3:11:01 | a7c5294cba3d | None        | True       |
| 970b90b1 | cfxdimensions-app-   | 7.7.2   | 3:38:14 | 01d4245bb90e | None        | True       |
|          | irm_service          |         |         |              |             |            |
| 153aa6ac | ml-config            | 7.7.2   | 3:37:33 | 10d5d6766354 | None        | True       |
| 5aa927a4 | cfxdimensions-app-   | 7.7.2   | 3:36:53 | dcfda7175cb5 | None        | True       |
|          | collaboration        |         |         |              |             |            |
| 6833aa86 | ingestion-tracker    | 7.7.2   | 3:36:13 | ef0e78252e48 | None        | True       |
| afe77cb9 | alert-processor-     | 7.7.2   | 3:35:33 | 6f03c7fdba51 | None        | True       |
|          | companion            |         |         |              |             |            |
+----------+----------------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-08-12 03:18:27,159 [rdaf.component.oia] INFO     - Initiating Maintenance Mode...
2024-08-12 03:18:32,978 [rdaf.component.oia] INFO     - Waiting for services to be moved to maintenance.
2024-08-12 03:18:55,771 [rdaf.component.oia] INFO     - Following container are in maintenance mode
+----------+----------------------+---------+---------+--------------+-------------+------------+

Run the below command to initiate upgrading the RDA Fabric OIA Application services without zero downtime

rdaf app upgrade OIA --tag 8.1.0.1

Please wait till all of the new OIA application service containers are in Up state and run the below command to verify their status and make sure they are running with 8.1.0.1 version.

rdaf app status

Example Output

+-----------------------------------+----------------+------------+--------------+---------+
| Name                              | Host           | Status     | Container Id | Tag     |
+-----------------------------------+----------------+------------+--------------+---------+
| cfx-rda-app-controller            | 192.168.108.51 | Up 3 hours | 2f5970c9ba3f | 8.1.0.1 |
| cfx-rda-app-controller            | 192.168.108.52 | Up 3 hours | 831cb384c807 | 8.1.0.1 |
| cfx-rda-reports-registry          | 192.168.108.51 | Up 4 hours | ae6dfcf1fb88 | 8.1.0.1 |
| cfx-rda-reports-registry          | 192.168.108.52 | Up 4 hours | 3387e3ac2e8b | 8.1.0.1 |
| cfx-rda-notification-service      | 192.168.108.51 | Up 4 hours | 757acc39018c | 8.1.0.1 |
| cfx-rda-notification-service      | 192.168.108.52 | Up 4 hours | a14d8ea906f7 | 8.1.0.1 |
| cfx-rda-file-browser              | 192.168.108.51 | Up 4 hours | 1e83162f75ce | 8.1.0.1 |
| cfx-rda-file-browser              | 192.168.108.52 | Up 4 hours | bff3cca26363 | 8.1.0.1 |
| cfx-rda-configuration-service     | 192.168.108.51 | Up 4 hours | 3c6598ce38e2 | 8.1.0.1 |
| cfx-rda-configuration-service     | 192.168.108.52 | Up 4 hours | de793664be3a | 8.1.0.1 |
| cfx-rda-alert-ingester            | 192.168.108.51 | Up 4 hours | 6df94614f4c2 | 8.1.0.1 |
| cfx-rda-alert-ingester            | 192.168.108.52 | Up 4 hours | b1de17f5c587 | 8.1.0.1 |
| cfx-rda-webhook-server            | 192.168.108.51 | Up 4 hours | 6de31a1f5101 | 8.1.0.1 |
| cfx-rda-webhook-server            | 192.168.108.52 | Up 4 hours | e70a6570d922 | 8.1.0.1 |
| cfx-rda-smtp-server               | 192.168.108.51 | Up 4 hours | efcebbe2a1ee | 8.1.0.1 |
| cfx-rda-smtp-server               | 192.168.108.52 | Up 4 hours | 93b36a17f7f3 | 8.1.0.1 |
+-----------------------------------+----------------+------------+--------------+---------+

Run the below command to verify all OIA application services are up and running.

rdac pods

Example Output

+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat   | Pod-Type                               | Pod-Ready   | Host           | ID       | Site        | Age      |   CPUs |   Memory(GB) | Active Jobs   | Total Jobs   |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App   | alert-ingester                         | True        | rda-alert-inge | 6a6e464d |             | 19:22:36 |      8 |        31.33 |               |              |
| App   | alert-ingester                         | True        | rda-alert-inge | 7f6b42a0 |             | 19:22:53 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | a880e491 |             | 19:23:21 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | b684609e |             | 19:23:18 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 874f3b33 |             | 19:22:24 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 70cadaa7 |             | 19:22:05 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | bde06c15 |             | 19:47:50 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | 47b9eb02 |             | 19:47:38 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | faa33e1b |             | 19:47:52 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | 36083c36 |             | 19:47:46 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | 5fd3c3f4 |             | 19:23:09 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | d66e5ce8 |             | 19:22:56 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | ecbb535c |             | 19:47:46 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | 9a05db5a |             | 19:47:36 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 61b3c53b |             | 19:22:18 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 09b9474e |             | 19:21:57 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-file-browser         | True        | rda-file-brows | 00495640 |             | 19:22:45 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-file-browser         | True        | rda-file-brows | 640f0653 |             | 19:22:29 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-irm_service          | True        | rda-irm-servic | 27e345c5 |             | 19:21:43 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-irm_service          | True        | rda-irm-servic | 23c7e082 |             | 19:21:56 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-notification-service | True        | rda-notificati | bbb5b08b |             | 19:23:20 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-notification-service | True        | rda-notificati | 9841bcb5 |             | 19:23:02 |      8 |        31.33 |               |              |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck

Example Output

+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=2, Brokers=[1, 2, 3] |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | minio-connectivity                                  | ok       |                                                             |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+

1.2.9.1 Upgrade Migrate Datasets

Non-Kubernetes

Please run the Python Upgrade Script to Migrate Datasets

python rdaf_upgrade_132_141.py migrate_dataset

Note

After executing successfully, the script will display the count of successful and failed dataset objects, as shown below. If any dataset objects fail, please re-run the script to ensure all dataset objects are migrated.

Example Output

rdauser@kubofflinereg10812:~$ python rdaf_upgrade_132_141.py migrate_dataset
2025-02-13 14:49:51,668 [PID=829:TID=MainThread:cfx.__main__:add_prefix_to_unhashed_objects:197] INFO - Found 7 unhashed dataset objects to update.
2025-02-13 14:49:51,748 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-esxi-host-inventory-meta.yml
2025-02-13 14:49:51,800 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-host-network-inventory-meta.yml
2025-02-13 14:49:51,867 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-storage-inventory-meta.yml
2025-02-13 14:49:51,927 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-vcenter-datastore-inventory-meta.yml
2025-02-13 14:49:51,979 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-vcenter-summary-inventory-meta.yml
2025-02-13 14:49:52,246 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-vm-inventory-meta.yml
2025-02-13 14:49:52,304 [PID=829:TID=ThreadPoolExecutor-0_0:cfx.__main__:_process_single_dataset:280] INFO - Successfully updated dataset: cfxdm-saved-data/vcenter_159_151-vswitch-inventory-meta.yml
2025-02-13 14:49:52,305 [PID=829:TID=MainThread:cfx.__main__:add_prefix_to_unhashed_objects:222] INFO - Total dataset objects processed: 1501, Success: 1501, Failed: 0
2025-02-13 14:49:52,305 [PID=829:TID=MainThread:cfx.__main__:add_prefix_to_unhashed_objects:226] INFO - Total time taken: 507.64 seconds
Dataset migration script execution completed.

1.2.10 Upgrade Event Gateway Services

Important

This Upgrade is for Non-K8s only

Step 1. Prerequisites

Event Gateway with 3.7.2 tag should be already installed

Note

If a user deployed the event gateway using the RDAF CLI, follow Step 2 and skip Step 3 or if the user did not deploy event gateway in RDAF CLI go to Step 3

Step 2. Upgrade Event Gateway Using RDAF CLI

To upgrade the event gateway, log in to the rdaf cli VM and execute the following command.
```
rdaf event_gateway upgrade --tag 8.1.0.1
```

Step 3. Upgrade Event Gateway Using Docker Compose File

Login to the Event Gateway installed VM
Navigate to the location where Event Gateway was previously installed, using the following command
```
cd /opt/rdaf/event_gateway
```

Edit the docker-compose file for the Event Gateway using a local editor (e.g. vi) update the tag and save it

vi event-gateway-docker-compose.yml

Example

version: '3.1'
services:
rda_event_gateway:
image: docker1.cloudfabrix.io:443/external/ubuntu-rda-event-gateway:8.1.0.1
restart: always
network_mode: host
mem_limit: 6G
memswap_limit: 6G
volumes:
- /opt/rdaf/network_config:/network_config
- /opt/rdaf/event_gateway/config:/event_gw_config
- /opt/rdaf/event_gateway/certs:/certs
- /opt/rdaf/event_gateway/logs:/logs
- /opt/rdaf/event_gateway/log_archive:/tmp/log_archive
logging:
    driver: "json-file"
    options:
    max-size: "25m"
    max-file: "5"
environment:
    RDA_NETWORK_CONFIG: /network_config/rda_network_config.json
    EVENT_GW_MAIN_CONFIG: /event_gw_config/main/main.yml
    EVENT_GW_SNMP_TRAP_CONFIG: /event_gw_config/snmptrap/trap_template.json
    EVENT_GW_SNMP_TRAP_ALERT_CONFIG: /event_gw_config/snmptrap/trap_to_alert_go.yaml
    AGENT_GROUP: event_gateway_site01
    EVENT_GATEWAY_CONFIG_DIR: /event_gw_config
    LOGGER_CONFIG_FILE: /event_gw_config/main/logging.yml

Please run the following commands

docker-compose -f event-gateway-docker-compose.yml down
docker-compose -f event-gateway-docker-compose.yml pull
docker-compose -f event-gateway-docker-compose.yml up -d

Use the command as shown below to ensure that the RDA docker instances are up and running.
```
docker ps -a | grep event
```
Use the below mentioned command to check docker logs for any errors
```
docker logs -f  -tail 200 <event gateway containerid>
```

Tip

In version 3.6.1 or above, the RDA Event Gateway agent introduces enhanced Syslog TCP/UDP endpoints, developed in Go (lang), to boost event processing rates significantly and optimize system resource utilization.

New Syslog TCP Endpoint Type: syslog_tcp_go
New Syslog UDP Endpoint Type: syslog_udp_go

1.2.11 Install RDAF Bulkstats Services

Non- Kubernetes

Note

The RDAF Bulkstats service is optional and only necessary if the Bulkstats data ingestion feature is required. Otherwise, you may ignore the steps below and go to next section.

Run the below command to install bulk_stats services

rdaf bulk_stats install --tag 8.1.0.1 --host <ip> --ssh-password <pwd>

A comma can be used to identify two hosts for HA Setups.

Example

rdaf bulk_stats install --tag 8.1.0.1 --host 192.168.108.17,192.168.108.18

Note

When deploying bulk stats on New VM, make sure the username and password matches with the existing VM's

Run the below command to get the bulk_stats status

rdaf bulk_stats status

Example Output

+----------------+----------------+------------+--------------+---------+
| Name           | Host           | Status     | Container Id | Tag     |
+----------------+----------------+------------+--------------+---------+
| rda_bulk_stats | 192.168.108.51 | Up 4 hours | 2b92d66234c8 | 8.1.0.1 |
| rda_bulk_stats | 192.168.108.52 | Up 4 hours | 9bd8564aaa52 | 8.1.0.1 |
+----------------+----------------+------------+--------------+---------+

1.2.11.1 Install RDAF File Object Services

Note

This service is applicable for Non-K8s only, The RDAF File Object service is optional and only necessary if the Bulkstats data ingestion feature is required. Otherwise, you may ignore the steps below and go to next section

Stop file object service using docker-compose file

docker-compose -f <file_object.yaml> down

Remove rda_file_object entries in rdaf.cfg file if bulkstats already deployed with older versions

Run the below command to install File Object services and provision service instances across multiple hosts, ensuring that all VMs use the same username and password.

rdaf file_object install --host <ip> --host <ip> --ssh-password <pwd> --tag 8.1.0.1

Log in to each file object node and update the permissions for the /opt/public folder.

ssh rdauser@<ip> sudo chown rdauser:rdauser /opt/public

Example Command

ssh [email protected] sudo chown rdauser:rdauser /opt/public

Run the below command to get the file_object status

rdaf file_object status

Example Output

+-----------------+----------------+---------------+--------------+---------+
| Name            | Host           | Status        | Container Id | Tag     |
+-----------------+----------------+---------------+--------------+---------+
| rda_file_object | 192.168.108.51 | Up 54 seconds | d1733d6d8995 | 8.1.0.1 |
| rda_file_object | 192.168.108.52 | Up 52 seconds | 1e1342ceac1c | 8.1.0.1 |
+-----------------+----------------+---------------+--------------+---------+

1.2.12 Nginx Load Balancer for Event gateway

Note

Update the nginx configuration file to enable log rotation for the Event Gateway only when Load Balancer is deployed

Add the below mentioned configuration file with the specified content and restart the Nginx container

vi /etc/logrotate.d/nginx

Paste the below content and save it

/opt/rdaf/logs/nginx/*.log
{
daily
rotate 1
compress
missingok
copytruncate
notifempty
}

1.3. Post Upgrade Steps

Non-Kubernetes

Step 1: Manually restart both instances of "app-controller" service using docker command

docker restart <app-controller container id>

Step 2: Update Cleared Alerts Data Retention in Database Property to 8760 hours (1 year) retention. Path to update Cleared Alerts Data Retention property Main Menu --> Administration --> Configurations --> Click on row level action for Cleared Alerts Data Retention in Database.

Step 3: Purge Resolved/Closed incidents data from IRM/AP/Collab DB and pstreams. This change is needed so that incidents/Alerts/collab can be maintained consistently across the system.

Currently, the purging mechanism relies on the retention_days setting defined per pstream. As a result, related data (e.g., alerts or collab messages) may be retained for different durations, leading to inconsistencies in how incident-related information is managed throughout the system.

Note

Need this change so that the collector will not purge data from pstreams.

Go to Main Menu --> Configuration --> RDA Administration --> Persistent Streams --> Persistent Streams Update below pstream definitions to remove retention_days and retention_purge_extra_filter attributes if a pstream has defined these properties.

Old Config New Config

"default_values": {},
"retention_days": 30,
"case_insensitive": true

"default_values": {},
"case_insensitive": true

a) oia-incidents-stream

b) oia-incident-inserts-stream

c) oia-incidents-delta-stream

d) oia-incidents-external-tickets-stream

e) oia-incidents-collaboration-stream

f) oia-collab-messagesharing-stream

g) oia-alerts-stream

h) oia-alerts-payload

Note

Check the RDA Packs Report Go to (Main Menu --> Configuration --> RDA Administration --> Packs) Please skip the Step 4 if no Packs found.

Step 4: In 8.0.0, system bundles were converted to packs at the start of the api server and uploaded so that they can be available in Packs page by default.

In 8.1.0.1, not all system bundles are needed and the relevant ones have been converted to packs that can be uploaded on demand. As a result of this change the environments that is upgraded from 8.0.0 to 8.1.0.1 need to run the script to remove the "system bundles" that were added to packs page.

Upgrade script that deletes the "bundle packs" from Packs page if they are not activated. Steps:

Download the Script delete_bundle_packs_8.0_to_8.1_upgrade.py

wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1/delete_bundle_packs_8.0_to_8.1_upgrade.py

copy the downloaded file inside the api server container.

docker cp delete_bundle_packs_8.0_to_8.1_upgrade.py <api server-container-id>:/tmp

Run the script as follows

Run in test mode to see what would be deleted

python delete_bundle_packs_8.0_to_8.1_upgrade.py --test

Example Output

(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py --test
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Running in TEST MODE - no actual deletions will be performed
Starting 8.0 to 8.1 upgrade pack cleanup process...
Processing 33 packs for 8.0 to 8.1 upgrade cleanup...
2025-07-07 14:23:10,890 [PID=2651:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:23:10,891 [PID=2651:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/1a0deaf4. ID: e4a04f90-e4e9-47ce-8873-eedbdb73c9ea
2025-07-07 14:23:10,899 [PID=2651:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/1a0deaf4. ID: e4a04f90-e4e9-47ce-8873-eedbdb73c9ea. Time taken 8 msec
2025-07-07 14:23:10,955 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:66] INFO - Loading DataPlanePolicy from file: /network_config/policy.json
2025-07-07 14:23:10,957 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:106] INFO - Loaded Dataplane policy with 1 configs, 3 pstream-mappings
2025-07-07 14:23:10,957 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:110] INFO - Dataplane custom routing enabled
2025-07-07 14:23:10,963 [PID=2651:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:23:10,963 [PID=2651:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1. ID: 730545b5-afe4-4033-8c9b-7944ef747413
2025-07-07 14:23:10,966 [PID=2651:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1. ID: 730545b5-afe4-4033-8c9b-7944ef747413. Time taken 2 msec
2025-07-07 14:23:11,142 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Cisco BPA version 1.0.0 exists
[TEST MODE] Pack 'Cisco BPA' version '1.0.0' would be deleted (not activated)
2025-07-07 14:23:11,213 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Topology Path Visualisation version 1.0.0 exists
[TEST MODE] Pack 'Topology Path Visualisation' version '1.0.0' would be deleted (not activated)
2025-07-07 14:23:11,245 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Topology version 1.0.0 exists
[TEST MODE] Pack 'Topology' version '1.0.0' would be deleted (not activated)
Pack 'System' version '1.0.0' does not exist, skipping
Pack 'Synthetic Metrics' version '1.0.0' does not exist, skipping

8.0 to 8.1 upgrade cleanup complete. 29 packs processed for deletion.
2025-07-07 14:23:12,213 [PID=2651:TID=MainThread:cfx.rda_messaging.nats.nats_subscriber:handle_exit:285] INFO - Initiating clean shutdown for subscription tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1

Run the actual deletion

python delete_bundle_packs_8.0_to_8.1_upgrade.py

Example Output

(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Starting 8.0 to 8.1 upgrade pack cleanup process...
Processing 33 packs for 8.0 to 8.1 upgrade cleanup...
2025-07-07 14:27:00,757 [PID=2677:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:27:00,757 [PID=2677:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/02ad5eca. ID: 4c0091c8-249f-4cd5-9f8a-937381dea8f8
2025-07-07 14:27:00,760 [PID=2677:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/02ad5eca. ID: 4c0091c8-249f-4cd5-9f8a-937381dea8f8. Time taken 3 msec
2025-07-07 14:27:00,806 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:66] INFO - Loading DataPlanePolicy from file: /network_config/policy.json
2025-07-07 14:27:00,806 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:106] INFO - Loaded Dataplane policy with 1 configs, 3 pstream-mappings
2025-07-07 14:27:00,807 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:110] INFO - Dataplane custom routing enabled
2025-07-07 14:27:00,810 [PID=2677:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:27:00,811 [PID=2677:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/79c09d31. ID: 3d7d6d4a-7d10-4cc1-b6a6-25e5ed21d0ab
2025-07-07 14:27:00,813 [PID=2677:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/79c09d31. ID: 3d7d6d4a-7d10-4cc1-b6a6-25e5ed21d0ab. Time taken 1 msec
2025-07-07 14:27:00,946 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Cisco BPA version 1.0.0 exists
Deleting pack 'Cisco BPA' version '1.0.0'
2025-07-07 14:27:00,993 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:612] INFO - Checking if pack Cisco BPA 1.0.0 exists before removing
2025-07-07 14:27:01,170 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_any_customer_enabled:1130] INFO - Unable to find any enabled customers package with scope query: id is 'Cisco BPA' and version is '1.0.0' and status is 'ACTIVATED'
2025-07-07 14:27:01,187 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_enabled_for_single_tenant:1185] INFO - Unable to find any enabled customers package with scope query: id is 'Cisco BPA' and version is '1.0.0' and single_tenant_status is 'ACTIVATED'
2025-07-07 14:27:01,202 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:623] INFO - Deleting decription MD file for pack Cisco BPA version 1_0_0
Done deleting objects
2025-07-07 14:27:01,393 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1607] INFO - Deleting rda_objects rda-objects/data/Cisco BPA/1_0_0/77be143c-description_Cisco BPA_1_0_0.data from pstream
2025-07-07 14:27:12,751 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:628] INFO - Deleting pack HP Network Automation version 1_0_0
2025-07-07 14:27:12,768 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1607] INFO - Deleting rda_packs rda_packs/HP Network Automation from pstream
2025-07-07 14:27:12,800 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1644] INFO - Response from deleting: {'status': 'ok', 'reason': '', 'data': {'took': 11, 'timed_out': False, 'total': 1, 'deleted': 1, 'batches': 1, 'version_conflicts': 0, 'noops': 0, 'retries': {'bulk': 0, 'search': 0}, 'throttled_millis': 0, 'requests_per_second': -1.0, 'throttled_until_millis': 0}, 'now': '2025-07-07T14:27:12.798267'}
2025-07-07 14:27:12,800 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:delete_from_customer_stream:178] INFO - Deleting pack from customer-rda-packs: id='HP Network Automation' and version='1.0.0'
Successfully deleted pack 'HP Network Automation' version '1.0.0'
Processing summary:
- Packs processed: 33
- Packs marked for deletion: 29
- Packs successfully deleted: 29
- Packs skipped (activated): 0
- Packs skipped (not found): 4
8.0 to 8.1 upgrade cleanup complete. 29 packs processed for deletion.

Run to get the stats of packs that are in Packs page (total # of packs, activated packs, # that will be deleted)

python delete_bundle_packs_8.0_to_8.1_upgrade.py --stats

Example Output

(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py --stats
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Gathering pack statistics...
2025-07-07 16:10:15,072 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:707] INFO - Getting minio path: rda_packs/
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco Meraki/9_0_1/Cisco Meraki.tar.gz
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco Meraki/9_0_1/manifest.yaml
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Cisco Meraki/9_0_1/manifest.yaml
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco vManage/9_0_1/Cisco vManage.tar.gz
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco vManage/9_0_1/manifest.yaml
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Cisco vManage/9_0_1/manifest.yaml
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/Fabrix AIOps Asset Correlation Regression.tar.gz
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/8_1_1/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/8_1_1/manifest.yaml
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/8_1_1/manifest.yaml
2025-07-07 16:10:15,220 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_1/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,221 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_1/manifest.yaml
2025-07-07 16:10:15,221 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_1/manifest.yaml
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_10/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_10/manifest.yaml
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_10/manifest.yaml
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_3/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_3/manifest.yaml
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_3/manifest.yaml
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_8/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_8/manifest.yaml
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_8/manifest.yaml
2025-07-07 16:10:15,247 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_9/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,248 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_9/manifest.yaml
2025-07-07 16:10:15,248 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_9/manifest.yaml
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/Fabrix AIOps ML Metrics Regression.tar.gz
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,259 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML/9_0_0/Fabrix AIOps ML.tar.gz
Pack Statistics:
2025-07-07 16:10:15,926 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_pack_activated:413] INFO - Pack VMWare vCenter version 9.0.2 is already in ACTIVATED state.
{
"total_packs": 16,
"activated_packs": 13,
"non_activated_packs": 3,
"pack_details": [
    {
    "pack_name": "Cisco Meraki",
    "version": "9.0.1",
    "activated": false
    },
    {
    "pack_name": "Cisco vManage",
    "version": "9.0.1",
    "activated": true
    },

Step 5: Upload following RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Packs from Catalog), and activate the packs in the below given order to get the latest dashboard changes for OIA Alerts and Incidents.

Fabrix Inventory Collection Base Pack Version 7.2.0
Fabrix AIOps Fault Management Base Version 9.0.13

Step 6: Upload following RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Packs from Catalog), and activate the packs to get the latest dashboard changes for VCenter and Network Topology

VMWare vCenter with version 9.0.2
Network Device Discovery with version 9.0.0

Step 7:Upload following RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Packs from Catalog), and activate the packs to get the latest dashboard changes for ML

Fabrix AIOps ML with version 9.0.0
Fabrix AIOps Asset Correlation Regression with version 9.0.1

Step 8. After the upgrade, check the following Platform, Worker, OIA Services, Event-gateway, Bulkstats YAML files in CLI VM located at /opt/rdaf/deployment-scripts/values.yaml

Please check the SYS_PTRACE within the capabilities section for each service, as illustrated in the following example.

Example

rda_api_server:
mem_limit: 4G
memswap_limit: 4G
privileged: true
environment:
  RDA_STUDIO_URL: '""'
  RDA_ENABLE_TRACES: 'no'
  DISABLE_REMOTE_LOGGING_CONTROL: 'no'
  RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
deployment: true
hosts:
- 192.168.109.50
- 192.168.109.51
cap_add:
- SYS_PTRACE

Step 9: Purge of Alerthistory Data from the Database

This task includes the purging of CLEARED alerts data from the alerthistory table and migrating CLEARED alert payloads from the alerthistory table to the oia-alert-payload PStream.

Execute the purge script as detailed in the provided link. please refer to Manual Purge History Alerts Document. This task involves the purging of alerts history data from the alerthistory table and the Alert payload migration from the alerthistory table to the alert payload PStream.
Upon successful execution of the Purge script, update the Cleared Alerts Data Retention in Database setting to 1 hour. Path to update Cleared Alerts Data Retention property Main Menu --> Administration --> Configurations --> Click on row level action for Cleared Alerts Data Retention in Database.

Step 10: Copy Policies From DB To PStream

Suppression/Correlation policies should be copied to pstream from the database to simplify the writing of rda_packs, dashboards, snapshots, and other artifacts.

Execute the CopyPoliciesFromDBToPStream Script as detailed in the provided Copy Policies From DB to Pstream

Upgrade from 7.7.2 to 8.1.0.1

1. Upgrade From 3.7.2 to 8.1.0.1 and 7.7.2 to 8.1.0.1

1.1. Prerequisites

1.1.1 Upgrade Python from 3.7 to 3.12

1.2. Upgrade Steps

1.2.1 Migration of On-Prem Registry

1.2.2 Upgrade On-Prem Registry

1.2.3 Download the new Docker Images

1.2.4 Upgrade RDAF Infra Services

1.2.5 Upgrade RDAF Platform Services

1.2.6 Upgrade rdac CLI

1.2.7 Upgrade RDA Worker Services

1.2.8 Update Environment Variables in values.yaml

1.2.9 Upgrade OIA Application Services

1.2.9.1 Upgrade Migrate Datasets

1.2.10 Upgrade Event Gateway Services

1.2.11 Install RDAF Bulkstats Services

1.2.11.1 Install RDAF File Object Services

1.2.12 Nginx Load Balancer for Event gateway

1.3. Post Upgrade Steps

1.2.6 Upgrade `rdac` CLI