Hot Remove CPU and Memory for Oracle production workloads

The previous blog On Demand Scaling up resources for Oracle production workloads – Hot Add CPU and Hot Add Memory focused on how to stop hoarding much needed infrastructure resources and live wisely ever after by scaling up as needed effectively.

This blog focuses on the Hot Remove CPU and Memory aspect of Oracle Workloads and the current limitations associated with these technologies.

Current caveats for CPU “Hot-Remove” support

  • Currently vSphere does not have the CPU Hot Remove capability on the Web Client.
  • Red Hat Enterprise Linux version 7.1 (RHEL7.1) and later supports “Hot-Remove” of physical CPU and Memory as long as the underlying hardware supports this function
  • The CPU Hot Remove capability is an In-Guest capability starting RHEL 7.1 and onwards – RedHat KB CPU/Memory “Hot-Add” and “Hot-Remove” Support in Red Hat Enterprise Linux version 7.
  • The CPU Hot Remove capability In-Guest capability does not result in a reduction of the vCPU’s allocated to the VM. This capability is just an OS Hot Online/Offline operation

Current caveats for Memory “Hot-Remove” support

Test Setup

A VM ‘SB-OL76-ORA19C’ has 8 vCPUs with 32  GB RAM running OEL 7.6 with Oracle Database 19c.

[root@sb_ol76_ora19c ~]# cat /etc/oracle-release
Oracle Linux Server release 7.6
[root@sb_ol76_ora19c ~]#

[root@sb_ol76_ora19c ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.6 (Maipo)
[root@sb_ol76_ora19c ~]#

[root@sb_ol76_ora19c ~]# uname -a
Linux sb_ol76_ora19c.corp.localdomain 4.1.12-112.14.15.el7uek.x86_64 #2 SMP Thu Feb 8 09:58:19 PST 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@sb_ol76_ora19c ~]#

Oracle instance ‘ora19c’ sga_max_size is set to 16 GB and cpu_count shows 8

oracle@sb_ol76_ora19c:ora19c:/home/oracle> sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 – Production on Wed Sep 15 12:28:07 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 – Production
Version 19.3.0.0.0

SQL> select name, value from v$parameter where name = ‘sga_max_size’;
NAME VALUE
————————- —————-
sga_max_size 17179869184
SQL>

SQL> select name, value from v$parameter where name = ‘cpu_count’;
NAME VALUE
————————- —————-
cpu_count 8
SQL> Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 – Production
Version 19.3.0.0.0
oracle@sb_ol76_ora19c:ora19c:/home/oracle>

Test Case – “Hot Remove” CPU

Red Hat Enterprise Linux version 7.1 (RHEL7.1) and later supports “Hot-Remove” of physical CPU and Memory as long as the underlying hardware supports this function. Learn more at CPU/Memory “Hot-Add” and “Hot-Remove” Support in Red Hat Enterprise Linux version 7

Currently vSphere does not have the CPU Hot Remove capability on the Web Client. The CPU Hot Remove capability is an In-Guest capability starting RHEL 7 onwards.

The VM ‘SB-OL76-ORA19C’ has 8 vCPUs.

Below commands lists all present vCPUs/CPUs of the system:

[root@sb_ol76_ora19c ~]# ls -l /sys/devices/system/cpu | grep cpu
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu0
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu1
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu2
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu3
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu4
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu5
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu6
drwxr-xr-x 6 root root 0 Sep 12 11:13 cpu7
drwxr-xr-x 2 root root 0 Sep 12 19:53 cpuidle
[root@sb_ol76_ora19c ~]#

Below command lists all working vCPUs/CPUs of the system:

[root@sb_ol76_ora19c ~]# cat /proc/cpuinfo |grep processor
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
processor : 6
processor : 7
[root@sb_ol76_ora19c ~]#

Below command will Hot-remove CPU0 from the system:

[root@sb_ol76_ora19c ~]# echo 0 > /sys/devices/system/cpu/cpu0/online

The OS /var/log/messages file shows below message:

Sep 12 20:04:39 sb_ol76_ora19c kernel: smpboot: CPU 0 is now offline

Check the online/offline vCPUs via the commands below:

[root@sb_ol76_ora19c ~]# cat /sys/devices/system/cpu/online
1-7
[root@sb_ol76_ora19c ~]#

[root@sb_ol76_ora19c ~]# cat /sys/devices/system/cpu/offline
0
[root@sb_ol76_ora19c ~]#

The Oracle database alert.log file has the following messages which reflects the Hot Removal of CPU0:

2021-09-12T19:55:55.961847-07:00
Detected change in CPU count to 7
* Load Monitor used for high load check
* New Low – High Load Threshold Range = [0 – 0]
PDB$SEED(2):Adjusting the altered value of parameter parallel_max_servers
PDB$SEED(2):from 140 to 100 due to the ROOT Container Value Setting
PDB1(3):Adjusting the altered value of parameter parallel_max_servers
PDB1(3):from 140 to 100 due to the ROOT Container Value Setting

Below SQL statement will reflect the reduction in the number of CPU’s to 7:

SQL> select name, value from v$parameter where name = ‘cpu_count’;
NAME
——————————————————————————–
VALUE
——————————————————————————–
cpu_count
7

Below command will online CPU0 back to the system:

[root@sb_ol76_ora19c ~]# echo 1 > /sys/devices/system/cpu/cpu0/online

The OS /var/log/messages file shows below message:

Sep 12 20:07:56 sb_ol76_ora19c kernel: smpboot: Booting Node 0 Processor 0 APIC 0x0

Check the online vCPUs via the commands below:

[root@sb_ol76_ora19c ~]# cat /proc/cpuinfo |grep processor
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
processor : 6
processor : 7
[root@sb_ol76_ora19c ~]#

[root@sb_ol76_ora19c ~]# cat /sys/devices/system/cpu/online
0-7
[root@sb_ol76_ora19c ~]#

The Oracle database alert.log file has the following messages which reflects the Hot Add of CPU0:
….
2021-09-12T19:56:55.995881-07:00
Detected change in CPU count to 8
* Load Monitor used for high load check
* New Low – High Load Threshold Range = [0 – 0]
PDB$SEED(2):Adjusting the altered value of parameter parallel_max_servers
PDB$SEED(2):from 160 to 100 due to the ROOT Container Value Setting
PDB1(3):Adjusting the altered value of parameter parallel_max_servers
PDB1(3):from 160 to 100 due to the ROOT Container Value Setting
2021-09-12T20:04:56.220823-07:00

Below SQL statement will reflect the increase in the number of CPU’s back to 8:

SQL> select name, value from v$parameter where name = ‘cpu_count’;

NAME
——————————————————————————–
VALUE
——————————————————————————–
cpu_count
8

We were able to successfully Hot Offline / Online CPU’s on RHEL 7.6 OS . Currently vSphere does not have the CPU Hot Remove capability on the Web Client. The CPU Hot Remove capability is an In-Guest capability starting RHEL 7 onwards.

The CPU Hot Remove capability In-Guest capability does not result in a reduction of the vCPU’s allocated to the VM. This capability is just an OS Hot Online/Offline operation

Test Case – “Hot Remove” Memory

RHEL 7.0 onwards supports “Hot-Remove” of physical Memory as long as the underlying hardware supports this function.

As per RedHat KB How to hot-add and hot-remove memory in RHEL7? : – The hardware or virtualization provider must support hot plug functionality. Historically VMware has supported hot-add memory, but not hot-remove memory. Verify functionality with your hardware or virtualization provider before proceeding.

Currently vSphere does not have the Memory Hot Remove capability on the Web Client. The Memory Hot Remove Capability In-Guest capability is currently not supported on VMware.

The VM ‘SB-OL76-ORA19C’ has 32 GB vRAM.

Check current memory status via command. We can remove the memory for which the /sys/devices/system/memory/memoryX/removable is 1

[root@sb_ol76_ora19c memory1]# cat /sys/devices/system/memory/memory1/state
online
[root@sb_ol76_ora19c memory1]# cat /sys/devices/system/memory/memory1/removable
1
[root@sb_ol76_ora19c memory1]#

The offline operation is not successful, and the Memory is still online.
[root@sb_ol76_ora19c memory1]# echo offline > /sys/devices/system/memory/memory1/state
[root@sb_ol76_ora19c memory1]# cat /sys/devices/system/memory/memory1/state
online
[root@sb_ol76_ora19c memory1]# cat /sys/devices/system/memory/memory1/removable
1
[root@sb_ol76_ora19c memory1]#

The OS /var/log/messages has these messages:
Sep 12 21:13:22 sb_ol76_ora19c kernel: Offlined Pages 32768
Sep 12 21:13:22 sb_ol76_ora19c systemd: Stopping Crash recovery kernel arming…
Sep 12 21:13:22 sb_ol76_ora19c kdumpctl: kexec: unloaded kdump kernel
Sep 12 21:13:22 sb_ol76_ora19c kdumpctl: Stopping kdump: [OK]
Sep 12 21:13:22 sb_ol76_ora19c systemd: Stopped Crash recovery kernel arming.
Sep 12 21:13:22 sb_ol76_ora19c systemd: Starting Crash recovery kernel arming…
Sep 12 21:13:23 sb_ol76_ora19c kdumpctl: kexec: loaded kdump kernel
Sep 12 21:13:23 sb_ol76_ora19c kdumpctl: Starting kdump: [OK]
Sep 12 21:13:23 sb_ol76_ora19c systemd: Started Crash recovery kernel arming.
Sep 12 21:13:23 sb_ol76_ora19c systemd: Stopping Crash recovery kernel arming…
Sep 12 21:13:23 sb_ol76_ora19c kdumpctl: kexec: unloaded kdump kernel
Sep 12 21:13:23 sb_ol76_ora19c kdumpctl: Stopping kdump: [OK]
Sep 12 21:13:23 sb_ol76_ora19c systemd: Stopped Crash recovery kernel arming.
Sep 12 21:13:23 sb_ol76_ora19c systemd: Starting Crash recovery kernel arming…
Sep 12 21:13:25 sb_ol76_ora19c kdumpctl: kexec: loaded kdump kernel
Sep 12 21:13:25 sb_ol76_ora19c kdumpctl: Starting kdump: [OK]
Sep 12 21:13:25 sb_ol76_ora19c systemd: Started Crash recovery kernel arming.

The Oracle database is still online and not affected by above operation.

We were not able to Hot Remove / Offline memory on RHEL 7.6 OS on VMware.

Currently vSphere does not have the Memory Hot Remove capability on the Web Client. The Memory Hot Remove Capability In-Guest capability is currently not supported on VMware

Summary

Current caveats for CPU “Hot-Remove” support

  • Currently vSphere does not have the CPU Hot Remove capability on the Web Client.
  • Red Hat Enterprise Linux version 7.1 (RHEL7.1) and later supports “Hot-Remove” of physical CPU and Memory as long as the underlying hardware supports this function
  • The CPU Hot Remove capability is an In-Guest capability starting RHEL 7.1 and onwards – RedHat KB CPU/Memory “Hot-Add” and “Hot-Remove” Support in Red Hat Enterprise Linux version 7.
  • The CPU Hot Remove capability In-Guest capability does not result in a reduction of the vCPU’s allocated to the VM. This capability is just an OS Hot Online/Offline operation

Current caveats for CPU “Hot-Remove” support

All Oracle on vSphere white papers including Oracle licensing on vSphere/vSAN, Oracle best practices, RAC deployment guides, workload characterization guide can be found in the url below

Oracle on VMware Collateral – One Stop Shop
https://blogs.vmware.com/apps/2017/01/oracle-vmware-collateral-one-stop-shop.html

This entry was posted in Oracle. Bookmark the permalink.