Solution – RKE Cluster MetalLB provides Services with IP Addresses but doesn’t ARP for the address

I ran in to the the same issue detailed here working with a RKE cluster

https://github.com/metallb/metallb/issues/1154

After looking around for a few hours digging in to the logs i figured out the issue, hopefully this helps some one else our there in the situation save some time.

Make sure the IPVS mode is enabled on the cluster configuration

If you are using :

RKE2 – edit the cluster.yaml file

RKE1 – Edit the cluster configuration from the rancher UI > Cluster management > Select the cluster > edit configuration > edit as YAML

Locate the services field under rancher_kubernetes_engine_config and add the following options to enable IPVS

    kubeproxy:
      extra_args:
        ipvs-scheduler: lc
        proxy-mode: ipvs

https://www.suse.com/support/kb/doc/?id=000020035

Make sure the Kernel modules are enabled on the nodes running control planes

Background

Example Rancher – RKE1 cluster

sudo docker ps | grep proxy # find the container ID for kubproxy

sudo docker logs ####containerID###

0313 21:44:08.315888  108645 feature_gate.go:245] feature gates: &{map[]}
I0313 21:44:08.346872  108645 proxier.go:652] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="nf_conntrack_ipv4"
E0313 21:44:08.347024  108645 server_others.go:107] "Can't use the IPVS proxier" err="IPVS proxier will not be used because the following required kernel modules are not loaded: [ip_vs_lc]"

Kubproxy is trying to load the needed kernel modules and failing to enable IPVS

Lets enable the kernel modules

sudo nano /etc/modules-load.d/ipvs.conf

ip_vs_lc
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4

Install ipvsadm to confirm the changes

sudo dnf install ipvsadm -y

Reboot the VM or the Baremetal server

use the sudo ipvsadm to confirm ipvs is enabled

sudo ipvsadm

Testing

kubectl get svc -n #namespace | grep load
arping -I ens192 192.168.94.140
ARPING 192.168.94.140 from 192.168.94.65 ens192
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 1.117ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.737ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.845ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.668ms
Sent 4 probes (1 broadcast(s))
Received 4 response(s)

If you have the service type load balancer on a deployment now you should be able to reach it if the container is responding on the service

helpful Links

https://metallb.universe.tf/configuration/troubleshooting/

https://github.com/metallb/metallb/issues/1154

https://github.com/rancher/rke2/issues/3710

How to extend root (cs-root) Filesystem using LVM Cent OS/RHEL/Almalinux

This guide will walk you through on how to extend and increase space for the root filesystem on a alma linux. Cent OS, REHL Server/Desktop/VM

Check the current space available

sudo dh -f 
Notes -
If you have 0% ~1MB left on the cs-root command auto-complete with tab and some of the later commands wont work, You should clear up atleast 4-10mb by clearing log files, temp files, etc

Mount an additional disk to the VM (Assuming this is a VM) and make sure the disk is visible on the OS level

sudo lvmdiskscan

OR

sudo fdisk -l

Confirm the volume group name

sudo vgs

Lets increase the space

First lets initialize the new disk we mounted

sudo mkfs.xfs /dev/sdb

Create the Physical volume

sudo pvcreate /dev/sdb

extend the volume group

sudo vgextend cs /dev/sdb
  Volume group "cs" successfully extended


Extend the logical volume

sudo lvextend -l +100%FREE /dev/cs/root

Grow the file system size

sudo xfs_growfs /dev/cs/root

Confirm the changes

sudo df -h

Setup guide for VSFTPD FTP Server – SELinux enforced with fail2ban (RHEL, CentOS, Almalinux)

Few things to note

  • if you want to prevent directory traversal we need to setup chroot with vsftpd (not covered on this KB)
  • For the demo I just used Unencrypted FTP on port 21 to keep things simple, Please utilize SFTP with the letsencrypt certificate for better security. i will cover this on another article and link it here

Update and Install packages we need

sudo dnf update
sudo dnf install net-tools lsof unzip zip tree policycoreutils-python-utils-2.9-20.el8.noarch vsftpd nano setroubleshoot-server -y

Setup Groups and Users and security hardening

if you want to prevent directory traversal we need to setup chroot with vsftpd (not covered on this KB)

Create the Service admin account

sudo useradd ftpadmin
sudo passwd ftpadmin

Create the group

sudo groupadd FTP_Root_RW

Create FTP only user shell for the FTP users

echo -e '#!/bin/sh\necho "This account is limited to FTP access only."' | sudo tee -a /bin/ftponly
sudo chmod a+x /bin/ftponly

echo "/bin/ftponly" | sudo tee -a /etc/shells

Create FTP users

sudo useradd ftpuser01 -m -s /bin/ftponly
sudo useradd ftpuser02 -m -s /bin/ftponly
user passwd ftpuser01 
user passwd ftpuser02

Add the users to the group

sudo usermod -a -G FTP_Root_RW ftpuser01
sudo usermod -a -G FTP_Root_RW ftpuser02

sudo usermod -a -G FTP_Root_RW ftpadmin

Disable SSH Access for the FTP users.

Edit sshd_config

sudo nano /etc/ssh/sshd_config

Add the following line to the end of the file

DenyUsers ftpuser01 ftpuser02

Open ports on the VM Firewall

sudo firewall-cmd --permanent --add-port=20-21/tcp

#Allow the passive Port-Range we will define it later on the vsftpd.conf
sudo firewall-cmd --permanent --add-port=60000-65535/tcp

#Reload the ruleset
sudo firewall-cmd --reload

Setup the Second Disk for FTP DATA

Attach another disk to the VM and reboot if you haven’t done this already

lsblk to check the current disks and partitions detected by the system

lsblk 

Create the XFS partition

sudo mkfs.xfs /dev/sdb
# use mkfs.ext4 for ext4

Why XFS? https://access.redhat.com/articles/3129891

Create the folder for the mount point

sudo mkdir /FTP_DATA_DISK

Update the etc/fstab file and add the following line

sudo nano etc/fstab
/dev/sdb /FTP_DATA_DISK xfs defaults 1 2

Mount the disk

sudo mount -a

Testing

mount | grep sdb

Setup the VSFTPD Data and Log Folders

Setup the FTP Data folder

sudo mkdir /FTP_DATA_DISK/FTP_Root -p

Create the log directory

sudo mkdir /FTP_DATA_DISK/_logs/ -p

Set permissions

sudo chgrp -R FTP_Root_RW /FTP_DATA_DISK/FTP_Root/
sudo chmod 775 -R /FTP_DATA_DISK/FTP_Root/

Setup the VSFTPD Config File

Backup the default vsftpd.conf and create a newone

sudo mv /etc/vsftpd/vsftpd.conf /etc/vsftpd/vsftpdconfback
sudo nano /etc/vsftpd/vsftpd.conf
#KB Link - ####

anonymous_enable=NO
local_enable=YES
write_enable=YES
local_umask=002
dirmessage_enable=YES
ftpd_banner=Welcome to multicastbits Secure FTP service.
chroot_local_user=NO
chroot_list_enable=NO
chroot_list_file=/etc/vsftpd/chroot_list
listen=YES
listen_ipv6=NO

userlist_file=/etc/vsftpd/user_list
pam_service_name=vsftpd
userlist_enable=YES
userlist_deny=NO
listen_port=21
connect_from_port_20=YES
local_root=/FTP_DATA_DISK/FTP_Root/

xferlog_enable=YES
vsftpd_log_file=/FTP_DATA_DISK/_logs/vsftpd.log
log_ftp_protocol=YES
dirlist_enable=YES
download_enable=NO

pasv_enable=Yes
pasv_max_port=65535
pasv_min_port=60000

Add the FTP users to the userlist file

Backup the Original file

sudo mv /etc/vsftpd/user_list /etc/vsftpd/user_listBackup
echo "ftpuser01" | sudo tee -a /etc/vsftpd/user_list
echo "ftpuser02" | sudo tee -a /etc/vsftpd/user_list
sudo systemctl start vsftpd

sudo systemctl enable vsftpd

sudo systemctl status vsftpd

Setup SELinux

instead of putting our hands up and disabling SElinux, we are going to setup the policies correctly

Find the available policies using getsebool -a | grep ftp

getsebool -a | grep ftp

ftpd_anon_write --> off
ftpd_connect_all_unreserved --> off
ftpd_connect_db --> off
ftpd_full_access --> off
ftpd_use_cifs --> off
ftpd_use_fusefs --> off
ftpd_use_nfs --> off
ftpd_use_passive_mode --> off
httpd_can_connect_ftp --> off
httpd_enable_ftp_server --> off
tftp_anon_write --> off
tftp_home_dir --> off
[[email protected] _logs]$ 
[[email protected] _logs]$ 
[[email protected] _logs]$ getsebool -a | grep ftp
ftpd_anon_write --> off
ftpd_connect_all_unreserved --> off
ftpd_connect_db --> off
ftpd_full_access --> off
ftpd_use_cifs --> off
ftpd_use_fusefs --> off
ftpd_use_nfs --> off
ftpd_use_passive_mode --> off
httpd_can_connect_ftp --> off
httpd_enable_ftp_server --> off
tftp_anon_write --> off
tftp_home_dir --> off

Set SELinux boolean values

sudo setsebool -P ftpd_use_passive_mode on

sudo setsebool -P ftpd_use_cifs on

sudo setsebool -P ftpd_full_access 1

    "setsebool" is a tool for setting SELinux boolean values, which control various aspects of the SELinux policy.

    "-P" specifies that the boolean value should be set permanently, so that it persists across system reboots.

    "ftpd_use_passive_mode" is the name of the boolean value that should be set. This boolean value controls whether the vsftpd FTP server should use passive mode for data connections.

    "on" specifies that the boolean value should be set to "on", which means that vsftpd should use passive mode for data connections.

    Enable ftp_home_dir --> on if you are using chroot

Add a new file context rule to the system.

sudo semanage fcontext -a -t public_content_rw_t "/FTP_DATA_DISK/FTP_Root/(/.*)?"
    "fcontext" is short for "file context", which refers to the security context that is associated with a file or directory.

    "-a" specifies that a new file context rule should be added to the system.

    "-t" specifies the new file context type that should be assigned to files or directories that match the rule.

    "public_content_rw_t" is the name of the new file context type that should be assigned to files or directories that match the rule. In this case, "public_content_rw_t" is a predefined SELinux type that allows read and write access to files and directories in public directories, such as /var/www/html.

    "/FTP_DATA_DISK/FTP_Root/(/.)?" specifies the file path pattern that the rule should match. The pattern includes the "/FTP_DATA_DISK/FTP_Root/" directory and any subdirectories or files beneath it. The regular expression "/(.)?" matches any file or directory name that may follow the "/FTP_DATA_DISK/FTP_Root/" directory path.

In summary, this command sets the file context type for all files and directories under the "/FTP_DATA_DISK/FTP_Root/" directory and its subdirectories to "public_content_rw_t", which allows read and write access to these files and directories.

Reset the SELinux security context for all files and directories under the “/FTP_DATA_DISK/FTP_Root/”

sudo restorecon -Rvv /FTP_DATA_DISK/FTP_Root/
    "restorecon" is a tool that resets the SELinux security context for files and directories to their default values.

    "-R" specifies that the operation should be recursive, meaning that the security context should be reset for all files and directories under the specified directory.

    "-vv" specifies that the command should run in verbose mode, which provides more detailed output about the operation.

"/FTP_DATA_DISK/FTP_Root/" is the path of the directory whose security context should be reset.

Setup Fail2ban

Install fail2ban

sudo dnf install fail2ban

Create the jail.local file

This file is used to overwrite the config blocks in /etc/fail2ban/fail2ban.conf
sudo nano /etc/fail2ban/jail.local
vsftpd]
enabled = true
port = ftp,ftp-data,ftps,ftps-data
logpath = /FTP_DATA_DISK/_logs/vsftpd.log
maxretry = 5
bantime = 7200

Make sure to update the logpath directive to match the vsftpd log file we defined on the vsftpd.conf file

sudo systemctl start fail2ban

sudo systemctl enable fail2ban

sudo systemctl status fail2ban
journalctl -u fail2ban  will help you narrow down any issues with the service

Testing

sudo tail -f /var/log/fail2ban.log

Fail2ban injects and manages the following rich rules

Client will fail to connect using FTP until the ban is lifted

Remove the ban IP list

#get the list of banned IPs 
sudo fail2ban-client get vsftpd banned

#Remove a specific IP from the list 
sudo fail2ban-client set vsftpd unbanip <IP>

#Remove/Reset all the the banned IP lists
sudo fail2ban-client unban --all

This should get you up and running, For the demo I just used Unencrypted FTP on port 21 to keep things simple, Please utilize SFTP with the letsencrypt certificate for better security. i will cover this on another article and link it here

Change the location of the Docker overlay2 storage directory

If you found this page you already know why you are looking for this, your server /dev/mapper/cs-root is filled due to /var/lib/docker taking up most of the space

Yes, you can change the location of the Docker overlay2 storage directory by modifying the daemon.json file. Here’s how to do it:

Open or create the daemon.json file using a text editor:

sudo nano /etc/docker/daemon.json

{
    "data-root": "/path/to/new/location/docker"
}

Replace “/path/to/new/location/docker” with the path to the new location of the overlay2 directory.

If the file already contains other configuration settings, add the "data-root" setting to the file under the "storage-driver" setting:

{
    "storage-driver": "overlay2",
    "data-root": "/path/to/new/location/docker"
}

Save the file and Restart docker

sudo systemctl restart docker

Don’t forget to remove the old data

rm -rf /var/lib/docker/overlay2

PowerShell remoting (WinRM) over HTTPS using a AD CS PKI (CA) signed client Certificate

This is a guide to show you how to enroll your servers/desktops to allow powershell remoting (WINRM) over HTTPS

Assumptions

  • You have a working Root CA on the ADDS environment – Guide
  • CRL and AIA is configured properly – Guide
  • Root CA cert is pushed out to all Servers/Desktops – This happens by default

Contents

  1. Setup CA Certificate template
  2. Deploy Auto-enrolled Certificates via Group Policy
  3. Powershell logon script to set the WinRM listener
  4. Deploy the script as a logon script via Group Policy
  5. Testing
1 – Setup CA Certificate template to allow Client Servers/Desktops to checkout the certificate from the CA

Connect to the The Certification Authority Microsoft Management Console (MMC)

Navigate to Certificate Templates > Manage

On the “Certificate templates Console” window > Select Web Server > Duplicate Template

Under the new Template window Set the following attributes

General – Pick a Name and Validity Period – This is up to you

Compatibility – Set the compatibility attributes (You can leave this on the default values, It up to you)

Subject Name – Set ‘Subject Name’ attributes (Important)

Security – Add “Domain Computers” Security Group and Set the following permissions

  • Read – Allow
  • Enroll – Allow
  • Autoenroll – Allow

Click “OK” to save and close out of “Certificate template console”

Issue to the new template

Go back to the “The Certification Authority Microsoft Management Console” (MMC)

Under templates (Right click the empty space) > Select New > Certificate template to Issue

Under the Enable Certificate template window > Select the Template you just created

Allow few minutes for ADDS to replicate and pick up the changes with in the forest

2 – Deploy Auto-enrolled Certificates via Group Policy

Create a new GPO

Windows Settings > Security Settings > Public Key Policies/Certificate Services Client – Auto-Enrollment Settings

Link the GPO to the relevant OU with in your ADDS environment

Note – You can push out the root CA cert as a trusted root certificate with this same policy if you want to force computers to pick up the CA cert,

Testing

If you need to test it gpupdate/force or reboot your test machine, The Server VM/PC will pickup a certificate from ADCS PKI

3 – Powershell logon script to set the WINRM listener

Dry run

  • Setup the log file
  • Check for the Certificate matching the machines FQDN Auto-enrolled from AD CS
  • If exist
    • Set up the HTTPS WInRM listener and bind the certificate
    • Write log
  • else
    • Write log
#Malinda Rathnayake- 2020
#
#variable
$Date = Get-Date -Format "dd_MM_yy"
$port=5986
$SessionRunTime = Get-Date -Format "dd_yyyy_HH-mm"
#
#Setup Logs folder and log File
$ScriptVersion = '1.0'
$locallogPath = "C:\_Scripts\_Logs\WINRM_HTTPS_ListenerBinding"
#
$logging_Folder = (New-Item -Path $locallogPath -ItemType Directory -Name $Date -Force)
$ScriptSessionlogFile = New-Item $logging_Folder\ScriptSessionLog_$SessionRunTime.txt -Force
$ScriptSessionlogFilePath = $ScriptSessionlogFile.VersionInfo.FileName
#
#Check for the the auto-enrolled SSL Cert
$RootCA = "Company-Root-CA" #change This
$hostname = ([System.Net.Dns]::GetHostByName(($env:computerName))).Hostname
$certinfo = (Get-ChildItem -Path Cert:\LocalMachine\My\ |? {($_.Subject -Like "CN=$hostname") -and ($_.Issuer -Like "CN=$RootCA*")})
$certThumbprint = $certinfo.Thumbprint
#
#Script-------------------------------------------------------
#
#Remove the existing WInRM Listener if there is any
Get-ChildItem WSMan:\Localhost\Listener | Where -Property Keys -eq "Transport=HTTPS" | Remove-Item -Recurse -Force
#
#If the client certificate exists Setup the WinRM HTTPS listener with the cert else Write log
if ($certThumbprint){
#
New-Item -Path WSMan:\Localhost\Listener -Transport HTTPS -Address * -CertificateThumbprint $certThumbprint -HostName $hostname -Force
#
netsh advfirewall firewall add rule name="Windows Remote Management (HTTPS-In)" dir=in action=allow protocol=TCP localport=$port
#
Add-Content -Path $ScriptSessionlogFilePath -Value "Certbinding with the HTTPS WinRM HTTPS Listener Completed"
Add-Content -Path $ScriptSessionlogFilePath -Value "$certinfo.Subject"}
else{
Add-Content -Path $ScriptSessionlogFilePath -Value "No Cert matching the Server FQDN found, Please run gpupdate/force or reboot the system"
}

Script is commented with Explaining each section (should have done functions but i was pressed for time, never got around to do it, if you do fix it up and improve this please let me know in the comments :D)

5 – Deploy the script as a logon script via Group Policy

Setup a GPO and set this script as a logon Powershell script

Im using a user policy with GPO Loop-back processing set to Merge applied to the server OU

Testing

To confirm WinRM is listening on HTTPS, type the following commands:

winrm enumerate winrm/config/listener
Winrm get http://schemas.microsoft.com/wbem/wsman/1/config

Sources that helped me

https://docs.microsoft.com/en-us/troubleshoot/windows-client/system-management-components/configure-winrm-for-https

https://gmusumeci.medium.com/get-rid-of-those-annoying-self-signed-certificates-with-microsoft-certificate-services-part-3-9d4b8e819f45

http://vcloud-lab.com/entries/powershell/powershell-remoting-over-https-using-self-signed-ssl-certificate

ArubaOS CX Virtual Switching Extension – VSX Stacking Guide

What is VSX?

VSX is a cluster technology that allows the two VSX switches to run with independent control planes (OSPF/BGP) and present themselves as different routers in the network. In the datapath, however, they function as a single router and support active-active forwarding.

VSX allows you to mitigate inherent issues with a shared control plane that comes with traditional stacking while maintaining all the benefits

  • Control plane: Inter-Switch-Link and Keepalive
  • Data plane L2: MCLAGs
  • Data plane L3: Active gateway

This is a very similar technology compared to Dell VLT stacking with Dell OS10

Basic feature Comparison with Dell VLT Stacking

Dell VLT StackingAruba VSX
Supports Multi chassis Lag
independent control planes
All active-gateway configuration (L3 load balancing)✅(VLT Peer routing)(VSX Active forwarding)
Layer 3 Packet load balancing
Can Participate in Spanning tree MST/RSTP
Gateway IP Redundancy ✅VRRP✅(VSX Active Gateway or VRRP)

Setup Guide

What you need?

  • 10/25/40/100GE Port for the interswitch link
  • VSX supported switch, VSX is only supported on switches above CX6300 SKU
Switch SeriesVSX
CX 6200 seriesX
CX 6300 seriesX
CX 6400 series
CX 8200 series
CX 8320/8325 series
CX 8360 series
**Updated 2020-Dec

For this guide im using a 8325 series switch

Dry run

  • Setup LAG interface for the inter-switch link (ISL)
  • Create the VSX cluster
  • Setup a keepalive link and a new VRF for the keepalive traffic

Setup LAG interface for the inter-switch link (ISL)

In order to form the VSX cluster, we need a LAG interface for the inter switch communication

Naturally i pick the fastest ports on the switch to create this 10/25/40/100GE

Depending on what switch you have, The ISL bandwidth can be a limitation/Bottleneck, Account for this factor when designing a VSX based solution 
Utilize VSX-Activeforwarding or Active gateways to mitigate this

Create the LAG interface

This is a regular Port channel no special configurations, you need to create this on both switches

  • Native VLAN needs to be the default VLAN
  • Trunk port and All VLANs allowed
CORE01#

interface lag 256
no shutdown
description VSX-LAG
no routing
vlan trunk native 1 tag
vlan trunk allowed all
lacp mode active
exit


-------------------------------

CORE02#

interface lag 256
no shutdown
description VSX-LAG
no routing
vlan trunk native 1 tag
vlan trunk allowed all
lacp mode active
exit
Add/Assign the physical ports to the LAG interface

I’m using two 100GE ports for the ISL LAG

CORE01#

interface 1/1/55
no shutdown
lag 256
exit
interface 1/1/56
no shutdown
lag 256
exit

-------------------------------

CORE02#

interface 1/1/55
no shutdown
lag 256
exit
interface 1/1/56
no shutdown
lag 256
exit

Do the same configuration on the VSX Peer switch (Second Switch)

Connect the cables via DAC/Optical and confirm the Port-channel health

CORE01# sh lag 256
System-ID       : b8:d4:e7:d5:36:00
System-priority : 65534

Aggregate lag256 is up
 Admin state is up
 Description : VSX-LAG
 Type                        : normal
 MAC Address                 : b8:d4:e7:d5:36:00
 Aggregated-interfaces       : 1/1/55 1/1/56
 Aggregation-key             : 256
 Aggregate mode              : active
 Hash                        : l3-src-dst
 LACP rate                   : slow
 Speed                       : 200000 Mb/s
 Mode                        : trunk


-------------------------------------------------------------------

CORE02# sh lag 256
System-ID       : b8:d4:e7:d5:f3:00
System-priority : 65534

Aggregate lag256 is up
 Admin state is up
 Description : VSX-LAG
 Type                        : normal
 MAC Address                 : b8:d4:e7:d5:f3:00
 Aggregated-interfaces       : 1/1/55 1/1/56
 Aggregation-key             : 256
 Aggregate mode              : active
 Hash                        : l3-src-dst
 LACP rate                   : slow
 Speed                       : 200000 Mb/s
 Mode                        : trunk


Form the VSX Cluster

under the configuration mode, go in to the VSX context by entering “vsx” and issue the following commands on both switches

CORE01#

vsx
    inter-switch-link lag 256
    role primary
    linkup-delay-timer 30

-------------------------------

CORE02#

vsx
    inter-switch-link lag 256
    role secondary
    linkup-delay-timer 30

Check the VSX Status

CORE01# sh vsx status
VSX Operational State
---------------------
  ISL channel             : In-Sync
  ISL mgmt channel        : operational
  Config Sync Status      : In-Sync
  NAE                     : peer_reachable
  HTTPS Server            : peer_reachable

Attribute           Local               Peer
------------        --------            --------
ISL link            lag256              lag256
ISL version         2                   2
System MAC          b8:d4:e7:d5:36:00   b8:d4:e7:d5:f3:00
Platform            8325                8325
Software Version    GL.10.06.0001       GL.10.06.0001
Device Role         primary             secondary

----------------------------------------

CORE02# sh vsx status
VSX Operational State
---------------------
  ISL channel             : In-Sync
  ISL mgmt channel        : operational
  Config Sync Status      : In-Sync
  NAE                     : peer_reachable
  HTTPS Server            : peer_reachable

Attribute           Local               Peer
------------        --------            --------
ISL link            lag256              lag256
ISL version         2                   2
System MAC          b8:d4:e7:d5:f3:00   b8:d4:e7:d5:36:00
Platform            8325                8325
Software Version    GL.10.06.0001       GL.10.06.0001
Device Role         secondary           primary

Setup the Keepalive Link

its recommended to set up a Keepalive link to avoid Split-brain scenarios if the ISL goes down, We are trying to prevent both switches from thinking they are the active devices creating STP loops and other issues on the network

This is not a must-have, it’s nice to have, As of Aruba CX OS 10.06.x you need to sacrifice one of your data ports for this

Dell OS10 VLT archives this via the OOBM network ports, Supposedly Keepalive over OOBM is something Aruba is working on for future releases

Few things to note

  • It’s recommended using a routed port in a separate VRF for the keepalive link
  • can use a 1Gbps link for this if needed

Provision the port and VRF

CORE01#

vrf KEEPALIVE

interface 1/1/48
no shutdown
vrf attach KEEPALIVE
description VSX-keepalive-Link
ip address 192.168.168.1/24
exit

-----------------------------------------

CORE02#

vrf KEEPALIVE

interface 1/1/48
no shutdown
vrf attach KEEPALIVE
description VSX-keepalive-Link
ip address 192.168.168.2/24
exit


Define the Keepalive link

Note – Remember to define the vrf id in the keepalive statement

Thanks /u/illumynite for pointing that out

CORE01#

vsx
    inter-switch-link lag 256
    role primary
    keepalive peer 192.168.168.2 source 192.168.168.1 vrf KEEPALIVE
    linkup-delay-timer 30

-----------------------------------------

CORE02#

vsx
    inter-switch-link lag 256
    role secondary
    keepalive peer 192.168.168.1 source 192.168.168.2 vrf KEEPALIVE
    linkup-delay-timer 30

Next up…….

  • VSX MC-LAG
  • VSX Active forwarding
  • VSX Active gateway

References

AOS-CX 10.06 Virtual SwitchingExtension (VSX) Guide

As always if you notice any mistakes please do let me know in the comments

External Pi-hole with IPv6 – Setup a secured Pi-hole DNS service on Docker using Linode/AWS

Let me address the question of why I decided to put a DNS server (Pihole) exposed to the internet (not fully open but still).

I needed/wanted to set up an Umbrella/NextDNS/CF type DNS server that’s publicly accessible but secured to certain IP addresses.

Sure NextDNS is an option and its cheap with similar features, but i wanted roll my own solution so i can learn a few things along the way

I can easily set this up for my family members with minimal technical knowledge and unable to deal with another extra device (Raspberry pi) plugged into their home network.

This will also serve as a quick and dirty guide on how to use Docker compose and address some Issues with Running Pi-hole, Docker with UFW on Ubuntu 20.x

So lets get stahhhted…….

Scope

  • Setup Pi-hole as a docker container on a VM
  • Enable IPV6 support
  • Setup UFW rules to prune traffic and a cronjob to handle the rules to update with the dynamic WAN IPs
  • Deploy and test

What we need

  • Linux VM (Ubuntu, Hardened BSD, etc)
  • Docker and Docker Compose
  • Dynamic DNS service to track the changing IP (Dyndns,no-Ip, etc)

Deployment

Setup Dynamic DNS solution to track your Dynamic WAN IP

for this demo, we are going to use DynDNS since I already own a paid account and its supported on most platforms (Routers, UTMs, NAS devices, IP camera-DVRs, etc)

Use some google-fu there are multiple ways to do this without having to pay for the service, all we need is a DNS record that's up-to-date with your current Public IP address. 

For Network A and Network B, I’m going to use the routers built-in DDNS update features

Network A gateway – UDM Pro

Network B Gateway – Netgear R6230

Confirmation

Setup the VM with Docker-compose

Pick your service provider, you can and should be able to use a free tier VM for this since its just DNS

  • Linode
  • AWS lightsail
  • IBM cloud
  • Oracle cloud
  • Google Compute
  • Digital Ocean droplet

Make sure you have a dedicated (static) IPv4 and IPv6 address attached to the resource

For this deployment, I’m going to use a Linode – Nanode, due to their native IPv6 support and cause I prefer their platform for personal projects

Setup your Linode VM – Getting started Guide

SSH in to the VM or use weblish console

Update your packages and sources

sudo apt-get update 
install Docker and Docker Compose

Assuming you already have SSH access to the VM with a static IPv4 and IPv6 address

Guide to installing Docker Engine on Ubuntu

Guide to Installing Docker-Compose

Once you have this setup confirm the docker setup

docker-compose version

Setup the Pi-hole Docker Image

Lets Configure the docker networking side to fit our Needs

Create a Seperate Bridge network for the Pi-hole container

I guess you could use the default bridge network, but I like to create one to keep things organized and this way this service can be isolated from the other containers I have

docker network create --ipv6 --driver bridge --subnet "fd01::/64" Piholev6

verification

We will use this network later in docker compose

With the new ubuntu version 20.x, Systemd will start a local DNS stub client that runs on 127.0.0.53:53

which will prevent the container from starting. because Pi-hole binds to the same port UDP 53

we could disable the service but that breaks DNS resolution on the VM causing more headaches and pain for automation and updates

After some google fu and trickering around this this is the workaround i found.

  • Disable the stub-listener
  • Change the symlink to the /etc/resolved.conf to /run/systemd/resolve/resolv.conf
  • push the external name servers so the VM won’t look at loopback to resolve DNS
  • Restart systemd-resolved
Resolving Conflicts with the systemd-resolved stub listener

We need to disable the stub listener thats bound to port 53, as i mentioned before this breaks the local dns resolution we will fix it in a bit.

sudo nano /etc/systemd/resolved.conf

Find and uncomment the line “DNSStubListener=yes” and change it to “no”

After this we need to push the external DNS servers to the box, this setting is stored on the following file

/etc/resolv.conf
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.

nameserver 127.0.0.53

But we cant manually update this file with out own DNS servers, lets investigate

Cartoon of a detective investigate following footprints | Premium ...
ls -l /etc/resolv.conf

its a symlink to the another system file

/run/systemd/resolve/stub-resolv.conf

When you take a look at the directory where that file resides, there are two files

When you look at the other file you will see that /run/systemd/resolve/resolv.conf is the one which really is carrying the external name servers

You still can’t manually edit This file, and it gets updated by whatever the IPs provided as DNS servers via DHCP. netplan will dictate the IPs based on the static DNS servers you configure on Netplan YAML file

i can see there two entries, and they are the default Linode DNS servers discovered via DHCP, I’m going to keep them as is, since they are good enough for my use case

If you want to use your own servers here – Follow this guide

 Lets change the symlink to this file instead of the stub-resolve.conf

$ sudo ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf

Now that its pointing to the right file

Lets restart the systemd-resolved

systemctl restart systemd-resolved

Now you can resolve DNS and install packages, etc

Docker compose script file for the PI-Hole

sudo mkdir /Docker_Images/
sudo mkdir /Docker_Images/Piholev6/

Lets navigate to this directory and start setting up our environment

nano /Docker_Images/Piholev6/docker-compose.yml
version: '3.4'
services:

   Pihole:
    container_name: pihole_v6
    image: pihole/pihole:latest
    hostname: Multicastbits-DNSService
    ports:
      - "53:53/tcp"
      - "53:53/udp"
      - "8080:80/tcp"
      - "4343:443/tcp"
    environment:
      TZ: America/New_York
      DNS1: 1.1.1.1
      DNS2: 8.8.8.8
      WEBPASSWORD: F1ghtm4_Keng3n4sura
      ServerIP: 45.33.73.186
      enable_ipv6: "true"
      ServerIPv6: 2600:3c03::f03c:92ff:feb9:ea9c
    volumes:
       - '${ROOT}/pihole/etc-pihole/:/etc/pihole/'
       - '${ROOT}/pihole/etc-dnsmasq.d/:/etc/dnsmasq.d/'
    dns:
      - 127.0.0.1
      - 1.1.1.1
    cap_add:
      - NET_ADMIN
    restart: always

networks:
  default:
    external:
      name: Piholev6
networks:
  default:
    external:
      name: Piholev6

Lets break this down a littlebit

  • Version – Declare Docker compose version
  • container_name – This is the name of the container on the docker container registry
  • image – What image to pull from the Docker Hub
  • hostname – This is the host-name for the Docker container – this name will show up on your lookup when you are using this Pi-hole
  • ports – What ports should be NATed via the Docker Bridge to the host VM
  • TZ – Time Zone
  • DNS1 – DNS server used with in the image
  • DNS2 – DNS server used with in the image
  • WEBPASSWORD – Password for the Pi-Hole web console
  • ServerIP – Use the IPv4 address assigned to the VMs network interface(You need this for the Pi-Hole to respond on the IP for DNS queries)
  • IPv6 – Enable Disable IPv6 support
  • ServerIPv6 – Use the IPv4 address assigned to the VMs network interface (You need this for the Pi-Hole to respond on the IP for DNS queries)
  • volumes – These volumes will hold the configuration data so the container settings and historical data will persist reboots
  • cap_add:- NET_ADMIN – Add Linux capabilities to edit the network stack – link
  • restart: always – This will make sure the container gets restarted every time the VM boots up – Link
  • networks:default:external:name: Piholev6 – Set the container to use the network bridge we created before

Now lets bring up the Docker container

docker-compose up -d

-d switch will bring up the Docker container in the background

Run ‘Docker ps’ to confirm

Now you can access the web interface and use the Pihole

verifying its using the bridge network you created

Grab the network ID for the bridge network we create before and use the inspect switch to check the config

docker network ls
docker network inspect f7ba28db09ae

This will bring up the full configuration for the Linux bridge we created and the containers attached to the bridge will be visible under the “Containers”: tag

Testing

I manually configured my workstations primary DNS to the Pi-Hole IPs

Updating the docker Image

Pull the new image from the Registry

docker pull pihole/pihole

Take down the current container

docker-compose down

Run the new container

docker-compose up -d

Your settings will persist this update

Securing the install

now that we have a working Pi-Hole with IPv6 enabled, we can login and configure the Pihole server and resolve DNS as needed

but this is open to the public internet and will fall victim to DNS reflection attacks, etc

lets set up firewall rules and open up relevant ports (DNS, SSH, HTTPS) to the relevant IP addresses before we proceed

Disable IPtables from the docker daemon

Ubuntu uses UFW (uncomplicated firewall) as an obfuscation layer to make things easier for operators, but by default, Docker will open ports using IPtables with higher precedence, Rules added via UFW doesn’t take effect

So we need to tell docker not to do this when launching a container so we can manage the firewall rules via UFW

This file may not exist already if so nano will create it for you

sudo nano /etc/docker/daemon.json

Add the following lines to the file

{
"iptables": false
}

restart the docker services

sudo systemctl restart docker

now doing this might disrupt communication with the container until we allow them back in using UFW commands, so keep that in mind.

Automatically updating Firewall Rules based on the DYN DNS Host records

we are going to create a shell script and run it every hour using crontab

Shell Script Dry run

  • Get the IP from the DYNDNS Host records
  • remove/Cleanup existing rules
  • Add Default deny Rules
  • Add allow rules using the resolved IPs as the source

Dynamic IP addresses are updated on the following DNS records

  • trusted-Network01.selfip.net
  • trusted-Network02.selfip.net

Lets start by creating the script file under /bin/*

sudo touch /bin/PIHolefwruleupdate.sh
sudo chmod +x /bin/PIHolefwruleupdate.sh
sudo nano /bin/PIHolefwruleupdate.sh

now lets build the script

#!/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
now=$(date +"%m/%d/%T")
DYNDNSNetwork01="trusted-Network01.selfip.net"
DYNDNSNetwork02="trusted-Network02.selfip.com"
#Get the network IP using dig
Network01_CurrentIP=`dig +short $DYNDNSNetwork01`
Network02_CurrentIP=`dig +short $DYNDNSNetwork02`
echo "-----------------------------------------------------------------"
echo Network A WAN IP $Network01_CurrentIP
echo Network B WAN IP $Network02_CurrentIP
echo "Script Run time : $now"
echo "-----------------------------------------------------------------"
#update firewall Rules
#reset firewall rules
#
sudo ufw --force reset
#
#Re-enable Firewall
#
sudo ufw --force enable
#
#Enable inbound default Deny firewall Rules
#
sudo ufw default deny incoming
#
#add allow Rules to the relevant networks
#
sudo ufw allow from $Network01_CurrentIP to any port 22 proto tcp
sudo ufw allow from $Network01_CurrentIP to any port 8080 proto tcp
sudo ufw allow from $Network01_CurrentIP to any port 53 proto udp
sudo ufw allow from $Network02_CurrentIP to any port 53 proto udp
#add the ipV6 DNS allow all Rule - Working on finding an effective way to lock this down, with IPv6 rick is minimal
sudo ufw allow 53/udp
#find and delete the allow any to any IPv4 Rule for port 53
sudo ufw --force delete $(ufw status numbered | grep '53*.*Anywhere.' | grep -v v6 | awk -F"[][]" '{print $2}')
echo "--------------------end Script------------------------------"

Lets run the script to make sure its working

I used a online port scanner to confirm

Setup Scheduled job with logging

lets use crontab and setup a scheduled job to run this script every hour

Make sure the script is copied to the /bin folder with the executable permissions

using crontab -e (If you are launching this for the first time it will ask you to pick the editor, I picked Nano)

crontab -e

Add the following line

0 * * * * /bin/PIHolefwruleupdate.sh >> /var/log/PIHolefwruleupdate_Cronoutput.log 2>&1
Lets break this down
0 * * * *

this will run the script every time minutes hit zero which is usually every hour

/bin/PIHolefwruleupdate.sh

Script Path to execute

/var/log/PIHolefwruleupdate_Cronoutput.log 2>&1

Log file with errors captured