VRF Setup with route leaking guide Dell S4112F-ON – OS 10.5.1.3

Scope –

Create Three VRFs for Three separate clients

Create a Shared VRF

Leak routes from each VRF to the Shared_VRF

Logical overview

Create the VRFs

ip vrf Tenant01_VRF
ip vrf Tenant02_VRF
ip vrf Tenant03_VRF

Create and initialize the Interfaces (SVI, Layer 3 interface, Loopback)

We are creating Layer 3 SVIs Per tenant

interface vlan200
 mode L3
 description Tenant01_NET01
 no shutdown
 ip vrf forwarding Tenant01_VRF
 ip address 10.251.100.254/24
!
interface vlan201
 mode L3
 description Tenant01_NET02
 no shutdown
 ip vrf forwarding Tenant01_VRF
 ip address 10.251.101.254/24
!
interface vlan210
 mode L3
 description Tenant02_NET01
 no shutdown
 ip vrf forwarding Tenant02_VRF
 ip address 172.17.100.254/24
!
interface vlan220
 no ip address
 description Tenant03_NET01
 no shutdown
 ip vrf forwarding Tenant03_VRF
 ip address 192.168.110.254/24
!
interface vlan250
 mode L3
 description OSPF_Routing
 no shutdown
 ip vrf forwarding Shared_VRF
 ip address 10.252.250.6/29

Confirmation

LABCORE# show i
image     interface inventory ip        ipv6      iscsi
LABCORE# show ip interface brief
Interface Name            IP-Address          OK       Method       Status     Protocol
=========================================================================================
Vlan 200                   10.251.100.254/24   YES      manual       up          up
Vlan 201                   10.251.101.254/24   YES      manual       up          up
Vlan 210                   172.17.100.254/24   YES      manual       up          up
Vlan 220                   192.168.110.254/24  YES      manual       up          up
Vlan 250                   10.252.250.6/29     YES      manual       up          up
LABCORE# show ip vrf
VRF-Name                          Interfaces

Shared_VRF                        Vlan250

Tenant01_VRF                      Vlan200-201

Tenant02_VRF                      Vlan210

Tenant03_VRF                      Vlan220

default                           Vlan1

management                        Mgmt1/1/1

Route leaking

For this Example we are going to Leak routes from each of these tenant VRFs in to the Shared VRF

This design will allow each VLAN within the VRFs to see each other, which can be a security issue how ever you can easily control this by

  • narrowing the routes down to hosts
  • Using Access-lists (not the most ideal but if you have a playbook you can program this in with out any issues)

Real world use cases may differ use this as a template on how to leak routes with in VRFs, update your config as needed

Create the route export statements wihtin the VRFS

ip vrf Shared_VRF
 ip route-import 2:100
 ip route-import 3:100
 ip route-import 4:100
 ip route-export 1:100
ip vrf Tenant01_VRF
 ip route-export 2:100
 ip route-import 1:100
ip vrf Tenant02_VRF
 ip route-export 3:100
 ip route-import 1:100
ip vrf Tenant03_VRF
 ip route-export 4:100
 ip route-import 1:100

Lets Explain this a bit

ip vrf Shared_VRF
 ip route-import 2:100 -----------> Import Leaked routes from target 2:100
 ip route-import 3:100 -----------> Import Leaked routes from target 3:100
 ip route-import 4:100 -----------> Import Leaked routes from target 4:100
 ip route-export 1:100  -----------> Export routes to target 1:100

if you need to filter out who can import the routes you need to use the route-map with prefixes to filter it out

Setup static routes per VRF as needed

ip route vrf Tenant01_VRF 10.251.100.0/24 interface vlan200
ip route vrf Tenant01_VRF 10.251.101.0/24 interface vlan201
!
ip route vrf Tenant02_VRF 172.17.100.0/24 interface vlan210
!
ip route vrf Tenant03_VRF 192.168.110.0/24 interface vlan220
!
ip route vrf Shared_VRF 0.0.0.0/0 10.252.250.1 interface vlan25
  • Now these static routes will be leaked and learned by the shared VRF
  • the Default route on the Shared VRF will be learned downstream by the tenant VRFs
  • instead of the default route on the shared VRF, if you scope it to a certain IP or a subnet you can prevent the traffic routing between the VRFs via the Shared VRF
  • if you need routes directly leaked between Tenents use the ip route-import on the VRF as needed

Confirmation

Routes are being distributed via internal BGP process

LABCORE# show ip route vrf Tenant01_VRF
Codes: C - connected
       S - static
       B - BGP, IN - internal BGP, EX - external BGP, EV - EVPN BGP
       O - OSPF, IA - OSPF inter area, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type 2, E1 - OSPF external type 1,
       E2 - OSPF external type 2, * - candidate default,
       + - summary route, > - non-active route
Gateway of last resort is via 10.252.250.1 to network 0.0.0.0
  Destination                 Gateway                                        Dist/Metric       Last Change
----------------------------------------------------------------------------------------------------------
  *B IN 0.0.0.0/0           via 10.252.250.1                                 200/0             12:17:42
  C     10.251.100.0/24     via 10.251.100.254       vlan200                 0/0               12:43:46
  C     10.251.101.0/24     via 10.251.101.254       vlan201                 0/0               12:43:46
LABCORE#
LABCORE# show ip route vrf Tenant02_VRF
Codes: C - connected
       S - static
       B - BGP, IN - internal BGP, EX - external BGP, EV - EVPN BGP
       O - OSPF, IA - OSPF inter area, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type 2, E1 - OSPF external type 1,
       E2 - OSPF external type 2, * - candidate default,
       + - summary route, > - non-active route
Gateway of last resort is via 10.252.250.1 to network 0.0.0.0
  Destination                 Gateway                                        Dist/Metric       Last Change
----------------------------------------------------------------------------------------------------------
  *B IN 0.0.0.0/0           via 10.252.250.1                                 200/0             12:17:45
  C     172.17.100.0/24     via 172.17.100.254       vlan210                 0/0               12:43:49
LABCORE#
LABCORE# show ip route vrf Tenant03_VRF
Codes: C - connected
       S - static
       B - BGP, IN - internal BGP, EX - external BGP, EV - EVPN BGP
       O - OSPF, IA - OSPF inter area, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type 2, E1 - OSPF external type 1,
       E2 - OSPF external type 2, * - candidate default,
       + - summary route, > - non-active route
Gateway of last resort is via 10.252.250.1 to network 0.0.0.0
  Destination                 Gateway                                        Dist/Metric       Last Change
----------------------------------------------------------------------------------------------------------
  *B IN 0.0.0.0/0           via 10.252.250.1                                 200/0             12:17:48
  C     192.168.110.0/24    via 192.168.110.254      vlan220                 0/0               12:43:52
LABCORE# show ip route vrf Shared_VRF
Codes: C - connected
       S - static
       B - BGP, IN - internal BGP, EX - external BGP, EV - EVPN BGP
       O - OSPF, IA - OSPF inter area, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type 2, E1 - OSPF external type 1,
       E2 - OSPF external type 2, * - candidate default,
       + - summary route, > - non-active route
Gateway of last resort is via 10.252.250.1 to network 0.0.0.0
  Destination                 Gateway                                        Dist/Metric       Last Change
----------------------------------------------------------------------------------------------------------
  *S    0.0.0.0/0           via 10.252.250.1         vlan250                 1/0               12:21:33
  B  IN 10.251.100.0/24     Direct,Tenant01_VRF      vlan200                 200/0             09:01:28
  B  IN 10.251.101.0/24     Direct,Tenant01_VRF      vlan201                 200/0             09:01:28
  C     10.252.250.0/29     via 10.252.250.6         vlan250                 0/0               12:42:53
  B  IN 172.17.100.0/24     Direct,Tenant02_VRF      vlan210                 200/0             09:01:28
  B  IN 192.168.110.0/24    Direct,Tenant03_VRF      vlan220                 200/0             09:02:09

We can ping outside to the internet from the VRF IPs

Redistribute leaked routes via IGP

You can use a Internal BGP process to pickup routes from any VRF and redistribute them to other IGP processes as needed – Check the Article for that information

Vagrant Ansible LAB Guide – Bridged network

Here’s a is a quick guide to get you started with a “Ansible core lab” using Vagrant.

Alright lets get started

TLDR Version

  • Install Vagrant
  • Install Virtual-box
  • Create project folder and CD in to it
Vagrant init
  • Vagrantfile – link
  • Vagrant Provisioning Shell Script to Deploy Ansible – link
  • Install the vagrant-vbguest plugin to deploy missing
vagrant plugin install vagrant-vbguest
  • Bring up the Vagrant environment
Vagrant up

Install Vagrant and Virtual box

For this demo we are using windows 10 1909 but you can use the same guide for MAC OSX

Windows

Download Vagrant and virtual box and install it the good ol way –

https://www.vagrantup.com/downloads.html

https://www.virtualbox.org/wiki/Downloads

https://www.vagrantmanager.com/downloads/

Install the vagrant-vbguest plugin (We need this with newer versions of Ubuntu)

vagrant plugin install vagrant-vbguest

Or Using chocolatey

choco install vagrant
choco install virtualbox
choco install vagrant-manager

Install the vagrant-vbguest plugin (We need this with newer versions of Ubuntu)

vagrant plugin install vagrant-vbguest

MAC OSX – using Brewcask

Install virtual box

$ brew cask install virtualbox

Now install Vagrant either from the website or use homebrew for installing it.

$ brew cask install vagrant

Vagrant-Manager is a nice way to manage all your virtual machines in one place directly from the menu bar.

$ brew cask install vagrant-manager

Install the vagrant-vbguest plugin (We need this with newer versions of Ubuntu)

vagrant plugin install vagrant-vbguest

Setup the Vagrant Environment

Open Powershell

to get started lets check our environment

vagrant version

Create a project directory and Initialize the environment

for the project directory im using D:\vagrant

Open powershell and run

mkdir D:\vagrant
cd D:\vagrant

Initialize the environment under the project folder

vagrant init

this will create Two Items

.vagrant – Hidden folder holding Base Machines and meta data

Vagrantfile – Vagrant config file

Lets Create the Vagrantfile to deploy the VMs

https://www.vagrantup.com/docs/vagrantfile/

The syntax of Vagrantfiles is Ruby this gives us a lot of flexibility to program in logic when building your files

Im using Atom to edit the vagrantfile

Vagrant.configure("2") do |config|
     config.vm.define "controller" do |controller|
                  controller.vm.box = "ubuntu/trusty64"
                  controller.vm.hostname = "LAB-Controller"
                  controller.vm.network "public_network", bridge: "Intel(R) I211 Gigabit Network Connection", ip: "172.17.10.120"
                    controller.vm.provider "virtualbox" do |vb|
                                 vb.memory = "2048"
                  end
                  controller.vm.provision :shell, path: 'Ansible_LAB_setup.sh'
   end
   (1..3).each do |i|
         config.vm.define "vls-node#{i}" do |node|
                       node.vm.box = "ubuntu/trusty64"
                       node.vm.hostname = "vls-node#{i}"
                       node.vm.network "public_network", bridge: "Intel(R) I211 Gigabit Network Connection" ip: "172.17.10.12#{i}"
                      node.vm.provider "virtualbox" do |vb|
                                                  vb.memory = "1024"
                     end
              end
        end
end

You can grab the code from my Repo

https://github.com/malindarathnayake/Ansible_Vagrant_LAB/blob/master/Vagrantfile

Let’s talk a little bit about this code and unpack this

Vagrant API version

Vagrant uses API versions for its configuration file, this is how it can stay backward compatible. So in every Vagrantfile we need to specify which version to use. The current one is version 2 which works with Vagrant 1.1 and up.

Provisioning the Ansible VM

This will

  • Provision the controller Ubuntu VM
  • Create a bridged network adapter
  • Set the host-name – LAB-Controller
  • Set the static IP – 172.17.10.120/24
  • Run the Shell script that installs Ansible using apt-get install (We will get to this below)

Lets start digging in…

Specifying the Controller VM Name, base box and hostname

Vagrant uses a base image to clone a virtual machine quickly. These base images are known as “boxes” in Vagrant, and specifying the box to use for your Vagrant environment is always the first step after creating a new Vagrantfile.

You can find different base boxes from app.vagrantup.com

Or you can create custom base boxes for pretty much anything including “CiscoVIRL(CML)” images – keep an eye out for the next article on this

Network configurations

controller.vm.network "public_network", bridge: "Intel(R) I211 Gigabit Network Connection", ip: "your IP"

in this case, we are asking it to create a bridged adapter using the Intel(R) I211 NIC and set the IP address you defined on under IP attribute

You can the relavant interface name using

get-netadapter

You can also create a host-only private network

controller.vm.network :private_network, ip: "10.0.0.10"

for more info checkout the network section in the KB

https://www.vagrantup.com/docs/networking/

Define the provider and VM resources

We declaring virtualbox(we installed this earlier) as the provider and setting VM memory to 2048

You can get more granular with this, refer to the below KB

https://www.vagrantup.com/docs/virtualbox/configuration.html

Define the shell script to customize the VM config and install the Ansible Package

Now this is where we define the provisioning shell script

this script installs Ansible and set the host file entries to make your life easier

In case you are wondering VLS stands for V=virtual,L – linux S – server.

I use this naming scheme for my VMs. Feel free to use anything you want; make sure it matches what you defined on the Vagrantfile under node.vm.hostname

!/bin/bash
sudo apt-get update
sudo apt-get install software-propetise-common -y
sudo apt-add-repository ppa:ansible/ansible
sudo apt-get update
sudo apt-get install ansible -y
echo "
172.17.10.120 LAB-controller
172.17.10.121 vls-node1
172.17.10.122 vls-node2
172.17.10.123 vls-node3" >> /etc/hosts

create this file and save it as Ansible_LAB_setup.sh in the Project folder

in this case I’m going to save it under D:\vagrant

You can also do this inline with a script block instead of using a separate file

https://www.vagrantup.com/docs/provisioning/basic_usage.html

Provisioning the Member servers for the lab

We covered most of the code used above, the only difference here is we are using each method to create 3 VMs with the same template (I’m lazy and it’s more convenient)

This will create three Ubuntu VMs with the following Host-names and IP addresses, you should update these values to match you LAN, or use a private Adapter

vls-node1 – 172.17.10.121

vls-node2 – 172.17.10.122

vls-node1 – 172.17.10.123

So now that we are done with explaining the code, let’s run this

Building the Lab environment using Vagrant

Issue the following command to check your syntax

Vagrant status

Issue the following command to bring up the environment

Vagrant up

If you get this message Reboot in to UEFI and make sure virtualization is enabled

Intel – VT-D

AMD Ryzen – SVM

If everything is kumbaya you will see vagrant firing up the deployment

It will provision 4 VMs as we specified

Notice since we have the “vagrant-vbguest” plugin installed, it will reinstall the relevant guest tools along with the dependencies for the OS

==> vls-node3: Machine booted and ready!
[vls-node3] No Virtualbox Guest Additions installation found.
rmmod: ERROR: Module vboxsf is not currently loaded
rmmod: ERROR: Module vboxguest is not currently loaded
Reading package lists...
Building dependency tree...
Reading state information...
Package 'virtualbox-guest-x11' is not installed, so not removed
The following packages will be REMOVED:
  virtualbox-guest-utils*
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
After this operation, 5799 kB disk space will be freed.
(Reading database ... 61617 files and directories currently installed.)
Removing virtualbox-guest-utils (6.0.14-dfsg-1) ...
Processing triggers for man-db (2.8.7-3) ...
(Reading database ... 61604 files and directories currently installed.)
Purging configuration files for virtualbox-guest-utils (6.0.14-dfsg-1) ...
Processing triggers for systemd (242-7ubuntu3.7) ...
Reading package lists...
Building dependency tree...
Reading state information...
linux-headers-5.3.0-51-generic is already the newest version (5.3.0-51.44).
linux-headers-5.3.0-51-generic set to manually installed.

Check the status

Vagrant status

Testing

Connecting via SSH to your VMs

vagrant ssh controller

“Controller” is the VMname we defined before not the hostname, You can find this by running Vagrant status on posh or your terminal

We are going to connect to our controller and check everything

Little bit more information on the networking side

Vagrant Adds two interfaces, for each VM

NIC 1 – Nat’d in to the host (control plane for Vagrant to manage the VMs)

NIC 2 – Bridged adapter we provisioned in the script with the IP Address

Default route is set via the Private(NAT’d) interface (you cant change it)

Netplan configs

Vagrant creates a custom netplan yaml for interface configs


Destroy/Tear-down the environment

vagrant destroy -f

https://www.vagrantup.com/intro/getting-started/teardown.html

I hope this helped someone. when I started with Vagrant a few years back it took me a few tries to figure out the system and the logic behind it, this will give you a basic understanding on how things are plugged together.

let me know in the comments if you see any issues or mistakes.

Until Next time…..

Azure AD Sync Connect No-Start-Connection status

Issue

Received the following error from the Azure AD stating that Password Synchronization was not working on the tenant.

When i manually initiate a delta sync, i see the following logs

"The Specified Domain either does not exist or could not be contacted"

(click to enlarge)

Checked the following

  • Restarted ADsync Services
  • Resolve the ADDS Domain FQDN and DNS – Working
  • Test required ports for AD-sync using portqry – issues with the Primary ADDS server defined on the DNS values

Root Cause

Turns out the Domain controller Defined as the primary DNS value was pointing was going thorough updates, its responding on the DNS but doesn’t return any data (Brown-out state)

Assumption

when checking DNS since the DNS server is connecting, Windows doesn’t check the secondary and tertiary servers defined under DNS servers.

This might happen if you are using a ADDS server via a S2S tunnel/MPLS when the latency goes high

Resolution

Check make sure your ADDS-DNS servers defined on AD-SYNC server are alive and responding

in my case i just updated the “Primary” DNS value with the umbrella Appliance IP (this act as a proxy and handle the fail-over)

Hybrid Exchange mailbox On-boarding : Target user already has a primary mailbox – Fix

During an Office 365 migration on a Hybrid environment with AAD Connectran into the following scenario:

  • Hybrid Co-Existence Environment with AAD-Sync
  • User [email protected] has a mailbox on-premises. Jon is represented as a Mail User in the cloud with an office 365 license
  • [email protected] had a cloud-only mailbox prior to the initial AD-sync was run
  • A user account is registered as a mail-user and has a valid license attached
  • During the office 365 Remote mailbox move, we end up with the following error during validation and removing the immutable ID and remapping to on-premise account won’t fix the issue
Target user 'Sam fisher' already has a primary mailbox.
+ CategoryInfo : InvalidArgument: (tsu:MailboxOrMailUserIdParameter) [New-MoveRequest], RecipientTaskException
+ FullyQualifiedErrorId : [Server=Pl-EX001,RequestId=19e90208-e39d-42bc-bde3-ee0db6375b8a,TimeStamp=11/6/2019 4:10:43 PM] [FailureCategory=Cmdlet-RecipientTaskException] 9418C1E1,Microsoft.Exchange.Management.Migration.MailboxRep
lication.MoveRequest.NewMoveRequest
+ PSComputerName : Pl-ex001.Paladin.org

It turns out this happens due to an unclean cloud object on MSOL, This is because Exchange online keeps pointers that indicate that there used to be a mailbox in the cloud for this user

Option 1 (nuclear option)

to fix this problem was to delete *MSOL User Object* for Sam and re-sync it from on-premises. This would delete [email protected] from the cloud – but it will delete him/her from all workloads, not only Exchange. This is problematic because Sam is already using Teams, One-drive, SharePoint.

Option 2

Clean up only the office 365 mailbox pointer information

PS C:\> Set-User [email protected] -PermanentlyClearPreviousMailboxInfo 
Confirm
Confirm
Are you sure you want to perform this action?
Delete all existing information about user "[email protected]"?. This operation will clear existing values from
Previous home MDB and Previous Mailbox GUID of the user. After deletion, reconnecting to the previous mailbox that
existed in the cloud will not be possible and any content it had will be unrecoverable PERMANENTLY. Do you want to
continue?
[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"): a

Executing this leaves you with a clean object without the duplicate-mailbox problem,

in some cases when you run this command you will get the following output 

 “Command completed successfully, but no user settings were changed.”

If this happens

Remove the license from the user temporarily and run the command to remove previous mailbox data

then you can re-add the license 

 

Upgrading VMware EXSI Hosts using Vcenter Update Manager Baseline (6.5 to 6.7 Update 2)

Update Manager is bundled in the vCenter Server Appliance since version 6.5, it’s a plug-in that runs on the vSphere Web Client.  we can use the component to

  • patch/upgrade hosts
  • deploy .vib files within the V-Center
  • Scan your VC environment and report on any out of compliance hosts

Hardcore/Experienced VMware operators will scoff at this article, but I have seen many organizations still using ILO/IDRAC to mount an ISO to update hosts and they have no idea this function even exists.

Now that’s out of the way Let’s get to the how-to part of this

In Vcenter click the “Menu” and drill down to the “Update Manager”

This Blade will show you all the nerd knobs and overview of your current Updates and compliance levels

Click on the “Baselines” Tab

You will have two predefined baselines for security patches created by the Vcenter, let keep that aside for now

Navigate to the “ESXi Images” Tab, and Click “Import”

Once the Upload is complete, Click on “New Baseline”

Fill in the Name and Description that makes sense to anyone that logs in and click Next

Select the image you just Uploaded before on the next Screen and continue through the wizard and complete it

Note – If you have other 3rd party software for ESXI you can create seprate baselines for those and use baseline Groups to push out upgrades and vib files at the same time 

Now click the “Menu” and Navigate Backup to “Hosts and Clusters”

Now you can apply the Baseline this at various levels within the Vcenter Hierarchy

Vcenter | DataCenter | Cluster | Host

Depending on your use case pick the right level

Excerpt from the KB 

For ESXi hosts in a cluster, the remediation process is sequential by default. With Update Manager, you can select to run host remediation in parallel.

When you remediate a cluster of hosts sequentially and one of the hosts fails to enter maintenance mode, Update Manager reports an error, and the process stops and fails. The hosts in the cluster that are remediated stay at the updated level. The ones that are not remediated after the failed host remediation are not updated. If a host in a DRS enabled cluster runs a virtual machine on which Update Manager or vCenter Server are installed, DRS first attempts to migrate the virtual machine running vCenter Server or Update Manager to another host so that the remediation succeeds. In case the virtual machine cannot be migrated to another host, the remediation fails for the host, but the process does not stop. Update Manager proceeds to remediate the next host in the cluster.

The host upgrade remediation of ESXi hosts in a cluster proceeds only if all hosts in the cluster can be upgraded.

Remediation of hosts in a cluster requires that you temporarily disable cluster features such as VMware DPM and HA admission control. Also, turn off FT if it is enabled on any of the virtual machines on a host, and disconnect the removable devices connected to the virtual machines on a host, so that they can be migrated with vMotion. Before you start a remediation process, you can generate a report that shows which cluster, host, or virtual machine has the cluster features enabled.

Link to KB on Remediation


Moving on; for this example, since I have only 2 hosts. we are going apply the baseline at the cluster level but apply the remediation at host level

Host 1 > Enter Maintenance > Remediation > Update complete and online

Host 2 > Enter Maintenance > Remediation > Update complete and online

Select the cluster, Click the “Updates” Tab and click on “Attach” on the Attached baselines section

Select and attach the baseline we created before

Click “Check Compliance” to scan and get a report

Select the host in the cluster, enter maintenance mode

Click “REMEDIATE” to start the upgrade. (if you do this at a cluster level if you have DRS, Update Manager will update each node)

This will reboot the host and go through the update process

Foot Notes –

You might run into the following issue

“vCenter cannot deploy Host upgrade agent to host”

Cause 1

Scratch partition is full use Vcenter and change the scratch folder location

VMWARE KB

Creating a persistent scratch location for ESXi  – https://kb.vmware.com/s/article/1033696

Cause 2

Hardware is not compatible,

I had this issue due to 6.7 dropping support for an LSI Raid card on an older firmware, you need to do some foot work and check the log files to figure out why its failing

Vmware HCL – Link

ESXI and Vcenter log file locations – link

“System logs on hosts are stored on non-persistent storage” message on VCenter

Ran into this pesky little error message recently, on a vcenter environment

If the logs are stored on a local scratch disk, vCenter will display an alert stating –  “System logs on host xxx are stored on non-persistent storage”

Configure ESXi Syslog location – vSphere Web Client

Vcenter > Select “Host”> Configure > Advance System Settings

Click on Edit and search for “Syslog.global.logDir”

Edit the value and in this case, I’m going to use the local data store (Localhost_DataStore01) to store the syslogs.

You can also define a remote syslog server using the “Syslog.global.LogHost” setting

Configure ESXi Syslog location – ESXCLI

Ssh on to the host

Check the current location

esxcli system syslog config get

*logs stored on the local scratch disk

Manually Set the Path

esxcli system syslog config set –logdir=/vmfs/directory/path

you can find the VMFS volume names/UUIDs under  –

/vmfs/volumes

remote syslog server can be set using

esxcli system syslog config set –loghost=’tcp://hostname:port’

Load the configuration changes with the syslog reload command

esxcli system syslog reload

The logs will immediately begin populating the specified location.

Unable to upgrade vCenter 6.5/6.7 to U2: Root password expired

As a Part of my pre-flight check for Vcenter upgrades i like to mount the ISO and go through the first 3 steps, during this I noticed the installer cannot connect to the source appliance with this error 

2019-05-01T20:05:02.052Z - info: Stream :: close
2019-05-01T20:05:02.052Z - info: Password not expired
2019-05-01T20:05:02.054Z - error: sourcePrecheck: error in getting source Info: ServerFaultCode: Failed to authenticate with the guest operating system using the supplied credentials.
2019-05-01T20:05:03.328Z - error: Request timed out after 30000 ms, url: https://vcenter.companyABC.local:443/
2019-05-01T20:05:09.675Z - info: Log file was saved at: C:\Users\MCbits\Desktop\installer-20190501-160025555.log

trying to reset via the admin interface or the DCUI didn’t work,  after digging around found a way to reset it by forcing the vcenter to boot in to single user mode

Procedure:

  1. Take a snapshot or backup of the vCenter Server Appliance before proceeding. Do not skip this step.
  2. Reboot the vCenter Server Appliance.
  3. After the OS starts, press e key to enter the GNU GRUB Edit Menu.
  4. Locate the line that begins with the word Linux.
  5. Append these entries to the end of the line: rw init=/bin/bash The line should look like the following screenshot:

After adding the statement, press F10 to continue booting 

Vcenter appliance will boot into single user mode

Type passwd to reset the root password

if you run into the following error message

"Authentication token lock busy"

you need to re-mount the filesystem in RW, which lets you change between read-only and read-write. this will allow you to make changes

mount -o remount,rw /

Until next time !!!

 

Guide – Secure UniFi Cloud Controller on AWS lightsail signed with Lets-encrypt SSL

I found a solution for how to navigate cloud key issues and wanted to set up a ZTP solution for Unifi hardware so I can direct ship equipment to the site, and provision it securely via internet without having to stand up a L2L tunnel.

Alright, lets get started…

This guide is applicable for any Ubuntu based install, but I’m going to utilize Amazon Lightsail for the demo, since at the time of writing, it’s the cheapest option I can find with enough compute resources and a static IP included.

2 GB RAM, 1 vCPU60 GB SSD

OPex (Recurring Cost) – 10$ per Month – As of February 2019

Guide

Dry Run

1. Set up Lightsail instance
2. Create and attach static IP
3. Open necessary ports
4. Set up Unify packages
5. Set up SSL using certbot and letsencrypt
6. Add the certs to unify controller
7. Set up Cronjob for SSL auto Renewal
8. Adopting UniFi devices

1. Set up LightSail instance

Login to – https://lightsail.aws.amazon.com

Spin up a Lightsail instance:

Set a name for the instance and provision it.

2. Create and attach static IP

Click on the instance name and click on the networking tab:

Click “Create Static IP”:

3. Open necessary ports

TCP or UDP

Port Number

Usage

TCP

80

Port used inform-URL for adoption.

TCP

443

Port used for Cloud Access service.

UDP

3478

Port used for STUN.

TCP

8080

Port used for device and controller communication.

TCP

8443

Port used for controller GUI/API as seen in a web browser.

TCP

8880

Port used for HTTP portal redirection.

TCP

8843

Port used for HTTPS portal redirection.

You can disable or lock down the ports as needed using IP-tables depending on your security posture

Post spotlight-

https://www.lammertbies.nl/comm/info/iptables.html#intr

4. Set up Unify packages

https://help.ubnt.com/hc/en-us/articles/209376117-UniFi-Install-a-UniFi-Cloud-Controller-on-Amazon-Web-Services#7

Add the Ubiquiti repository to /etc/apt/sources.list:
sudo echo "deb http://www.ubnt.com/downloads/unifi/debian stable ubiquiti" | sudo tee -a /etc/apt/sources.list
Add the Ubiquiti GPG Key:
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 06E85760C0A52C50

Update the server’s repository information:

sudo apt-get update

Install JAVA 8 run time

You need Java Run-time 8 to run the UniFi Controller

Add Oracle’s PPA (Personal Package Archive) to your list of sources so that Ubuntu knows where to check for the updates. Use addaptrepository command for that.

sudo add-apt-repository ppa:webupd8team/java -y sudo apt install java-common oracle-java8-installer

update your package repository by issuing the following command

sudo apt-get update

The oracle-java8-set-default package will automatically set Oracle JDK8 as default. Once the installation is complete we can check Java version.

java -version

java version "1.8.0_191"

MongoDB

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6

echo "deb http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list
sudo apt update

Update. Retrieve the latest package information.

sudo apt update

sudo apt-get install apt-transport-https

Install UniFi Controller packages.

sudo apt install unifi

You should be able to Access the web interface and go through the initial setup wizard.

https://yourIPaddress:8443

5. Set up SSL using certbot and letsencrypt

Lets get that green-lock up in here shall we

So, a few things to note here… UniFi doesn’t really have a straightforward way to import certificates, you have to use the java keystore commands to import the cert, but there is a very handy script built by Steve Jenkins that makes this super easy.

First, we need to request a cert and sign it using lets encrypt certificate authority.

Let’s start with adding the repository and install the EFF certbot package – link

sudo apt-get update
sudo apt-get install software-properties-common 
sudo add-apt-repository universe 
sudo add-apt-repository ppa:certbot/certbot 
sudo apt-get update 
sudo apt-get install certbot

5.1 Update/add your DNS record and make sure its propagated (this is important)

Note - The DNS name should point to the static IP we attached to our light-sail instance
Im going to use the following A record for this example

unifyctrl01.multicastbits.com

Ping from the controller and make sure the server can resolve it.

ping unifyctrl01.multicastbits.com


You wont be able see any echo replies because ICMP is not allowed on the firewall rules in AWS - leave it as is we just need the server to see the IP resolving to DNS A record

5.2 Request the certificate

Issue the following command to start certbot in certonly mode

sudo certbot certonly
usage: 
  certbot [SUBCOMMAND] [options] [-d DOMAIN] [-d DOMAIN] ...

Certbot can obtain and install HTTPS/TLS/SSL certificates. By default,
it will attempt to use a webserver both for obtaining and installing the
certificate. The most common SUBCOMMANDS and flags are:

obtain, install, and renew certificates:
    (default) run   Obtain & install a certificate in your current webserver
    certonly        Obtain or renew a certificate, but do not install it
    renew           Renew all previously obtained certificates that are near expiry
    enhance         Add security enhancements to your existing configuration
   -d DOMAINS       Comma-separated list of domains to obtain a certificate for

 

5.3 Follow the wizard

Select the first option #1 (Spin up a temporary web server)

Enter all the information requested for the cert request.

This will save the certificate and the privet key generated to the following directory:

/etc/letsencrypt/live/DNSName/

All you need to worry about are these files:

  • cert.pem
  • fullchain.pem
  • privkey.pem

6 Import the certificate to the UniFi controller

You can do this manually using the keytool-import

https://crosstalksolutions.com/secure-unifi-controller/
https://docs.oracle.com/javase/tutorial/security/toolsign/rstep2.html

But for this we are going to use the handy SSL import script made by Steven Jenkins

6.1  Download Steve Jenkins UniFi SSL Import Script

Copy the unifi_ssl_import.sh script to your server

wget https://raw.githubusercontent.com/stevejenkins/unifi-linux-utils/master/unifi_ssl_import.sh

6.2 Modify Script

Install Nano if you don’t have it (it’s better than VI in my opinion. Some disagree, but hey, I’m entitled to my opinion)

sudo apt-get install nano
nano unifi_ssl_import.sh

Change your hostname.example.com to the actual hostname you wish to use. In my case, I’m using

UNIFI_HOSTNAME=your_DNS_Record

Since we are using Ubuntu comment following three lines for Fedora/RedHat/CentOS

#UNIFI_DIR=/opt/UniFi
#JAVA_DIR=${UNIFI_DIR}
#KEYSTORE=${UNIFI_DIR}/data/keystore 

Uncomment following three lines for Debian/Ubuntu

UNIFI_DIR=/var/lib/unifi 
JAVA_DIR=/usr/lib/unifi
KEYSTORE=${UNIFI_DIR}/keystore

 Since we are using Letsencrypt

LE_MODE=yes

here’s what i used for this demo

#!/usr/bin/env bash

# unifi_ssl_import.sh
# UniFi Controller SSL Certificate Import Script for Unix/Linux Systems
# by Steve Jenkins <http://www.stevejenkins.com/>
# Part of https://github.com/stevejenkins/ubnt-linux-utils/
# Incorporates ideas from https://source.sosdg.org/brielle/lets-encrypt-scripts
# Version 2.8
# Last Updated Jan 13, 2017

# CONFIGURATION OPTIONS
UNIFI_HOSTNAME=unifyctrl01.multicastbits.com
UNIFI_SERVICE=unifi

# Uncomment following three lines for Fedora/RedHat/CentOS
#UNIFI_DIR=/opt/UniFi
#JAVA_DIR=${UNIFI_DIR}
#KEYSTORE=${UNIFI_DIR}/data/keystore

# Uncomment following three lines for Debian/Ubuntu
UNIFI_DIR=/var/lib/unifi
JAVA_DIR=/usr/lib/unifi
KEYSTORE=${UNIFI_DIR}/keystore

# Uncomment following three lines for CloudKey
#UNIFI_DIR=/var/lib/unifi
#JAVA_DIR=/usr/lib/unifi
#KEYSTORE=${JAVA_DIR}/data/keystore

# FOR LET'S ENCRYPT SSL CERTIFICATES ONLY
# Generate your Let's Encrtypt key & cert with certbot before running this script
LE_MODE=yes
LE_LIVE_DIR=/etc/letsencrypt/live

# THE FOLLOWING OPTIONS NOT REQUIRED IF LE_MODE IS ENABLED
PRIV_KEY=/etc/ssl/private/hostname.example.com.key
SIGNED_CRT=/etc/ssl/certs/hostname.example.com.crt
CHAIN_FILE=/etc/ssl/certs/startssl-chain.crt

#rest of the script Omitted

6.3 Make script executable:
chmod a+x unifi_ssl_import.sh
6.4 Run script:
sudo ./unifi_ssl_import.sh

This script will

  • Backup the old keystore file (very handy, something i always forget to do)
  • update the relevant keystore file with the LE cert
  • restart the services to apply the new cert

7. Setup Automatic Certificate renewal

Lets-encrypt cert expeires every 3 months you can easily renew this by using

letsencrypt renew

This will use the existing config you used to generate the cert and renew it

then run the SSL-import script to update the controller cert

you can automate this using a cronjob

Copy the modified import Script you used in Step 6 to “/bin/certupdate/unifi_ssl_import.sh”

sudo mkdir /bin/certupdate/
cp /home/user/unifi_ssl_import.sh /bin/certupdate/unifi_ssl_import.sh

switch to sudo and edit your cron-tab for root and add the following lines

sudo su
crontab -e
0 1 31 1,3,5,7,9,11 * root certbot renew
15 1 31 1,3,5,7,9,11 * root /bin/certupdate/unifi_ssl_import.sh

Save and exit nano by doing CTRL+X followed by Y. 

Check crontab for root and confirm

crontab -e

At 01:00 on day-of-month 31 in January, March, May, July, September, and November the command will attempt to renew the cert

At 01:15 on day-of-month 31 in January, March, May, July, September, and November it will update the keystore with the new cert

 

Useful links –

https://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/

https://crontab.guru/#

8. Adopting UniFi devices to the new Controller with SSH or other L3 adoption methods

If you can SSH into the AP, it’s possible to do L3-adoption via CLI command:

1. Make sure the AP is running the same firmware as the controller. If it is not, see this guide: UniFi – Changing the Firmware of a UniFi Device.

2. Make sure the AP is in factory default state. If it’s not, do:

syswrapper.sh restore-default

3. SSH into the device and type the following and hit enter:

set-inform http://ip-of-controller:8080/inform

4. After issuing the set-inform, the UniFi device will show up for adoption. Once you click adopt, the device will appear to go offline.

5. Once the device goes offline, issue the  set-inform  command from step 3 again. This will permanently save the inform address, and the device will start provisioning.

https://help.ubnt.com/hc/en-us/articles/204909754-UniFi-Device-Adoption-Methods-for-Remote-UniFi-Controllers

Managing the Unify controller services

# to stop the controller
$ sudo service unifi stop

# to start the controller
$ sudo service unifi start

# to restart the controller
$ sudo service unifi restart

# to view the controller's current status
$ sudo service unifi status

Troubleshooting  issues 

cat /var/log/unifi/server.log

go through the system logs and google the issue, best part about ubiquity gear is the strong community support 

 


ASIC’S in Cisco Catalyst switches

Preface-

After I started working with Open networking switches, wanted to know more about the Cisco catalyst range I work with every day.

Information on older ASICS is very hard to find, but recently they have started to talk a lot about the new chips like UADP 2.0 with the Catalyst 9k / Nexus launch, This is more likely due to the rise of Desegregated Network Operating Systems DNOS such as Cumulus and PICA8, etc forcing customers to be more aware of what’s under the hood rather than listening and believing shiny PDF files with a laundry list of features.

The information was there but scattered all over the web, I went though CiscoLive, TechFieldDay slides/videos, interviews, partner update PDFs, Press leases and whitepapers and even LinkedIn profiles to gather information

If you notice a mistake please let me know

Scope –

we are going to focus on the ASIC’s used in the well-known 2960S/X/XR and the 36xx,37xx,38XX  and the new Cat 9K series

Timeline

 

Summary

Cisco Brought a bunch of companies to acquire the switching technology they needed that later bloomed into the switching platforms we know today

  • Crescendo Communications (1993) – Catalyst 5K and 6K chassis
  • Kalpana (1994) – Catalyst 3K (Fun Fact they invented VLANs that later got standardized as 802.1q)
  • Grand Junction (1995) – Catalyst 17xx, 19xx, 28xx, 29xx
  • Granite Systems (1996) – Catalyst 4K (K series ASIC)

After the original Cisco 3750/2950 switches, Cisco 3xxx/2xxx-G  (G for Gigabit) was released

Next, the Cisco 3xxx-E series with enterprise management functions was released

later, Cisco developed the Cisco 3750-V series with the function of energy-saving version for –E series, later replaced by Cisco 3750 V2 series (Possibly a die shrink)

G series and E series were later phased out and integrated into Cisco X series. which is still being sold and supported

in 2017-2018 Cisco released the catalyst 9k family to replace the 2K and 3K families

Sasquatch ASIC

from what I could  find there are two variants of this ASIC

The initial release in 2003

  • Fixed pipeline ASIC
  • 180 Nano-meter process
  • 60 Million Transistors

Shipped with the 10/100 3750 and 2960

Die Shrink to 130nm in 2005

  • Fixed pipeline ASIC
  • 130 Nano-meter process
  • 210 Million Transistors

Shipped in the 2960-G/3560-G/3750-G series

I couldn’t find much info about the chip design. will update this section when I find more.

 Strider ASIC

Initially Release in 2010

  • Fixed pipeline ASIC
  • Built on the 65-nanometer process
  • 1.3 Billion Transistors

Strider ASIC (circa 2010) was an improved design based on the 3750-E series was first shipped with the 2960-S family.

S88G_ASIC design
S88G ASIC

later in 2012-2013 with a die shrink to 45-nanometer, they managed to fit 2 ASICs in the same silicon, This shipped with the 2960-X/XR which replaced the 2960-S

  • higher stack speeds and features
  • limited layer 3 capabilities IP Lite feature (2960-XR only)
  • Better QoS and Netflow lite

Later down the line they silently rolled the ASIC design to a 32-nanometer process for better yield to achieve cheaper manufacturing costs

this switch is still being sold with no EOL announced as a cheaper Acess layer switch

On a side note – in 2017 Cisco released another version of the 2960 family the WS-2960-L This is a cheaper version built on a Marvel ASIC (Same as the SG55x) with a web UI and fanless design. I personally think this is the next evolution of their SMB market-oriented family the popular Cisco SG-5xx series. for the first the time the 2960 family had a fairly usable and pleasant web interface for the configuration and management. the new 9K series seems to be containing a more polished version of the web-UI

Unified Access Data Plane (UADP)

Due to the limitations in the fixed pipeline architecture and the costs involved with the re-rolling process to fix bugs they needed something new and had three options

As a compromise between all three Cisco Started dreaming up this programmable ASIC design in 2007-2008 the idea was to build a chip with programmable stages that can be updated with firmware updates instead of writing the logic into the silicon permanently.

they released the programmable ASIC technology initially for their QFP (Quantum flow processor) ASIC in the ISR router family to meet the customer needs (service providers and large enterprises)

This chip allowed them to support advanced routing technologies and new protocols without changing hardware simply via firmware updates improving the longevity of the investment allowing them to make more money out of the chips extended life cycle.

Eventually, this technology trickled downstream and the Doppler 1.0 was born

Improvements and features in UADP 1.0/1.1/2.0

  • Programmable stages in the pipeline
  • Cisco intent Driven networking support – DNA Center with ISE
  • Intergarted Stacking Support with Stack power – ASIC is built with pinouts for the stacking fabric allowing faster stacking performance
  • Rapid Recirculation (Encapsulation such as MPLS, VXLAN)
  • TrustSec
  • Advance on-chip QOS
  • Software-defined networking support – integrated NetFlow, SD access
  • Flex Parser – Programmable packet parser allowing them to introduce support for new protocols with firmware updates
  • On-chip Micro-engines – Highly specialized engines within in the chip to perform repetitive tasks such as
    • Reassembly
    • Encryption/Decryption
    •  Fragmentation
  • CAPWAP – Switch can function as a wireless Lan Controller
    • Mobility agent – Offload Qo and other functions from the WLC (IMO Works really nicely with multi-site wireless deployments)
    • Mobility Controller – Fullblown Wireless LAN controller (WLC)
  • Extended life cycle allowed integration of Cisco security technologies such as Cisco DNA + ISE later down the line even on the first generation switches
  • Multigig and 40GE speed support
  • Advanced malware detection capabilities via packet fingerprinting

Legacy Fixed pipeline architecture

Programmable pipeline architecture

Doppler/UADP 1.0 (2013)

While doppler1.0 programmable ASIC handling the Data plane coupled with Cavium CPU for the control plane the first generation of the switches to ship with these chip was the 3650-x and 3850-x gigabit versions

  • Built on 65 Nanometer Process
  • 1.3 billion transistors

UADP 1.1 (2015)

  • Die Shrink to 45 Nanometer
  • 3 billion transistors

UADP 2.0 (2017)

  • Built on 28nm/16nm Process
  • Equipped with an Intel Xeon D (Dual-core X86) CPU for the control plane
  • Open-IOS-XE

7.4 billion transistors

Flexible ASIC Templates –  

Allows Cisco to write templates that can optimize the chip resources for different use cases

the new Catalyst 9000 series will replace the following campus switching families built on the older Strider and more recent UADP 1 and 1.1 ASICS

  • Catalyst 2K —–> Catalyst 9200
  • Catalyst 3K —–> Catalyst 9300
  • Catalyst 4K —–> Catalyst 9400
  • Catalyst 6K —–> Catalyst 9500

I’m will update/fix this post when I find more info about the UADP 2 and the next evolution, stay tuned for a few more articles based on the silicon used in open networking X86 chassis.

MS Exchange 2016 [ERROR] Cannot find path ‘..\Exchange_Server_V15\UnifiedMessaging\grammars’ because it does not exist.


So recently I ran into this annoying error message with Exchange 2016 CU11 Update.

Environment info-

  • Exchange 2016 upgrade from CU8 to CU11
  • Exchange binaries are installed under D:\Microsoft\Exchange_Server_V15\..
Microsoft.PowerShell.Commands.GetItemCommand.ProcessRecord()". [12/04/2018 16:41:43.0233] [1] [ERROR] Cannot find path 'D:\Microsoft\Exchange_Server_V15\UnifiedMessaging\grammars' because it does not exist. 
[12/04/2018 16:41:43.0233] [1] [ERROR-REFERENCE] Id=UnifiedMessagingComponent___99d8be02cb8d413eafc6ff15e437e13d Component=EXCHANGE14:\Current\Release\Shared\Datacenter\Setup
[12/04/2018 16:41:43.0234] [1] Setup is stopping now because of one or more critical errors. [12/04/2018 16:41:43.0234] [1] Finished executing component tasks.
[12/04/2018 16:41:43.0318] [1] Ending processing Install-UnifiedMessagingRole
[12/04/2018 16:44:51.0116] [0] CurrentResult setupbase.maincore:396: 0 [12/04/2018 16:44:51.0118] [0] End of Setup
[12/04/2018 16:44:51.0118] [0] **********************************************

Root Cause

Ran the Setup again and it failed with the same error
while going though the log files i notice that the setup looks for this file path while configuring the "Mailbox role: Unified Messaging service" (Stage 6 on the GUI installer)

$grammarPath = join-path $RoleInstallPath "UnifiedMessaging\grammars\*";

There was no folder present with the name grammars under the Path specified on the error

just to confirm, i checked another server on CU8 and the grammars folder is there.

Not sure why the folder got removed, it may have happened during the first run of the CU11 setup that failed,

Resolution

My first thought was to copy the folder from an existing CU8 server. but just to avoid any issues (since exchange is sensitive to file versions)
I created an empty folder with the name "grammars" under D:\Microsoft\Exchange_Server_V15\UnifiedMessaging\




Ran the setup again and it continued the upgrade process and completed without any issues...¯\_(ツ)_/¯











[12/04/2018 18:07:50.0416] [2] Ending processing Set-ServerComponentState
[12/04/2018 18:07:50.0417] [2] Beginning processing Write-ExchangeSetupLog
[12/04/2018 18:07:50.0420] [2] Install is complete. Server state has been set to Active.
[12/04/2018 18:07:50.0421] [2] Ending processing Write-ExchangeSetupLog
[12/04/2018 18:07:50.0422] [1] Finished executing component tasks.
[12/04/2018 18:07:50.0429] [1] Ending processing Start-PostSetup
[12/04/2018 18:07:50.0524] [0] CurrentResult setupbase.maincore:396: 0
[12/04/2018 18:07:50.0525] [0] End of Setup
[12/04/2018 18:07:50.0525] [0] **********************************************

Considering cost of this software M$ really have to be better about error handling IMO, i have run in to silly issues like this way too many times since Exchange 2010.