Wim Update Automation – ConfigMgr

Our desktop support folks were looking to keep their Wims up to date for their desktop OSD, and our server folks tagged along for the ride. Of course we went ahead and automated the entire process.

We leverage all of the previous works of OSD Gurus to build a MDT based CTGlobal Image Factory. Those posts are all well covered – go forth and seek out the deploymentBunny (aka:Mikael Nystrom); Kent Agerlund; Ami Arwidmark; Johan Arwidmark, and so on – these are well covered and not repeated here.

The next step is to build even more automation around this. The full workflow is as follows…

  1. Azure Automation Scheduled Task triggers runbook monthly to kick it all off
  2. Clean up any existing test VMs in my test cluster… decommission the old tests
  3. Create multiple local running processes on the ImageFactory server
    1. One each to build the local VM
    2. One each to monitor for completion
  4. The completion monitor kicks off new runbooks via webhooks when complete
  5. Refresh OSImage Package properties + Refresh package source.
  6. Check status then deploy VM (ours are in VMWare) specifying test TS
  7. Last package in the test TS is a powerShell script that calls… another runbook
  8. Unit testing – for lack of a better term of the test VM with shiny new OS
    1. Verify .Net version is latest.
    2. Windows update verification against Microsoft Catalog per OS
    3. powerShell / WMF Version verification
    4. SMBv1 / IIS / WCF Frameworks is disabled checks
    5. Insert check of your own ūüôā
  9. If all unit tests pass – refresh Prod OSImage Package properties and source

 

To start with, gather all the MDT Task Sequence Ids from ImageFactory, the Operating Package ID from ConfigMgr, and some sort of unique identifier. I simply picked the OS Level being automated. Along with this… it does assume that you have managed to automate server VM builds in your environment.

## Servers to rebuild
$osList = @("2012","2016","2016-Core")

$oslist | ForEach-Object{
    $os = $_

    if ($os -eq '2012')
    {
        $taskID = 'Server2012'
        $osID = 'ABC0000E'
        $deployOS = '2012R2'
    }
    elseif ($os -eq '2016')
    {
        $taskID = 'Server2016'
        $osID = 'ABC00012'
        $deployOS = '2016'
    }
    elseif ($os -eq '2016-Core')
    {
        $taskID = 'Server2016-Core'
        $osID = 'ABC00010'
        $deployOS = '2016-Core'
    }

C:\Automation\ServerDecom.ps1 -computer $computer -rebuild $True

Quick note on the decommissioning part. My automation will flag for a rebuild system. What it does is instead of deleting the AD Computer object – it will instead reset the password. This is very important later on when the newly deployed test VM has a new IP address, and needs to dynamically update AD DNS. If you delete the computer object… say good bye to your computer’s access to update any existing DNS Records.

$sam = $computer+'$'
$password = ConvertTo-SecureString -String $sam -AsPlainText -Force
Get-ADComputer $sam -Credential $adCreds| Set-ADAccountPassword -NewPassword:$password -Reset:$true -Credential $adCreds

Moving on – the runbook still needs to kick off the MDT build. This takes a little prep work to have some local powerShell scripts saved on the Image Factory Server. The reason for doing this all is to keep from having a runbook run for excessive amounts of time waiting for the MDT build to complete.

Invoke-Command -ComputerName 'ImageFactory Server' -Credential $adCreds -ScriptBlock {
param($deployOS)
$process = "powershell.exe -file c:\scripts\$deployOS`.ps1"
$process2 = "powershell.exe -file c:\scripts\Monitor-$deployOS`.ps1"
([WMICLASS]"\\server\ROOT\CIMV2:win32_process").Create($process)
([WMICLASS]"\\server\ROOT\CIMV2:win32_process").Create($process2)
} -ArgumentList $deployOS

The reason this was done with process creation is that the remote powerShell session will end all processes on exit. However, if you kick off new process inside the remote session – they stay running as system.

The two scripts that do the local heavy lifting…
Build VM

Remove-Module CTImageFactory -ErrorAction SilentlyContinue
Import-Module "e:\ImgFactory\Scripts\CTImageFactory.psm1" -WarningAction SilentlyContinue
set-location "e:\ImgFactory\Scripts\"
Get-GlobalVariables
Start-Build -BuildType 'Single' -TaskSequenceID 'Server2016'

Monitor VM

do
{
Start-Sleep -Seconds 1200
$state = (Get-VM |where name -eq 'Server2016').State
}
until ($state -eq 'Off')

Invoke-RestMethod -Method Post -Uri 'https://s1events.azure-automation.net/webhooks?token=webhookToken' -Body (ConvertTo-Json -InputObject @{'osID'='ABC00010';'computer'='Test Server Name';'deployOS'='2016-Core'}) -ErrorAction Stop

The last bit calls the next runbook via webhook in the gravy train. This is where it needs to check in with the site server, and make sure the newly captured wim files are updated and distributed. Then move on to deploy the wim as a test for verification.

$siteCode = Get-AutomationVariable -Name 'ConfigMgr-SiteCode'
$siteServer = Get-AutomationVariable -Name 'ConfigMgr-SiteServer'

################### Update wim file
$imageProperties = Get-WmiObject -ComputerName $siteServer  -Namespace "root/SMS/site_$sitecode" -Class 'SMS_ImagePackage' -Credential $sccmServiceAccount |where-object -property PackageID -eq $osID
$imageProperties.ReloadImageProperties()
$imageProperties.RefreshPkgSource()

Import-Module "C:\Program Files (x86)\Microsoft Configuration Manager\AdminConsole\bin\ConfigurationManager.psd1" -Force
$drive = get-psdrive -name MPT -ErrorAction SilentlyContinue
If(!$drive){New-psdrive -Name $siteCode -PSProvider "AdminUI.PS.Provider\CMSite" -root $siteServer -Credential $sccmServiceAccount  -Description "SCCM Site"}
$location = $siteCode+":"
push-location
Set-location $location
$date = get-date -format d
Set-CMOperatingSystemImage -Id $osID -Description $date -version "I'm NEW and Shiny'"
Pop-Location
Remove-Psdrive MPT

# Check on status of distribution
$count = 0
do
{
$count
Start-Sleep -Seconds 120
$status = Get-WmiObject -Credential $sccmServiceAccount -Namespace root\sms\site_$siteCode -Query "SELECT PackageID,PackageType,State,ServerNALPath,LastCopied FROM SMS_PackageStatusDistPointsSummarizer where PackageID = '$($osID)'" -ComputerName $siteServer | select-object PackageID,PackageType,State,LastCopied,ServerNALPath
if(($status.State | Where-Object {$_ -eq 0}).count -gt 1){$complete = $true}
$count++
}until($complete -or $count -eq 10)
if(-not($complete)){throw "SCCM Package failed to update in time"}    

Phew! All that to update content, distribution, some descriptions, and check that its all where it needs to be. As long as it all completes – then I call our internal automated server build scripts. We use boot media attached to the VM to automatically call the appropriate task sequence. In addition, we set OSD TS Variables in a text file accessible from the OSD environment to quickly set all needed variables. No waiting for device membership collections and so forth.

The last step of the task sequence is to call.. you guessed it – another runbook via a webhook. This last step is all about verifying the final product. Does it have the latest updates? Did the MDT TS do everything I needed it to do? Lets find out. Also, this is where having clean AD DNS really comes into play.

$session = New-PSSession -ComputerName $computer -Credential $creds
$dotNetTest = Get-AutomationVariable -Name 'dotnetTest'

# Test for dotNet version
$dotNetValidation = Invoke-Command -Session $session -ScriptBlock {
param($dotNetTest)
$dotNetVersions = (Get-ChildItem 'HKLM:\SOFTWARE\Microsoft\NET Framework Setup\NDP' -recurse | Get-ItemProperty -name Release -EA 0 |Where-Object { $_.PSChildName -eq 'Full'} |Select-Object Release).Release
If ($dotNetVersions -ge $dotNetTest) {$dotNetValidation = 'Pass'}
Else {$dotNetValidation = 'Failed'}
$dotNetValidation
} -ArgumentList $dotNetTest

Test for Microsoft Update CU

$updateValidation = Invoke-Command -Session $session -ScriptBlock {
$date = get-date -Format yyyy-MM
$updateTest = "$date Cumulative Update" # Needs to match $date + Cumulative Update *
$2012updateTest1 = "$Date Preview of Monthly Quality Rollup for Windows Server 2012 R2"
$2012updateTest2 = "$Date Security Monthly Quality Rollup for Windows Server 2012 R2"
$catalogURI = "https://www.catalog.update.microsoft.com/Search.aspx?q="
$hotfixList = Get-HotFix |Sort-Object -Property hotfixid
$list = $hotfixlist[($hotfixList.count -3)..($hotfixList.count)].HotfixID
$updateValidation = 'Failed'
$list |ForEach-Object {
    $KB = $_
    $uri = $catalogURI + $kb
    try {
        # have to usebaseicparsing and the raw content as IE has not run yet
        $site = Invoke-WebRequest -Uri $uri -UseBasicParsing
        $content = $site.RawContent
        If ($content -like "*$updateTest*") {$updateValidation = 'Pass'}
        ElseIf ($content -like "*$2012updateTest1*") {$updateValidation = 'Pass'}
        ElseIf ($content -like "*$2012updateTest2*") {$updateValidation = 'Pass'}
        }
    Catch {Write-Warning "Failed to lookup $KB"}
    }
}
$updateValidation
#endregion

Getting into the rest of the validations — enter your own as needed

$powerShellValidation = Invoke-Command -Session $session -ScriptBlock {
    $powerShellVersion = '5.1'
    [string]$major = $PSVersionTable.psversion.major 
    [string]$minor = $PSVersionTable.psversion.minor
    $powerShellTest = $major + ".$minor"

    If ($powerShellTest -eq $powerShellVersion) {$powerShellValidation = 'Pass'}
    Else {$powerShellValidation = 'Failed'}
    $powerShellValidation
    }

$features = Invoke-Command -Session $session -ScriptBlock {Get-WindowsFeature | Where-Object -Property 'InstallState' -eq 'Installed'}
    $smbTest = $features | Where-Object -Property Name -eq 'FS-SMB1'
    If ($smbTest) {$smbValidation = 'Failed'}
    Else {$smbValidation = 'Pass'}

    $iisTest = $features | Where-Object -Property Name -like 'web*'
    If ($iisTest) {$iisValidation = 'Failed'}
    Else {$iisValidation = 'Pass'}

    $wcfTest = $features | Where-Object -Property Name -like 'NET-WCF-*'
    If ($wcfTest) {$wcfValidation = 'Failed'}
    Else {$wcfValidation = 'Pass'}

$testing = @{
netvalidation = $dotNetValidation;
updatevalidation = $updateValidation;
powershellvalidation = $powerShellValidation;
smbvalidation = $smbValidation;
iisvalidation = $iisValidation;
wcfvalidation = $wcfValidation
}

$testing | ForEach-Object {
    If ($_.values -ne 'Pass') {$nextStep = 'Email'}
    Else {$nextStep = 'UpdateProd'}
}

At this point, if anything did not pass… it simply emails our ticketing queue to notify what did not pass so we can go fix the MDT TS or what ever the case may be. Otherwise, it moves on and updates the production OS package the same as the test.

Woah… kind of a lot.

So you want to automate against ConfigMgr… do you?

Automation is great. It really is. I’m using Azure Automation Hybrid Runbook workers for just about everything these days which is a post on its own, but wanted to touch base on some key interactions with ConfigMgr.

First, the premise. The automation servers are all running server 2016 core. So… no ConfigMgr console. No first popup of the console to set site assignments with powerShell. Actually, no real good way to get some default goodness with ConfigMgr at all. For the remainder of the post – the examples will all be dealing with my automation around updating OS Image Files.

So clearly I just copy over the powerShell module files with the .dlls in order to work better with the site server remotely. The simple route is to copy over the bin path¬†C:\Program Files (x86)\Microsoft Configuration Manager\AdminConsole\bin — to the same location on your core server. This will at least get you access to the powerShell modules.

Simple… but this is really messy in an automated world. Important here is making a new-psdrive and providing your service account credentials for the task at hand. Many of the ConfigMgr cmdlets do not allow for a -credential switch to provide access in line.

Import-Module "C:\Program Files (x86)\Microsoft Configuration Manager\AdminConsole\bin\ConfigurationManager.psd1" -Force
$drive = get-psdrive -name $siteCode -ErrorAction SilentlyContinue
If(!$drive){New-psdrive -Name $siteCode -PSProvider "AdminUI.PS.Provider\CMSite" -root $siteServer -Credential $sccmServiceAccount  -Description "SCCM Site"}
$location = $siteCode+":"
push-location
Set-location $location
$date = get-date -format d
Set-CMOperatingSystemImage -Id $osID -Description $date -version "I'm a NEW and Shiny'"
Pop-Location
Remove-Psdrive $siteCode

Alternatively, you could certainly create a powerShell session with credentials; and then invoke a script block to the session. Sure – this works, but it opens up requiring the allowance of powerShell remoting for your service account. Just something to consider if you’re comfortable putting all your automation in script blocks.

Now… into the rabbit hole. I don’t like having to copy about files from a console. What happens if something changes in the ConfigMgr site version that updates the powerShell cmdlets… and now which servers need this copied where? There are many other reasons to keep this stuff off my core infrastructure, but honestly – I just don’t want to have to think about it. Keep it simple. Keep it to powerShell.

So some other good options. Obviously, Cim/Wmi work well since so much of ConfigMgr is accessible via this channel. They are also going to work with providing credentials! So letting a service account handle a well scoped task isn’t that big of a hurdle. What the hurdle is, is that not all Wmi objects will have the appropriate methods to write data back to your site server.

# Reload the image properties before refreshing the package on the distribution points
$imageProperties = Get-WmiObject -ComputerName $siteServer  -Namespace "root/SMS/site_$sitecode" -Class 'SMS_ImagePackage' -Credential $sccmServiceAccount |where-object -property PackageID -eq $osID
$imageProperties.ReloadImageProperties()
$imageProperties.RefreshPkgSource()

While this was great to get the first task done of updating the files in the system… there was no way to update the description or version via the Wmi object. When you pipe through to get-member – the properties exist, but there is no put() method.

Option five. The bottom of the rabbit hole I went down is to connect directly to the SMS Provider using WMI. This requires creating a SWbemServices object. Say what? No need to explain.. Microsoft Docs Page thank you docsMsft!

I rewrote the VB sample in powerShell a while back, and did a pull request. So side note – if you see something, write something, pull request back!

 
$siteCode = ''
$siteServer = 'server.domain'

$credentials = Get-Credential
$username = $credentials.UserName

# The connector does not understand a PSCredential. The following command will pull your PSCredential password into a string.
$password = [System.Runtime.InteropServices.Marshal]::PtrToStringAuto([System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($credentials.Password))

$NameSpace = "root\sms\site_$siteCode"
$SWbemLocator = New-Object -ComObject "WbemScripting.SWbemLocator"
$SWbemLocator.Security_.AuthenticationLevel = 6
$connection = $SWbemLocator.ConnectServer($siteServer,$Namespace,$username,$password)

 

With a connector in place, you can do something like below to update description/version and put()it back

$wim = $connection.Get("SMS_ImagePackage.PackageID='$osID'")
$wim.Properties_.Item("Version").value = $version
$wim.Properties_.Item("Description").value = $Description
$wim.Put_()

Here is the short of it. I can still provide credentials to run as what ever service principal I need, and then I get the benefit of being able to do.. well anything I want since its a direct connection against the SMS Provider itself.

There you have it folks. A full repertoire of how to get things done when coding in an automated way against ConfigMgr. Pick you poison wisely, or just mix and match as needed.
 

Strong Cryptography – .NET + powerShell

I spend a lot of time using Invoke cmdlets in powerShell. Over the past year there has been a need to address how the Invoke-RestMethod and Invoke-WebRequest handle SSL/TLS connections as service providers and API endpoints drop older versions of SSL and TLS. The good news is that providers are finally dropping these insecure channels, and the bad news is that Microsoft applications still default to allowing them as a client.

A few things for consideration. Disabling TLS 1.0 and all of SSL is rather simple Рa few Registry Keys and a reboot Рand done.  See the end of the post for the SCHANNEL Reg Keys.

It came as a surprise to me then when the Security Providers Keys are set, and then I started to get connection failures with a ‘Underlying Connection has Failed’ — very common wording for a SSL/TLS handshake failure. Whats going on powerShell? Why are you failing me?

Turns out… its because of .NET. If you pop open a powerShell session you can run

[Net.ServicePointManager]::SecurityProtocol

¬† ¬†— you will likely get the following output. “Ssl, Tls”

Not good powerShell. I said Ssl is disabled, and why wouldn’t you handshake to something better? Turns out powerShell isn’t as smart as you would think. You have a couple of options.

First — in the worst case if you still need to use Ssl… please don’t… but if you must, you can set a powerShell session to use TLS only by setting¬†

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 

But please don’t do this as a work around in your scripting. Please get rid of what ever other Ssl requirements are preventing a global system change.

Second — you can tell your OS to set the .Net Framework to use Strong Cryptography! You would think you only need to do this for Wow6432Node as who runs powerShell as a 32 bit process… but even if you check your environment, and are running as a 64 bit process it will still read the 32 bit version of .Net framework settings. Set the following, reboot, and your .Net calls from powerShell for the ServicePointManager will default to Tls, Tls11, Tls12.

Set-ItemProperty -Path 'HKLM:\SOFTWARE\Wow6432Node\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord

Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord

 

 No more connections issues!

 

Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 2.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 2.0\Client]
“DisabledByDefault”=dword:00000001
“enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 2.0\Server]
“Enabled”=hex(b):00,00,00,00,00,00,00,00
“DisabledByDefault”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 3.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 3.0\Client]
“DisabledByDefault”=dword:00000001
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 3.0\Server]
“Enabled”=hex(b):00,00,00,00,00,00,00,00
“DisabledByDefault”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Client]
“DisabledByDefault”=dword:00000001
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Server]
“DisabledByDefault”=dword:00000001
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.1]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.1\Client]
“DisabledByDefault”=dword:00000000
“Enabled”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.1\Server]
“DisabledByDefault”=dword:00000000
“Enabled”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Client]
“Enabled”=dword:00000001
“DisabledByDefault”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Server]
“DisabledByDefault”=dword:00000000
“Enabled”=dword:00000001

Managing Multiple ConfigMgr Sites with Powershell

We’re in the middle of migrating from a single ConfigMgr site to having two separate sites for servers and desktops.¬† Along with test sites, that’s a lot of sites to manage!¬† When you’re running Powershell on a machine that is managed by a site, you can easily cd, set-location, or push-location to that site’s drive. But what if you want to manage a site different than what’s managing your machine?¬† You can open a powershell terminal or ISE session directly from the console, but that can be a hassle, and also won’t work for things not run interactively.¬† I’ve taken to putting this code at the top of all of my scripts.

Continue reading “Managing Multiple ConfigMgr Sites with Powershell”

CIS Server Hardening and ConfigMgr

I recently worked on hardening an ConfigMgr Environment, using the¬†CIS Windows Server 2016 Hardening Benchmarks.¬† We’re a CIS member so I have access to the GPO template, so after reading through the benchmark document, I removed the few settings I knew I didn’t want.

After applying this policy to my site systems, clients were no longer showing activity in the console, and they’d lost their green check mark.¬† I traced the path of a hardware inventory from one client, and it was able to successfully send the inventory to the management point, but it still wasn’t updating in the console.¬† I cranked up logging on the management point, and looked in mpfdm.log.¬† The log was full of errors like this:

**ERROR: Cannot connect to the inbox source, sleep 30 seconds and try again.

Occasionally I also had this error:

CFileDispMgr::GetStagingLocation(\) failed with 0X80070057
Continue reading “CIS Server Hardening and ConfigMgr”

Microsoft (Azure) APIs

Wanted to share a finer thought of getting your head around all the wonderful, and some times frighteningly complicated Microsoft APIs. Its not so bad~

Recently I did a user group talk on this topic, and was shocked by the number of hands that were raised when I asked how many developers were in the room. An Azure user group, and almost every single person raised their hand. So I pushed them… how many have written their own API? Again, almost all hands were in the air. By the end though, one of them said – this is too much info! This is a quick break down of what I hoped they obtained out of the talk. In short… read the manual.

  1. That in the case of APIs, good documentation will make your day. Microsoft does an amazing job of documenting. Let that sink in. Yes. Yep. OK, go check it out Microsoft Docs.
  2. I wasn’t all surprised that with this many developers at an Azure user group that a lot of .Net(core)/C# developers were in attendance. If a particular language is what you know… go read the manual again on the docs page. The point being that Microsoft does amazing work documenting the REST API Reference for many languages, and supports many different client libraries. Such as:
  • .NET
  • Java
  • Node.js
  • Python
  • Azure CLI

 

In the end, it is all the same basic components/concepts no matter the language you develop in. Build and make a URI call with an action to an API endpoint, and include with it a header and some body data for the thing you are trying to do. For the third time (and done) go read the manual on your API. It will tell you everything you need to know.

Azure Kubernetes Service (AKS) with Advanced networking.

The bulk of what you need to know is in Microsofts on-line Docs here.¬† Its good information so I’ll repeat what’s in it, I will however cover a few gotchas I came across.

First thing to know is that when you create the AKS from the portal, it creates 2 Resource Groups (RG).¬† One has the AKS servers along with the VNET and the other has all the nodes, scale sets, load balancers once they get created, etc.¬† The process also creates a Service Principal (Enterprise Application) that is used why you use the kubectl commands and so on.¬† You can find this by looking in the IAM second of the second RG that is created.¬† The problem is that it doesn’t have any access by default to the VNET in the main RG that got created.¬† So when you try to apply a deployment that has a loadbalancer it can’t bind to VNET.¬† If you run:

az aks browse –resource-group $resGrp –name $aks

to launch the dashboard you’ll see the error.¬† Go into the IAM section of the VNET and add the¬†Service Principal.¬† I added it as a Contributor.

Once of the reasons to use Advanced networking is so you can peer AKS with other networks, including one that has a Site-to-Site VPN connection to your on-prem site.  The thing it took me awhile to find in the docs is that you have to create the peering on both the kubernetes VNET and the VNET with the VPN connection.

So this brings us to one more quirk about networking.¬† The Kubernetes Service IP ranges and the Docker Bridge IPs don’t show up anywhere in Azure.¬† Squirreled away somewhere in kubernetes land I guess.¬† But that means the VNET with the VPN connection doesn’t know where those address are, and they are the ones that matter.¬† At least the¬†¬†Kubernetes Service IPs do.¬† (Still working on the routing situation for those)