Exploring the RunAs feature in PowerShell DSC

PowerShell DSC is a great tool to configure and manage system settings. One highly requested feature was to be able to use DSC to manage user settings as well – for example per user registry settings. Another example would be to install an MSI which only works when run as a user.

To enable such scenarios, Microsoft published augmented versions
of the WindowsProcess and Package resources on the gallery.

With the latest WMF version, you get this feature out of the
box – DSC natively supports configuring user settings!

With the new WMF, in each resource, you can specify the user
context under which you want the resource to run. You do this via the ‘PsDscRunAsCredential’ property that is part of every resource (think of it in lines of ‘DependsOn’ which also gets added to every resource).

Let us see it in action with some examples.

A simple example is changing the console color using a DSC configuration:

Configuration foo

{

    Node (“localhost”)

    {

        Registry r

        {


Key 
= “HKEY_CURRENT_USER\\Software\Microsoft\\Command
Processor”


ValueName 
= “DefaultColor”


ValueData 
= ‘1F’


ValueType 
= “DWORD”


Ensure 
= “Present”


Force 
= $true


Hex 
= $true

            PsDscRunAsCredential = (Get-Credential)

        }

    }

}

 

 

 

 

$configData = @{

    AllNodes = @(

        @{


NodeName
=“localhost”;


PSDscAllowPlainTextPassword 
= $true

           

         }

 

)}

 

foo -ConfigurationData $configData

BlueConsole

There you have it, a blue console! If you try the same configuration without specifying the ‘PsDscRunAsCredential’ property, you would not see any
change in the console color next time you fired up the cmd prompt. The reason is that by default, the DSC engine (LCM) runs under the System account.

 

Next, let us take a look at the scenario where PsDscRunAsCredential provided is that of an admin and the admin tries to access some network share (This demonstrates that we do not need to enable CredSSP and there is no double hop problem)

 

Configuration foo

{

    Node (‘localhost’)

    {

        Script s

        {


PsDscRunAsCredential 
= (Get-Credential)


GetScript 
= ‘@{}’


TestScript 
= ‘$false’


SetScript 
= {New-Item -ItemType File -Path \\scratch2\scratch\abhikcha\Demo.txt}

        }

    }

}

 

 

$configData = @{

    AllNodes = @(

        @{


NodeName
=“localhost”;


PSDscAllowPlainTextPassword 
= $true

           

         }

 

)}

 

foo -ConfigurationData $configData

Output:

 

PS C:\Windows\system32> Start-DscConfiguration -Wait
-Verbose -Path .\foo

VERBOSE: Perform operation ‘Invoke CimMethod’ with following
parameters, ”methodName’ = SendConfigurationApply,’className’ =

MSFT_DSCLocalConfigurationManager,’namespaceName’ =
root/Microsoft/Windows/DesiredStateConfiguration’.

VERBOSE: An LCM method call arrived from computer WIN-IP51C1HOSRH
with user sid S-1-5-21-2127521184-1604012920-1887927527-101

18509.

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ Start
Set      ]

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ Start  Resource
]  [[Script]s]

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ Start
Test     ]  [[Script]s]

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ End
Test     ]  [[Script]s]  in 1.0270 seconds.

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ Start
Set      ]  [[Script]s]

VERBOSE:
[WIN-IP51C1HOSRH]:
[[Script]s] Performing the operation “Set-TargetResource” on target
“E

xecuting the SetScript with the user supplied credential”.

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ End
Set      ]  [[Script]s]  in 1.0780 seconds.

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ End
Resource ]  [[Script]s]

VERBOSE: [WIN-IP51C1HOSRH]: LCM:  [ End
Set      ]    in  3.1380 seconds.

VERBOSE: Operation ‘Invoke CimMethod’ complete.

VERBOSE: Time taken for configuration job to complete is 3.347
seconds

 

PS C:\Windows\system32> dir \\scratch2\scratch\abhikcha\demo.txt

 

 

    Directory: \\scratch2\scratch\abhikcha

 

 

Mode
LastWriteTime         Length
Name                                                                      

—-
————-         ——
—-                                                                      

-a—-        4/29/2015
5:07
PM
demo.txt   

A thing to note is that an admin can provide the credentials of a non admin user and DSC works in that case too. This can come in handy when you want to configure some settings based on which user is logged in (for example, the number of days a cookie remains valid for Internet Explorer).

Getting under the hood.

Now, for the really fun part, let us dig deeper and see the system environment when DSC executes a resource as a user. Take a deep breath and follow along!

We are going to use the new DSC resource debugging features and see them in conjunction with the ‘RunAs’ feature to peek at the internals of the runspace, user environment, etc. First, let us see how to run LCM so that it breaks into the debugger whenever we apply a configuration. The ‘DebugMode’ is enhanced with a ‘ResourceScriptBreakAll’ mode to enable this:

LocalConfigurationManager

{

DebugMode =
“ResourceScriptBreakAll”

}

PS C:\Windows\system32>
Set-DscLocalConfigurationManager -Verbose -Path .\foo

VERBOSE: Performing the operation
“Start-DscConfiguration: SendMetaConfigurationApply” on target
“MSFT_DSCLocalConfigurationM

anager”.

VERBOSE: Perform operation ‘Invoke CimMethod’
with following parameters, ”methodName’ = SendMetaConfigurationApply,’classNam

e’ =
MSFT_DSCLocalConfigurationManager,’namespaceName’ =
root/Microsoft/Windows/DesiredStateConfiguration’.

VERBOSE: An LCM method call arrived
from computer APRILWMF with user sid
S-1-5-21-2127521184-1604012920-1887927527-10118509.

VERBOSE: [APRILWMF]: LCM: [ Start
Set ]

VERBOSE: [APRILWMF]: LCM: [ Start
Resource ] [MSFT_DSCMetaConfiguration]

VERBOSE: [APRILWMF]: LCM: [ Start
Set ] [MSFT_DSCMetaConfiguration]

VERBOSE: [APRILWMF]: LCM: [ End
Set ] [MSFT_DSCMetaConfiguration] in 0.1100 seconds.

VERBOSE: [APRILWMF]: LCM: [ End
Resource ] [MSFT_DSCMetaConfiguration]

VERBOSE: [APRILWMF]: LCM: [ End
Set ]

VERBOSE: [APRILWMF]: LCM: [ End
Set ] in 0.1570 seconds.

VERBOSE: Operation ‘Invoke CimMethod’
complete.

VERBOSE:
Set-DscLocalConfigurationManager finished in 1.156 seconds.

Let us now apply the configuration:

PS C:\Windows\system32>
Start-DscConfiguration -Wait -Verbose -Path .\foo

VERBOSE: Perform operation ‘Invoke
CimMethod’ with following parameters, ”methodName’ =
SendConfigurationApply,’className’ =

MSFT_DSCLocalConfigurationManager,’namespaceName’
= root/Microsoft/Windows/DesiredStateConfiguration’.

VERBOSE: An LCM method call arrived
from computer APRILWMF with user sid
S-1-5-21-2127521184-1604012920-1887927527-10118509.

VERBOSE: [APRILWMF]: LCM: [ Start
Set ]

WARNING:
[APRILWMF]: [DSCEngine] Warning LCM is in Debug
‘ResourceScriptBreakAll’ mode. Resource

script processing will be
stopped to wait for PowerShell script debugger to attach.

VERBOSE: [APRILWMF]: LCM: [ Start
Resource ] [[Registry]r]

VERBOSE: [APRILWMF]: LCM: [ Start
Test ] [[Registry]r]

WARNING:
[APRILWMF]: [[Registry]r] Resource is waiting for
PowerShell script debugger to attach.

Use the following commands to
begin debugging this resource script:

Enter-PSSession -ComputerName
APRILWMF -Credential <credentials>

Enter-PSHostProcess -Id 3628
-AppDomainName DscPsPluginWkr_AppDomain

Debug-Runspace -Id 4

As you can see, the engine stops and gives you the instructions on how to debug. Once you see these instructions, open up another instance of ISE and enter the last three commands:

p

You are in the debugger and have all the power at your fingertips! At this point, you can see the environment by printing out the value of $env:temp for example and see that it points to the ‘PsDscRunAsCredential’ user’s temp folder path. You can step in / step out / set breakpoints – do all the normal debugging tasks that you are familiar with. This is an incredibly powerful way to root cause complex issues during resource development and authoring.

As you may have noticed, the debugger starts with the Test-TargetResource function of the resource. This is because LCM first calls the Test-TargetResource when you apply a configuration. Once you are done debugging, press F5. After that, hit CTRL+C and type ‘exit’. This causes the debugger to come out of the runspace where Test-TargetResource was executing. LCM then continues applying the configuration:

VERBOSE: [APRILWMF]: LCM: [ End
Test ] [[Registry]r] in 101.3310 seconds.

VERBOSE: [APRILWMF]: LCM: [ Start
Set ] [[Registry]r]

WARNING:
[APRILWMF]: [[Registry]r] Resource is waiting for
PowerShell script debugger to attach.

Use the following commands to
begin debugging this resource script:

Enter-PSSession -ComputerName
APRILWMF -Credential <credentials>

Enter-PSHostProcess -Id 3628
-AppDomainName DscPsPluginWkr_AppDomain

Debug-Runspace -Id 3

Executing the instructions from the second ISE window will
cause the debugger to hit the beginning of the Set-TargetResource function and you can continue with your debugging.

The ‘RunAs’ is a really cool feature in the latest version of WMF. I can imagine it lighting up various new scenarios that were simply notpossible before.

Give it a try and share your experiences!

Advertisements

Writing a Concurrent Queue

In this post, I am going to describe writing a non-blocking queue using the Interlocked family of functions like my previous post. However, this time the implementation is in C# (Being an automatically garbage collected language, the ABA problem should not occur, at least I did not see it in my testing).

Concurrent Queue

A concurrent queue is basically a queue which provides protection against multiple threads mutating its state and thus causing inconsistencies.

A naive way to implement a concurrent queue may be to just slap locks in its enqueue and dequeue functions when they try to modify the head and tail. In the GitHub page, I have provided a sample implementation of this approach in the class ConcurrentQueueUsingLocks. A more sophisticated implementation may be to use condition variables as means for the enqueue and dequeue functions to communicate.

The complexity increases significantly when we try to go lock-free. There are many implementations that you can find on the web for a lock-free queue. In this post, I will show a sample implementation and peek into some of the race conditions that may occur and how the code provides protection against them. In general, simulating race conditions is hard since we don’t have control over when the CPU will schedule a thread. However, if we are inside a debugger, we can ‘freeze’ and ‘thaw’ threads to get an understanding of what may occur in a real world scenario. Below are the enqueue and dequeue functions reproduced for convenience.

Enqueue:

Enqueue

Dequeue

Let us see an example when a thread might be suspended before it has got the chance to update the tail of the queue. To do so, call the enqueue function twice on two different threads. When the first thread reaches the last CompareExchange call, we ‘freeze’ it:

FreezeJustBeforeLastCompareExchange

This means that although the next node to tail is the new node just created, the tail itself has not been modified it. After freezing this thread, the second thread runs:

localTailNextUpdated

As you can see, the tail does not point to null. We atomically fix the tail.

A similar exercise can be performed for the dequeue operation. However, it gets a little tricky there. What we need to do is, call enqueue on one thread and when that thread reaches the last CompareExchange instruction, we freeze it and give the other thread which performs dequeue a chance to run (Notice the placement of the breakpoints):

3

Press F5:

4

Although both the head and the tail are the same, tail’s next is not null! This is a real possibility when the CPU suspends a thread just before it has had a chance to update the tail and schedules another thread which may perform a dequeue operation. In this case we  need to atomically update the tail and continue with our loop.

You can download the code from my GitHub page and play around with the code to simulate the other race conditions that may occur (according to the comments).

Comparison with the blocking queue

To compare the blocking and the non-blocking queues, I wrote a quick and dirty test which apart from measuring the running time, also tests for correctness. The idea is to enqueue a set of known values and then pop them. Since we know the values that we enqueued, we can test for two things:

  1. All values are popped.
  2. No value is popped twice.

After verifying that both versions worked correctly for one million integers to 10 million integers, I found that the time taken by the non-blocking queue was not much less than the blocking queue if the number of threads was small (2). However, with around 8-10 threads, the non-blocking queue out performed the blocking queue.

(Sorry my Excel skills are not that great, I would have produced a graph otherwise ;))

Hope you liked this post!

Writing a Concurrent Stack

The stack is one of the most basic data structures and also one of the most widely used. In this post, I am going to take a stab at writing a concurrent stack – one that may be mutated simultaneously by more than one thread. I have provided a sample C++ implementation at my GitHub page .

The nodes of a stack can be represented by the following struct:

template<typename T>

struct Node
{
T value;
Node<T>* next;
};

The three main operations performed on a stack are push, pop and top (to peek at the top of the stack).

A typical push operation involves:

  1. Creating a new node.
  2. Setting the next pointer of the new node to the top of the stack
  3. Modifying the pointer to the top of the stack to the new node.

In a concurrent environment, we have a race condition between steps 2.) and 3.)

Consider a thread T1 which has run till step 2. At this point, another thread, say T2 is scheduled and it runs all the three steps above. This will cause the pointer to the top of the stack to mutate and hold the new value inserted by T2. T2 unwinds and T1 is resumed. T1 has the stale address of the stack top. It runs step 3. After this, we have lost the node inserted by T2. Apart from causing undefined behavior, this will also cause a memory leak since the memory allocated for the node by T2 will never be freed.

There are multiple ways to solve this issue. The simplest and the most obvious is to serialize the threads when executing steps 2 and 3 by putting them inside a lock (or some other critical section primitive). However, in a highly concurrent environment, this can impact performance due to high contention.

We can achieve higher performance by using non-blocking synchronization using the CAS pattern. In Win32, we have the Interlocked family of functions which implement this.

What we need to do is, before executing step 3, we need to check whether the stack top was modified. If yes, we loop back and execute step 2 again. If it was not modified, we can safely execute step 3. This is achieved using the following:

do
{
newNode->next = stackTop;
} while (InterlockedCompareExchangePointer((volatile PVOID*)&stackTop, newNode, newNode->next) != newNode->next);

A typical pop operation involves:

  1. Caching the pointer to the stack top.
  2. Advancing the stack top to the next node.
  3. Deleting the cached pointer.

Again, there is a race condition between steps 2 and 3. It can cause a double deletion of the same memory address if two threads read in the same value for the stack top. Another subtle issue is that there is also a race condition between steps 1 and 2 when the stack has just one element. Consider two threads, say T1 and T2. T1 has executed step 1. At this point, T2 comes in and executes all the steps. At a future instant, the CPU reschedules T1. T1 tries to execute step 2 and it causes a null reference exception (the stack top has been set to null by T2).

We can again use InterlockedCompareExchange to solve the race conditions. We also need to perform a null check between steps 1 and 2:

Node<T>* topElem;
do
{
topElem = stackTop;
if (NULL == topElem)
return;
} while (InterlockedCompareExchangePointer((volatile PVOID*) &stackTop, stackTop->next, topElem) != topElem);
delete topElem;

The implementation provided in the GitHub page has a few other functions which you can try out.

I will soon be writing another post to show the race conditions by putting the code inside the debugger.

NOTE: There is a problem named as ‘ABA’ which is present in most lock-free implementations of a linked list in an unmanaged language like C or C++. Since this is a simple introduction to the world of non-blocking synchronization, I have not attempted to solve the ‘ABA’ problem. There is a ton of material out there which you can Bing/Google to find out more about it and how to solve it.

Hope you liked this post!