Create and use a "provisioning agent" for end to end provisioning

Publié le 29 Août 2012 par damcuvelier

Problem:

A device in the process of being provisioned will often have the following steps:

1 - Prepare the hard drive and deploy the OS (either through image deployment or scripted install)

2 - Join the domain and other first start actions

3 - Install the LANDesk Agent

4 - Using the LANDesk Agent install software deployment packages, patch the system, etc.

There is a known issue that can cause seemingly random software deployment failures in step 4. The issue is that when the LANDesk agent is installed it will enter the local scheduler information for the client machine to run a security scan and a policy sync. Because of the way our scheduler works these will attempt ro run immediately, but with up to a 60 minute random delay. (Note: This is a default setting and can be changed)

The problem occurs if the security scan or a policy task gets initiated and is running when a distribute software provisioning action starts. In that scenario the distribute software action will attempt to run the task, view that the other task is already running, and exit with a failure.

If you have multiple distribute software actions chained together you will likely see the process fail on different packages. This is because of the random delay at the start time that will make it so you might get the error early on, later in the process, or sometimes even succeed completely.

Description of Local Scheduler Behavior on Agent Deploy:

The LANDesk Local Scheduler service is set to launch tasks after a series of filters are passed. These filters include passing a certain date and time, day of week, ip address changes, log on events, or others. Our default daily tasks will use only the filter to check and see if we have already passed a certain date and time. By default the security scan and the policy sync are set to run once every 24 hours with a 1 hour random delay. The first run time is set to the current date and time. This means that when the agent is deployed all the tasks listed are already passed, so the scheduler will launch them immediately, with (by default) up to 1 hour random delay. Usually this is desired behavior so your device will attempt to make sure it is up to date on all tasks from the core, but during provisioning this can cause the problem listed above (see next section for details).

Cause:

When our software distribution agent runs (sdclient.exe) it will perform a check to make sure that no other task is running. It will check for any other instances of sdclient, any instances of vulscan (security scanner), and also do a few other checks that aren't relevant here. If it sees sdclient.exe or vulscan.exe already running it will return an exit code that a task is already in progress, so it can't run. As we described in the previous section, our agent deployment will launch vulscan.exe within an hour of agent deployment. This can cause a provisioning distribute software action to fail.

In addition to the possible timing of another task starting before our software deployment agent can run there is the feature that vulscan.exe (security scanner) is very persistent. Because it relates to security of the device it will not immediately fail if it sees a different task already running; it will wait until the other task finishes. This is the most common scenario for the tasks to fail. The process looks something like this:

1 - Agent gets installed. Security scan is set to trigger within 1 hour.

2 - Software deployment tasks start and are running, as per the provisioning template.

3 - During one distribute software action the security scan is launched by the agent.

4 - The security scanner sees that a task is running, so it remains memory resident waiting for the other task to finish and for sdclient.exe to close.

5 - The distribute software action finishes and sdclient.exe closes until the next action can be started.

6 - The security scanner receives notice that it can now begin to scan, because there are no other currently running software deployment tasks.

7 - The next distribute software action starts, launching sdclient.exe to handle the process.

8 - Sdclient.exe sees that the security scanner is already running, returns a failed status code, and the provisioning template fails.

Since the timing of the security scan starting can be different you will see the failure at different points in the overall provisioning process, or sometimes not at all.

Resolution:

To work around this known problem you can create an agent to be used for provisioning that does not start the security scan or the policy sync immediately upon installing the agent. We don't have the option to disable the tasks completely, but we can set them to start at some point in the future, such as the year 3000. That should give us enough time.

After all distribute software or patch system tasks have completed you can then finish the provisioning template by installing the agent you would actually like to use on the device long term.

To configure an agent in this way the following settings are recommended:

Change the Policy sync to not run until the year 3000 (or some other point in the distant future), as seen in the screenshots below.

Change the Security scanner to not run until the year 3000 (or some other point in the distant future), as seen in the screenshots below.

Conclusion:

Following the above steps should improve the success rate of end to end provisioning tasks.

« Article précédent Article suivant »

Commenter cet article

Retour à l'accueil

The WinOps Blog