Alerting Options using Intel AMT for Workstations

After engaging with a few workstation customers on the introduction of Intel AMT, we discovered that a lot of workstations are considered mission critical and require additional RAS support over typical office automation clients.

What are typical workstation usages?

Based on the feedback from our customers we have found that workstations are used as high-end systems that run critical applications like CAD/CAM, CAE and engineering analysis applications. In some cases,  they are used as remote servers. Geographically, these machines can be located locally within a facility at different sites in the field. That said, customers are specifically looking for solutions to monitor and receive alerts when the system experiences a critical event. Once alerts are received, that could trigger a support ticket that can execute some action that results in resolution. The need is to have these alerting and break fix solutions as automated as possible.

How did we gather customer feedback on workstation usages?

We carried out Walk the Flow (WTF) discussions with customers to get the manageability requirements and customer workstation usages. Walk the Flow discussions with key enterprise customers to discuss Workstation and its integration into the overall compute ecosystem has been very educational. First, let me give a quick overview of what "Walk the Flows" are... Walk the Flows are interactive discussions which we have with key End-user Accounts which have deployed Intel vPro systems and its technology into there overall IT infrastructure.

Typical WTF discussions touch on: Workstation Usages

  • Workstation statistics (desktops, mobile, Servers)
  • OS's used
  • Workstation location (local versus remote sites)
  • Compute model
  • Workstation considerations (futures)

To date, we have had a number of WTF discussions with End-users about the new Workstations that are being released in the next 6 to 9 months.

One of the areas of interest from the workstation customers is the value of improved RAS support for workstations with mission critical usages. The usages range from brokerage firms using them on the stock traders desks, real time engineering analysis of experiments, to oil and gas exploration analysis in the field. The requirements for successful alerting are: (1) filtered critical events of interest, (2) sending the events to a central management console/database, (3) automatically generating a service ticket for the event, (4) the help desk services the ticket and dispositions the request.

Intel vPro with Intel AMT has the answer for workstation alerting. Intel AMT 7.0 supports the following alerts - see table below. These alerts also line up with the critical system issues that customers are interested in.

Type Event Description
Sensor Events Temperature Problems

* Generic critical temperature problem

* Generic temperature warning

* Over - Temperature problem

* Over - Temperature warning

* Under - Temperature problem

* Under - Temperature

Voltage Problems

* Generic critical voltage problem

* Over - voltage problem

* Under - Voltage problem

Fan Problems

* Generic critical fan failure

* Generic predictive fan failure

* Fan Speed problem (fan speed below expected speed, cooling still adequate)

Case Intrusion Physical security (chassis intrusion)
System FW Errors

* Unrecoverable system board failure

* No bootable media

* Hang during option ROM initialization (specified via watchdog set command)

* Unrecoverable multi-processor configuration mismatch

System FW Progress Events

* System firmware started. The presence of this progress code indicates that at least one CPU is properly executing

* Starting memory initialization and test

* Completed memory initialization and test

* Starting hard disk initialization and test

* waiting for the user-password entry

* Entering BIOS setup

* Starting system resource configuration

* Starting OS boot process

* Starting option ROM initialization

* Starting secondary processors(s)' initialization

Watchdog Events

* System Boot Failure

* OS Boot Failure

AMT Generated Events

* Link Up Event

* Watchdog Events

* Password Attack Events

* Circuit Breaker Event

* Agent Presence Events

* FW update Events

* Bring Up Events ( CPU missing, CPU DOA, DIMM missing, BIOS hang)

* User notification alert

* AMT notification alert

* Auditor Notification Alert

* Host Wake Up Notification Alert

Alerting - Current Options:

The ISV's support for Alerting is Symantec's Altiris Management Console, which has a Notification Server that can be setup to where the Support Engineer can:

  • Select and Subscribe to Specific Alerts
  • If Hardware Event is triggered an Alert is sent to a location which is defined by the subscription
  • Once Alert is received, a defined set of actions can be used to respond to the alert

Next option is, using the AMT DTK (AMT Developers Toolkit), which has an AMTEventData class that can be used to customize alerts. The custom alerts can be categorized into FW errors, FW Progress Events, Watchdog Events and AMT generated events.

Last option is by using Microsoft's Powershell scripting tool with the Intel vPro Technology module. With this, an IT Administrator can write custom scripts which can help with creating Alert subscriptions and integrating it into the overall IT support structure. More info on this can be found on this link.

Customized Alerts: On the same Alerting topic, customers want to be able to trigger on more specific info. For example, they want to be notified if the temperature or voltage of there workstation has exceeded a threshold they have set.

Questions to the reader:

Are there additional requirements for workstations? What Usages  are driving the needs? Are you using other tools? If you are using the DTK or the Intel vPro Technology module with Microsoft's Powershell scripting tool, is there a library that contain working sample scripts which customers can take and use?