Based on diags release 1.44
Stig Telfer, Alpha Processor Inc, 8 December 2000
Some API motherboards, starting with the UP1000, have the ability embedded in their ROMs to test themselves and attempt to diagnose hardware defects. This is broadly known as the Alpha Diagnostic Environment. The objective of this system is to provide high levels of detail about low-level diagnostics, while remaining simple enough to require minimal user training. The Alpha architecture has design features which enable extremely powerful low-level diagnostic access.
The diagnostic environment has been developed with several design goals:
First-pass manufacturing self-test. Once a board's circuits are validated electronically, a power-on test is made using the diagnostics. The goal is to probe-test all the embedded slots and devices, and stress-test the memory and IO systems. An interface that is as clear as possible is required, while providing enough detail in cases of error.
Field hardware diagnosis. For field hardware failures, the diagnostics provide a possible mechanism for fault isolation. The system's strength is its ability to interact with the hardware components individually. Since the only hardware dependency is a functional CPU, errors in critical components (such as level-2 cache, system bus) can be analysed in an interactive environment. Failure recovery is a priority when developing for this uncertain environment.
Reseller hardware verification. Once a bare board has been shipped, a reseller (or technically sophisticated end-user) may wish to verify peripherals or memory that they are adding to the board. The diagnostic environment provides this opportunity, as a secondary function.
The Diagnostic firmware also offers extended functionality in other directions. Principally it adds the ability to boot an embedded operating system kernel (Linux) directly from ROM, bypassing the requirements for a BIOS to setup and control a boot device. This feature (referred to as DBLX) has some clear advantages in specific situations:
The Diagnostic firmware has added flexibility through the ability for the advanced user to download and execute test applets. In this way, tests conceivably could be developed to meet specific problems in the field that are not covered by the standard firmware release.
High-level faults such as hard disk corruptions or operating system related issues are beyond the scope of this system. Complementary OS-level tests should be used.
The firmware evolved from the source tree of the DEC Debug Monitor (DBM), but has moved in a direction taking it from being a development environment to being more suitable for unskilled users. This is done with the design goal of being a usable tool for diagnostic work in the field.
The Reset PALcode. In EV6 Alpha systems, the SROM (serial ROM) contains the power-on internal CPU initialisation code. It is directly connected to the processor and is loaded when the CPU first comes out of reset. This component is capable of reporting details of failures in the level-2 cache, system bus, memory controller, system memory, PCI bus, southbridge and flash ROM.
The native mode Alpha Diagnostics. In the Alpha architecture, "native mode" refers to execution outside of PAL mode. This level depends on various system components to be functioning correctly. When the SROM code deduces the system is capable of supporting the higher level environment, it will fetch this image from the firmware and transfer to it. This flash-resident firmware image operates in an environment that is analogous to the early stages of an x86 BIOS.
The diagnostics are built in these two modules because the low-level reset PAL code executes in an environment that doesn't assume the presence of any devices external to the EV6 core, including memory. Due to this, it is carefully written in PAL mode assembler. The high-level native diagnostics module is written in C and enjoys a richer execution environment. In particular, it depends on a memory system that is at least intermittently reliable for stack, data and code.
PALcode (Privileged Architecture Library). The Alpha implements a scheme which replaces microcode with software (PALcode) that executes in a special CPU mode (PAL mode). This gives the diagnostic system the flexibility to catch many machine interactions that would not be possible in other architectures
Soft UART. Alpha processors implement a serial communications link directly connected to the processor. This link can be used to report and interact with the user, even in the earliest stages of initialisation.
At the time of writing, Alpha Diagnostics is available for every API platform with the exception of the DP264. However, it is not shipped by default in every platform's firmware configuration.
Platforms shipped with Alpha Diagnostics include:
Platforms that support Alpha Diagnostics but ship without it include:
For the platforms listed above, is not difficult to upgrade the diagnostic firmware in the field.