API motherboards, starting with the UP1000 and UP2000, have the ability embedded in their ROMs to test themselves and attempt to diagnose hardware failures. This document describes how to use this ability and what it is capable of.
On the UP1000, the Alpha Diagnostic Environment (ADE) is invoked every time you power on the machine. It is run quickly, before next-level firmware (AlphaBIOS or SRM console, depending on your configuration) and you don't normally notice it passing, except for a sequence of beeps (the "happy chime").
However, it can be called in an interactive way to perform detailed diagnostics and stress-tests of the system, or to recover damaged ROMs on systems that no longer run the AlphaBIOS. To do this, a jumper must be set on the motherboard.
When Diags is invoked in this way, the first thing you see is a welcome screen. If you have a video card plugged in, it will be on the local monitor. If not, it will be on the SROM UART diagnostic serial port. This port is a jumper block on the motherboard.
From here, a full test of the system can be started by the command 'go'. This test sets up the motherboard and runs a sequence of tests designed to stress the main system components. Then it may attempt to start the AlphaBIOS to run operating system tests.
Diags has two forms of output that can be controlled and directed by your command:
To control these two forms of output, there are two commands, tty and log. They take an argument that describes the new destination for that output, which can be any of the following:
srom | The SROM debug port. This link is a jumper block on the motherboard. This port has the advantage of very little hardware dependency. In situations where the hardware is very unreliable, use this port. The baud rate used here is the same baud used by the SROM mini-debugger (if invoked), or is automatically calculated. If in doubt, use 115K as a baud, 8 data bits, 1 stop bit and no flow control in the terminal program (eg HyperTerminal) |
com1 | The first serial port, which is setup at a baud rate of 115K |
com2 | The second serial port, which is setup at a baud rate of 115K |
com3 | The third serial port. This port is not present unless an expansion card is plugged in. It is set at a baud rate of 115K |
com4 | The fourth serial port. This port is not present unless an expansion card is plugged in. It is set at a baud rate of 115K |
lpt1 | The first parallel port. |
lpt2 | The second parallel port. This port is not present unless an expansion card is plugged in. |
local | Local screen and keyboard. Both a keyboard and a VGA-compatible graphics card must be connected for this option. |
Examples of usage:
There are two ways of invoking a command in Diags:
A simple command is the PC-compatibility devices test. This test initialises and briefly tests various system components that conform to the standard PC IO-bus interface. To run this test, you can either choose the menu option ('2') or type the command name ('isa'). In this case both have the same effect.
You will see the isa test running on the console, and if you have enabled log output then you will see some more detailed information on the log stream.
All modern computers hold the system date and time in a piece of battery backed-up electronics called the CMOS Real-Time Clock, sometimes shortened to CMOS or RTC. This is a clock that ticks even when the computer is switched off to keep track of time.
To read the system time, the cmos command can be used, as shown here being read over a serial console:
+-------------------Alpha Diagnostics 1.44pre4 (Oct 31 2000)-------------------+ | | | TEST OPTIONS SUBSYSTEM STATUS | | 1) Run full test set Memory......Set up | | 2) Scan PCI buses PCI I/O.....Set up | | 3) Test integrated peripherals AGP I/O.....Unknown | | 4) Test interrupt mechanisms ISA I/O.....Set up | | 5) Stress memory Video.......Unknown | | 6) Reset system Interrupts..Unknown | | 7) Exit Alpha Diagnostics | | | | | | | | | | | | | | | | | |+-----------------------------Date/Time settings-----------------------------+| ||Date & time are set to: Fri Nov 3 12:35:57 2000 || || || |+------------------------------Press Any Key...------------------------------+| | Swordfish> cmos... | +------------------------------------------------------------------------------+ |
To write a new system date and time, the command is cmos set. The current date and time will be displayed. The new date and time must be entered in exactly the same format, including the case of the letters, as shown in the next example:
Date/Time is currently set to: Fri Nov 3 12:37:47 2000 Time is in 24-hour format. +--------------------------Date/Time update complete-------------------------+ |Clock now set to: Fri Nov 3 12:20:01 2000 | | | +------------------------------Press Any Key...------------------------------+ Swordfish> Fri Nov 3 12:20:00 2000 |
SRM console and AlphaBIOS have different mechanisms for storing the system clock year value in CMOS. These different mechanisms depend on what year counting is started from. This year is referred to as the 'epoch year'. Alpha Diagnostics is capable of working with both systems. To change from one system to the other, the cmos epoch command is used.
+--------------------Alpha Diagnostics 1.4-6 (Feb 21 2001)---------------------+ | | | TEST OPTIONS SUBSYSTEM STATUS | | 1) Run full test set Memory......Set up | | 2) Scan PCI buses PCI I/O.....Set up | | 3) Test integrated peripherals ISA I/O.....Set up | | 4) Test interrupt mechanisms Video.......Unknown | | 5) Stress memory Interrupts..Unknown | | 6) Reset system SMP.........Unknown | | 7) Exit Alpha Diagnostics I2C.........Set up | |+------------------------------Change Epoch year-----------------------------+| ||Available Epoch year options are (current 1952): || || 1: SRM console firmware epoch (1952) || || 2: ARC/AlphaBIOS firmware epoch (1980) || || 3: MS-DOS epoch [not normally used on Alpha] (1900) || || 4: Quit without changing epoch year || ||Please choose an option: || |+----------------------------------------------------------------------------+| | | | | | | | Type 'help' for command summary | | Swordfish> cmos epoch... | +------------------------------------------------------------------------------+ |
The required year encoding scheme can be chosen by selecting from the numbered list provided.
Alpha Diagnostics, the API NetWorks Linux Flash Driver and SRM console all share the notion of a common system NVRAM. This space is used for storing system settings and non-volatile environment variables that influence firmware operations, such as OS booting.
Alpha Diagnostics can be used to view and change the NVRAM variables, which may affect terminal operation, boot procedures or influence the operation of SRM console.
The main interface to the system NVRAM is via the 'nvram' command.
+---------------------System Firmware Environment Settings---------------------+ | | | 0: diags_action = diags | | 1: auto_action = [No value set in NVRAM] | | 2: boot_dev = [No value set in NVRAM] | | 3: bootdef_dev = rom | | 4: booted_dev = rom | | 5: boot_file = [No value set in NVRAM] | | 6: booted_file = [No value set in NVRAM] | | 7: boot_osflags = nfsroot=192.168.0.12:/opt/exports/pugwash | | 8: booted_osflags = nfsroot=192.168.0.12:/opt/exports/pugwash | | 9: boot_reset = [No value set in NVRAM] | | 10: dblx_use_initrd = [No value set in NVRAM] | | 11: enable_bootmem = ON | | 12: bootmem_alloc_mb = 1 | | 13: tty_dev = com1 | | 14: com1_baud = 115200 | | 15: shutdown_temp = [No value set in NVRAM] | | 16: therm_hyst = [No value set in NVRAM] | | | | | | Enter number of the setting to change (press RETURN to quit): | | NVRAM> | +------------------------------------------------------------------------------+ |
Using the NVRAM command, environment variables may be changed by selecting the desired variable by number, and entering a new value for the variable when prompted.
It is possible that the firmware images in the ROM can become corrupted, either by failing hardware, an aborted upgrade, or errant software. In anticipation of this, Diags provides a mechanism for recovering the system firmware. This module is sometimes referred to as a fail-safe booter (FSB).
To invoke the FSB, either power up the system with the FSB jumper in position, or start Diags and type the 'flash' command.
The process is completed in three steps, invoked in this sequence:
After scanning, the user is presented with a map of the system ROM, which contains several varieties of firmware. The FSB is capable of updating any of these images.
After the reflash operation, the ROM is rescanned to verify the newly programmed image. There are several other commands that do not form part of the standard firmware upgrade procedure, and do not appear on the menu:
Asset information EEPROMs are provided on all API NetWorks motherboards and processor modules, although the exact number and configuration of the EEPROMs depends on the product used. The EEPROMs are not used by API NetWorks and are supplied in their blank state. Customers can use these EEPROMs however they choose to do so.
The last byte of each EEPROM is reserved by API NetWorks for hardware testing purposes and its usage should be avoided if possible.
The Alpha Diagnostics asset command is used to examine and edit the contents of all asset information EEPROMs in the system. When first viewed, all asset information EEPROMs will be in their blank state.
+------------------Motherboard and Processor Asset Information-----------------+ |Asset Information EEPROMs in this system: | | | | 0) Primary Module asset info : ................................ | | 1) Secondary Module asset info : ................................ | | 2) Motherboard asset info : ................................ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please select an EEPROM to examine or modify (or RETURN to quit): | | EEPROM> | +------------------------------------------------------------------------------+ |
Next to each listed EEPROM is a few bytes from the contents of the EEPROM. In the screenshot above all the EEPROMs are listed as being blank. To view an entire EEPROM, select it by number from the list.
+------------------Motherboard and Processor Asset Information-----------------+ |Asset Information EEPROMs in this system: | | | | 0) Primary Module asset info : ................................ | | 1) Secondary Module asset info : ................................ | | 2) Motherboard asset info : ................................ | | | | | | | | | |+---------------------------Motherboard asset info---------------------------+| || || || ................................................................ || || ................................................................ || || ................................................................ || || ................................................................ || || || |+----------------------------------------------------------------------------+| |+-----------------------------Modify this EEPROM?----------------------------+| ||Press 'y' to modify the contents of this EEPROM. || ||Press any other key to return. || |+------------------------------Press Any Key...------------------------------+| | EEPROM> 2 | +------------------------------------------------------------------------------+ |
The full contents of the EEPROM are displayed (in this case, 256 bytes). To change the contents of the EEPROM, choose 'y' as indicated to enter the editing mode.
+--------------------Edit Asset Information EEPROM Contents--------------------+ |EEPROM: Motherboard asset info | | | |Original Contents: | | | | ................................................................ | | ................................................................ | | ................................................................ | | ................................................................ | | | | | |New Contents: | | | | API NetWorks UP2000 Motherboard................................. | | ................................................................ | | ................................................................ | | ................................................................ | | | | | | | | | | Use cursor keys to navigate, press RETURN to finish editing | | | +------------------------------------------------------------------------------+ |
The new asset information is entered by overtyping the old data using a simple editor. When satisfied with the contents, press RETURN to commit the new information to the asset EEPROM.
Read asset information EEPROMs on motherboard and processor modules. The exact number and configuration of those EEPROMs depends on the product used. The asset information EEPROMs are for user purposes and are supplied blank.
See Viewing and Changing Asset Information for more details.
The default action for this command is to search the ROM for either AlphaBIOS or SRM console firmware and transfer to it.
Firmware of a particular type can be scanned for by passing a numerical Firmware ID. Currently-defined firmware IDs are:
0 | Debug monitor firmware. Note: this firmware may not be accessible from diagnostics because it operates from the same address range in memory. |
1 | AlphaBIOS/ARC firmware for Windows NT |
2 | SRM console firmware for Linux/Tru64/OpenVMS |
6 | Fail-safe booter. Note: this firmware may not be accessible from diagnostics because it operates from the same address range in memory |
7 | MILO, Linux loader |
8 | VxWorks Real-time OS |
9 | Diagnostics |
10 | SROM microbios. This firmware is in a format that is not loadable from diagnostics |
11 | Embedded Linux kernel. This firmware is not in a format that can be directly loaded by the bios command. Use the linux command instead. |
12 | CBOX settings table. This is Alpha CPU configuration data and cannot be executed |
13 | OSF PALcode. Used during the Linux loading process. This firmware is not in a format that can be successfully loaded and started by the bios command, but is required by the linux command for embedded booting |
A firmware .rom file can be specified to load next-level firmware from a DOS-format floppy disk.
Commands relating to the CMOS real-time clock. cmos without arguments displays the current system date and verifies that the clock is ticking.
See the section about changing the system date and time for more information about the cmoscommand.
Display 128 bytes of CMOS memory, which contain the system date and other information about the system state.
Verify that the real-time clock is generating periodic interrupts, and returns the count of such interrupts received since enabling the clock.
Set a new date and time for the CMOS clock. See the section about changing the system date and time for more information about the cmos setcommand.
Test the calibration of the CMOS interval timer modes and reports on their accuracy.
Change the epoch year (the year from which the firmware measures years; for AlphaBIOS and SRM console this is different).
Download a code module that can execute within Diags. The creation and usage of diagnostic applets is not covered by this document.
This command is only enabled in factory and DVT releases of Alpha Diagnostics
Test all asset information EEPROMs in the system. The exact number and configuration of these EEPROMs depends on the product used.
The test consists of reading all the EEPROM contents. Writeability is verified using the last byte in the EEPROM. The last byte in the EEPROM is reserved for this purpose and its usage should be avoided if possible.
See Viewing and Changing Asset Information for more details about asset information EEPROMs.
Read all NVRAM environment settings stored in the system. See also nvram. The env command lists all variables found in the NVRAM, including ones you may have made up yourself. The nvram command only lists values for pre-defined variables that have meaning in diags or SRM console. See also set.
For further information on working with NVRAM, see the section Viewing and Changing the System NVRAM.
Flash ROM management. The usage of this command is covered here.
A test of the floppy disk. The floppy uses direct memory access (DMA). Note: this test will destroy any data stored on the floppy disk!
Run a full set of tests. This is the standard hardware qualification procedure.
A summary list of available commands.
Initialise the platform I2C bus and poll all i2c system health monitors. The sensors available depend on the motherboard design under test. Typical sensors include temperature sensors, fan RPM sensors and voltage sensors. If a system health monitor is reading beyond its threshold values a warning is made and the test stops. The default number of test passes is 10.
IDE block read/write test. Note - an IDE device must be connected for this test. This test is not implemented in current versions of Diags.
Read a byte from PC-compatibility IO-space. With these routines individual IO-space locations can be peeked and poked. See also out.
Alpha Diagnostics records useful information about hardware configuration which can be viewed using the info command.
The information gathered is sorted into subject categories, such as processor, memory, firmware, chipset, and motherboard information. All information is organised into attribute (for example, processor clock frequency), a value (for example, CPU 0: 833 MHz), and a source of the information (for example, motherboard jumper configuration). Often the same information can come from different sources, so viewing all sources of information side-by-side can be a useful way of identifying discrepancies in the hardware configuration.
Change the interrupt priority level. This mechanism enables or disables interrupts on a scale from IPL 0 (all enabled) to IPL 7 (all disabled).
ipl with no arguments returns the current interrupt priority level.
Interrupt routing test. Many system devices generate interrupts to interact with the processor and this test attempts to test all those mechanisms.
PC-compatibility devices probing and testing. Various motherboard components are probed for, initialised and tested.
Fetch and boot a Linux kernel. The kernel and PALcode may be ROM-resident, or can alternatively be supplied either on DOS formatted floppy or transferred by serial link using the Xmodem transfer protocol.
When downloading via a serial link and Xmodem, or loading from DOS-format floppy disk, the kernel should be in the standard (gzipped ELF, as in vmlinux.gz) format. The PALcode should be a raw binary file (without ROM header).
When loading from ROM, both PALcode and kernel must have appropriate ROM headers. Using the DBLX developers kit, a kernel in the expected format is produced by the make romboot command.
With the -nosmp command, SMP startup is skipped. This may be useful, for example on a board with broken SMP support, or when the secondary processors must be left in the SROM state (for example, when using Linux BootMem functionality to load SRM console by ROM-in-RAM).
Enable logging output on the specified device. The device argument can be any of: srom, local, com1, com2, com3, com4, lpt1, lpt2, rom.
See also tty
Repeat a command, either for an (optional) number of iterations, or indefinitely. On failure, the loop will stop.
Memory test. A variety of memory tests are run, with the intention to stress the system and find transient memory errors. Try running ipl 0 first, to enable error interrupt reporting from the system.
If the system has multiple processors, it is run on all initialised CPUs. The memory test region is partitioned between the processors in different ways, depending on the test routine being run.
To test a particular region of memory, give two arguments, where the arguments are addresses in megabytes, in decimal.
Combined memory and IO stress test, designed to give maximum throughput through the memory controller. This test is not implemented on early versions of Diags.
A module enabling the reading and configuration of NVRAM environment variables. There is a standard defined set of configuration variables that have meaning in Diagnostics and SRM console. Variables of this type set in diagnostics can be used to configure or influence SRM console, for example by programming the boot device or kernel parameters, or telling SRM to auto-boot.
Environment variables may also be set using the set command and read using the env command.
For further information on working with NVRAM, see the section Viewing and Changing the System NVRAM.
Write a byte to PC-compatibility IO space. This function can be used to control devices individually. See also in
PCI bus configuration and display. Any PCI busses and AGP busses in the system are scanned and configured and the details of the PCI/AGP devices detected are displayed.
Read an entire PCI configuration space header. The best way to see all CSRs at a glance. bus, device and function are all decimal arguments. See also pciin and pciout
Read a byte from PCI configuration space. The bus, device, and function arguments are all decimal, and register is given in hexadecimal. See also pciout
Write a byte to PCI configuration space. The bus, device, and function arguments are all decimal. register and value are given in hexadecimal. See also pciin
Write a new environment variable into the system NVRAM. Standard, pre-defined environment variables can be used to configure diags and SRM console (see the nvram command). However, this command allows you to create and set any environment variable you wish.
Test SMP functionality in the system chipset. smp start must be run first to probe for and start any secondary processors.
Test SMP functionality in the system chipset. smp ipi Sends inter-processor interrupts between on-line processors repeatedly to test the mechanism.
Direct interactive console output to the specified device. The device argument can be any of: srom, local, com1, com2, com3, com4. See also log
Probe for and initialise a VGA card and monitor, if present. This command first tests the integrity of the PCI option ROM, and then executes the code in that ROM through BIOS emulation.
Use the video card's memory as a buffer for testing the PCI bus. Not fully implemented in pre-release versions of Diags.