API Manufacturing Diagnostics Developer Guide

Contents

Introduction
Diags code structure
User interface routines
Hardware drivers for Northbridge chipsets
Hardware drivers for Southbridge chipsets
Interface for platform-dependent code

Warning: this information is out of date


Introduction

The target audience for this document is developers intending to work with the source code for the API manufacturing diagnostics system (aka mo-bo diags). Portions of this code are available under copyright from DEC/Compaq and Alpha Processor Inc., where mentioned in the program headers.

The diagnostic environment inherits most of its code from the Compaq 21264 EBSDK, particularly from the debug monitor. However, it has been altered and augmented with several purposes in mind:


Internal Code Structure

The code has been structured in a way that mimics the hardware structure of a typical system: Each module in the above categories is implemented to a common interface defined for that module. This enables other device drivers to use a general mechanism for interfacing with the hardware.

Input and Output

This software inherits from the debug monitor a flexible mechanism for re-direction of input and output. With the debug monitor, a command-line interface could be driven from the local screen & keyboard, the serial ports, or the SROM UART. The diagnostic firmware has added to this with support for simple ANSI terminal commands, enabling the implementation of structured user interfaces that are more user-friendly than a command prompt.

With the introduction of the possibility of formatted screens and menus comes the difficulty of handling large volumes of detailed output where necessary. For this purpose the diagnostic firmware has introduced a second output stream, referred to as the log stream, upon which detailed technical information could be placed. Typically, this may be directed to a parallel port printer or a serial port that is logging to a file, so that a persistent record of any detailed output can be kept during the diagnostics operation.


User Interface

#include "uilib.h"

The user interface implements several simple components for interaction:
struct Point A point on the screen (top left is 1,1)
struct Rect A rectangular region of the screen
mobo_cls( void ) Clears the whole screen, moves cursor to top left
mobo_zap( Rect ) Clear a given region of the screen
mobo_goto( Point ) Moves cursor to the given point. Note that in ANSI terminal conventions, the coordinate system has (1,1) as the top left corner
mobo_putstr( const String, ... ) A printf clone
mobo_key( void ) Poll for and return a single keypress
mobo_box( Rect, String ) Draws a box in ASCII-art, with given title
mobo_alertf( const String, const String, ... ) Draws an alert/error dialog box, with given title and printf-style format message. It then prompts for a keypress and returns
mobo_logf( const String, ... ) printf-format messages directed to the logging stream (typically a second terminal or parallel printer)

Platform

#include "platform.h"

Any features that may be implemented in a way that is specific to a particular platform form part of this interface
String Prompt Prompt string for this platform
plat_rombase, plat_romsize Base and size of ROM. Base is typically zero unless there is some meaningful reason for having otherwise.
plat_fixup( void ) Once the standard diagnostics initialisations are made, this function is called to apply any further, platform specific initialisations
plat_inroml( unsigned ) Read a longword from the ROM
plat_inromb( unsigned ) Read a byte from the ROM
plat_bios( unsigned ) Jump to BIOS image in firmware
plat_reset( unsigned ) Perform a software reset

Northbridge (Memory Controller)

#include "northbridge.h"

The features of a typical Northbridge are accessed with this interface:
nb_fixup( void ) Once the standard diagnostics initialisations are made, this function is called to apply any further, chipset specific initialisations
nb_ecc( bool ) Enables or disables ECC control in the chipset.

Southbridge (Standard IO Controller)

#include "southbridge.h"

The features of a typical Southbridge are accessed with this interface:
sb_fixup( void ) Once the standard diagnostics initialisations are made, this function is called to apply any further, chipset specific initialisations


Details of Operation

Memory Map

A conscious effort should be made to use as little memory as possible, and at as low an address range as possible. The diagnostics use the following memory map:

0 - 0x7000 Unused
0x7000 PALcode impure area. Contains data passed from PALcode to the diagnostics
0x8000 PALcode
0x10000(64K) Diagnostic code and data
8MTop of stack (nb: could be lower)

Startup Procedure

Typically the diagnostics are started via the SROM, although other entry mechanisms should not be ruled out. Entry into the diagnostics comes via the PALcode POWERUP routine (at the base of the PALcode image).

dbm/dbmentry.S This code is at the start of the diagnostics image. It writes to the LEDs, sets up entry points for interactions with the PALcode and calls main.
dbm/main.cInitialises the console and writes first messages. Sets up internal data structures and does the least device initialisation possible.
dbm/uilib.cRuns the terminal interface (mobo) and interacts with the user

Interrupt Handling Procedure

Interrupts may come from several sources. The interrupts that are handled by the diagnostic firmware can take any of the following routes through PALcode:
INTERRUPT External interrupt from a system device (or another processor)
MCHK Machine Check interrupt - for example a multi-bit ECC error
CRD Correctable Read Error - a single-bit ECC error

The PALcode performs initial processing of the interrupt, including any scrubbing necessary. It then calls the interrupt handler entry point (simplehandler), in dbm/dbmentry.S. This routine performs a context save, and then jumps to the C environment of dbm/handler.c. After processing, the original context is restored.

For an external device interrupt, the handler will perform the necessary processing and raise an alert if the interrupt was not expected. For a machine check or CRD, the handler calls an interpretation routine (cpu_mchkinterp) which analyses the cause of the error. As much information as possible is put into the log stream (vi mobo_logf) if a machine check or CRD occurs.